Microsoft says DirectX Raytracing 1.2 will deliver up to 2.3x performance uplift

Microsoft
(Image credit: Microsoft)

Microsoft announced this week its DirectX Raytracing (DXR) 1.2 application programming interface, which introduces new features that significantly improve visual quality and rendering performance by up to 2.3 times. AMD, Intel, Nvidia, and Qualcomm, along with game developers like Remedy, are working to integrate DXR 1.2 technologies into future gaming hardware and software.

The DirectX Raytracing 1.2 update includes Opacity Micromaps (OMM) and Shader Execution Reordering (SER), two technologies that boost performance in raytraced games by 2 (SER) to 2.3 times (OMM). Both technologies must be implemented into actual games or game engines to experience the performance benefits.

2X – 2.3X performance boost

One of the main issues with alpha-tested geometry (foliage, fences, hair, etc.) in raytracing is extra calculations required for light to determine whether it hits a surface or passes through. Opacity Micromaps (OMM) improve how alpha-tested geometry processes by applying a texture with an alpha channel to a flat surface. It then removes pixels below a certain transparency threshold. OMM reduces the number of times shaders need to be used, leading to higher efficiency and performance.

In the best-case scenario, Microsoft claims an improvement of 2.3 times. However, keep in mind that not all games and scenes contain a lot of elements like foliage and fences. For example, while S.T.A.L.K.E.R. 2 has loads of grass, leaves, and fences in practically all scenes, Cyberpunk 2077 barely has any foliage.

Shader Execution Reordering (SER) seems to be a more universal feature as it reorders how shaders are executed to avoid shader divergence. Shader divergence occurs when nearby pixels require shaders to do different tasks, a common situation in scenes with heavy raytracing effects, such as complex lighting, realistic shadows, and detailed reflections.

GPUs process shaders in parallel threads organized into groups called warps or wavefronts. Ideally, all threads within a group execute identical instructions simultaneously, maximizing GPU efficiency. Shader divergence occurs when threads in the same warp or wavefront need to perform different instructions. In this case, simultaneous execution is impossible, forcing the GPU to handle each instruction path separately, leaving some threads idle and increasing latency.

According to Microsoft, SER sorts or batches similar shader workloads together, reducing divergence and maximizing GPU utilization and speeding up rendering by up to two times.

Hardware support

Regarding hardware support, the situation is a mixed bag, which is common with new API features.

All Nvidia GPUs dating back to Turing (GeForce RTX 20-series) support Opacity Micromaps (OMM), so these graphics cards can potentially experience a performance boost once game developers implement them into their titles. Intel said its next-generation Celestial (Xe3) GPUs will also support OMM.

Nvidia's GPUs have supported Shader Execution Reordering (SER), starting with the GeForce RTX 40-series Ada Lovelace family. Intel said it looks forward to supporting SER "when it is available in a future Agility SDK.' However, whether it will be supported on Intel's Arc 'Alchemist' or Arc 'Battlemage' GPUs (or both) is unclear.

AMD does not seem to support OMM or SER on its RDNA 2/3/4 GPUs, though Microsoft said that the red company is working with it on the widespread adoption of these technologies. Also, AMD has certain scheduling optimizations that may mimic how SER works, so if game developers take time to optimize for Radeon GPUs, the latter may get some speed improvements.

Qualcomm also does not support OMM or SER, but it said it would on its next-generation integrated GPUs.

The preview version of DXR 1.2 will launch in April 2025.

Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • -Fran-
    "Also, AMD has certain scheduling optimizations that may mimic how SER works, so if game developers take time to optimize for Radeon GPUs, the latter may get some speed improvements."

    So AMD cards won't get any benefits from these. Got it.

    And good to see MS actually is doing something with DirectX. DX feels dead and has felt dead for a while. Zero new tech from their end. Same thing I can say about Khronos and Vulkan/OGL, though**.

    Regards.
    Reply
  • qwertymac93
    DXR is a real advantage over Vulkan. Someone has to push for a vendor agnostic standard here, else Nvidia will continue to monopolize it just like they do the AI space.
    Reply
  • heffeque
    -Fran- said:
    So AMD cards won't get any benefits from these. Got it.
    Hopefully it's a SW thing and RDNA 2-4 will get it sooner or later 🤷‍♂
    Reply
  • TerryLaze
    qwertymac93 said:
    DXR is a real advantage over Vulkan. Someone has to push for a vendor agnostic standard here, else Nvidia will continue to monopolize it just like they do the AI space.
    Whut?!
    The only vendor agnostic standard would be software rendering and that would be terrible.
    You have to use the hardware to make it fast and that hardware is going to change from one to the other.
    If they make a standard for certain hardware then hardware will not progress anymore for fear of losing compatibility.
    Which is why we still have the x86 base from the 1920ies (hyperbole)
    Reply
  • TerryLaze
    heffeque said:
    Hopefully it's a SW thing and RDNA 2-4 will get it sooner or later 🤷‍♂
    It is a software thing as in somebody has to code for the specific hardware in AMD cards, but outside of consoles is there anybody that has any urge to do it? And the resources of course.
    Maybe steam/valve could but they are using linux so why should they bother.
    Reply
  • JarredWaltonGPU
    heffeque said:
    Hopefully it's a SW thing and RDNA 2-4 will get it sooner or later 🤷‍♂
    My understanding in talking with Nvidia about SER is that there are hardware features alongside the software tweaks, which is why Ampere and Turing don't support SER. You can reorganize things in software before sending the shaders to the GPU to execute, but apparently that doesn't do much. ?

    Frankly, SER has always felt a bit like something that's mostly software and could be implemented for other GPUs. Sort of like how Framegen and MFG could be done on tensor cores. So, it wouldn't be surprising if there is a benefit to DXR 1.2 with SER on pre-Ada RTX cards, along with AMD and Intel GPUs. Maybe not as much of a benefit as if there are hardware hooks, though.
    Reply
  • Peksha
    JarredWaltonGPU said:
    My understanding in talking with Nvidia about SER is that there are hardware features alongside the software tweaks, which is why Ampere and Turing don't support SER. You can reorganize things in software before sending the shaders to the GPU to execute, but apparently that doesn't do much. ?

    Frankly, SER has always felt a bit like something that's mostly software and could be implemented for other GPUs. Sort of like how Framegen and MFG could be done on tensor cores. So, it wouldn't be surprising if there is a benefit to DXR 1.2 with SER on pre-Ada RTX cards, along with AMD and Intel GPUs. Maybe not as much of a benefit as if there are hardware hooks, though.
    You are probably wrong about SER. There are several hardware requirements for it - a huge L2 (Nx10MB) capable of containing all the register files because it is through it that their reordering occurs, as well as the reordering function. In addition, these calls must be encoded in the game, which no one has done except CP2077. In my opinion, the technology looks terrible, if you imagine how many of these permutations will be written in each frame, it's just a waste of L2 bw and overhead...
    Probably for this reason this technology is not used and the implementation in dx will not advance it much. More about SER on c&c.

    AMD's approach to hardware thread scheduling in RDNA4 without the need for a huge L2 seems more interesting, but we don't know the implementation details yet.
    Reply
  • ManDaddio
    So now Microsoft decides it wants to do something. Nvidia brought out Ray tracing in 2018. Microsoft is way behind. They need to get into the gaming mood. But I guess since their partners with AMD they decided they're going to do something now that AMD finally has ok Ray tracing support after all these years. Shame.
    Reply
  • umeng2002_2
    Opacity Micro Maps really helped path tracing in Cybperunk. Prior to them being implemented, performance would really tank near areas with lots of trees or windows.
    Reply
  • blppt
    -Fran- said:
    So AMD cards won't get any benefits from these. Got it.
    Well, you do have to consider that a lot of games are built with consoles in mind, and they are all AMD except for Nintendo. Their game engines could be built with that in mind.
    Reply