DirectX Ray-Tracing [DXR]

I believe that the main point here is that RTRT is going forward and DXR will apparently become the standard on both PC and consoles.

On a side note, I myself prefer open standards like OpenGL, but let's face it, DXR has the edge right now.
 
I believe that the main point here is that RTRT is going forward and DXR will apparently become the standard on both PC and consoles.

On a side note, I myself prefer open standards like OpenGL, but let's face it, DXR has the edge right now.
I don't see any difference when comparing DX usage.
DX is used on pc and xbox
Sony has their own ap, i same for RT. (it won't be using DXR)
Vulkan will support RT, and have same usage as it Vulkan currently does.
 
I did not look at it in detail, but Vulkan seems on par with DXR and also mesh shaders. (Not sure about sampler feedback yet).
At least i checked they have support for inline tracing. Seems pretty much the same, and NVs extensions become core functionality after some time.
 
Scalable Ray Traced Global Illumination in Real Time
3/23/2020
Key Highlights of the RTXGI SDK v1.0 release include:

  • Full Source Code. We are providing full source code so you can easily integrate RTXGI into your tools and customize it to your specific needs.
  • No Baking, No Leaks. Dramatically speed up your iteration time with real-time ray traced lighting. No more waiting for bakes. No more obsessing over light probe positioning either; no light or shadow leaking.
  • Any DXR GPU. RTXGI runs on a broad range of hardware. All DXR-enabled GPUs are supported, including GeForce RTX 20 Series, GTX 1660 Series, and GTX 10 series.
  • Tuned & Optimized. Easy to integrate and fast to provide results, right out of the box. Performance optimized to fit in 60Hz frame budgets.
https://developer.download.nvidia.com/video/RTX/rtxgi_03_2020_web_v2.mp4

https://developer.nvidia.com/rtxgi
 
Last edited by a moderator:
Should work ideally for a 8x8x4 (the 4 rotating axis by level) node size if you wanted to exactly fit a 32 byte cache line for the bitset
Little late reply, but I've come around to your way of thinking. Tracing a regular spatial subdivision being essentially a 2.5D problem, more akin to rasterization than intersection testing, does seem nice. I think you can get the 64 bit intersection mask for the ray with 3 2D-texture lookups and a small amount of math and logic operations, you AND it with occupancy mask and on you go. Not really an octree any more at that point though, more a 64-tree. Google tells me it's an old idea (I'm not sure I'd build the tree like they do though, don't like the sorting).
 
Last edited:
Microsoft is proposing a new version of DXR that uses SSDs to reduce the acceleration structures pressure on VRAM.

Increasingly, as part of video games and other such applications, the acceleration structures for ray tracing are explicitly edited or regenerated by the software to reflect the current set of potentially visible geometry. Such acceleration structures are now competing for storage (both persistent (e.g., flash memory) and non-persistent (e.g., RAM)) with other data, such as geometry and texture data.

This growth in the share of the memory by the acceleration structures has resulted in systems with significantly large memory requirements. Moreover, the bandwidth required to fetch the large amount of data for acceleration structures has also proportionally gotten bigger. The systems and methods described herein help minimize the space required for ray tracing acceleration structures.

The patent lists certain solutions such as having more manageable pools of data associated with the acceleration structures. These can be saved either within the memory or storage devices such as SSDs which offer the fastest processing speeds outside of non-persistent storage options.


 
There is nothing in that patent that even mentions DXR or any kind of external block storage.

What they propose are methods to pre-build BVH nodes from low-detail geometry then store the actual high LOD geometry in child nodes placed in system memory, and let the CPU upload high LOD nodes into dedicated graphics memory on demand when requested by the GPU - similarily to how graphics runtimes and video card drivers manage textures and other resources. This could also leverage tiled resources to automatically page non-resident nodes from system memory to graphics memory, and sampler feedback to determine which detail levels need to be paged.

What is new, some of these methods would require non-opaque BVH structures, i.e. the GPU would have to disclose the exact format of its BVH nodes for the CPU to manage the BVH tree and/or for application developers to pre-build BVH structures for static geometry and store them with their application data or game levels.

But that's going a bit too far to claim that "DirectX will use SSDs to offload data from local video memory" - in a sense, it already does by using regular virtual memory management at the OS level.


US20240070960A1- Systems and methods for ray tracing acceleration structure level of detail processing
An example graphics processing system is to retrieve a first level of detail value for a sub-tree from a level of detail residency map corresponding to a bounding volume hierarchy of objects. The graphics processing system is to determine a second level of detail value for the sub-tree. The graphics processing system is to select a final level of detail value for the sub-tree based on a comparison between the first level of detail value for the sub-tree and the second level of detail value for the sub-tree. The graphics processing system is to, based on the final level of detail value for the sub-tree, select child nodes in an acceleration structure tree and trace the selected child nodes.
Increasingly, as part of video games and other such applications, the acceleration structures for ray tracing are explicitly edited or regenerated by the software to reflect the current set of potentially visible geometry. Such acceleration structures are now competing for storage (both persistent (e.g., flash memory) and non-persistent (e.g., RAM)) with other data, such as geometry and texture data.

Accordingly, there is a need for systems and methods for better handling of the data associated with the acceleration structures.

 
Last edited:
A peculiar property with inline RT is that you can integrate them into compute shader nodes which gives you the ability to dispatch rays with the Work Graphs API shown in an example below. Naturally, inline RT will disable any 'reordering' optimizations ...

Image
 
Is that what it sounds like, arbitrary ray casts called from shaders?
Well work graphs give you the ability to 'enqueue' shaders from other shaders. The nodes in our work graphs currently represent compute shader programs and there aren't many usage restrictions for them in conjunction with work graphs so you can embed ray query objects in these nodes as alluded by @Rys. Also, it's helpful to think of the implementation as nested parallelism/dispatch where shaders *spawn* other shaders (may or may not contain inline RT) rather than a shader doing true function calls to other shaders ('callable' shaders) ...

Another GPU driven approach for ray tracing is using the ExecuteIndirect API by specifying the indirect arguments within the command signature to either perform an indirect draw/dispatch (graphics/compute shaders with inline RT) or indirect dispatch rays call which will use the ray tracing pipeline (RT PSO)!
 
Area ReSTIR: Resampling for Real-Time Defocus and Antialiasing

Recent advancements in spatiotemporal reservoir resampling (ReSTIR) leverage sample reuse from neighbors to efficiently evaluate the path integral. Like rasterization, ReSTIR methods implicitly assume a pinhole camera and evaluate the light arriving at a pixel through a single predetermined subpixel location at a time (e.g., the pixel center). This prevents efficient path reuse in and near pixels with high-frequency details.

We introduce Area ReSTIR, extending ReSTIR reservoirs to also integrate each pixel's 4D ray space, including 2D areas on the film and lens. We design novel subpixel-tracking temporal reuse and shift mappings that maximize resampling quality in such regions. This robustifies ReSTIR against high-frequency content, letting us importance sample subpixel and lens coordinates and efficiently render antialiasing and depth of field.
 
What are the chances of adding support for intersecting curve primitives in the next iteration of DXR? Apple recently added it to Metal but it’s unclear whether it’s hardware accelerated. Optix supports it in CUDA.

Theoretically it will be faster to intersect curve primitives for stuff like hair which I believe currently uses highly tessellated lines & triangles.
 
What are the chances of adding support for intersecting curve primitives in the next iteration of DXR? Apple recently added it to Metal but it’s unclear whether it’s hardware accelerated. Optix supports it in CUDA.

Theoretically it will be faster to intersect curve primitives for stuff like hair which I believe currently uses highly tessellated lines & triangles.
Can developers just use intersection shaders to test against curve primitives?
 
What are the chances of adding support for intersecting curve primitives in the next iteration of DXR? Apple recently added it to Metal but it’s unclear whether it’s hardware accelerated. Optix supports it in CUDA.

Theoretically it will be faster to intersect curve primitives for stuff like hair which I believe currently uses highly tessellated lines & triangles.
Apparently, Nvidia HW implements motion blurred AABBs to support per-query customizable search radii for arbitrary primitives described in this patent ...

Vendors would likely have to implement HW support for ray traced moving geometry in the acceleration structure first but I'm not sure if other vendors would recognize the merit of ray traced motion blur effects for extending curved primitives ...
 
Back
Top