Introduction to DirectX Raytracing (DXR) - SIGGRAPH 2018

Discussion in 'Rendering Technology and APIs' started by DmitryKo, Oct 27, 2018.

  1. DmitryKo


    Feb 26, 2002
    Likes Received:
    55°38′33″ N, 37°28′37″ E
    Introduction to DirectX Raytracing: Overview of Ray Tracing
    Peter Shirley, NVIDIA

    Ray Tracing Versus Rasterization for primary visibility
    stream triangles to pixel buffer to see that pixels they cover
    Ray Tracing: stream pixels to triangle buffer to see what triangles cover them

    Key Concept             Rasterization                                  Ray Tracing
    Fundamental question    What pixels does geometry cover?               What is visible along this ray?
    Key operation           Test if pixels are inside a triangle           Ray-triangle intersection
    How streaming works     Stream triangles (each triangle tests pixels)  Stream rays (each ray tests intersections)
    Inefficiences           Shade many triangles per pixel (overdraw)      Test many intersections per ray
    Acceleration structure  (Hierarchial) Z-buffering                      Bounding volume hierarchies
    Drawbacks               Incoherent queries difficult to make           Traverses memory incoherently

    Introduction to DirectX Raytracing: Overview and Introduction to Ray Tracing Shaders
    Chris Wyman, NVIDIA

    DirectX Rasterization Pipeline

    What do shaders do in today’s widely-used rasterization pipeline?

    • Run a shader, the vertex shader, on each vertex sent to the graphics card
    This usually transforms it to the right location relative to the camera

    • Group vertices into triangles, then run tessellation shaders to allow GPU subdivision of geometry
    Includes 3 shaders with different goals, the hull shader, tessellator shader, and domain shader

    • Run a shader, the geometry shader, on each tessellated triangle
    Allows computations that need to occur on a complete triangle, e.g., finding the geometric surface normal

    Rasterize our triangles (i.e., determine the pixels they cover)
    Done by special-purpose hardware rather than user-software
    Only a few developer controllable settings

    • Run a shader, the pixel shader (or fragment shader), on each pixel generated by rasterization
    This usually computes the surface’s color

    • Merge each pixel into the final output image (e.g., doing blending)
    Usually done with special-purpose hardware
    Hides optimizations like memory compression and converting image formats

    Squint a bit, and that pipeline looks like:

    Input: Set of Triangles

    Shader(s) to transform vertices into displayable triangles → Rasterizer → Shader to compute color for each rasterized pixel → Output (ROP)

    Output: Final Image

    DirectX Ray Tracing Pipeline

    So what might a simplified ray tracing pipeline look like?

    Input: Set of Pixels

    Take input pixel position, generate ray(s) → Intersect Rays With Scene → Shade hit points; (Optional) generate recursive ray(s) → Output

    Output: Final Image

    One advantage of ray tracing:
    Algorithmically, much easier to add recursion

    Pipeline is split into five new shaders:

    – A ray generation shader defines how to start ray tracing - Runs once per algorithm (or per pass)

    Intersection shader(s) define how rays intersect geometry - Defines geometric shapes, widely reusable

    Miss shader(s) define behavior when rays miss geometry }

    Closest-hit shader(s) run once per ray (e.g., to shade the final hit) } – Defines behavior of ray(s) – Different between shadow, primary, indirect rays

    Any-hit shader(s) run once per hit (e.g., to determine transparency) }

    Note: Read spec for more advanced usage, since meaning of “any” may not match your expectations​

    An new, unrelated sixth shader:

    – A callable shader can be launched from another shader stage - Abstraction allows this; explicitly expose it
    #1 DmitryKo, Oct 27, 2018
    Last edited: Oct 27, 2018
    Heinrich4, OCASM, pharma and 3 others like this.
  2. DmitryKo


    Feb 26, 2002
    Likes Received:
    55°38′33″ N, 37°28′37″ E
    What Happens When Tracing a Ray?

    • A good mental model:

    First, we traverse our scene to find what geometry our ray hits
    When we find the closest hit, shade at that point using the closest-hit shader
    This shader is a ray property; in theory, each ray can have a different closest-hit shader

    • If our ray misses all geometry, the miss shader gets invoked
    Can consider this a shading routine that runs when you see the background
    Again, the miss shader is specified per-ray

    How Does Scene Traversal Happen?

    Traverse the scene acceleration structure to ignore trivially-rejected geometry
    – An opaque process, with a few developer controls
    – Allows vendor-specific algorithms and updates without changing render code

    If all geometry trivially ignored, ray traversal ends

    For potential intersections, an intersection shader is invoked
    – Specific to a particular geometry type (e.g., one shader for spheres, one for Bezier patches)
    – DirectX includes a default, optimized intersection for triangles

    No shader-detected intersection? Detected intersection not the closest hit so far?
    – Continue traversing through our scene

    Detected hit might be transparent? Run the any-hit shader
    – A ray-specific shader, specified in conjunction with the closest-hit shader
    – Shader can call IgnoreHit() to continue traversing, ignoring this surface

    Update the closest hit point with newly discovered hit

    Continue traversing to look for closer intersections
    – Had a valid hit along the ray? Shade via the closest-hit shader
    – No valid hits? Shade via the miss shader

    Summary: DirectX Ray Tracing Shaders

    • Control where your rays start? The ray generation shader

    • Control when your rays intersect geometry? The geometry’s intersection shader

    • Control what happens when rays miss? Your ray’s miss shader

    • Control how to shade your final hit points? Your ray’s closest-hit shader

    • Control how transparency behaves? Your ray’s any-hit shader

    What is a Ray Payload?

    Ray payload is an arbitrary user-defined, user-named structure
    – Contains intermediate data needed during ray tracing
    – Note: Keep ray payload as small as possible

    Large payloads will reduce performance; spill registers into memory

    A simple ray might look like this:
    – Sets color to blue when the ray misses
    – Sets color to red when the ray hits an object

    What Can DXR HLSL Shaders Do?

    All the standard HLSL data types, texture resources, user-definable structures and buffers

    Numerous standard HLSL intrinsic or built-in functions useful for graphics, spatial manipulation, and 3D mathematics

    – Basic math (sqrt, clamp, isinf, log), trigonometry (sin, acos, tanh), vectors (normalize, length), matrices (mul, transpose)

    New intrinsic functions for ray tracing

    – Functions related to ray traversal: TraceRay(), ReportHit(), IgnoreHit(), and AcceptHitAndEndSearch()
    – Functions for ray state, e.g.: WorldRayOrigin(), RayTCurrent(), InstanceID(), and HitKind()
  3. DmitryKo


    Feb 26, 2002
    Likes Received:
    55°38′33″ N, 37°28′37″ E
    Tutorials: Build a Path Tracer Step-by-Step (Or Learning DirectX HLSL by example)
    Chris Wyman, NVIDIA

    Tutorial: Ray Traced Ambient Occlusion

    What is ambient occlusion?
    – Approximates incident light over hemisphere
    – Gives a (very) soft shadow

    Simplest implementation:
    – Shoot random ray over hemisphere
    – See if any occluders within specified radius
    – No? Return 1
    – Yes? Return 0
    Ambient occlusion with one shadow ray per pixel

    Want Less Noise? Shoot More Rays!

    Amazon Bistro (64 rays per pixel)
    UE4 Sun Temple (64 rays per pixel)

    Tutorial: Diffuse Shadows And Global Illumination

    How is this code different?

    Shoots two types of rays:
    Shadow rays (test visibility only)
    Indirect rays (return a color in selected direction)

    Shadow rays identical to AO rays
    Both test visibility in a specified direction
    Unless you want to rename for clarity
    And changes due to different number of ray shaders
    • For me, shadows are hit group #0, miss shader #0
    • For me, there are 2 hit groups (1 for shadows, 1 for color)​

    Tutorial: Diffuse Shadows And Global Illumination

    Color rays are a bit more complex. How?
    – Payload contains a color, per-pixel random seed
    – Miss shader needs to return the background color
    – Any hit identical (i.e., discard transparent surfaces)
    – We have a closest hit shader

    At the hitpoint, gets material information then shades
    – Shooting a ray is simpler
    • Use no special ray flags
    • Use correct hit & miss shaders for color rays
    – For me, color rays are type #1 (of 2)​

    What happens for our diffuse shading?
    – Pick a random light, so we don’t shoot one ray to each
    – Load information about the selected light
    – Compute our cosine (NdotL) term
    – Shoot our shadow ray
    – Surface color depends on the light’s intensity
    – Compute total diffuse color
    pharma likes this.
  4. DmitryKo


    Feb 26, 2002
    Likes Received:
    55°38′33″ N, 37°28′37″ E
    Shawn Hargreaves, Microsoft
    Introduction to DirectX Raytracing
    Part 2 – the API

    D3D12 Binding Model (indirection ftw!)

    Descriptor = pointer to a GPU resource

    Descriptor table = indexable array of descriptors

    Descriptor heap = area of GPU memory containing multiple descriptor tables

    Root signature defines a binding convention, used by shaders to locate whatever data they need to access
    • Inlined root constants
    • Inlined root descriptors
    • Pointers to descriptor tables within the descriptor heap

    New Requirements For Raytracing

    Acceleration structure format is opaque
    • Unlike traditional vertex data used for rasterization, there is no standard layout suitable for all implementations​
    Rays can go anywhere
    • So all geometry and shaders must be simultaneously available​
    Different shaders may want different resource bindings

    More levels of indirection!

    Acceleration Structures

    Opaque geometry format optimized for ray traversal (e.g. BVH)
    Layout determined by driver and hardware
    Built at runtime on the GPU
    Immutable except for incremental in-place updates

    Memory Management

    Because the format is implementation defined, you cannot know up front how big an acceleration structure will be

    • Runs on the CPU
    • Returns a conservative estimate
    • ResultDataMaxSizeInBytes
    • ScratchDataSizeInBytes

    • Runs on the GPU
    • Returns actual size, in GPU memory after the command list has finished executing


    Suballocate out of larger buffers
    Use conservative sizes while generating command list for initial build
    After real size data is available, perform a compaction pass
    Don’t compact things that animate…
    Beware CPU/GPU stalls!

    Shader Tables

    Rays can go anywhere and hit anything
    Different objects can have different materials
    Need to run different shaders depending on which object a ray hit

    • Array of pointers to shaders
    • Index into the array is determined by which object was hit

    Arrays of Pointers to Shaders

    Shader Identifier = ‘pointer’ to a shader (32 byte blob)

    Hit Group = { intersection shader, any hit shader, closest hit shader }

    Shader Record = { shader identifier, local root arguments }

    Shader Table = { shader record A }, { shader record B }, …

    No dedicated API for creating shader tables

    • These are just memory that can be filled however you like
    #4 DmitryKo, Oct 27, 2018
    Last edited: Nov 1, 2018
    corysama and pharma like this.
  5. DmitryKo


    Feb 26, 2002
    Likes Received:
    55°38′33″ N, 37°28′37″ E
    Colin Barré-Brisebois, SEED - Electronic Arts
    Full Rays Ahead! From Raster to Real-Time Raytracing

    I’m a dev and I want to move to DXR… What should I do?

    Transition to DXR is not automagical
    • Some effects are easy to add: hard shadows/reflections, ambient occlusion
    • The fun starts when things get blurry & soft :)
    DXR is pretty intuitive!
    • Nice evolution from previous raster + compute pipelines
    • Easy to get quickly up and improve!
    Break down passes so you can easily swap & reuse!
    • HLSL makes it easy: DXR interops with Rasterization and Compute
    • Build shared functions that you will call from both Rasterization and Compute
    Prepare your passes for the transition: swapping inputs & outputs
    Start thinking about how to handle noise (TAA and other filtering)

    First Thing

    A few techniques should be implemented first (in difficulty order)
    • Shadows
    • Ambient Occlusion
    • Reflections

    Launch a ray towards light
    • Ray misses → Not in shadow
    • Handled by Miss Shader
    • Shadowed = !payload.miss;

    Soft shadows?
    • Random direction from cone [PBRT]
    • Cone width drives penumbra
    • [1;N] rays & filtering
    • We used SVGF [Schied 2017]
    • Temporal accumulation
    • Multi-pass weighted blur
    • Variance-driven kernel size

    Integral of the visibility function over the hemisphere for the point on a surface with normal with respect to the projected solid angle
    • Random cosine hemi sampling
    • Launch from g-buffer normals
    • AO = payload.miss ? 1.0 : 0.0

    Launch rays from G-Buffer
    Trace at half resolution
    • ¼ ray/pixel for reflection
    • ¼ ray/pixel for reflected shadow
    Reconstruct at full resolution
    Also supports:
    • Arbitrary normals
    • Spatially-varying roughness
    Extended info: GDC 2018 & DD 2018

    Reflection Pipeline

    Importance sampling → Screen-space reflection → Raytracing → Envmap gap fill → Spatial reconstruction →Temporal accumulation → Bilateral cleanup

    Validate Against Ground Truth!

    Validating against ground truth is key when building RTRT!

    Toggle between hybrid and path-tracer when working on a feature
    • Rapidly compare results against ground truth
      • Toggle between non-RT techniques and RT
      • i.e.: SSR → RT reflections, SSAO → RTAO
    • Check performance & check quality (and where you can cut corners)
      • PICA PICA: used constantly during production
    • Multi-layer material & specular, RTAO vs SSAO, Surfel GI vs path-traced GI

    No additional maintenance required between shared code
    • Because of interop!
    Take Advantage of Interop

    DirectX offers easy interoperability between raster, compute and raytracing
    • Raytracing, rasterization and compute shaders can share code & types
    • Evaluate your actual HLSL material shaders - directly usable for a hybrid raytracing pipeline
    The output from one stage can feed data for another
    • i.e.: Write to UAVs, read in next stage
    • i.e.: Prepare rays to be launched, and trace on another (i.e.: mGPU)
    • i.e.: Can update Shader Table from the GPU
    Interop extends opportunities for solving new sparse problems
    • Interleave raster, compute and raytracing
    • Use the power of each stage to your advantage
    • Interop will become your new best friend as we move towards this transition

    Speaking of Rays…

    Handling coherency is key for RTRT performance
    • Coherent adjacent work performing similar operations & memory access
      • Camera rays, texture-space shading
    • Incoherent → trash caches, kills performance
      • Reflection, shadows, refraction, Monte Carlo
    Use rays sparingly
    • Trace only where necessary
    • Tune ray count to importance
    • Adaptive techniques
    Reconstruct & denoise
    • Reuse results from spatial and temporal domains
    Texture Level-of-Detail

    Mipmapping [Williams 1983] is the standard method to avoid texture aliasing:
    Screen-space pixel maps to approximately one texel in the mipmap hierarchy
    Supported by all GPUs for rasterization via shading quad and derivatives

    No shading quads for ray tracing!

    Traditionally: Ray Differentials
    • Estimates the footprint of a pixel by computing world-space derivatives of the ray with respect to the image plane
    • Have to differentiate (virtual offset) rays
    • Heavier payload (12 floats) for subsequent rays (can) affect performance. Optimize!
    Alternative: always sample mip 0 with bilinear filtering (with extra samples)
    • Leads to aliasing and additional performance cost
    Open Problems
    • Noise vs Ghosting vs Performance
    • Managing Coherency & Ray Batches
    • Transparency & Procedural Geometry
    • Specialized denoising & reconstruction
    • Real-Time Global Illumination
    • DXR’s Top & Bottom Accel → best for RTRT?
    • Managing Animations
    • New Hybrid rendering approaches?
    • Texture LOD?
    eloyc and OCASM like this.
  6. DmitryKo


    Feb 26, 2002
    Likes Received:
    55°38′33″ N, 37°28′37″ E
    eloyc, jlippo and OCASM like this.
  7. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Feb 7, 2002
    Likes Received:
    Remedy's new Ray-Tracing presentation and some tweets from Sebbi about it.

    Heinrich4, Lightman, pharma and 4 others like this.
  8. pharma

    Veteran Regular

    Mar 29, 2004
    Likes Received:
    Parallel Shader Compilation for Ray Tracing Pipeline States
    November 19, 2018
    In ray tracing, a single pipeline state object (PSO) can contain any number of shaders. This number can grow large, depending on scene content and ray types handled with the PSO; construction cost of the state object can significantly increase. The DXR API makes it possible to distribute part of the creation work to multiple threads by utilizing collections. A collection is a ID3D12StateObject with type D3D12_STATE_OBJECT_TYPE_COLLECTION.

    Multiple threads can be used for state object creation, as shown in figure 1. One collection can store one or more shaders that are compiled from one or more DXIL libraries. Each collection is created with a single thread, but as lots of shaders can be used in one PSO, it is possible to distribute the related collection creation work to multiple threads. Additionally, one collection can be potentially used in multiple PSOs. It can be a good idea to cache created collections for reuse.

    In order to allow compilation of shader code to native format during collection creation, the collections must define most of the state that would be defined in the final PSO as well with subobjects. A RAYTRACING_SHADER_CONFIG subobject must be defined. All shaders must have root signatures fully defined with GLOBAL_ROOT_SIGNATURE and LOCAL_ROOT_SIGNATURE subobjects. Additionally, a HIT_GROUP subobject must be defined for intersection, any-hit, and closest-hit shaders. Note that the RAYTRACING_SHADER_CONFIG subobjects in all collections and in the PSO itself in a PSO creation call must match. A RAYTRACING_PIPELINE_CONFIG subobject does not need to be defined in collections. You should avoid using the state object flags
    or ALLOW_EXTERNAL_DEPENDENCIES_ON_LOCAL_DEFINITIONS for best performance from collections. They may prevent compiling shader code to native format or increase memory consumption.
    OCASM likes this.
  9. pharma

    Veteran Regular

    Mar 29, 2004
    Likes Received:
    This course is an introduction to Microsoft’s DirectX Raytracing API suitable for students, faculty, rendering engineers, and industry researchers. The first half focuses on ray tracing basics and incremental, open-source shader tutorials accessible for novices. It’s the definitive guide to getting started using this incredible technology. The video delivers more than 3 hours of training from several of the most experienced ray tracing engineers and researchers on the planet.
    Heinrich4 and OCASM like this.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.