Game development presentations - a useful reference

Visibility TAA and Upsampling with Subsample History

http://filmicworlds.com/blog/visibility-taa-and-upsampling-with-subsample-history/

Guy is an idea machine. Really cool. Each blog post is a new way to use the visibility buffer.

Really fantastic series of articles. I might be missing something but he seems to be using hardware triangle setup and rasterization in the first pass when rendering the visibility buffer. This seems not ideal for pixel sized triangles.

Nanite works differently right? I believe it passes the raw geometry directly into a compute shader bypassing all geometry hardware.
 
Nanite works differently right? I believe it passes the raw geometry directly into a compute shader bypassing all geometry hardware.
Yes, which may be their primary motivation to use visibility buffer. The CS rasterizer does one 64bit atomic to VRAM per pixel. Generating full GBuffer directly would need two passes or a spinlock per pixel.
Now RDNA2 already has 128bit cache lines. I wonder if we could get 128bit atomics too. Then we could put more stuff into a pixel, like barycentric coords or some material info.
 
Yes, which may be their primary motivation to use visibility buffer. The CS rasterizer does one 64bit atomic to VRAM per pixel. Generating full GBuffer directly would need two passes or a spinlock per pixel.
Now RDNA2 already has 128bit cache lines. I wonder if we could get 128bit atomics too. Then we could put more stuff into a pixel, like barycentric coords or some material info.

Did you write chapter 26 of RT Gems II? :D

This chapter revisits solutions to some problems that can arise from combining ray tracing and rasterization but also touches on methods for more generalized ray traced LOD transitions.

Cross-faded LOD transitions are also a good choice for ray tracing applications as they prevent bounding volume hierarchy (BVH) refitting or rebuild operations for geomorphing geometries. Instead, bottom-level acceleration structures (BLASs) for discrete LODs can be instanced in the top-level acceleration (TLAS) structure as needed. So, similar to the fact that rasterization needs to render two LODs during transitions, it is necessary to put the geometry of two or more LODs in the BVH in order to enable ray traced transitions.

At this point in time, DXR doesn’t provide obvious direct API support for letting potential traversal hardware handle high-quality LOD transitions, though. The DirectX specification mentions traversal shaders as a potential future feature, but it is unclear if and when they will become a reality. In a hybrid ray traced technique, where the ray origin is usually derived from the world-space position of a given pixel, a traversal shader would allow LOD-based cross-fading using the same logic that the pixel shaders described previously use. Instead of discarding pixels, the ray would just be forwarded into the BLAS of the selected LOD
 
Did you write chapter 26 of RT Gems II? :D
hehe :) We discussed NVs stochastic LOD paper (which the chapter is based on) vs. the earlier Intel paper introducing 'traversal shaders' before.
IIRC, the limitation in NVs solution is it can not switch LODs as the ray goes through the scene, it can only select LOD on generation or hit. So it cant be a full solution to the continuous LOD problem, but it is nice to hide popping of discrete LODs as shown.
 
First Siggraph Slides are up. Natascha was very quick with it this time. Thanks for that!
http://advances.realtimerendering.com/s2021/index.html

Advances in Real-Time Rendering in Games: Part I
Improved Spatial Upscaling through FidelityFX Super Resolution for Real-Time Game Engines
Timothy Lottes (Unity Technologies)
Kleber Garcia (Unity Technologies)

Experimenting with Concurrent Binary Trees for Large Scale Terrain Rendering
Thomas Deliot (Unity Technologies)
Jonathan Dupuy (Unity Technologies)
Kees Rijnen (Unity Technologies)
Xiaoling Yao (Unity Technologies)

A Deep Dive into Nanite Virtualized Geometry
Brian Karis (Epic Games)
Rune Stubbe (Epic Games)
Graham Wihlidal (Epic Games)

Large-Scale Global Illumination at Activision
Ari Silvennoinen (Activision Publishing)

Advances in Real-Time Rendering in Games: Part II
Real-Time Samurai Cinema: Lighting, Atmosphere, and Tone mapping in Ghost of Tsushima
Jasmin Patry (Sucker Punch Productions)

Radiance Caching for Real-time Global Illumination
Daniel Wright (Epic Games)

Global Illumination Based on Surfels
Henrik Halen (SEED at Electronic Arts),
Andreas Brinck (Ripple Effect Studios at Electronic Arts),
Kyle Hayward (Frostbite at Electronic Arts),
Xiangshun Bei (Ripple Effect Studios at Electronic Arts)
 
Last edited:
Siggraph slides part II (link see two above) are online.

Surfel GI is interesting and performance test are done on PS5 a stress test and some using Plant versus Zombie games more realistic for a game.

It is probably a bit faster on XSX and faster on RDNA 2 GPU and NVIDIA GPUs.

It looks like faster than Lumen but for geometry with too much details with overlapping surfel it fallback to SSGI.

Not a solution for Nanite level of geometry.
 
Last edited:
for geometry with too much details with overlapping surfel it fallback to SSGI.
Would you mind sharing the quote from the presentation?

There are many cases where Lumen falls back to the SSGI.
For example, Lumen builds up its cards cache around geometry without implicit parametrization (i.e. UVs for the cards), it does so in a cube map manner, so it does not support complexly shaped geometry with curls and self occlusion (cards can't be projected to the occluded parts of geometry, so the cards cache won't work for something like trees for example), that's where Lumen will always fall back to SSGI with current implementation.
Unlike the cards in Lumen, surfels support arbitrarily shaped geometry and hence the self illumination of geometry.

Not a solution for Nanite level of geometry.
Following this logic, neither SDFs are. Global SDF is super oversimplified and low res and doesn't match the Nanite geometry precisely at all, local SDFs are a bit better, but still nowhere near the Nanite geometry complexity, also, both work just for non deformable geometry (doesn't support foliage, skinned models, etc).
Surfels on the other hand are just for irradiance caching and support all types of geometry, tracing happens for geometry and should provide way better precision for GI and ambient shadowing in comparison with SDFs.
Overall surfels seem to be more versatile and advanced solution for the cache, but also more complex to implement.
 
Last edited:
Would you mind sharing the quote from the presentation?

There are many cases where Lumen falls back to the SSGI.
For example, Lumen builds up its cards cache around geometry without implicit parametrization (i.e. UVs for the cards), it does so in a cube map manner, so it does not support complexly shaped geometry with curls and self occlusion (cards can't be projected to the occluded parts of geometry, so the cards cache won't work for something like trees for example), that's where Lumen will always fall back to SSGI with current implementation.
Unlike the cards in Lumen, surfels support arbitrarily shaped geometry and hence the self illumination of geometry.


Following this logic, neither SDFs are. Global SDF is super oversimplified and low res and doesn't match the Nanite geometry precisely at all, local SDFs are a bit better, but still nowhere near the Nanite geometry complexity, also, both work just for non deformable geometry (doesn't support foliage, skinned models, etc).
Surfels on the other hand are just for irradiance caching and support all types of geometry, tracing happens for geometry and should provide way better precision for GI and ambient shadowing in comparison with SDFs.
Overall surfels seem to be more versatile and advanced solution for the cache, but also more complex to implement.

Page 204
We also have certain situations where high-detail geometry can over-spawn surfels. One of our options is to combine with screen space global illumination to minimize surfel spawning in areas where SSGI solves well.



Combining with SSGI will also help limit how many surfels we need to solve, and it will also help limit how far out we cull surfels and raytracing.

Surfels work better and has better performance 3.41 ms per seconds on a PS5 and they say they can improve performance using intrinsics and other optimizations.

But it has some limitation for very detailed geometry in certains conditions but they don't give details. If most of the time it fallback to SSGI, this is not a good solution for Nanite. All depends of the limitations.

Imo I prefer the surfel solution.
 
Last edited:
Thanks!
So world space tracing does not fall back to SSGI.
The screen traces can be used in places where the SSGI works and then fall back to the more expensive world space tracing where SSGI fails.
Since surfels are spawned in adaptive manner, using screen traces can also help with better surfels distribution in current frame, but that's not a limitation of the surfel cache when it comes to complex geometry, there is no need in screen traces at all, that's just a small potential optimization that will most likely fail since screen traces are view dependent.
 
Not a solution for Nanite level of geometry.
Well, they could build a angular representation per surfel, so local normals get a directional term when shading. Increases memory cost ofc.
Further one could precompute directional close range(s) occlusion on the geometry, which increases storage and memory costs.
But then high detail should work pretty well already without further SS sampling or rays.

Interesting: Their tech only gives a incomplete scene representation from the frame buffer. Similar to how Cryteks partial voxelization in Crysis 2.
Thus they get bounces only from some geomtery (is or was visible some time ago).
Comparing this with DDGI, they have an advantage of accuracy from surface samples, but in a global sense their surfel method is even less accurate.
 

Attachments

  • upload_2021-8-12_13-20-38.png
    upload_2021-8-12_13-20-38.png
    208.7 KB · Views: 8
Thanks!
So world space tracing does not fall back to SSGI.
The screen traces can be used in places where the SSGI works and then fall back to the more expensive world space tracing where SSGI fails.
Since surfels are spawned in adaptive manner, using screen traces can also help with better surfels distribution in current frame, but that's not a limitation of the surfel cache when it comes to complex geometry, there is no need in screen traces at all, that's just a small potential optimization that will most likely fail since screen traces are view dependent.

EA SEED said:
We also have certain situations where high-detail geometry can over-spawn surfels

This is a performance problem with high details geometry overspawning surfels. Not so small if they think of an alternative. Will it works well with for example Nanite level of geometry of the first PS5 demo?They only use SSGI for this case. This is not the only the case with transparency where they don't use surfel but probe based GI.

Well, they could build a angular representation per surfel, so local normals get a directional term when shading. Increases memory cost ofc.
Further one could precompute directional close range(s) occlusion on the geometry, which increases storage and memory costs.
But then high detail should work pretty well already without further SS sampling or rays.

Interesting: Their tech only gives a incomplete scene representation from the frame buffer. Similar to how Cryteks partial voxelization in Crysis 2.
Thus they get bounces only from some geomtery (is or was visible some time ago).
Comparing this with DDGI, they have an advantage of accuracy from surface samples, but in a global sense their surfel method is even less accurate.

There is probably some solution. If it was not the case surfels would have not used in offline rendering for GI. They say themselve it is a work in progress and probably people will push the idea further.

I prefer it to Lumen, less performant hungry work for everything out of transparency. :)

But at least surfel arrive in realtime rendering and more people not only you @JoeJ will become interested. :)
 
Last edited:
There is probably some solution. If it was not the case surfels would have not used in offline rendering for GI.
Yeah, the solution simply is to generate surfels from the scene, not from what's currently visible.
But then we have a need for global parametrization, likely adding preprocessing affecting production, and a lot of work to implement related tools.
Another option would be to use volume probes like DDGI as fall back for offscreen stuff.
(Maybe they did something here and i'd need to correct myself, not done yet reading the whole paper... too much of them at once :) )

Edit: pasted in the picture of froxel grid by accident and can't remove it. (I see the same happened with earlier post - not intended)
Funnily i had this same idea some weeks ago in regard of doing full scene fluid simulations. But i did not realize it's interesting to map surfels to the scene as well.
 

Attachments

  • upload_2021-8-12_14-51-16.png
    upload_2021-8-12_14-51-16.png
    208.7 KB · Views: 2
Yeah, the solution simply is to generate surfels from the scene, not from what's currently visible.
But then we have a need for global parametrization, likely adding preprocessing affecting production, and a lot of work to implement related tools.
Another option would be to use volume probes like DDGI as fall back for offscreen stuff.
(Maybe they did something here and i'd need to correct myself, not done yet reading the whole paper... too much of them at once :) )

Edit: pasted in the picture of froxel grid by accident and can't remove it. (I see the same happened with earlier post - not intended)
Funnily i had this same idea some weeks ago in regard of doing full scene fluid simulations. But i did not realize it's interesting to map surfels to the scene as well.

Here at least some solution exist where they can solve the problem. But is is funny something so useful was forget because it was used before in realtime but never released in any games by Michael Bunnel and it was the defacto method for GI before pathtracing. I think the patent about surfel is abandoned maybe due to the fact everyone use pathtracing in offline rendering.

In realtime field all people were running after voxel and sdf as an alternative to triangle based raytracing.

And it is using hardware raytracing. :) Something so useful not lost.
 
Surfels shoot rays into the scene. These rays hit geometry, any geometry that is in the ray tracing acceleration structure, including dynamic and skinned geometry. On a hit point, we evaluate direct diffuse lighting and shadowing by tracing the light sources in the scene. We will talk about how we do that efficiently a bit later. Additionally at the hit locations we also evaluate the existing surfel lighting at those locations. This gives us effectively infinite bounce over time, as long as there is surfel coverage in that area.

This seems like it would be a problem since surfels only spawn for geometry that was on screen at some point. What if the ray hits back facing or offscreen geometry? You get one bounce but that’s it.

The different approaches to dynamic GI are intriguing. There’s DDGI, Lumen, GIBS so far all taking slightly different approaches to the problem. Interestingly both Lumen and GIBS use probe grids in addition to their surface caches. GIBS seems to be the most complete in terms of accuracy and asset compatibility. Lumen’s advantage is support for older non-RT hardware but that seems not so relevant given the performance required. DDGI is less accurate as it only uses probes but is the most straightforward to implement.

What’s clear is that native 4K will not be a thing this generation. GIBS is struggling with checkerboard 1800p on PS5.
 
Back
Top