Game development presentations - a useful reference

This seems like it would be a problem since surfels only spawn for geometry that was on screen at some point.
Yet these surfels persist across many frames and should be better for disocclusion handling when compared to Lumen's Screen Space Radiance caching.
I compared the Lumen's surface cache (cards) to surfels by mistake, these should not be compared directly since cards store surface shading in world space and surfels cache radiance in world space, these are two different things.
For radiance caching, Lumen uses the Screen Space Radiance Cache in conjunction with World Space Radience cache (probes), the former one is responsible for high frequency GI shading and will not persist if camera view is changed since it's linked to the screen space, surfels should persist if view is changed since they are stored in world space (and there are probes to add bounces for backface geometry in both solutions).
 
But is is funny something so useful was forget because it was used before in realtime but never released in any games by Michael Bunnel and it was the defacto method for GI before pathtracing. I think the patent about surfel is abandoned maybe due to the fact everyone use pathtracing in offline rendering.
The remarkable (but forgotten) idea from Bunnell was about replacing visibility tern with an approximation from a shadowing term. This idea is not really limited to surfels - i also tried it with a volume grid of probes for example, and it works for that too.

The other related method is using radiance cache to get infinite bounces for free (also called radiosity method). The method requires some form of finite elements for the cache, but again works with surfels, volume probes, or whatever else. And it is standard now in all realtime methods which came up after RTX introduction.
Curiously it was mostly ignored before, although Bunnell used it. I think there were some fancy VCT techniques which did it too. And Sonic Ethers Minecraft Mod also used such caching, the official RT port as well.
Classical Path Tracing can not do this. Which is why i think it will never become the standard lighting solutions for games. It's just inefficient to recalculate everything from scratch in each frame.
But i'm convinced about PT with a probe cache to shorten paths. Still feels expensive, but Exodus demonstrates it works already now, being basically Quake2RTX + DDGI cache.

What’s clear is that native 4K will not be a thing this generation.
Like all probe solutions it can be independent from resolution. If there is a problem, they can just reduce surfel density but still keep render resolution high. Won't look as good ofc., but helps with reducing RT cost.
Though, that's not clearly an advantage for probe solutions. Because reducing render resolution (e.g. handhelds) does not automatically give you a perf. win for GI on the other hand. And achieving dynamic probe resolution may be harder than just changing a render target.
Increasing lag is another option to scale probe solutions.
 
Yet these surfels persist across many frames and should be better for disocclusion handling when compared to Lumen's Screen Space Radiance caching.
I compared the Lumen's surface cache (cards) to surfels by mistake, these should not be compared directly since cards store surface shading in world space and surfels cache radiance in world space, these are two different things.
For radiance caching, Lumen uses the Screen Space Radiance Cache in conjunction with World Space Radience cache (probes), the former one is responsible for high frequency GI shading and will not persist if camera view is changed since it's linked to the screen space, surfels should persist if view is changed since they are stored in world space (and there are probes to add bounces for backface geometry in both solutions).

I got the impression that GIBS probes are only sampled when lighting transparent surfaces.
 
I got the impression that GIBS probes are only sampled when lighting transparent surfaces.
I do think it is only for transparency - but I believe in the future they mention using them for general specularity (sphereical guassians was mention in future work). Currently diffuse only.

One thing that is a bit of ??? in the Lumen presentation at Siggraph, is that it is not shown where the PS5 numbers are coming from. Were those traces using the Hardware Path or the Software Path? That is a pretty important thing to point out for quality and performance!!!
 
Like all probe solutions it can be independent from resolution. If there is a problem, they can just reduce surfel density but still keep render resolution high. Won't look as good ofc., but helps with reducing RT cost.
Though, that's not clearly an advantage for probe solutions. Because reducing render resolution (e.g. handhelds) does not automatically give you a perf. win for GI on the other hand. And achieving dynamic probe resolution may be harder than just changing a render target.
Increasing lag is another option to scale probe solutions.

It’s also not obvious that lower surfel density at higher resolution is a net IQ win versus upscaling from a lower resolution with more accurate GI. I wonder what Frostbite is doing for shadows and reflections. Presumably they will also support RT versions. That’s a lot of rays.
 
I got the impression that GIBS probes are only sampled when lighting transparent surfaces.
This part is unclear to me, why would not they simply fall back to probes on a first bounce ray hit for a new surfel?

Also, surfels will accumulate quite a few bounces for all geometry in scene (including backfaced) in a matter of a few tens of frames.
 
Last edited:
This part is unclear to me, why would not they simply fall back to probes on a first bounce ray hit for a new surfel?

Yeah it would make sense but they didn’t mention it explicitly in the deck. It’s probably because they don’t need it thanks to your other point below.

Also, surfels will accumulate quite a few bounces for all geometry in scene (including backfaced) in a matter of a few tens of frames.

True.
 
One thing that is a bit of ??? in the Lumen presentation at Siggraph, is that it is not shown where the PS5 numbers are coming from. Were those traces using the Hardware Path or the Software Path? That is a pretty important thing to point out for quality and performance!!!

Lol I assume it’s hardware RT since there was no mention of SDFs or software tracing until the very end of the deck.

The presentation starts off by saying casting a ray per pixel into the BVH is too expensive. So it follows that the whole point of the sparse screen space radiance cache is to cast fewer rays into said BVH.

But it’s the surface cache that enables multiple bounces of GI so I’m also a little confused as to how the screen space radiance cache interacts with the Lumen scene. It would have to be tracing into the surface cache to enable multi bounce GI.
 
But it’s the surface cache that enables multiple bounces of GI so I’m also a little confused as to how the screen space radiance cache interacts with the Lumen scene. It would have to be tracing into the surface cache to enable multi bounce GI.
I have watched the video. I did read the paper.
But i still have no idea how all those things are connected. Some overview over the whole system would have been nice.
 
My first question regarding GIBS is whether or not it's compatible with hardware ray tracing ?

The immediate impression I had was that they have a hybrid approach by combining hardware acceleration with a surface cache (surfels) which has it's own acceleration structure. BVH's and grids have different properties that are ideal in certain scenarios. A BVH is very slow to build but it's fast at convergence during ray tracing so they still use it because there's hardware acceleration to get high quality results in a short time. Hierarchical grids have opposite characteristics in comparison to a BVH so it can be useful for doing lookups to get plausible results for dynamic geometry in their case ...
 
My first question regarding GIBS is whether or not it's compatible with hardware ray tracing ?
Yes, they use HW RT. Hitpoints are then shaded taking nearby surfels into account (if there are any).

The immediate impression I had was that they have a hybrid approach by combining hardware acceleration with a surface cache (surfels) which has it's own acceleration structure.
Yes. It's a kind of regular grid, so lookup is O(1) and faster than BVH. The pyramid helps to keep surfel density per cell uniform, and this way they avoid a need for hierarchical grid.

so it can be useful for doing lookups to get plausible results for dynamic geometry in their case ...
It's not really specific to dynamic objects. They rebuild the grid every frame, so it does not matter what's dynamic or static. Also the surfel generation does not make a difference here.
But if we have very small objects with no surfel on it, or volumetric stuff, then they use a cascaded volume grid of probes. So that's kind of fallback to solve problems we usually have when thinking of dynamic objects.

...not sure if i got all this right ;)
 
My first question regarding GIBS is whether or not it's compatible with hardware ray tracing ?

It’s not just compatible with hardware ray tracing, it requires it. I wonder if they will provide a fallback for non-RT hardware.

The immediate impression I had was that they have a hybrid approach by combining hardware acceleration with a surface cache (surfels) which has it's own acceleration structure. BVH's and grids have different properties that are ideal in certain scenarios. A BVH is very slow to build but it's fast at convergence during ray tracing so they still use it because there's hardware acceleration to get high quality results in a short time. Hierarchical grids have opposite characteristics in comparison to a BVH so it can be useful for doing lookups to get plausible results for dynamic geometry in their case ...

The caches (surfels, cards) and geometry representations (SDF, BVH) serve different purposes right so they’re not competing with each other.

GIBS hardware traces from visible surfel locations into the BVH each frame, shades the hit point and also integrates lighting from any previously cached surfel data covering the hit point. It adds the result back into the world space surfel cache for bounce lighting.

Lumen is a bit more involved and has multiple caches. The world space surface cache (cards) essentially consists of reverse cube maps around each Nanite object. These cube maps are stored in a texture atlas and updated each frame by software rasterizing the Nanite geometry at the appropriate LOD based on distance. Cubemap pixels are then shaded and this shading step also samples the surface cache. This sampling of the cache while updating the cache is what provides Lumen’s bounce lighting.

Lumen then has a second level of caching in screen space using a 1/16th resolution grid of pixel regions. Each frame a different pixel within the region is selected to cast rays and the result is temporally accumulated over multiple frames. It seems this ray trace can use either the SDF (software) or BVH (hardware). It then samples the surface cache at the ray hit point.

The question is why is this screen space caching and accumulation even necessary when the world space surface cache already has all the bounce lighting? Why not just sample the cache directly from each pixel during the final render like GIBS instead of shooting and accumulating more rays. There is a hint of an answer in the Lumen video that says the bounce GI in the surface cache is low quality. However it doesn’t explain how the cache samples itself or why the result is low fidelity. Either way it makes sense then that more rays need to be cast from screen space to accumulate an accurate result from the surface cache.

The key difference with GIBS is that the bounce GI in the world space surfel cache is already high fidelity because each surfel has already cast a bunch of rays. Therefore the cached data can be sampled directly for each pixel in the final render. Lumen is basically jumping through additional hoops with significant limitations in order to support much higher resolution geometry.
 
The question is why is this screen space caching and accumulation even necessary when the world space surface cache already has all the bounce lighting? Why not just sample the cache directly from each pixel during the final render
For Lumen the cache just seems too noisy. Think of visualization of Lumen scene - too much noise. GIBS shows a similar but lesser issue, which they fight by blurring surfels with neighbor surfels.
 
The world space surface cache (cards) essentially consists of reverse cube maps around each Nanite object. These cube maps are stored in a texture atlas and updated each frame by software rasterizing the Nanite geometry at the appropriate LOD based on distance. Cubemap pixels are then shaded and this shading step also samples the surface cache.
Still confused about this...
The cards are all boxes, so they would correspond to rectangles of texture. And each texel would be one cached probe.
We can also have LOD by selecting a mip, and we could build the full mip pyramid to prefilter to limit the noise problem eventually.
So far so good.
But why do they need to rasterize geometry? Do the UV charts we see correspond to card faces? It does not look like this.
Why do they no just precompute a worldspace pos/norm and simplified material per card texel. Then no rasterization would be needed?
 
Do the UV charts we see correspond to card faces?
Thats what’s described in the video. The UV chart is an atlas of all the card faces.

Why do they no just precompute a worldspace pos/norm and simplified material per card texel. Then no rasterization would be needed?

As in treat the cache as a precomputed mini-gbuffer? Well you would still need to light those pixels and store the shaded results somewhere. Also you would need to precompute all LOD levels and choose the right level when sampling the cache. The Nanite rasterizer gives you all that for “free”.
 
Last edited:
The question is why is this screen space caching and accumulation even necessary when the world space surface cache already has all the bounce lighting?
I guess the answer is to provide high frequency per pixel GI such as the indirect shadows, though the distance at which this cache works is ridiculously low (2 meters), and it seems local model SDFs are used exactly for this screen-space cache (to provide higher frequency geometry details).
 
Also, surfels will accumulate quite a few bounces for all geometry in scene (including backfaced) in a matter of a few tens of frames.

Thinking about this some more GIBS won’t give you multiple bounces from back facing or offscreen geometry that has never been on screen. At best you get the direct diffuse light and shadow at the hit point. There’s no surfel because the geometry has never been on screen so no indirect lighting. This seems pretty fragile for scenes with a lot of indirect diffuse contribution from occluded geometry.

It’ll be fine for one bounce though.
 
Thinking about this some more GIBS won’t give you multiple bounces from back facing or offscreen geometry that has never been on screen.
Indirect diffuse contribution for the backface geometry can be picked up from the closest probes, that's how it works in Metro Exodus EE and I would be surprised if DICE did something different considering they already have the dynamically updated probes as a fallback solution.
The surfel cache in world space is more akin to the screen space cache in Lumen, it adds higher frequency GI lighting and reduces noise by casting multiple rays from each surfel spatially and temporarily and by doing importance sampling.
It would have been nice if they had had per pixel GI or AO for the smallest scale details that couldn't be picked up by surfels, though they can capture these with screen space tracing in the same manner as Lumen does.
 
Indirect diffuse contribution for the backface geometry can be picked up from the closest probes, that's how it works in Metro Exodus EE and I would be surprised if DICE did something different considering they already have the dynamically updated probes as a fallback solution.

I don’t know. They were pretty clear about the approach to lighting opaque geometry and there was no mention of querying probes. Wouldn’t rays shot from probes have the same problem anyway? If they hit back facing geometry with no surfel present you still get just one bounce.

It would have been nice if they had had per pixel GI or AO for the smallest scale details that couldn't be picked up by surfels, though they can capture these with screen space tracing in the same manner as Lumen does.

Yeah they’re capping surfel resolution so there’s probably some per-pixel element missing here. Maybe they’re doing RT AO and shadows too.
 
They were pretty clear about the approach to lighting opaque geometry and there was no mention of querying probes.
Lumen presentation has no mention of the surface cache and cards in general, so what?
Such presentations often omit many details, that's 100+ pages to explain for a very limited session time after all.

Also, DICE has some "to do" notions on the page 206: "And improve performance by integrating and sharing rays with surfel ray dispatch".
The part about rays integration looks like sharing of radiance between probes and surfels to me.

If they hit back facing geometry with no surfel present you still get just one bounce.
By this paper, dynamic ray-traced probes handle surface shading and caching in the same way as the lumen's surface cache, so I don't see how these techics are different in this regard, both iteratively accumulate bounces across many frames by blending in the history data.

Yeah they’re capping surfel resolution so there’s probably some per-pixel element missing here. Maybe they’re doing RT AO and shadows too.
Yeah, I guess 1 per pixel bounce would still add a lot of shadows and would improve final frame look drastically. Lumen is quite limited here too and adds small details with sreen space mostly.
 
Back
Top