Next gen lighting technologies - voxelised, traced, and everything else *spawn*

This is really a great example to illustrate how denoising works. It's no trick or magic - it's just smoothing of a noisy and sparse input.
Well there's different ways to do that. A gaussian blur smooths a noisy input, but also destroys detail. There's definitely potential for smarter ways of denoising by taking various cues and using various reconstruction. It's the kind of thing ML should be good at. Seeing the result of upscaling from DLAA2, I imagine ML processing of noisy lighting data combined with simple geometry cues to be potentially very good. Maybe clean, crisp lighting of a whole scene with <10% coverage good.
 
Well there's different ways to do that. A gaussian blur smooths a noisy input, but also destroys detail. There's definitely potential for smarter ways of denoising by taking various cues and using various reconstruction. It's the kind of thing ML should be good at. Seeing the result of upscaling from DLAA2, I imagine ML processing of noisy lighting data combined with simple geometry cues to be potentially very good. Maybe clean, crisp lighting of a whole scene with <10% coverage good.
I'm not sure ML gives an advantage - the problem might be too simple to require ML. If we compare a task that can already be addressed with simple low pass filter in comparison to the task of labeling hand drawn letters for example.

On the other hand the given samples can be very bad, making it difficult. In an earlier Q2 VKPT video they showed how the results would look without light sampling (next event estimation). This means rays are only shot randomly, hitting a light just by small chance.
While this would converge to the correct solution with many samples just as well, this input is not good enough to give any usable result - even the filtered image was mostly black with some flickering blotches.
Probably, ML could not do so much better here, or if so - the trained data might only work in the same scene and conditions.

It seems more important to get better samples in the first place. We want a good representation of the direct lighting if it dominates. If direct lighting is absent, the indirect light is hopefully low frequency and still variance is low enough to keep the filters working.
But there are always difficult situations where it all breaks down. Games will be designed to avoid this, and we get a new form of restrictions that no kind of denoising magic alone could solve.

Intense skylight coming through small windows would be a good example for such limitation. Simple next event estimation is inefficient because it's unaware of the small holes and where they are. So it would make sense to treat the holes similar like area lights, which would require to have some kind of portal polygon data structure with high probability of light passing through.
Another example would be street and traffic lights at night in city scene. With so many lights, we might want to combine them similar to the Lightcuts algorithm.

I guess, after the recent leap in denoising we may not get so much more out of it then what we already have. But using its variance measure as feedback to drive different sampling strategies is eventually one interesting future direction. The idea is not new, but the realtime constraint could spur some new research that was not attractive for offline.

Seeing the result of upscaling from DLAA2
It's too proprietary to draw conclusions for me. I assume most of the better results come from the addition temporal jittering so the high res information is there in a standing image. That's neither new nor machine learning just because it uses Tensor cores.
Results are good and they can call it whatever they want.

But it's not that i do not see any use of ML in general for games. Would be awesome if it could add detail to the wood in Dreams, for example.
 
Raytracing in Games VI: How to accelerate the rays of GPUs
April 13, 2020
tl;dr: Real-time ray tracing in games is gaining momentum and the new Xbox Series X and PlayStation 5 game consoles are almost certain to make the breakthrough. But how do GPUs actually speed up the calculation of rays? An insight into the technology provides information.
...
The BVH algorithm requires good access to cache and VRAM at high bandwidth. As an outsourced element, as the RT core is at Nvidia, this means space consumption through an additional memory connection. AMD describes that the texture processor responsible for retrieving textures during shading already has an optimal connection to memory. This makes it predestined to run the BVH algorithm memory-efficiently. The second step for ray tracing, the interface determination, is then carried out by a new computing unit, the so-called "Ray Intersection Engine". This is a fixed-function unit, i.e. as with Nvidia, an ASIC that is integrated into the texture processor. This unit could be the one that the Xbox developer called a dedicated computing unit for ray tracing.

AMD's approach differs from Nvidia's approach in two ways. For one thing, the new computing unit performs only one instead of two steps of the ray tracing pipeline. As a result, this area can be smaller, saving chip space. On the other hand, scheduling, i.e. the decision as to which steps are taken and when, is increasingly carried out by the shader unit. The texture processor always only handles the beam passage through a box plane in the BVH and delivers the result (radiation interface or other boxes that need to be traversed) back to the shader unit. This is done in the same way that textures were previously deployed, and can therefore continue to use the existing shader infrastructure. After that, the shader unit decides what steps to take next, giving developers better control over the computing effort to be applied to ray tracing. The shader unit also determines how many secondary beams are to be tracked.
https://www.computerbase.de/2020-04/raytracing-in-spielen-gpu-beschleunigung/
 
The texture processor always only handles the beam passage through a box plane in the BVH and delivers the result (radiation interface or other boxes that need to be traversed) back to the shader unit. This is done in the same way that textures were previously deployed, and can therefore continue to use the existing shader infrastructure.

Phrasing it this way, maybe the BVH is not defined / restricted by hardware at all. If all AMD does for RT is providing box and triangle intersections, we would get full flexibility, and DXR would be just an example implementation.
...sounds too good to be true, and too hard to believe performance could be close to NV.
Console reveals did not help against my curiosity yet. Still knowing nothing... :)
 
Phrasing it this way, maybe the BVH is not defined / restricted by hardware at all. If all AMD does for RT is providing box and triangle intersections, we would get full flexibility, and DXR would be just an example implementation.
...sounds too good to be true, and too hard to believe performance could be close to NV.
Console reveals did not help against my curiosity yet. Still knowing nothing... :)

Mark Cerny talked about ray-AABB intersection test and ray-polygon intersection test accelerated too in RDNA2.

And Andrew Goosen(Xbox) said the BVH can be created offline and optimize at least on console.
 
Yeah, there are still many hints around that point towards more flexibility.
And Andrew Goosen(Xbox) said the BVH can be created offline and optimize at least on console.
But this is not one. I expect this becomes (or should be) DXR feature in any case.
Similar to the pipeline cache in Vulkan, vendors could implement their own 'save black box data to disk' methods. Or better give a blob of memory for BVH per piece of geometry so RT becomes friendly for streaming.
Actually i wonder this does not exist already (wich could be the case but IDK).
 
Yeah, there are still many hints around that point towards more flexibility.

But this is not one. I expect this becomes (or should be) DXR feature in any case.
Similar to the pipeline cache in Vulkan, vendors could implement their own 'save black box data to disk' methods. Or better give a blob of memory for BVH per piece of geometry so RT becomes friendly for streaming.
Actually i wonder this does not exist already (wich could be the case but IDK).

I was tricked thinking he talked about custom BVH.

https://www.eurogamer.net/articles/digitalfoundry-2020-inside-xbox-series-x-full-specs

[Series X] goes even further than the PC standard in offering more power and flexibility to developers," reveals Goossen. "In grand console tradition, we also support direct to the metal programming including support for offline BVH construction and optimisation. With these building blocks, we expect ray tracing to be an area of incredible visuals and great innovation by developers over the course of the console's lifetime."
 
How easy is it to combine dynamic objects with static in a BVH? And how well could a BVH be streamed in the case of something like GTA streaming precomputed city data?
 
How easy is it to combine dynamic objects with static in a BVH? And how well could a BVH be streamed in the case of something like GTA streaming precomputed city data?
To my understanding dynamic objects require their own dynamic BVHs which are completely separate from the static BVH for the scene
 
To my understanding dynamic objects require their own dynamic BVHs which are completely separate from the static BVH for the scene
How do you trace reflections of static scenery and dynamic objects? Let's say there's a window. How do you reflect the players and the scenery behind them? Surely that requires two RT passes?
 
How do you trace reflections of static scenery and dynamic objects? Let's say there's a window. How do you reflect the players and the scenery behind them? Surely that requires two RT passes?
I have no clue, you need someone with more understanding on how they work on that.
 
Isn't AABB just a box containing some geometry as in the lowest level of BVH?

It can be used for other primitives than triangle. I imagine it can help if people use other primitives for doing the lighting of a scene than the triangle. For example in Voxel Based GI or point based global illumination or in games like Dreams using other primitive than triangle like signe distance field or voxel base game.
 
How easy is it to combine dynamic objects with static in a BVH? And how well could a BVH be streamed in the case of something like GTA streaming precomputed city data?
It is (or would be) very easy. Not sure if what i say next fits 100% with DXR / RTX, but i assume so:

Divide static world into chunks of geometry, build optimized BVH for each on CPU offline, store on disk. (On PC: During game install on client or level load, build per vendor BVH once per object and store on HD for next play session eventually.)
Those chunks, together with rigid dynamic objects like cars, become bottom level acceleration structures.
Same for dynamic skinned characters, but they require refitting BVH to deformed geometry on GPU per frame.

After uploding new chunks to GPU and removing old ones, build top level AS per frame on GPU over the whole scene. The number of chunks, cars and characters should be something like 1000, so not very costly to build tree over that.
Eventually we also need to transform the BLAS, or we store transforms in nodes and transform rays instead.

How do you trace reflections of static scenery and dynamic objects? Let's say there's a window. How do you reflect the players and the scenery behind them? Surely that requires two RT passes?

The only difference between static and dynamic stuff is the necessary refitting for animation, but while tracing there are no issues with reflections from dynamic on static objects, and there is no need for differing code paths for static / dynamic stuff at all. (If we want to keep things simple)
Or do you mean needing two rays for reflection and refraction? This can be avoided too, with the usual stochastic approach of randomly deciding to do only one of them per sample, and later combining multiple samples for approximate solution. (Also useful for diffuse vs. specular.)
 
And they aren't on Turing?


Geometry is rarely 100% static.

But it's often MANY% static. Maybe some engines might chose to use two separate BVHs and traverse both simultaneously. A super tight and optimized one baked offline + a dynamic slightly less optimal to traverse but quick to build one generated in real time just for the dynamic portion of the scene. The speedup might offset the overhead.
 
Back
Top