GART: Games and Applications using RayTracing

chris1515 · Jul 3, 2021

DavidGraham said:
I don't believe that is true at all, even in RDNA2 optimisd implementation like RE8, RDNA2 takes a bigger hit.

We've had many console optimised RT implementations, Doom Eternal, Call of Duty Cold War, Watch Dogs Legion, The Medium .. and RDNA2 still falls behind Turing.

Not to mention, once you start increasing the complexity of RT, or use multiple RT effects, the gap gets wider.

Wrong metric, you don't compare based on price, which is arbitrary and subject to the competitive landscape in it's respective time. You compare based on technical specs. The 6700XT is a 3070ish level GPU, the fact it crashed behind a 2070S in the light RT workload that Doom Eternal uses is telling enough, same for the 6800. The 6800XT is barely faster than 3060Ti, the 6900XT is either equal to 2080Ti or barely faster. That Is just pathetic scaling.

You don't have console implementation on PC because it uses DXR. And I am not sure devs will use the flexibility they have on consoles if they need to create two RT system one for consoles and one for PC. On consoles you have standard BVH solution too and I would not be surprised if all devs use this for the moment and maybe forever because they need to release games on PC.

I am not sure Sony studios will use this flexibility if they need to release titles one day on PC. Every consoles titles will release one day on PC.

JoeJ · Jul 3, 2021

CarstenS said:
Genuine question: How is this handled on the GTX-side of things, which also support DXR via driver?

IDK, but probably they use the same BVH data structure and build/refit compute shaders.
I do know that GTX DXR is slower than other compute raytracers, so we can assume NV did not optimize the hell out of it. Still good enough. I wish AMD would do this too... :/

DegustatoR · Jul 3, 2021

CarstenS · Jul 3, 2021

DavidGraham said:
Wrong metric, you don't compare based on price, which is arbitrary and subject to the competitive landscape in it's respective time. You compare based on technical specs. The 6700XT is a 3070ish level GPU, the fact it crashed behind a 2070S is telling enough, same for the 6800. The 6800XT is barely faster than 3060Ti, the 6900XT is either equal to 2080Ti or barely faster. That Is just pathetic scaling.

I believe it is the correct metric speaking from a consumer point of view and also from being a cut-down of the 2nd-largest die in it's family.

But may that as it be. The topic was RDNA2 vs. Turing, not Ampere.
Then look at 6800 vs. 2080 Ti if you will, for which i provided an example as well. Or do you insist, that I use 6900 XT here, since it's also a 1000-EUR-class product, as the 2080Ti has been?

trinibwoy · Jul 3, 2021

JoeJ said:
The comparison was not about DXR, but overpriced first gen RTX GPUs, showcasing a feature which required GPUs at the pricepoint previously reserved for Titan to run smoothly.

Neither PhysX or DXR were a primary driver of GPU prices.

Something looked wrong here. Pay more for a feature that halfs framerate but does not look twice as good. It felt ridiculous to many - a marketing gag, introduced too early to make real sense yet. Large scale fluid simulations are ridiculous too and still did not made it into games. To me the comparison makes sense, even if just one feature found adoption.

How can you say it was introduced too early when there are shipping games making practical use of RT today? You’re comparing today’s reality with some alternate version that either has no DXR or more flexible DXR. I don’t know why we would be better off with no RT in games today and it’s easy to say that something should be better but how do you know that the flexibility you want is even achievable today? Where are the alternatives to DXR that prove your point?

I agree with you on large scale fluid simulations. Hardware is nowhere powerful enough yet. At the same time though we still don’t have good cloth simulation in games even though it’s definitely within reach of today’s GPUs. So yeah it’s great that proprietary PhysX lost but the end result is the bar remains low for everyone because nobody else picked up the torch. The long predicted open solution never arrived. Not really something worth celebrating.

No. DirectX is not a first version of an API. It has been learned flexibility is important. But this was totally ignored, and now is much harder to add afterwards, if it's possible at all.
The price 'to get going with just something' is high and has long standing consequences, already hurting progress on RT. See UE5 which suffers because it is innovative.

The price of doing nothing has even worse consequences. Would you be more satisfied if this console generation had no RT at all? How is that better for us long term?

DegustatoR · Jul 3, 2021

trinibwoy said:
I agree with you on large scale fluid simulations. Hardware is nowhere powerful enough yet. At the same time though we still don’t have good cloth simulation in games even though it’s definitely within reach of today’s GPUs. So yeah it’s great that proprietary PhysX lost but the end result is the bar remains low for everyone because nobody else picked up the torch. The long predicted open solution never arrived. Not really something worth celebrating.

Yeah, I wanted to point that out as well. PhysX died and instead of all the free and better alternatives we're back to square one basically, with ragdoll physics running on CPU, with nobody being in any hurry to create anything like what PhysX was.

JoeJ · Jul 3, 2021

Dictator said:
Do you remember the reveal of DXR? It was demo'd on production engines (Northlight, 4a, Frostbite, Unreal) - it is not like this just popped out of nowhere and devs were not consulted at all.

One thing I do know for an absolute fact is that some of the key people responsible for MS's API development quite literally only learned of Nanite and Lumen UE5 existing at all the day it was presented to the press on PS5. UE5 was actually kept a secret to nearly everyone (but Sony essentially). UE4 projects at MS for example had no visibility at all on UE5 up until it was publicly announced.

My impression is selected devs were contacted last minute just to make some quick demos for the DXR showcase. In public, there was nothing.
Likely Epic knew about it some time before too. Maybe they complained, maybe they didn't. But does not matter - LOD was the elephant in the room for decades, and no matter which solutions people come up with, access to BVH is necessary for any method i'm aware of. Next gen was coming close, so the assumption there might come some progress on LOD was obvious, together with the assumption about increasing dynamic geometry all together.

A nice hypothetical question about UE5 would be: 'If Epic had known about DXR and it's limits long enough, maybe they would have cancelled the work on Nanite, because it's geometry is not compatible although it's still traditional triangles?'
Personally i have the same situation, thus my rant. I'm happy DXR was announced right before i would have started work on compute raytracing, so i dropped those plans in time. But actually i come back to thinking it might be the best compromise, although this really makes no sense.

JoeJ · Jul 3, 2021

trinibwoy said:
How can you say it was introduced too early when there are shipping games making practical use of RT today?

Because i'm afraid API limitations get never fixed properly. So we have nice RT games now, but worse games than what would be possible tomorrow.
Notice this is no problem from HW vendors perspective. A reason to upgrade is a good thing for them, but a bad thing for developers and end users.
It's not about alternative fantasy realities, but about conflicting interests which needs to be leveraged. Increasing production costs require some rethinking on all ends.

DegustatoR · Jul 3, 2021

JoeJ said:
Because i'm afraid API limitations get never fixed properly. So we have nice RT games now, but worse games than what would be possible tomorrow.
Notice this is no problem from HW vendors perspective. A reason to upgrade is a good thing for them, but a bad thing for developers and end users.
It's not about alternative fantasy realities, but about conflicting interests which needs to be leveraged. Increasing production costs require some rethinking on all ends.

It's the same old story again - better to have something which is usable now instead of waiting for some more years until a more flexible solution could appear (or it could totally not, see PhysX example).
I also don't understand how having DXR1 precludes us from "fixing API limitations" in the future. Isn't that a bit like saying that DX5 has precluded us from getting DX7-12?

trinibwoy · Jul 3, 2021

JoeJ said:
Because i'm afraid API limitations get never fixed properly. So we have nice RT games now, but worse games than what would be possible tomorrow.

DXR doesn’t appear to be fundamentally broken. In this first iteration BVHs are black boxed for a very good reason. BVHs need to be hyper-optimized for the traversal and intersection hardware that they’re running on. Asking every developer to solve this problem on day one is a very bad idea. Imagine every game developer having to optimize their BVH implementation separately for AMD, Intel and Nvidia. E.g if Nvidia hardware prefers BVH8 but AMD likes BVH4.

It may be nice academically but a complete disaster for actually shipping games out the door. The IHVs would have no opportunity to optimize for their hardware and performance would be all over the place depending on the developer’s preferred platform and/or their ability to even code a proper BVH pipeline.

Notice this is no problem from HW vendors perspective. A reason to upgrade is a good thing for them, but a bad thing for developers and end users.
It's not about alternative fantasy realities, but about conflicting interests which needs to be leveraged. Increasing production costs require some rethinking on all ends.

A hypothetical future DXR version with programmable BVH will run just fine on today’s hardware. No need to upgrade.

JoeJ · Jul 3, 2021

trinibwoy said:
sking every developer to solve this problem on day one is a very bad idea.

That's not my request. Accessing / building / maintaining BVH should be possible, but not necessary for those who won't have any benefit. So yes, DXR is not 'broken', it just misses that essential option.

trinibwoy said:
A hypothetical future DXR version with programmable BVH will run just fine on today’s hardware. No need to upgrade.

If we get it, yes. But if we don't get it, using more powerful HW is the usual practice to achieve further progress. And i think we can't count on that anymore that much. So we need all options to optimize, and my request isn't even low level.

JoeJ · Jul 3, 2021

DegustatoR said:
There are even production software results where RDNA2 can't beat Ampere despite using h/w RT acceleration while Ampere is not (at least I think it does? feel free to correct me on this one, hasn't looked into it much).

IDK either, but there was a primary OpenCL implementation (so no access to intersection HW), and a Vulkan one (which in theory has access now).
My guess is there is still no HW acceleration, and Turing was also faster than RDNA with Radeon Rays. Maybe the benchmark lists if using CL or VK versions to be sure.

trinibwoy · Jul 3, 2021

JoeJ said:
That's not my request. Accessing / building / maintaining BVH should be possible, but not necessary for those who won't have any benefit. So yes, DXR is not 'broken', it just misses that essential option.

In order to give developers access to the BVH the api needs to expose a well defined BVH data structure instead of the current opaque TLAS/BLAS hierarchy. This in turn would impose constraints on the hardware traversal and intersection implementation as it would need to adhere to this strict api definition and the compression/caching considerations that go along with it.

This seems quite a bit less flexible overall than the current approach where IHVs have room to innovate.

chris1515 · Jul 3, 2021

DegustatoR said:
It's the same old story again - better to have something which is usable now instead of waiting for some more years until a more flexible solution could appear (or it could totally not, see PhysX example).
I also don't understand how having DXR1 precludes us from "fixing API limitations" in the future. Isn't that a bit like saying that DX5 has precluded us from getting DX7-12?

A counter example is console, they have the same standard DXR into the API, multiple BVH solution done by the platform holder and the possibility for everyone to customize its own BVH.

It doesn't means AMD and Nvidia doesn't have their own solution, it means more flexibility. Keep the standard solution for first gen title and gives flexbility to devs who needs it.

trinibwoy · Jul 3, 2021

chris1515 said:
A counter example is console, they have the same standard DXR into the API, multiple BVH solution done by the platform holder and the possibility for everyone to customize its own BVH.

Is it true that developers can customize the BVH on consoles? Do you have a source for that?

JoeJ · Jul 3, 2021

DegustatoR said:
what's stopping AMD from porting them into AGS or a proprietary VK extension if they are so helpful?

I guess interest is minor, so not worth the 10 minutes of work to expose that single intersection instruction >:/
Personally i'd use it, even if NV then still needs another solution...

trinibwoy said:
In order to give developers access to the BVH the api needs to expose a well defined BVH data structure instead of the current opaque TLAS/BLAS hierarchy. This in turn would impose constraints on the hardware traversal and intersection implementation as it would need to adhere to this strict api definition and the compression/caching considerations that go along with it.

This seems quite a bit less flexible overall than the current approach where IHVs have room to innovate.

Not my request either. I don't see a good solution for a vendor independent BVH API. And i don't need it. Vendor extensions would be the way to go, but that's not popular within Microsofts vision of 'standardize everything'.
So my biggest hope is Vulkan extensions. NV usually is creative here, e.g. device generated command buffers extension. But they already have the performance lead and likely see no need to make things complicated. AMD does nothing so far, also maybe because RDNA3 might use another data structure then.

So it's difficult and just temporary. But the current solution has zero flexibility, and nothing is less flexible than that.
To me, adding patches for each future generation would be much less work and headaches than trying to achieve LOD with DXR, which basically is impossible. So i would welcome temporary extensions with open arms, until vendors start to agree on BVH format (if they disagree at all).
Take AMDs format for example, which we know what it is from the intersection instruction interface: 4 child pointers and bounding boxes. Super simple. Not as complicated as you and many here might think.

trinibwoy · Jul 3, 2021

Sure AMD seems to be using BVH4 which is pretty reasonable. Maybe Nvidia is doing something different.

https://research.nvidia.com/sites/default/files/publications/ylitie2017hpg-paper.pdf

chris1515 · Jul 3, 2021

trinibwoy said:
Is it true that developers can customize the BVH on consoles? Do you have a source for that?

There is no hardware traversal solution on consoles, no RT core. I don't see any limitation for devs to do what they want. After Microsoft and Sony have their own BVH solution, devs can use. They can do their own or more probably customize existing MS or Sony BVH and tailor them to their needs. Microsoft told for example on Xbox Series X you can precompute the BVH offline for static geometry and steam it from the SSD and do other optimisation on it at least on Xbox Series. And all of this is out of scope on PC.

https://www.eurogamer.net/articles/digitalfoundry-2020-inside-xbox-series-x-full-specs

Andrew Goossen said:
It is important to put this into context, however. While workloads can operate at the same time, calculating the BVH structure is only one component of the ray tracing procedure. The standard shaders in the GPU also need to pull their weight, so elements like the lighting calculations are still run on the standard shaders, with the DXR API adding new stages to the GPU pipeline to carry out this task efficiently. So yes, RT is typically associated with a drop in performance and that carries across to the console implementation, but with the benefits of a fixed console design, we should expect to see developers optimise more aggressively and also to innovate. The good news is that Microsoft allows low-level access to the RT acceleration hardware.

"[Series X] goes even further than the PC standard in offering more power and flexibility to developers," reveals Goossen. "In grand console tradition, we also support direct to the metal programming including support for offline BVH construction and optimisation. With these building blocks, we expect ray tracing to be an area of incredible visuals and great innovation by developers over the course of the console's lifetime."

DegustatoR · Jul 3, 2021

chris1515 said:
There is no hardware traversal solution on consoles, no RT core.

There is, they are called ray accelerators.

chris1515 · Jul 3, 2021

DegustatoR said:
There is, they are called ray accelerators.

Ray/intersection is hardware accelerated but not BVH traversal on AMD GPU. NVIDIA GPU hw accelration use RT core for BVH traversal and ray/intersection acceleration. Denoising and DLSS is accelerated by Tensor Core.

GART: Games and Applications using RayTracing

chris1515

JoeJ

DegustatoR

CarstenS

Moderator

trinibwoy

Meh

DegustatoR

JoeJ

JoeJ

DegustatoR

trinibwoy

Meh

JoeJ

JoeJ

trinibwoy

Meh

chris1515

trinibwoy

Meh

JoeJ

trinibwoy

Meh

chris1515

DegustatoR

chris1515

Similar threads