I think the 'hate' phase is already over. It was very visible in comments on gaming sites after Turing launch, but recently i don't spot that anymore.When ALL vendors go RT, theres not much to 'hate' on now anymore.
It's not silly, just because this time NV is finally more successful.
DXR is API standard of Microsoft, and i still think it was NV proposing it by using Optix. The industry has to use that an PC and XBox, but this does not proof they are happy with it. Otherwise XBox would not have more flexibility than we have on PC, which indicates this standard is not good enough.
The comparison was not about DXR, but overpriced first gen RTX GPUs, showcasing a feature which required GPUs at the pricepoint previously reserved for Titan to run smoothly.It actually is quite silly to compare PhysX and DXR. One is owned and controlled by Nvidia. The other isn’t.
No. DirectX is not a first version of an API. It has been learned flexibility is important. But this was totally ignored, and now is much harder to add afterwards, if it's possible at all.It’s the first version of the api. Given how quickly DXR has made it into shipping games it would seem this first attempt was more than good enough to get the ball rolling. It’s kinda meaningless to point out that the first version of something isn’t perfect.
?See UE5 which suffers because it is innovative.
What i mean is: They can't apply the same LOD mechanism of Nanite to raytraced geometry because BVH is blackboxed.
I do not view it as all as a failure or anything. It is just the beginning of an API - an API which can evolve as hardware and techniques do too. I find it rather unsurprising that a compute based triangle renderer has quirks that do not support a seemless integration with DXR hardware and software accelleration. Especially since discrete LOD and hardware triangle creation is an aspect of rendering that has existed as the de facto standard for nearly 30 years. Nearly every game uses it for primary view (easily 95+% of produced games). And the API was designed in 2017-2018 when UE5 did not even exist yet (which is only one engine and one solution to triangle density and LOD, there can of course be others). Can you imagine the hardware that must exist to both allow for serious hardware acceleration (4 - 10x) and programmability happening at the same time? Not sure that happens often if at all in the graphics accelerator space.What i mean is: They can't apply the same LOD mechanism of Nanite to raytraced geometry because BVH is blackboxed.
Rebuilding all BVH constantly just because a patch of surface changes detail is no practical, having BVH at constant high detail does not fit into memory, having BVH at constant low detail misses RTs accuracy advantage and again becomes inefficient at distance.
What we need is ability to access and modify BVH, even if different vendors use different data structures.
As a bonus, this opens up reusing precomputed custom hierarchies (e.g. Nanites BVH) to convert it into HW format at very low cost, removing the costs of building BVH form the graphics driver completely.
Currently DXR has no support for any LOD mechanism. Standard discrete LODs do not count because that's no solution at all. You can't use them for larger models like terrain or architecture without noticeable popping and discontinuities.
Beside visibility, LOD is one of the two key problems in computer graphics. So far games industry did not really try to solve it, UE5 is the first serious attempt. Shortsighted DXR design hinders progress in this direction, and imo there is no excuse to justify such a failure.
I do not view it as all as a failure or anything. It is just the beginning of an API - an API which can evolve as hardware and techniques do too. I find it rather unsurprising that a compute based triangle renderer has quirks that do not support a seemless integration with DXR hardware and software accelleration. Especially since discrete LOD and hardware triangle creation is an aspect of rendering that has existed as the de facto standard for nearly 30 years. Nearly every game uses it for primary view (easily 95+% of produced games). And the API was designed in 2017-2018 when UE5 did not even exist yet (which is only one engine and one solution to triangle density and LOD, there can of course be others). Can you imagine the hardware that must exist to both allow for serious hardware acceleration (4 - 10x) and programmability happening at the same time? Not sure that happens often if at all in the graphics accelerator space.
I feel like waiting a bit for DXR too mature and for more exotic ideas like nanite to actually be proven as shipping and development viable needs to happen before I get upset that I cannot get perfect acceleration in UE5 yet.
- There's some minor allowance for LOD-ish things via ray-test flags, but what are the implications of even using this feature? How much more incoherent do I end up if my individual rays have to decide LOD? Better yet, my ray needs to scan different LODs based on distance from ray origin (or perhaps distance from camera), but those are LODs *in the BVH*, so how do I limit what the ray tests as the ray gets further away? Do I spawn multiple "sub-rays" (line segments along the ray) and given them different non-overlapping range cutoffs, each targetting different LOD masks? Is that reasonable to do, or devastatingly stupid? How does this affect my ray-intersection budget? How does this affect scheduling? Do I fire all LOD's rays for testing at the same time, or so I only fire them as each descending LOD's ray fails to intersect the scene?
Black Box Raytracing
The API, as it is now, is a bit of a “black box” one.
There’s been a massive work with modern graphics APIs like Vulkan, D3D12 and partially Metal to move away from black boxes in graphics. DXR seems to be a step against that, with a bunch of “ohh, you never know! might be your GPU, might be your driver, might be your executable name lacking a quake3.exe” in it.
- What acceleration structure is used, what are the pros/cons of it, the costs to update it, memory consumption etc.? Who knows!
- How is scheduling of work done; what is the balance between lane utilization vs latency vs register pressure vs memory accesses vs (tons of other things)? Who knows!
- What sort of “patterns” the underlying implementation (GPU + driver + DXR runtime) is good or bad at? Raytracing, or path tracing, can get super bad for performance at divergent rays (while staying conceptually elegant); what and how is that mitigated by any sort of ray reordering, bundling, coalescing (insert N other buzzwords here)? Is that done on some parts of the hardware, or some parts of the driver, or DXR runtime? Who knows!
- The “oh we have BVHs of triangles that we can traverse efficiently” part might not be enough. How do you do LOD? As Sebastien and Brian point out, there’s quite some open questions in that area.
It probably would be better to expose/build whatever “magics” the upcoming GPUs might have to allow people to build efficient tracers themselves. Ability to spawn GPU work from other GPU work; whatever instructions/intrinsics GPUs might have for efficient tracing/traversal/intersection math; whatever fixed function hardware might exist for scheduling, re-scheduling and reordering of work packets for improved coherency & memory accesses, etc. etc.
You are overstating the flexibility of console ray tracing in regards to the topic that JoeJ and I are talking about. It is still all about the hardware rasterised triangle there on console.The flexibility is existing on consoles but is not useful for multiplatform games because of DXR on PC.
Since when can consoles test hits against arbitrary primitive types that are not triangles and also have the hw speed up? That is what the acceleration is about. The HW acceleration is doing the exact same thing there, just the traversal is programmable since it is not hw accelerated on AMD. But your still writing traversal to test against hardware rendered triangles.
I can find tons of other blogs and tweet of graphics developer complaining
The flexibility is existing on consoles but is not useful for multiplatform games because of DXR on PC.
That's why I'm looking forward to a title that's relying heavily on RT on the one hand and being optimized for addtional optimization paths RDNA2 offers on the other hand. To see, what this difference will mean for performance in practice.RTX is ’flexible’ too. Besides, theres rdna2 on pc aswell, in its true form and all its festures/power (23+tf).
That's irrelevant. All current GPUs build and maintain BVH entirely on compute and CPU. So there is no reason to blackbox it at all. By opening up, developer takes the risk future GPUs need updates to support changes in their BVH format as well. It's then up to the developer to take that or using the current approach of leaving it all to the driver. Most would have decided for the latter of course, because they have no true LOD system yet. But if they want to compete UE5 visuals, this will change now, so it would be nice if DXR were ready.an API which can evolve as hardware and techniques do too.
The compute rasterizer is also totally irrelevant. This receives attention because we are used to focus on 'realtime rendering'. But this the wrong context.I find it rather unsurprising that a compute based triangle renderer has quirks that do not support a seemless integration with DXR hardware and software accelleration.
This API is just the trivial RT API we know for decades from offline raytracing, Optix being just one very similar example. It was no big challenge to design it.And the API was designed in 2017-2018 when UE5 did not even exist yet
And for differences in visual quality as well.To see, what this difference will mean for performance in practice.
If any, yes, that too.And for differences in visual quality as well.
Any existing RTX or RDNA2 GPU!Can you imagine the hardware that must exist to both allow for serious hardware acceleration (4 - 10x) and programmability happening at the same time?
But LOD is no 'exotic' idea. And it never was. It is just an open problem we need to address. I doubt we'll ever solve it in a way that serves all needs. Nanite is just a very good example, but still has restrictions.I feel like waiting a bit for DXR too mature and for more exotic ideas like nanite to actually be proven as shipping and development viable needs to happen before I get upset that I cannot get perfect acceleration in UE5 yet.
Do you remember the reveal of DXR? It was demo'd on production engines (Northlight, 4a, Frostbite, Unreal) - it is not like this just popped out of nowhere and devs were not consulted at all.So i ask you: Why the fuck did they seemingly pull off the raytracing revolution in secret, not asking devs for feedback? Why did they think the simple and easy to use practices from offline rendering would work for realtime games too?
Whatever the answer is, it is no excuse for the failure.
Genuine question: How is this handled on the GTX-side of things, which also support DXR via driver?Any existing RTX or RDNA2 GPU!
Like said, they build BVH in compute - pure software, but blackboxed.
I do not request 'traversal shaders', which only AMD could support. All i want is access to BVH data structure, which is pretty much the most important thing in raytracing.
Genuine question: How is this handled on the GTX-side of things, which also support DXR via driver?
I don't believe that is true at all, even in RDNA2 optimisd implementation like RE8, RDNA2 takes a bigger hit.Most of the current RT workloads are developed with a focus on what Nvidia hardware can and cannot do.
Wrong metric, you don't compare based on price, which is arbitrary and subject to the competitive landscape in it's respective time. You compare based on technical specs. The 6700XT is a 3070ish level GPU, the fact it crashed behind a 2070S is telling enough, same for the 6800. The 6800XT is barely faster than 3060Ti, the 6900XT is either equal to 2080Ti or barely faster. That Is just pathetic scaling.6700XT is a bit ahead in 1080p, a bit behind in 3840p, 2070S is a 539 card, 6700XT 480€