AMD Radeon RDNA2 Navi (RX 6500, 6600, 6700, 6800, 6900 XT)

Well that’s the thing. We need tangible evidence that flexibility produces better results today than the “limited and inefficient” DXR guiderails. Until then it’s just wishful thinking. If AMD doesn’t provide that evidence then yes we can only hope someone else does.

My view on this is simple. We already have extremely flexible RT implementations (x86, CUDA) so the experts clearly understand the trade offs of software vs hardware RT. If greater flexibility was the best option today then DXR would have been designed with that in mind.

I always go for seeing is believing. We have not seen it yet from amd/rnda2. Now this doesnt mean they will get there, but not untill RDNA3 atleast. Use of RT is going to eat in performance, and since the consoles dont have enough of it, it remains to be seen what kind of RT we will see. I think what spiderman, ratched etc have shown is about what to expect. Theres a reason DS doesnt use it.
 
Lol I would hardly give AMD’s documentation credit for the incredible imagination, talent and hard work of console devs and artists.



Well that’s the thing. We need tangible evidence that flexibility produces better results today than the “limited and inefficient” DXR guiderails. Until then it’s just wishful thinking. If AMD doesn’t provide that evidence then yes we can only hope someone else does.

My view on this is simple. We already have extremely flexible RT implementations (x86, CUDA) so the experts clearly understand the trade offs of software vs hardware RT. If greater flexibility was the best option today then DXR would have been designed with that in mind.

I think this will come from title designed around next generation consoles.
 
I think this will come from title designed around next generation consoles.

If we look at the outgoing generation it interesting thing is that we shifted away from a static generation with the mid life v2 consoles.

It could very well be that we see similar v2s this upcoming generation as well, which may have both an altered approach and priority with respect to ray tracing.
 
Ampere is twice as fast with compute performance.

Which Ampere is twice as fast as which Navi 21 card in which real world benchmark that measures compute performance?
Perhaps you should send an e-mail to the developers of SiSoft Sandra, LuxMark, IndigoBench, Geekbench, Passmark and Blender to tell them they're all probably using some super buggy code that puts Navi 21 cards in close parity to GA102 ones.

Or perhaps you're being tricked by nvidia's spec sheet for maximum FLOP throughput on Ampere, which mean very little considering it's getting much lower ALU utilization than RDNA2 or even Turing.




Flexibility helps you work smarter by doing less work for the same result or to do things that are simply impossible due to constraints of hardware solutions. I suspect there isn’t much room to do the former as DXR seems to provide decent enough support for skipping work where it’s not needed.

The real benefits would be in doing even more advanced RT with more complex data structures and more detailed geometry. But that would be dog slow on today’s compute hardware anyway so it’s a moot point.
You're taking a whole lot of conclusions over very little information.
Why is more flexibility only useful for more complex data structures?

This isn't the scientific compute market where effective throughput of a certain type of calculations is the determining factor.
What matters is how power/cost effective an architecture is at producing XYZ visual results, and for GPUs the flexibility has historically played a huge part in that.
Take for example the first medium-range chip with unified shader architecture, the G84. When the 8600GTS came out it ran considerably behind the 7900GTX due to half the memory bandwidth and much lower fillrate, but when Crysis came out it was almost trading blows with the former high-end.


I've seen time and again developers claiming that nvidia's greatest problem with their RT approach is that it's "too precise" for real-time raytracing.
AMD's approach could simply be an answer to that, and it proposes to do less but smarter.
 
If modern means chips introduced this very year.
The trend is such indeed, but the interesting question is how far will it reach.
Brute force, (e.g. "Psycho" setting in Cyberpunk 2077) which seems to be what we're seeing in existing games, is upended by bandwidth. We've gone from 0 ray tracing to 11 in about 2 years and there's no more bandwidth on the horizon.

So the algorithms are going to have to be smarter and in the control of developers. I guess that custom BVH nodes and custom traversal algorithms will be required. It's hard to tell if the available hardware supports those ideas, but I have my fingers crossed.
 
You're taking a whole lot of conclusions over very little information.
Why is more flexibility only useful for more complex data structures?

These concepts aren’t unique to raytracing. It’s the same fundamental tradeoffs for any algorithm.

This isn't the scientific compute market where effective throughput of a certain type of calculations is the determining factor. What matters is how power/cost effective an architecture is at producing XYZ visual results

Exactly. Flexibility costs transistors which cost area and power. For the same transistor budget flexibility is only a net win when it allows you to work smarter or do new things. AMD has shown us neither so far.

I've seen time and again developers claiming that nvidia's greatest problem with their RT approach is that it's "too precise" for real-time raytracing.
AMD's approach could simply be an answer to that, and it proposes to do less but smarter.

Source? Sorry but that sounds silly given Nvidia has been promoting very low ray counts with denoising and DLSS making up the difference. What exactly is “too precise” about that?
 
This has already been answered. All modern GPUs have RT h/w in them.
This answer is much more general than my question.
NV has complete FF implementation of classical RT, including fixed BVH structure and traversal.
AMD only has intersection of boxes and triangles. Traversal is seemingly entirely compute, so we could do our own and even use different acceleration structures, ignoring DXR.
Because of that i may have called AMD 'compute RT', and NV 'fixed function', which ofc. is off but makes sense to me personally.

But what about performances ? You need a nice balance between speed and flexibility. Being flexible, but with very low performances, or, kind or worse, no way to tape in this flexibility (hello DXR ?) is not a good thing...
That's the big question. Ofc. it seems missing traversal HW is a lack, and it explains why NV is faster in current games.
But this could change, this time i'll use my own WIP instead UE5 as example:
AMD: I can use the same BVH for both GI and RT. I never need to build BVH - instead i stream it with the asset. I can implement stochastic lod, or progressive meshes.
NV: I need to build BVH after asset load on CPU. If geometry is very detailed, that's a huge and unsteady workload. I can't do lod either, so probably limiting RT to some fixed distance around camera ends up most practical.
I assume in a situation like this, performance on AMD would be much better. Traversal being slow is no big problem because i can use lod to adapt if necessary.
I also think there is enough platform support to have per vendor RT implementations in such cases.
 
NV has complete FF implementation of classical RT, including fixed BVH structure and traversal.
What's "FF" about a MIMD processor?

AMD only has intersection of boxes and triangles.
Which is what NV also has and Intel will likely have too.

Traversal is seemingly entirely compute
Traversal is entirely compute everywhere, the difference is in what this compute is running on and what is exposed to APIs and developers.

Because of that i may have called AMD 'compute RT', and NV 'fixed function', which ofc. is off but makes sense to me personally.
It's off by so much that it doesn't make sense at all. What would you call DXR running on Pascals?
 
This has already been answered. All modern GPUs have RT h/w in them.

Which is why triangle rt is, well, a dead end. Look at the performance it's getting. Last gen base games with a single raytracing effect on consoles, or bringing a $1200 gpu with the best tracing around to 30 fps, also on a limited title.

And it's not like it's "brand new" anymore. It's been around for 2 years now. Research has been done, best practices are getting established, and it's still a massive performance hog.

I am worried about RDNA2 consumer hardware, even without RT a 6900 fairs no better than a 6800(xt) in Cyberpunk (cache problems). And I love the idea of tracing. "It just works" is a great thing for devs to have once it's set up. But you can do tracing with specialized hw, the reason gpus have it at all is for "realtime" and api limits to do with cross vendor friendliness, or you can do tracing in compute. Tracing that's simd friendly, that's fully optimized across all vendors and platforms.

HW RT is a shiny dead end lure put up by Nvidia. Dreams can do traced global illumination on a PS4. There's no reason every dev can't do tracing with the same efficiency, no special hw needed. But they've been lured in by vague promises. Cyberpunk would've been much better off putting that R&D time elsewhere, it could've run and looked better on every platform. But it's only the beginning of this gen. And if triangle meshes are dead for primary rays (see UE5, Dreams, sebbie's project) in favor of compute, raster hw be damned, there's no reason incoherent secondary rays can't do the same thing, ray tracing hw be damned (see UE5, Claybook, Control, The Last of Us 2, etc. etc.)
 
@Frenetic Pony


It looks like HW RT still has its uses, even in the case of games like Dreams.

Edit: I have no doubt that we're entering a period where we'll have more solutions exploring ideas outside of the traditional vertex shader pipeline, but the transition for the industry will take longer than generational changes in gpu hardware (two years). By the time the next Nvidia and AMD architectures come out in 2022, the industry will still be doing some form of triangle meshes with a hybrid DXR/Vulkan RT pipeline. Those gpus will most likely have even bigger HW improvements for real time RT. The first generation of UE5 games are probably two to three years away. I don't see Unity having an alternative to their triangle mesh pipeline for a long time, because they'll need to build a whole infrastructure of tools around it. Not sure if Ubi, EA etc are working on anything in secret. We're a LONG way from needing generational hw improvements before switching to a purely generalized compute processor without some form of hw acceleration for specific fixed functions.

Edit2: Here's Teardown running on my rtx3080 at 1080. Pretty much 100% gpu usage and I'm getting around 90 fps while just looking at trees and nothing much else is happening. And that's a voxel graphics style that's not suited to most games. Even generalized compute based approaches to ray tracing, or cone tracing are going to be expensive.

upload_2020-12-14_18-47-8.png
 
Last edited:
We're a LONG way from needing generational hw improvements before switching to a purely generalized compute processor without some form of hw acceleration for specific fixed functions.

Since the whole discussion has it backends from consoles, those 10TF machines are not going to provide enough horsepower to deliver next gen graphics and meaningfull ray tracing anyway. Like many techheads have said, it will be rather subtle since the hw isnt powerfull enough. For now, NVs solution makes sense in performance. Its seems like the 2001's xbox pixel and vertex shaders. Some more generations of GPUs and we might see more flexible solutions from NV.
 
I think triangle mesh RT will continue to be used for specular reflection or/and RT shadows but GI will use other form of raytracing because of next generation consoles being not powerful enough to use triangle based RT for GI.

https://www.cryengine.com/news/view...-ray-traced-reflections-in-cryengine-and-more

Here it is cleverly mixing Voxel cone tracing and triangle based RT.

It would be perfect to have RT shadows and RT reflection but I think a choice need to be made on consoles probably not on PC.
 
Last edited:
Since the whole discussion has it backends from consoles, those 10TF machines are not going to provide enough horsepower to deliver next gen graphics and meaningfull ray tracing anyway. Like many techheads have said, it will be rather subtle since the hw isnt powerfull enough. For now, NVs solution makes sense in performance. Its seems like the 2001's xbox pixel and vertex shaders. Some more generations of GPUs and we might see more flexible solutions from NV.

It might end up like vulkan/dx12. Devs realize how much work it is to optimize performance for different hw while gains replacing black box isn't what one would expect. It's also not at all trivial to generate really good quality bvh which at the moment is black box in driver. It doesn't help that different hw might prefer very differently built bvh(memory accesses). Black box early on is great to lower barrier of entry and drive adoption. I do hope though that eventually dxr becomes more like unified shaders/compute==flexibility. But that might not be the right choice for every developer. Would be great for hobbyists/research though.
 
Which is why triangle rt is, well, a dead end. Look at the performance it's getting. Last gen base games with a single raytracing effect on consoles, or bringing a $1200 gpu with the best tracing around to 30 fps, also on a limited title.

And it's not like it's "brand new" anymore. It's been around for 2 years now. Research has been done, best practices are getting established, and it's still a massive performance hog.

I am worried about RDNA2 consumer hardware, even without RT a 6900 fairs no better than a 6800(xt) in Cyberpunk (cache problems). And I love the idea of tracing. "It just works" is a great thing for devs to have once it's set up. But you can do tracing with specialized hw, the reason gpus have it at all is for "realtime" and api limits to do with cross vendor friendliness, or you can do tracing in compute. Tracing that's simd friendly, that's fully optimized across all vendors and platforms.

HW RT is a shiny dead end lure put up by Nvidia. Dreams can do traced global illumination on a PS4. There's no reason every dev can't do tracing with the same efficiency, no special hw needed. But they've been lured in by vague promises. Cyberpunk would've been much better off putting that R&D time elsewhere, it could've run and looked better on every platform. But it's only the beginning of this gen. And if triangle meshes are dead for primary rays (see UE5, Dreams, sebbie's project) in favor of compute, raster hw be damned, there's no reason incoherent secondary rays can't do the same thing, ray tracing hw be damned (see UE5, Claybook, Control, The Last of Us 2, etc. etc.)

I disagree triangle RT is probably the long term future of rendering but in the next decade maybe we will have an intermediate step. In offline rendering movie before doing pathtracing, some movies uses other methods like point based global illumination.

I think we will have the same phenomenon with a careful use of triangle based RT.

And HW RT can accelerate other form of raytracing than triangle based raytracing too.

GI approximation are ok but RT shadows have the big advantage to have accurate shadow with area light and better ambient occlusion. Very hig res shadow maps, screen space shadows + capsule shadows + SSAO or better SSDO is a compromise. And RT reflection have no real good compromise.
 
I disagree triangle RT is probably the long term future of rendering but in the next decade maybe we will have an intermediate step.
Likely till the end of RDNA2 console generation so 2025'ish, with a possibility that PC h/w will do a lot more frame rendering with RT by the end of this period already. It is already happening with RTX games like WDL and CP2077.
Next console gen will probably have fast enough RT h/w to make purely path traced AAA even games possible. Maybe it will be even flexible enough to trace something else through more than just BVH.
 
Back
Top