AMD RDNA3 Specifications Discussion Thread

Weirdly, they just said ~1.7x performance increase gen on gen for regular raster games at 4K, but ~1.6x increase gen on gen for RT.
Unless they misspoke, are they regressing in RT performance? Or were they thinking 1.7 * 1.6 == 2.7x increase overall with RT in use?

Nothing weird in this. The number of raytracing untis only increased by 1.2x and they got 8-9% more clock rate => 1.3x raw theoretical raytracing performance compared to RDNA2.

That they still got in average ~1.6x improvement means that their architectural improvements are giving in average about 24% improvement in real-world perf vs theoretical perf.
 
Also, Ryan brought up an interesting point... the whole reason of RDNA was to fix some of the inherit downfalls with GCN.
Wonder what Ryan was talking about. GCN as scalar as it gets, has literally 0 instruction coissuing capabilities. If anything, Fermi (GF10x) and Kepler had the FMA coissuing capabilities, so they are probably the closest match to RDNA's 3 CUs in this regard.
 
The whole gimmick of RDNA3 is that it's cheap.
Focus on raw PPA metrics above all else.
It was supposed to be cheap by recent AMD standards (which were keeping AMD from grabbing more market share in recent years since producing more low margin things would drag company margin down), but whether it would be competetive is another story.
Based on raw specs and with a little bit of modeling magic, I can see how a full blown and much smaller AD103 with 256 bit bus can easily be on par in basic rasterization and much faster everywhere else, so if they aimed for better PPA vs competitors, it does not look like they have reached their goal yet. Though, it feels like they are in a better spot PPA-wise (or I would rather say perf/$, assuming the whole package production is cheaper than producing higher area monolithic die at the same tech process, which is a little bit questionable at least given the lack of any data on this matter) now in comparison with where they were with RDNA 2.
 
Last edited:
It was supposed to be cheap by recent AMD standards (which were keeping AMD from grabbing more market share in recent years since producing more low margin things would drag company margin down), but whether it would be competetive is another story.
Based on raw specs and with a little bit of modeling magic, I can see how a full blown and much smaller AD103 with 256 bit bus can easily be on par in basic rasterization and much faster everywhere else, so if they aimed for better PPA vs competitors, it does not look like they have reached their goal yet. Though, it feels like they are in a better spot PPA-wise (or I would rather say perf/$, assuming the whole package production is cheaper than producing higher area monolithic die at the same tech process, which is a little bit questionable at least given the lack of any data on this matter) now in comparison with where they were with RDNA 2.

Yeah the area thing sucks. Power is fine, it'll compete just with or often beat a 4080 at just 10% more power. But what is that area? It makes me think they were hoping for 24gbps ram, that die could hit a 20-25% higher clocks.

Also anyone know wtf "Hypr-Rx" is supposed to even be??? "One click better performance and latency". And then some bizarro slide. I could get a Reflex competitor for better latency, but I've no idea what this is talking about.

AMD-HYPERRX.jpg
 
Yeah the area thing sucks. Power is fine, it'll compete just with or often beat a 4080 at just 10% more power. But what is that area? It makes me think they were hoping for 24gbps ram, that die could hit a 20-25% higher clocks.

Also anyone know wtf "Hypr-Rx" is supposed to even be??? "One click better performance and latency". And then some bizarro slide. I could get a Reflex competitor for better latency, but I've no idea what this is talking about.

AMD-HYPERRX.jpg
After some thinking and mulling over what their presentation shows it appears to be 3 pieces in one button:
1. Radeon Anti-lag (which is frame queue limiting like "previously rendered frames = 1")
2. FSR 2.0
3. Radeon boost (which turns the resolution down as the mouse moves)

It does not seem to be a reflex competitor, rather something else entirely.
 
Not really because you're giving up spatio-temporal stability of being able to render perfect reflections in a single sample per pixel
Where did you get that? Planar reflections is just a small and very limited subset of what you can do with RT (and without making thousands of geometry passes for every planar surface in a scene).
You don't give up any spatio-temporal stability with RT at all, just ignore normal maps and roughtheness in your ray gen shader and treat all surfaces as mirrors, this would probably be the dumbest stuff to do with RT, but will give you just as spatio-temporal stable reflections as planar reflections would, and it will also give you an ability to do these mirror reflections not just on flat surfaces, but also on the rest of scene be it any curved or weirdly shaped surfaces or character or whatever else, all in one go without rendering scene 1000 times.

Any realistic BVH in games are going to have to weigh in the cost between rebuilding/refitting and often cut out some distant geometry
And you have to do way more of this stuff with planar reflections. To draw a planar reflection, you need to render the geometry in that reflection and if there are 1000 mirrors in a scene you have to redraw all the geometry 1000 times, how cool is that?

Rebuilding the acceleration structure every time while including all geometry would be extremely prohibitive for any modern game ...
And that's why nobody rebuilds it in every frame, most of geometry is static anyway, so you may need a refitting in the worst case for it if was moved.

If you have to make cheats in the BVH such as using lower LoDs and cutting out geometry then it's not a clear cut increase in quality over planar reflections
First, you don't have to)
Second, even if you would like to make cheats, the same cheats would be required for planar reflections because frame budgets are limited in both cases and planar reflections will hit frame times way more badly with increasing number of reflecting surfaces.
 
And you have to do way more of this stuff with planar reflections. To draw a planar reflection, you need to render the geometry in that reflection and if there are 1000 mirrors in a scene you have to redraw all the geometry 1000 times, how cool is that?

Also, good luck rasterizing reflections of reflections. This is one use case where there really shouldn’t be any more debate about the right solution.
 
Avatar
And all UE5 games with Lumen
Ha ok, i though with 'game can't turn off RT' you meant HW raytracing support on minimal specs.
Personally i see 3 relevant categories of raytracing:
1. HW accelerated triangle ray tracing
2. Software raytracing using approximated formats of geometry (Lumen, SDF, voxel cone tracing, i do it with surfels)
3. Software triangle raytracing (e.g. CryTeks Neon Noir demo)

Lumen can produce absolutely wonderful lighting. However, it's just cheap console raytracing. It can look gorgeous, but if you want to know how reality looks, you gotta go RT. :)
This only holds for offline RT due to massive sample counts needed to achieve photo realism.
Within realtime constraints, approximated geometry can give better quality. If this geometry is properly prefiltered, only one ray can sample a large area of geometry, while triangle raytracing always is just point sampling.
Usually we associate this advantage with the problem of light leaks, remembering voxel cone tracing. But this does not mean it can't work. Voxels are just a bad approximation for surfaces.

Thus the ideal would be to use prefiltered geometry for glossy reflections or GI,
and accurate triangle RT for hard shadows or sharp reflections.
And ideally we could do both using the same data structures and algorithms.
 
Ha ok, i though with 'game can't turn off RT' you meant HW raytracing support on minimal specs.
Personally i see 3 relevant categories of raytracing:
1. HW accelerated triangle ray tracing
2. Software raytracing using approximated formats of geometry (Lumen, SDF, voxel cone tracing, i do it with surfels)
3. Software triangle raytracing (e.g. CryTeks Neon Noir demo)

Yeah, 1 and 3 are what Avatar is going to use. So with a RT capable card, you will have much better performance and likely quality too.

Previously I was under the assumption this would've been the case with UE5 too. But Lumen has a good software RT implementation that runs similar to hardware (2, as you said).

So UE5 is kinda ruining the parade of Raytracing. This is why I believe RDNA3 is in a good position because you will still be able to turn HW-RT off and use Software-RT in games that run on UE5 (the game engine nearly every game dev uses) And even with HW-RT, Alex compared the performance in the Matrix Demo and RDNA2/Ampere performed very similarily in GPU limit, despite Ampere being much stronger in RT. So right now based on the very few examples we have, it appears like stronger hardware acceleration for Raytracing in Nvidia cards won't matter that much in games using Unreal Engine 5.

I guess inhouse-engines like Snowdrop with Avatar are the only hope for us Raytracing enthusiasts. Or game devs will decide to implement other lighting methodes such as RTXGI.
 
Also, good luck rasterizing reflections of reflections. This is one use case where there really shouldn’t be any more debate about the right solution.
I think to get reflections of reflections we would approximate the 2nd bounce reflection with a lookup into whatever we use for irradiance cache, e.g. RTXGI probes.
I'd do this also for the full raytraced approach, because recursion is just crazy for realtime, imo. I mean, no matter what, you have to clamp recursion depth anyway at some point.
So i don't think that's a good argument regarding planar vs. RT reflections.

The better argument is simply: If we use HW RT at all, we already accepted the cost of maintainign BVH, so ofc. we will use RT reflections but no planar restrictions. Nobody will mix planar reflections with RT AO, for example.
But if HW RT can't be used, planar reflections and multi view rasterization in general remains a topic.
Even with HW RT on, multi view raster is still used everywhere for shadow maps. Only after HW RT becomes practical enough to get rid of that, the debatte would be settled.
 
Where did you get that? Planar reflections is just a small and very limited subset of what you can do with RT (and without making thousands of geometry passes for every planar surface in a scene).
You don't give up any spatio-temporal stability with RT at all, just ignore normal maps and roughtheness in your ray gen shader and treat all surfaces as mirrors, this would probably be the dumbest stuff to do with RT, but will give you just as spatio-temporal stable reflections as planar reflections would, and it will also give you an ability to do these mirror reflections not just on flat surfaces, but also on the rest of scene be it any curved or weirdly shaped surfaces or character or whatever else, all in one go without rendering scene 1000 times.
???

Coherent rays and curved surfaces ? Introducing curved surfaces is going to contribute to noise due to the divergent directions of the rays ...

Coherent rays ? Curved surfaces ? Spatio-temporal stable image ?

Pick two out of the three from the above because you can't have all three currently (if ever) ...
And you have to do way more of this stuff with planar reflections. To draw a planar reflection, you need to render the geometry in that reflection and if there are 1000 mirrors in a scene you have to redraw all the geometry 1000 times, how cool is that?
I don't believe many real-time applications with realistically be able to use 1000 mirrors (let alone 100 or even 10) since there's a high likelihood that a portion of them will be occluded ...
And that's why nobody rebuilds it in every frame, most of geometry is static anyway, so you may need a refitting in the worst case for it if was moved.
It sounds like you made an argument in favour of planar reflections. Planar reflections would be able to handle dynamic geometry more gracefully ...

With ray tracing you're at the mercy of the acceleration structure at all times. Refitting isn't going to work very well with dynamic deformable geometry in a scene so you'll get tons of noise as a result of the lower quality acceleration structure. Rebuilding an acceleration structure is astronomically expensive and needs to be ammortized over several frames even on the most powerful hardware today. Cutting out geometry ? Well now you're just stuck with a less accurate scene representation which sucks too ...
First, you don't have to)
Second, even if you would like to make cheats, the same cheats would be required for planar reflections because frame budgets are limited in both cases and planar reflections will hit frame times way more badly with increasing number of reflecting surfaces.
At least you're guaranteed to be getting a spatio-temporally stable image and not having to deal with any acceleration structure with planar reflections ...

The tradeoff between full blown RT reflections and planar reflections in terms of quality/elegance isn't one-sided as you would implicate them to be ...
 
So I've slept and watched the reveal again with a fresh head and he's my thoughts.

Price

Amazing price if looking at it from a raster point of view, not so good of a price if looking at it from a ray tracing point of view. The VRAM amount is a huge win and is something that AMD/ATI have historically always been better than Nvidia at.

Design

Is it really chiplets? Just looks like repurposed HBM to me and given this, would HBM2E/HM3 not been a an option in terms of cost? Three HBM2E stacks would be 24GB VRAM @ 1.38TB/s of bandwidth with three HBM3 stacks being 2TB/s of bandwidth. I remember an article claiming the 16GB of HBM2 used in Radeon VII was estimated to be $300, what would that cost in 2022?

Ray tracing

This is what I'm really disappointed with as it's currently the single biggest selling point for me when it comes to choosing a GPU. They had a mammoth task trying to catch Nvidia but this feels like an half assed effort just to say they've improved it. This very much reminds me of the HD5000 series vs Fermi in regard to tessellation back in the day.

Power consumption

Is the power consumption actually good? When limiting the RTX4090 to 350w it doesn't actually lose that much performance and is still faster than the 7900XTX. So watt for watt is it actually as efficient as Nvidia's latest effort? I don't think it is. And RT perf/watt is so far behind Nvidia it's not even funny.

Physical size

This is a win and the fact it uses standard PCIEX power plugs is an even bigger win, is this the reason why the card is limited to 350w?

Clock speeds

What happened to them? Where's the 3Ghz speeds? Hopefully running the front end at a slightly faster clock than the CU's hasn't restricted clock speeds and overclocking and we'll see partner cards get closer to that mark.

Overall

I'm disappointed, maybe I bought in to the hype train to much but the industry needed AMD to really stick it to Nvidia like they have been sticking it to Intel these last few years with Ryzen. It's another generation where Nvidia can charge and do what they want as they are the ones with the massive RT advantage and they have little incentive to do anything lower down in the price tiers.
 
Last edited:
When you have no transistor budget left, there is a problem. I dont know where all these transistors have gone. Navi31 has twice the transistors over Ampere but rasterizing performance will be 50% better. Raytracing performance will be worse in sum cases. So the whole architecture is still bloated.
 
So UE5 is kinda ruining the parade of Raytracing. This is why I believe RDNA3 is in a good position because you will still be able to turn HW-RT off and use Software-RT in games that run on UE5
Your argument is probably weaker than the other related argument, which is Lumens SW RT is often faster than any HW RT, depending on content.
There is this ussue of HW RT becominjg slow if multiple isntances of BVH overlap in space, e.g. the many layers of rock models seen in the Land of Nanite demo.
IIrc, they often have overlaps of 20 models, and HW RT needs to traverse all of them to find teh closest intersection. This is not a HW restriction, but a downside of the BVH data structure and ray tracing algorithms in general.
The SW SDF approach has some advantage here, becasue we can use the distance we get on the first ray entry to reject most of those 20 model squickly, plus SDF can be more easily merged to a single, global SDF than BVH.

Contrary, the Matrix City demo has probably very little no overlaps, because models have tight BVH boxes not extending the model much, and the models fit tightly to each other without a need to overlap them.
Here, HW RT will do better.

I assume this holds no matter what's the vendor of your GPU, but there surely are some cases where AMD would go faster with SW RT, but NV is faster with HW RT for the same scene.

Still, i don't think AMDs weaker RT is so much of a problem right now. And, um, UE5 seems slow no matter what.
It's more of a problem for the future, maybe.
If i upgrade my GPU nowadays, i want to use it for at least 5 years. Because they are more expensive, i'll upgrade less often than i did before.
So how will it be in 5 years? Will the weak RT perf. force me to upgrade earlier than otherwise, if i buy some RDNA3 now?
Personally i think that's a no, because in 5 years we'll probably still have the same console generation. But that's surely a factor to consider, even for those who think RT is not that important.
 
I think to get reflections of reflections we would approximate the 2nd bounce reflection with a lookup into whatever we use for irradiance cache, e.g. RTXGI probes.
I'd do this also for the full raytraced approach, because recursion is just crazy for realtime, imo. I mean, no matter what, you have to clamp recursion depth anyway at some point.
So i don't think that's a good argument regarding planar vs. RT reflections.

Yeah there’s certainly a bounce limit but there are practical (though uncommon) use cases where you want at least that second bounce to be high fidelity. SDFs and probes wouldn’t really cut it. I’m thinking here about looking at the back of my head at the barber :cool:

The better argument is simply: If we use HW RT at all, we already accepted the cost of maintainign BVH, so ofc. we will use RT reflections but no planar restrictions. Nobody will mix planar reflections with RT AO, for example.
But if HW RT can't be used, planar reflections and multi view rasterization in general remains a topic.
Even with HW RT on, multi view raster is still used everywhere for shadow maps. Only after HW RT becomes practical enough to get rid of that, the debatte would be settled.

That’s mostly still an option because the number of shadow casting lights is artificially limited in games. Granted that’s not likely to change this console generation or even the next one.

Once UE5 games start shipping in earnest we will have some useful data points on RT alternatives. As of now the most graphically advanced titles are all using HWRT.
 
RDNA’s shader architecture is heavily inspired by Nvidia’s Maxwell/Pascal, their doubling of FP32 ALUs seems to be inspired by Ampere…
 
Back
Top