AMD RDNA3 Specifications Discussion Thread

Yes they've finally introduced B-frames support but lmao yea has to be implemented still.

The more notable fact is Premiere support.
Looks like someone threw some cash at Adobe.
 
This doesn't really belong in this thread, but they've been "good enough" since early this year and actually exactly in the same ballpark as Intel and NVIDIA with their h.264 encoder quality for a while already. H.265 was always fine. No-one just uses the new SDKs for whatever reason, this seems to change now thanks to OBS cooperation. Hopefully it means older gens get to use the improvements they actually already have too.

Fair point, but I believe they added AV1 encode and their top chip also has dual encoders. This comes along with co-developing some functionality within OBS.

I believe this part of the presentation also highlighted OBS enhancements for RX 6000 and RX 7000 series so you are correct about unused SDKs.
 
Async clocks for frontend and shaders.

1.5x RT per CU. Is he talking about the 48WGP as 48CUs?
So ~1.8x RT increase?

I find the move to decoupled clocks really interesting especially based on the history. Nvidia went with a 2-2.5x shader clock back in the day.
AMD went the opposite route and an 8% reduction in shader clocks saved 25% power?
Crazy way to rebalance the design but makes a lot of sense with all the investment made with the fine-grained power management and controllers we have now.

Wonder how much the decoupled clocks change had to do with their CU changes.
Also, Ryan brought up an interesting point... the whole reason of RDNA was to fix some of the inherit downfalls with GCN.
Moving back to being reliant on ILP after 2 generations... seems odd. Some of the other CU changes seem slightly odd but that is far outside my expertise.
https://www.anandtech.com/show/1763...first-rdna-3-parts-to-hit-shelves-in-december
 

Attachments

  • AMD RDNA 3 Tech Day_Press Deck 28.png
    AMD RDNA 3 Tech Day_Press Deck 28.png
    228.1 KB · Views: 11
Last edited:
Moving back to being reliant on ILP after 2 generations... seems odd. Some of the other CU changes seem slightly odd but that is far outside my expertise.
I am confused by that account. If anything, RDNA is designed to extract more ILP over GCN by (finally) exposing the 5-stage EX pipeline to the instruction stream with full-rate issuing, rather than quarter-rate issuing + quarter-rate execution looping over the same wavefront for 4 cycles (to hide the 4-stage EX pipeline).

Even AMD themselves said so in their RDNA whitepaper.

One of the core principles of the RDNA architecture is reducing latency to exploit instruction-level parallelism for each wavefront. Accordingly, each RDNA SIMD can decode and issue instructions every cycle – boosting throughput and reducing latency by 4X.
 
When they say "dual issue", do they mean extracting ILP like back in the VLIW days?
No. They moved to software scoreboarding, and introduced a very limited form of VLIW2, supporting only a subset of opcodes. The ISA holistically remains your standard SIMT "single lane view" form.

There are also explicit clues (driver/LLVM patch) of Wave64 having single cycle execution in RDNA 3. So that's another way of ILP extraction through reduced latency (albeit Wave64 only).

They might also be able to extract more TLP by co-issuing 1 instruction each from 2 different wavefronts that belong to the same SIMD, though there is no explicit indication on that yet. You can as well argue that this has a second order effect on ILP, since increased chance of an instruction being issued = lower effective latency = higher ILP.
 
Last edited:
I am confused by that account. If anything, RDNA is designed to extract more ILP over GCN by (finally) exposing the 5-stage EX pipeline to the instruction stream with full-rate issuing, rather than quarter-rate issuing + quarter-rate execution looping over the same wavefront for 4 cycles (to hide the 4-stage EX pipeline).

Even AMD themselves said so in their RDNA whitepaper.
I'm out of my depth here and was just going off Ryan's take, specifically-
But, as with all dual-issue configurations, there is a trade-off involved. The SIMDs can only issue a second instruction when AMD’s hardware and software can extract a second instruction from the current wavefront. This means that RDNA 3 is now explicitly reliant on extracting Instruction Level Parallelism (ILP) from wavefronts in order to hit maximum utilization. If the next instruction in a wavefront cannot be executed in parallel with the current instruction, then those additional ALUs will go unfilled.

This is a notable change because AMD developed RDNA (1) in part to get away from a reliance on ILP, which was identified as a weakness of GCN – which was why AMD’s real-world throughput was not as fast as their on-paper FLOPS numbers would indicated. So AMD has, in some respects, walked backwards on that change by re-introducing an ILP dependence.
https://www.anandtech.com/show/1763...first-rdna-3-parts-to-hit-shelves-in-december
 
It's funny to hear that, we already have DLSS2/3 and FSR2, 3 in development, upcoming games that will not give the setting to disable ray tracing, and games that are already on the market and have much worse graphics without raytracing
I'd wager it will be a miniscule number of games that require RT GPUs to run.
 
Last edited:
I like the prices except I agree with all the "Why $999.99, just say $1,000!" talk. That's been annoying me with everything for as long as I can recall, but lately gas prices ending in .99 have been setting me off. I don't feel like we really got any real performance information and I doubt we'll start seeing accurate leaks until a whole lot closer to December 13th. I'm not displeased, but I wish we had more performance information. Still I like the size and the power connectors, it looks like it's got potential. :)
 
Credit to some of the leakers for taking it on the chin at least:


The clock speeds are not the only detriment, but they're a big one. If you kept the same price point but tacked on ~500mhz, RT would still be lackluster, but that would also mean an easy win on raster vs the 4090 and at least Ampere-level RT. That would change the value proposition substantially.

The 6nm 6500XT was getting close to 3GHz out of the box, and the rumors were almost a GHz above where 7900XTX currently is. :no:

Now the below doesn't even look likely with LN2.

 
As expected, great rasterization performance, pretty decent power consumption, (not expected) ok RT performance (AMD might be confident 'cos Lumen exists), and out of my league specs -550W PSU-.

What worries me the most is that they avoided any direct comparison with nVidia, :unsure:which paints a very bleak prospect in that regard. Another generation goes to nVidia. One of how many?
 
Last edited:
Avatar
And all UE5 games with Lumen
Lumen can produce absolutely wonderful lighting. However, it's just cheap console raytracing. It can look gorgeous, but if you want to know how reality looks, you gotta go RT. :)

This short video explains it quite well. This guy is a professional film maker and uses UE5. Here he compares (VERY very fairly tbh) Lumen to Raytracing.

 
Last edited:
I find the move to decoupled clocks really interesting especially based on the history. Nvidia went with a 2-2.5x shader clock back in the day.
AMD went the opposite route and an 8% reduction in shader clocks saved 25% power?
Crazy way to rebalance the design but makes a lot of sense with all the investment made with the fine-grained power management and controllers we have now.

Wonder how much the decoupled clocks change had to do with their CU changes.
Also, Ryan brought up an interesting point... the whole reason of RDNA was to fix some of the inherit downfalls with GCN.
Moving back to being reliant on ILP after 2 generations... seems odd. Some of the other CU changes seem slightly odd but that is far outside my expertise.
https://www.anandtech.com/show/1763...first-rdna-3-parts-to-hit-shelves-in-december
I thought those decoupled clock rates were interesting because they mentioned that games have been getting more front-end bound. I'm sure that some people have noticed that GPU scaling has become very non-linear in recent years and they're not always CPU bound even at ultra low resolutions like 640x360. The new Plague Tale comes to mind and it goes a long way explaining how well the Series S performs compared to the bigger consoles.

Either way, I'm fairly hopeful from what I'm seeing in these cards. My 2080 Ti is still pretty okay at 4k144 but I'd be lying if I said I wasn't interested.
 
Async clocks for frontend and shaders.

1.5x RT per CU. Is he talking about the 48WGP as 48CUs?
So ~1.8x RT increase?

No.

It's "up to 1.5x". Marketing speech.

I average the performance increase is much smaller. The 1.5x comes in situations where RDNA2 had some bottlenecks and did very badly and was very far from it's theoretical performance. Now RDNA3 gets it's real-world performance closer to the theoretical performance.
 
Back
Top