AMD RDNA3 Specifications Discussion Thread

Ok, from what I'm understanding the design is facing these design problems/limits:
- the dual issue instructions means that if they can't extract the second instruction, it has just a little more shaders than N21, this can be alleviated by drivers and engines tweaks
- the same architecture prevents it from scaling in frequency, apparently it was targeting about 3GHz and 73TF
- the EFB interconnection is adding to power consumption limiting the GCD
- the EFB interconnection is adding to the cost, dunno how much compared to a bigger monolithic die
Didn't Tom's representative blurb out that they actually do scale to 3 GHz in some other YT channels coverage of the event?
 
What I've heard that can be false but doesn't contradict that video: the design is targeting 3GHz, OEMs aren't reaching anywhere near that.
I have no idea if it's an architecture limit, a power limit, or a node process limit.
 
What I've heard that can be false but doesn't contradict that video: the design is targeting 3GHz, OEMs aren't reaching anywhere near that.
I have no idea if it's an architecture limit, a power limit, or a node process limit.
hardly believe it could be a process node limit , considering Zen 4 clock ....
 
What I've heard that can be false but doesn't contradict that video: the design is targeting 3GHz, OEMs aren't reaching anywhere near that.
I have no idea if it's an architecture limit, a power limit, or a node process limit.
I would imagine a power limit. 2x8pin means only 375w with the XTX already at 355w.
A 3x8pin partner card at 450w, some are suggesting another 10-20% more performance if it isn't bandwidth limited.
It is extremely unlikely to be node limited and if they did indeed design it for higher clocks, unlikely it is an architecture limit.
Edit- Could be a software limit due to their new features and decoupled clocks.
 
I would imagine a power limit. 2x8pin means only 375w with the XTX already at 355w.
A 3x8pin partner card at 450w, some are suggesting another 10-20% more performance if it isn't bandwidth limited.
It is extremely unlikely to be node limited and if they did indeed design it for higher clocks, unlikely it is an architecture limit.
Edit- Could be a software limit due to their new features and decoupled clocks.
I think they just want to win the efficiency crown even if they have to clock low
 
Someone please explain to a dumb console only pleb who happens to find this stuff interesting...they talked a lot about decoupled clocks and boost and all that

This may be totally off the mark but it reminded me of what the PS5 is said to do on its own hardware. Are they a similar sort of thing?
 
I have only third hand information, but from what I'm reading it doesn't overclock at all.

Isn't that really limited by its TDP/Cooling as it only comes with 2 power inputs?

75W+2*150W limits you to 375W and it's a 350W card.

Edit: Forget it as somebody else already wrote the same basically.
 
I'm sure there will be some AIB variants with 3 8-pin power connectors, just like with many Nvidia and AMD models in the past.
 
Yeah I respect that if you want ultra high frame rates at high resolutions above all else then RT might be a non-starter on any card. But it seems to me a strange compromise to strive for crazy high framerates and particularly for very high native res (especially where upscaling is available) on a 50-60TF $900+ GPU while at the same time having the core graphics look worse than what is available on a $300 4TF Xbox Series S - which would certainly be the case in some titles without RT.

I'm not sure what that means - could you give an example of a game where this could possibly be the case?

Like the 7900 is not a non-RT GPU, it's just a GPU who's RT performance will likely only match a 2-year old Ampere 3080. That's definitely disappointing, but it's far, far above the RT performance of the S, X and PS5. The main advantage of the PC is choice, you're not restricted into what the developer personally felt were the right compromises for a particular framerate. There is not going to be something released within this console generation that will only run well on Ada, as that would be financial suicide for the developer.

The argument isn't that "RT is worthless and has no future" - the argument is that the sacrifices it brings to performance and resolution are too great for the small improvements in brings in many of the titles currently - at least at the framerate standards that have been raised considerably in the past 5 years. You can mitigate those sure - but the actual cost in $ is far too high for the majority of PC gamers.

If the 4070-whatever comes out in 2023, and is in the same price ballpark of the 7900, has a 20% raster disadvantage but a 100% RT advantage, then you can definitely make the argument that it's a shortsighted decision to go with Radeon - like it would have been when Ampere was introduced and the 6800/6900 series were almost priced identically, which explains why Nvidia completely took over in marketshare (that and in many games, you could argue it was also much faster in raster due to DLSS). We'll see with actual benchmarks I guess, I don't think the 4080 will necessarily put a cork in this argument though.

e-sports players where high frame rates are essential are a clear exception to this of course. But putting aside frame rate, if I care enough about graphics to insist they must be running at 4K native, but then have a series S pushing better core graphics than me outside of resolution, I'm not sure that would sit right. If I'm going to spend that much on a GPU, I want better frame rate, better image quality and better graphics than a $300 or even a $500 console. To me, that's why RT performance is important. Without it, I can potentially only have 2 of those things, and not all 3, which isn't great if I'm spending $900+ on a GPU alone.

You get that with a 7900 series card though, in spades. And you can get that without RT as well - ports like God of War, Days Gone etc have significant raster improvements even outside of framerate/resolution. Again I don't think anyone is arguing that there is no point to ray tracing at all, just that the cost to get a product that can run full-RT games at the resolutions/framerates many gamers want is too high atm, so the sacrifice to that area of performance may be more acceptable if the other 2 can be met at a more reasonable price point.

I hope you're wrong but after this launch am starting to suspect you're right. Unfortunately I'm not sure PC gaming can survive this pricing structure if we're going to have to spend near double the price of a console 2 years after its launch on the GPU alone just to get for something that's significantly (as in 2x or more) faster.

It is definitely a concern of mine, I can appreciate the technology in something like a 4090 and it's good there's an option out there for people who value that unique experience. But from the perspective of what grows the PC gaming base, which in turn puts more pressure on publishers to take better care of their ports, it's definitely worrying.
 
Last edited:
I'm sure there will be some AIB variants with 3 8-pin power connectors, just like with many Nvidia and AMD models in the past.
Only AIB card shown so far, Asus TUF, has 3x8-pin, but they didn't reveal clocks yet.
 
Isn't that really limited by its TDP/Cooling as it only comes with 2 power inputs?

75W+2*150W limits you to 375W and it's a 350W card.

Edit: Forget it as somebody else already wrote the same basically.
I don't know how many were sold, but it looks like the R9 295X2 used 2 8pin connectors to draw 500W. So there is precedence for AMD drawing more than the 375W "allowed" by the spec.
 
- the dual issue instructions means that if they can't extract the second instruction, it has just a little more shaders than N21,
There are other (confirmed / speculated) ways beyond instruction fusion/pairing.


P.S. Wave64 single-cycle execution implies that both the lower and upper halves are co-issued to the dual-issue 32-lane hardware. This is contrary to RDNA 2 issuing the same Wave64 instruction over 2 cycles to single-issue 32-lane hardware (hence 2-cycle execution, 1 for lower, 1 for upper).
 
Last edited:
hardly believe it could be a process node limit , considering Zen 4 clock ....

There's more to it than that, 28nm CPUs clocked twice as fast as 28nm GPUs too.
What they really mean is that for this specific design, that may be the process node limit; you end up being limited by certain critical paths inside the design that can't run at a higher frequency.

Easy way to think about it is Intel's new Alder Lake and Raptor Lake CPUs.

The P-Cores and E-Cores are the exact same process, they're even on the same silicon, but the P-Cores top out about 1500mhz higher than the E-cores do, because their architecture was designed with high clock rates as goal #1. More attention was paid to optimizing critical paths, deeper pipelining, choosing leakier transistors with higher drive strength that can switch faster, but burn more power, etc.

You can also expend a lot of manual effort, both with AI analysis and regular old humans trying to optimize the critical paths in the chip. Not to muddy the discussion too much by mentioning team green, but Anandtech's Pascal article goes into some pretty good detail of how careful design lets you run up the clocks on the same process:

Nothing to be done about it now after the fact though, the design is set in stone and the silicon is out in the wild.
 
Last edited:
Back
Top