AMD: RDNA 3 Speculation, Rumours and Discussion

Status
Not open for further replies.
The diagrams are now available, e.g.:

b3da048.png


In this diagram each AID features:
  • cache (610)
  • command processor (606)
  • GDDR PHY (614)
stacked upon each AID are shader engine dies and the AIDs are connected by bridge chiplets. The Multimedia and IO Die (708) is shown as a separate chiplet mounted on the MCM module.
 
Last edited:
Die Size [mm2]Frequency [MHz]SM or WGP
Shaders​
TMUROPCache [MB]Memory bus width [bit]MemSpeed [Gbps]Vram [GB]
AD106~190 TSMC N5(4N)2610 ?40 ?5120 ?160 ?40-48 ?32 ?128 ?21-22 ?8 ?
N33~203 TSMC N63200 ?164096128 ?48-64 ?3212820 ?8
RX 6950XT520 TSMC N723104051203201281282561816

Peak single precision performance [TFLOP]Peak Texture Fill rate [GT/s]Peak Pixel Fill rate [GP/s]Bandwidth [GB/s]
AD10626.7417.6104.4-125.3336-352
N3326.2409.6153.6-204.8320
RX 6950XT23.7739.2295.7576
* N21 just for comparison

Spec wise, they look pretty comparable.
I have to wonder which one will be faster in pure raster performance, in RT Nvidia should win.
Although RDNA3 has lesser performance per FLOP than RDNA2, It should still have an advantage in performance per FLOP than Ada I think.
 
I doubt that RDNA3 on N6 will have a sizeable clock advantage against Lovelace on N5. 2,6 seems a bit conservative for AD106 while 3,2 seems a bit high for N33.
 
I doubt that RDNA3 on N6 will have a sizeable clock advantage against Lovelace on N5. 2,6 seems a bit conservative for AD106 while 3,2 seems a bit high for N33.
2.6GHz is the same as the official boost for RTX 4080 12GB, that's why I used It.
BTW, I just checked desktop RTX 3000 models and the official boost is in the range of 1665->1777MHz, only 3080Ti has 1860MHz. So then Ada106 should have ~2.7GHz. Maybe later they will refresh the lineup with higher clocked models.

Both RDNA1and RDNA2 uses N7 process, yet the boost frequency increased from 1980MHz(5700XT 50th Anniversary Edition) to 2635MHz(Radeon RX 6650 XT) and that's 33%.
N6 based RX 6500XT goes as high as 2815MHz, although power consumption is very bad.
I don't think 3.2GHz(+14%) is unreasonably high, even If It's built using N6.
 
Dont NV gpus have this baseclock, boost and gameclock? I think the advertised clocks are lower than what these gpus clock to (as high as temps, load etc allow). This was introduced with Pascal i think.
 
Dont NV gpus have this baseclock, boost and gameclock? I think the advertised clocks are lower than what these gpus clock to (as high as temps, load etc allow). This was introduced with Pascal i think.
For Nvidia the official boost is lower than the actual clockspeed achieved from what I saw in reviews.
AMD has baseclock, gameclock and boost.
 
Last edited:
2080Ti has 1350mhz base and 1545 boost, but it does clock higher than that boost?
Yes. NVIDIAs logic is that they'll tell what every chip will definitely boost to and after that may the silicon lottery decide your actual clocks.
 
Yes. NVIDIAs logic is that they'll tell what every chip will definitely boost to and after that may the silicon lottery decide your actual clocks.
It's less about "silicon lottery" and more about the load type which the card is running at each moment. The "silicon lottery" part is handled by chip binning with better performing chips ending up in AIB's factory OC models.
 
Yes. NVIDIAs logic is that they'll tell what every chip will definitely boost to and after that may the silicon lottery decide your actual clocks.

Exactly, their advertised boost clocks arent the highest clocks, quite far from it. Another example is my GTX1050 in a older laptop, advertised boost clock is 1455 but averages 1720mhz according to hwinfo64 under gaming.

My 2080Ti averages 1720 to 1740mhz depending on game/workload (hwinfo64/BF2042/Infinite). Rarely do these GPUs not clock higher than the advertised boost clocks.

It's less about "silicon lottery" and more about the load type which the card is running at each moment. The "silicon lottery" part is handled by chip binning with better performing chips ending up in AIB's factory OC models.

This. If you have bad airflow/higher temps that will affect where the boost will be going. The advertised clocks are a minimum, while it can clock higher than that if temperatures and loads allow.
 
It's less about "silicon lottery" and more about the load type which the card is running at each moment. The "silicon lottery" part is handled by chip binning with better performing chips ending up in AIB's factory OC models.
Of course best chips go usually to best models, but still it's more or less silicon lottery that decides the actual clocks on "basic models" too. Load plays it's own role too, but as shown in reviews there are loads that push them even under supposed base clocks, let alone advertised Boost-clocks, so it can't be relied upon without knowing what specific load NVIDIA had in mind (if any)

As Diamond.G pointed out, happens to some extent on AMD too and it's same silicon lottery there to see where exactly your card lands, but AMDs reported clocks are usually much closer to actual clocks.

I want back to days when you knew exactly what you were getting when you got it from the store. The card has this many units. They run at that many MHz. End of story.
Now I can buy dozen identical cards none of which perform the same.
 
Of course best chips go usually to best models, but still it's more or less silicon lottery that decides the actual clocks on "basic models" too. Load plays it's own role too, but as shown in reviews there are loads that push them even under supposed base clocks, let alone advertised Boost-clocks, so it can't be relied upon without knowing what specific load NVIDIA had in mind (if any)

As Diamond.G pointed out, happens to some extent on AMD too and it's same silicon lottery there to see where exactly your card lands, but AMDs reported clocks are usually much closer to actual clocks.

I want back to days when you knew exactly what you were getting when you got it from the store. The card has this many units. They run at that many MHz. End of story.
Now I can buy dozen identical cards none of which perform the same.
I can't remember even one time when any load pushed any of my recent GF cards "below base clocks". Some loads do put them slightly below rated boost but certainly not base, and such loads are actually rare (and mostly weirdly "light"; in a sense that a heavy RT game is unlikely to do this but some older raster only game running at >100 fps may).

The difference between cards of the similar model in clocks ("silicon lottery") is completely negligible (like a couple of clocking steps, +-20MHz).
 
I can't remember even one time when any load pushed any of my recent GF cards "below base clocks". Some loads do put them slightly below rated boost but certainly not base, and such loads are actually rare (and mostly weirdly "light"; in a sense that a heavy RT game is unlikely to do this but some older raster only game running at >100 fps may).

The difference between cards of the similar model in clocks ("silicon lottery") is completely negligible (like a couple of clocking steps, +-20MHz).

"Base clocks" should be just that, guaranteed clocks for whatever bin you're buying. "Boost clocks" just mean "we think it can hit this in the right circumstances and anyway it's capped here". And then variability can as often go to temporary powerdraw/heat. We can already see this with the current AD test cards that are out, with one reviewer reporting a max powerdraw of 500 watts while another reports 600.

The difference in cards is a good thing, it means vendors are pulling everything they can out of the cards and being clever about how they handle variability by ensuring temporary power draw and the coolers are rated for the worst case scenario of the worst of the bin they're doing. The "old way" of getting everything to run at the base minimum means everything runs at the speed of the absolute worst cards with no variability in powerdraw/heat allowed unless you overclocked yourself. So sure this means some variability in performance as some cards hit power/heat limits quicker and whatever. But it's giving people the most for their money.
 
Base is meaningless those days, vendors should just quote fmax and average workload clock (vidya for vidya GPUs and say, GEMM brrrr for compute sticks).
 
Die Size [mm2]Frequency [MHz]SM or WGP
Shaders​
TMUROPCache [MB]Memory bus width [bit]MemSpeed [Gbps]Vram [GB]
AD106~190 TSMC N5(4N)2610 ?40 ?5120 ?160 ?40-48 ?32 ?128 ?21-22 ?8 ?
N33~203 TSMC N63200 ?164096128 ?48-64 ?3212820 ?8
RX 6950XT520 TSMC N723104051203201281282561816

Peak single precision performance [TFLOP]Peak Texture Fill rate [GT/s]Peak Pixel Fill rate [GP/s]Bandwidth [GB/s]
AD10626.7417.6104.4-125.3336-352
N3326.2409.6153.6-204.8320
RX 6950XT23.7739.2295.7576
* N21 just for comparison

Spec wise, they look pretty comparable.
I have to wonder which one will be faster in pure raster performance, in RT Nvidia should win.
Although RDNA3 has lesser performance per FLOP than RDNA2, It should still have an advantage in performance per FLOP than Ada I think.

The supposed N33 is on N6, much less dense but also less expensive overall than whatever Nvidia will be offering.

The 32mb of LLC there really drops performance moving from 1080p to 1440p. If they can stack another 32mb it'll be a great 1440p performer for the size/production cost.
 
Status
Not open for further replies.
Back
Top