AMD: RDNA 3 Speculation, Rumours and Discussion

Leoneazzurro5 · Sep 10, 2021

ToTTenTranz said:
I don't know who greymon is, but either him or @Bondrewd are wrong, then.

The answer to that tweet referred to Nvidia first, then to all next-gen. And IIRC no one said Q1/22 for RDNA3. I'd say everyone pointed to 3rs/4th quarter 2022 since the beginning.

LordEC911 · Sep 18, 2021

Leoneazzurro5 said:
The answer to that tweet referred to Nvidia first, then to all next-gen. And IIRC no one said Q1/22 for RDNA3. I'd say everyone pointed to 3rs/4th quarter 2022 since the beginning.

The only talk of Q4 '21 or Q1 '22 I can remember is a few months back when the regular clickbait rumor sites were posting about tapeouts.

Edit- Example from wccftech

Ferman · Sep 19, 2021

With so many not able to buy current generation GPUs, it is going to be very hard to release a refreshed line up. Especially if it will be just as hard to buy the new GPUs.

Leoneazzurro5 · Sep 19, 2021

https://twitter.com/x/status/1439486356548325387

Basically another confirmation of the specifications known so far

xpea · Sep 19, 2021

Leoneazzurro5 said:
https://twitter.com/x/status/1439486356548325387

Basically another confirmation of the specifications known so far

n33 6nm 128bit gddr6 perf>6900xt
With 128 bit bus thus half bandwidth? Hard to believe...

Bondrewd · Sep 19, 2021

xpea said:
n33 6nm 128bit gddr6 perf>6900xt

Yes

xpea said:
With 128 bit bus thus half bandwidth? Hard to believe...

The magick of gigacache!
Granted not much more perf than 6900XT, but the wattage is also way way lower.

Leoneazzurro5 · Sep 19, 2021

So in the same discussion the leaker says that the low end cards may appear as a N2X 6nm refresh, due in 2023. If this is true, I'd say that will include only the smallest cards.

Bondrewd · Sep 19, 2021

Leoneazzurro5 said:
low end cards

No such thing.

Jawed · Sep 19, 2021

Would Navi 33 need 256MB of Infinity Cache to exceed 6900XT and compensate for half the GDDR bus width? I don't think so. 128MB, the same as Navi 21, merely clocked higher should be dominant in terms of overall memory efficiency. And if we assume this GPU is aimed at 1440p gaming (instead of 4K) then we could say that this amount of Infinity Cache is comfortable at this performance level.

6600XT is essentially the same performance as 5700XT. The smaller bus (50%) and die (94%) along with 7% more transistors gives this performance despite using 71% of the power. Infinity Cache is somewhere in the region of 2% of the transistors if we use the 6 transistor per bit rule of thumb and add a bit for supporting hardware.

We could conclude that the narrower memory bus is allowing Navi 2x to spend more power on die for actual graphics work and that the power consumption of Infinity Cache is practically negligible, since most of those transistors are "idle" at any one time.

But Navi 3x won't get this Infinity Cache boost. So we come back to 128MB, on its own, as determining the performance of Navi 33.

I think it's reasonable to expect Navi 33 to be about 320-350mm². This article about TSMC 6nm:

TSMC Reveals 6 nm Process Technology: 7 nm with Higher Transistor Density (anandtech.com)

implies 15% more transistors. I suppose that puts it in the region of 21-23B transistors assuming that some of those extra transistors would be available because of a reduction in GDDR PHY size from 192-bit to 128-bit (saves about 16mm²?). That's about 5B short of the transistor count of Navi 21 (6900XT)

So the extra transistors available for GPU work, versus Navi 22 (6700XT), is about 5B. So that's not going to take Navi 33 to the equivalent of 64 CUs - only about 52. So that implies much higher clocks, e.g. 3.2GHz+.

I do expect the ALU:TMU ratio to get doubled in Navi 3x, but that's not going to save a huge amount of transistors. 2% of an "equivalent-CU" saving?

It seems reasonable that ALU:RA will get halved or quartered in Navi 3x, but I wouldn't be surprised if that eats up all the savings of reduced TMU count. I'm assuming RA math doesn't use TMU math units, merely that data-paths and scheduling (queuing and coalescing memory operations) are largely common. If the TMU math units have been generalised to allow them to also do ray-box and ray-triangle tests, then I guess texturing throughput will continue rising to beyond-crazy levels, just to advance ray-acceleration throughput.

The big unknown is still the hints of a radically different WGP design. AMD's trend with RDNA has been to increase the quantity of "uncore" per WGP, so I think that would put a strong limit on the increase in "equivalent-CU" count.

In other words I think brute clocks are going to be more significant than brute SIMD count, in getting to 6900XT performance at 350mm² or less. Extra uncore may well unlock yet better "SIMD IPC".

We may also see that ">6900XT" actually only refers to ray-tracing performance. In pure rasterisation workloads Navi 33 might fall far short of 6900XT when ALU-limited.

LordEC911 · Sep 19, 2021

Leoneazzurro5 said:
https://twitter.com/x/status/1439486356548325387

Basically another confirmation of the specifications known so far

I find it hard to believe that they were cut down the bus on N32.

Bondrewd said:
No such thing.

So what is the minimum price now? $199 or $299?
We have obviously seen the last of <$150 DGPUs that are current generation and in production.

Bondrewd · Sep 19, 2021

LordEC911 said:
I find it hard to believe that they were cut down the bus on N32.

?

LordEC911 said:
So what is the minimum price now?

Should be like $450 or so.

Jawed said:
I think it's reasonable to expect Navi 33 to be about 320-350mm²

A wee bit more.

Jawed said:
So that's not going to take Navi 33 to the equivalent of 64 CUs - only about 52

?
The ALU count is identical to N21, which is 5120.

Deleted member 13524 · Sep 19, 2021

LordEC911 said:
The only talk of Q4 '21 or Q1 '22 I can remember is a few months back when the regular clickbait rumor sites were posting about tapeouts.

I'm only assuming N3x is Q2 2022 because @Bondrewd claimed AMD is on a 6 quarter cadence between GPU families. I don't follow random rumors from wccftech.

Jawed said:
Would Navi 33 need 256MB of Infinity Cache to exceed 6900XT and compensate for half the GDDR bus width? I don't think so.

Going by AMD's famous graph of cache usage by target resolution, it does look like 256MB would fit some >75% of memory requests at 4K.

If 75% of the requests are done at ~2TB/s, then the remaining 25% can probably come at 256GB/s because the effective bandwidth will still end at ~1.6TB/s.

no-X · Sep 19, 2021

One thing may be targeted candence, the other one is real situation affected by Covid, overload of TSMC, mining, etc. That could easily cause 4 months delay (Navi 23 was delayed by several months already).

Deleted member 13524 · Sep 19, 2021

no-X said:
One thing may be targeted candence, the other one is real situation affected by Covid, overload of TSMC, mining, etc. That could easily cause 4 months delay (Navi 23 was delayed by several months already).

Then unless nvidia is getting special treatment by TSMC they're getting Lovelace pushed to 2023.

Qesa · Sep 19, 2021

ToTTenTranz said:
Going by AMD's famous graph of cache usage by target resolution, it does look like 256MB would fit some >75% of memory requests at 4K.

If 75% of the requests are done at ~2TB/s, then the remaining 25% can probably come at 256GB/s because the effective bandwidth will still end at ~1.6TB/s.

If there's a 25% miss rate and DRAM is 256 GB/s, the effective bandwidth can be no higher than 1 TB/s, no matter how fast the bandwidth is on hits.

If the cache is serving up 2 TB/s from hits and 0.25 TB/s from misses, by definition that's an 89% hit rate.

trinibwoy · Sep 19, 2021

128-bit gddr6 > 6900xt. That would be some impressive voodoo if it’s true at higher resolutions.

Bondrewd · Sep 19, 2021

trinibwoy said:
That would be some impressive voodoo if it’s true at higher resolutions.

Still has the N21 limitation of sorta dies at 4k.

Either way you're not running higher resolutions off an 8GB framebuffer.

trinibwoy · Sep 20, 2021

Bondrewd said:
Still has the N21 limitation of sorta dies at 4k.

Either way you're not running higher resolutions off an 8GB framebuffer.

Makes sense.

Jawed · Sep 20, 2021

Jawed said:
Infinity Cache is somewhere in the region of 2% of the transistors if we use the 6 transistor per bit rule of thumb and add a bit for supporting hardware.

Sigh, when I was writing that I felt something was wrong, but couldn't put my finger on it. Bed resolved the problem: I hadn't accounted for bytes! So 8x 2%. ARGH.

Bondrewd said:
The ALU count is identical to N21, which is 5120.

Well, we can subtract 32mm² off for 128-bit GDDR6 and save ~15% area to translate 520mm² on 7nm to about 424mm².

Perhaps the "new" WGP arrangement means there's only two shader engines, not four. This would reduce the count of ROPS, e.g. to 64. That would save a fair amount of die space... I reckon 32 ROPs are about the same area as a WGP, so just 64 ROPs saves in the region of 10mm². Perhaps 32mm² total saving with only two shader engines?

So (520-64)/1.15 takes us to 396mm² (ignoring the non-scaling of 128-bit GDDR6).

If this is really an 8GB card then it seems as if it would need to be positioned as a 1080p card, "7600XT".

I think power consumption is actually a bigger problem than performance, if this is really a 1080p card and around 150W.

Bondrewd · Sep 20, 2021

Jawed said:
So (520-64)/1.15 takes us to 396mm² (ignoring the non-scaling of 128-bit GDDR6).

Yep!
Sorta-kinda there.

Jawed said:
If this is really an 8GB card then it seems as if it would need to be positioned as a 1080p card, "7600XT".

Unfortunately yes, clamshells are wildly impractical for anything resembling a mainstream GPU and 24Gb DRAMs are DDR5 only for now.

AMD: RDNA 3 Speculation, Rumours and Discussion

Leoneazzurro5

LordEC911

Ferman

Leoneazzurro5

xpea

Bondrewd

Leoneazzurro5

Bondrewd

Jawed

LordEC911

Bondrewd

Deleted member 13524

Guest

no-X

Deleted member 13524

Guest

Qesa

trinibwoy

Meh

Bondrewd

trinibwoy

Meh

Jawed

Bondrewd

Similar threads