AMD: RDNA 3 Speculation, Rumours and Discussion

Jawed · Jul 31, 2021

Bondrewd said:
Dawg this is from AMD slides.
The wonky looking internal ones at that.

The slides designed to out leakers? :mrgreen:

Bondrewd · Jul 31, 2021

Jawed said:
The slides designed to out leakers?

Don't think so, those aren't partner-distributed at all.
Whoever at Shanghai that spilled the N31 config will probably get a whacking tho...

CarstenS · Jul 31, 2021

Bondrewd said:
N5 SRAM scaling is miserable, like 1.15x or so?

More like 30% denser. Or this is specific to AMDs SRAM implementation.

Entropy · Jul 31, 2021

CarstenS said:
More like 30% denser. Or this is specific to AMDs SRAM implementation.

At IEDM 2019, the 5nm process was quoted to have 1.84x logic density improvement compared to 1.35x SRAM density improvement.

Since we only really have two mobile SoCs to go by when it comes to what this means for actual products, their improvement in overall transistor density was roughly 50%. What this means for HP designs is not clear.

Bondrewd · Jul 31, 2021

Entropy said:
Since we only really have two mobile SoCs

Yea and SRAM scaling there isn't 1.35X either.

CarstenS · Jul 31, 2021

Thanks for restating what I just said guys.

Nebuchadnezzar · Jul 31, 2021

GCDs not having IMCs is pretty obvious given that's the whole point of chiplets to put the badly scaling analog blocks on the larger cheaper process. Furthermore it would be a clusterfuck to have the L3 on the MCD and then have traffic from that interleave back through the GCDs to DRAM, it's utter nonsense.

Jawed · Jul 31, 2021

Nebuchadnezzar said:
GCDs not having IMCs is pretty obvious given that's the whole point of chiplets to put the badly scaling analog blocks on the larger cheaper process. Furthermore it would be a clusterfuck to have the L3 on the MCD and then have traffic from that interleave back through the GCDs to DRAM, it's utter nonsense.

Like the 3D V-cache you mean?

Leoneazzurro5 · Jul 31, 2021

Jawed said:
Like the 3D V-cache you mean?

In V-cache the cache is stacked on top of the oher cache, so the data path are the shortest possible. If the GCD had its own memory inteface and the stacked cache has to be positioned on the other side of the die like in your sketch, you will add a lot of distance for data paths that will limit frequency and add on power consumption.

Nebuchadnezzar · Jul 31, 2021

Jawed said:
Like the 3D V-cache you mean?

As mentioned, V-cache is an extension of the existing L3 - it's just additional banks and zero change to the data flow. That's not what's happening with the MCD. Incidentally I believe AMD will at some point move to a stacked giant L4 in the future, because it cannot be anything else but an L4 because it has to be centralised because of coherency.

Jawed · Aug 1, 2021

We've discussed the active bridge chiplet before:

ACTIVE BRIDGE CHIPLET WITH INTEGRATED CACHE - ADVANCED MICRO DEVICES, INC. (freepatentsonline.com)

and we have seen other patent documents:

https://forum.beyond3d.com/posts/2212201/

that relate to distributed tasks and performing DMA operations across distributed processors and their respective PHYs.

MCD at around 300mm²:

ethernity said:
It should be on MCD.
I imagine AMD would take the best of both worlds. N5P GCD for absolute logic density and performance and N6 MCD with HD/SRAM optimized libraries for lower cost per MB IC.
N5P SRAM density gain over N7/N6 is very mediocre.
512MB SRAM on N6 with optimized libraries would only be 280-300m2 (Figures estimated from wikichip data, behind paywall). On N5 hardly any better around 250+mm2 but much costlier.
But all those logic blocks can scale very high almost 1.48x with N5P (assuming AMD goes with N5P for GPUs else 1.85x on plain N5)
I suppose 2x GCD + 1x MCD would be closing in around 1000mm2 or maybe even more. Will cost a pretty penny.

will not have enough perimeter for 256-bit GDDR6 and all the other GPU IO and 2x 2TB/s (guess) L3 interfaces to each GCD.

Frenetic Pony · Aug 1, 2021

CarstenS said:
More like 30% denser. Or this is specific to AMDs SRAM implementation.

Err, was basing the "high density sram" off: https://www.anandtech.com/show/15219/early-tsmc-5nm-test-chip-yields-80-hvm-coming-in-h1-2020

Was this wrong? I just assumed the calculations for sram density at 5nm were right, never bothered to check. But 128MB Sram from this is what, just over 20 mm^2 there, so... not huge?

Qesa · Aug 1, 2021

Frenetic Pony said:
Err, was basing the "high density sram" off: https://www.anandtech.com/show/15219/early-tsmc-5nm-test-chip-yields-80-hvm-coming-in-h1-2020

Was this wrong? I just assumed the calculations for sram density at 5nm were right, never bothered to check. But 128MB Sram from this is what, just over 20 mm^2 there, so... not huge?

N7 has a HD SRAM cell size of 27000 nm^2, so you're looking at a ~28% density improvement on paper.

Critically, this density is never even close to achieved IRL. With their zen 3 v-cache, AMD fit 64 MB of L3$ in a 36 mm^2 die. That's ~67000 nm^2 per bit, less than half the theoretical density. And, also from AMD, this was about twice the density of the L3 on the zen 3 CCD and RDNA2.

Frenetic Pony · Aug 1, 2021

Qesa said:
N7 has a HD SRAM cell size of 27000 nm^2, so you're looking at a ~28% density improvement on paper.

Critically, this density is never even close to achieved IRL. With their zen 3 v-cache, AMD fit 64 MB of L3$ in a 36 mm^2 die. That's ~67000 nm^2 per bit, less than half the theoretical density. And, also from AMD, this was about twice the density of the L3 on the zen 3 CCD and RDNA2.

So while TSMC provides standard reference libraries, actual implementation will be a question mark. Thus any estimation of SRAM size on RDNA3 is... kind of an open question.

Well, so much for early cost estimations then. Thanks for the info.

TESKATLIPOKA · Aug 1, 2021

Greetings to all members of beyond3d forum.
.

Bondrewd said:
Ehh, ballpark ~440mm^2 but it's also less mem than N22.
Feasiable for 450 buck.

So from your post I assume Navi 33 has only 128bit GDDR6 bus, right?
The question is who would want to buy Navi 33 for at least $450 with not even 12GB Vram next year? Even If It performs like RX 6900XT with only 8GB Vram It's a hard sell in my opinion.

Kaotik · Aug 1, 2021

TESKATLIPOKA said:
Greetings to all members of beyond3d forum.
.

So from your post I assume Navi 33 has only 128bit GDDR6 bus, right?
The question is who would want to buy Navi 33 for at least $450 with not even 12GB Vram next year? Even If It performs like RX 6900XT with only 8GB Vram It's a hard sell in my opinion.

Make it 16 gigs then? GDDR6 supports clamshelling

Bondrewd · Aug 1, 2021

TESKATLIPOKA said:
The question is who would want to buy Navi 33 for at least $450

Uh I mean do you really have a choice?

TESKATLIPOKA said:
Even If It performs like RX 6900XT with only 8GB Vram It's a hard sell in my opinion.

It's a mobile first part much the same way N23 is.

CarstenS · Aug 1, 2021

Frenetic Pony said:
Err, was basing the "high density sram" off: https://www.anandtech.com/show/15219/early-tsmc-5nm-test-chip-yields-80-hvm-coming-in-h1-2020

Was this wrong? I just assumed the calculations for sram density at 5nm were right, never bothered to check. But 128MB Sram from this is what, just over 20 mm^2 there, so... not huge?

I was going by this, which seems to be based on newer data:
https://en.wikichip.org/wiki/5_nm_lithography_process#N5
https://fuse.wikichip.org/news/3398/tsmc-details-5-nm/

And yes, this is reference implementation, that's why I asked, if this 15% figure maybe was based on AMD specific implementation.

edit: To which, funnily enough, I've not gotten an answer from you know who. Apparently not gotten cleared to leak juicy bits.

Leoneazzurro5 · Aug 1, 2021

Jawed said:
MCD at around 300mm²:

Probably bigger, It seems to be on N6 and not on N5 and if you integrate more things than only cache It could be over 400 mm^2

Jawed said:
will not have enough perimeter for 256-bit GDDR6 and all the other GPU IO and 2x 2TB/s (guess) L3 interfaces to each GCD.

The whole purpose of stacking is increasing area density and interconnection bandwidth by using vertical connections, thus by not being limited by perimeter or such, that is, only the VRAM bus and I/O connections would be connected to the perimeter of the MCD, while the inter-GCD bandwidth would be achieved through the cache itself and the vertical interconnection paths

Bondrewd · Aug 1, 2021

Leoneazzurro5 said:
Probably bigger

Nope.
You're still not thinking the right size.
Think less AMD Rome and more AMD Navi31.

AMD: RDNA 3 Speculation, Rumours and Discussion

Jawed

Bondrewd

CarstenS

Moderator

Entropy

Bondrewd

CarstenS

Moderator

Nebuchadnezzar

Jawed

Leoneazzurro5

Nebuchadnezzar

Jawed

Frenetic Pony

Qesa

Frenetic Pony

TESKATLIPOKA

Kaotik

Drunk Member

Bondrewd

CarstenS

Moderator

Leoneazzurro5

Bondrewd

Similar threads