AMD: RDNA 3 Speculation, Rumours and Discussion

Kaotik · Jan 26, 2021

Wesker said:
Plenty of opportunities for AMD to release a more mainstream GPU to replace the 5700 XT and compete with the RTX 3060.

This thing better be good...

Well that thing is coming and supposedly will compete with 3060 Ti (cut down model against 3060 then?), but it's unlikely related to this as we've seen several Navi2X-chip codenames and none are related to Nashira, summit or otherwise.

Bondrewd · Jan 26, 2021

DegustatoR said:
AMD Nashira Summit GPU spotted in AotS database

That's just a Navi21 (or 22) ES board.

Kaotik · Jan 26, 2021

Bondrewd said:
That's just a Navi21 (or 22) ES board.

It doesn't fit any of the previous codenames, though?

Bondrewd · Jan 26, 2021

Kaotik said:
It doesn't fit any of the previous codenames, though?

Cuz it's the board, not the GPU itself.
Remember that Vega12 ES? Big and bright-red.
Stuff like that.

Deleted member 90741 · Jan 28, 2021

AMD apparently filed two provisional patents in 2019 for ML based chiplets
"CHIPLET APPROACH FOR COUPLING GPU WITH MACHINE LEARNING ACCELERATION AT HIGH POWER EFFICIENCY," filed on Jul. 22, 2019
"HIGH BW INTER-CONNECTED CHIPLETS AND GPU FOR HIGH PERFORMANCE GAMING AND MACHINE LEARNING WORKLOADS," filed on Jul. 22, 2019

Full patent application is this (within 12 months of Provisional)
Filed: July 20, 2020

20210026686 CHIPLET-INTEGRATED MACHINE LEARNING ACCELERATORS

Techniques for performing machine learning operations are provided. The techniques include configuring a first portion of a first chiplet as a cache; performing caching operations via the first portion; configuring at least a first sub-portion of the first portion of the chiplet as directly-accessible memory; and performing machine learning operations with the first sub-portion by a machine learning accelerator within the first chiplet.

Seems like a ML accelerator for gaming implemented in the Infinity Cache chiplets. Seems gaming oriented.

https://www.freepatentsonline.com/20210026686.pdf

Deleted member 90741 · Jan 28, 2021

In continuation to the patent post
The Memory in the chiplet is divided into two parts, one acting as LLC and another part use to sync between the CUs and the Accelerator

... the APD scheduler 136 is capable of scheduling shader programs for execution in the compute units 132 while also scheduling operations for execution on the cache/machine learning accelerator chiplets 404

the machine learning accelerators 502 are capable of, and sometimes do, perform machine learning operations such as matrix multiplications that consume data within the directly-accessible memory 508 of the same chiplet 404 and output results of the operations to the directly-accessible memory 508 of the same chiplet 404.

CarstenS · Jan 28, 2021

Now, where have we seen something like this?

Rootax · Jan 28, 2021

This ML accelerator chiplet will never be found....

dobwal · Jan 28, 2021

CarstenS said:
Now, where have we seen something like this?

Xbox 360?

DegustatoR · Jan 28, 2021

Xe HPC more like

Megadrive1988 · Feb 8, 2021

According to Paul at Redgamingtech, Navi 31 and 32 will use chiplets, Navi 33 will be a monolithic die. All coming in 2022.

Navi 41 is early, no real silicon yet, but progressing well.

Frenetic Pony · Feb 9, 2021

Megadrive1988 said:
According to Paul at Redgamingtech, Navi 31 and 32 will use chiplets, Navi 33 will be a monolithic die. All coming in 2022.

Navi 41 is early, no real silicon yet, but progressing well.

Like, I know the guy was right before. But isn't this the same leak from like a month ago? The one that doesn't make a lot of sense because why would anyone make a chiplet that big with that low of a yield if you could just cut it in half and see yields skyrocket, design costs plummet, and get whatever flexibility you want with binning. Besides which, they'd need to design multiple chiplets for this (rather than one and just reuse like with Zen), cut another 25% of power just to hit 360 watts for this "160cu" top end chip, and run a 512bit bus with 18gbps GDDR6 or HBM just to supply the thing.

I'm just going to go ahead and doubt this one a bit, at least until concrete information on how this is supposed to be supported at all emerges.

Leoneazzurro5 · Feb 9, 2021

Frenetic Pony said:
Like, I know the guy was right before. But isn't this the same leak from like a month ago? The one that doesn't make a lot of sense because why would anyone make a chiplet that big with that low of a yield if you could just cut it in half and see yields skyrocket, design costs plummet, and get whatever flexibility you want with binning. Besides which, they'd need to design multiple chiplets for this (rather than one and just reuse like with Zen), cut another 25% of power just to hit 360 watts for this "160cu" top end chip, and run a 512bit bus with 18gbps GDDR6 or HBM just to supply the thing.

I'm just going to go ahead and doubt this one a bit, at least until concrete information on how this is supposed to be supported at all emerges.

Why they'd need to make multiple chiplets? They could have an "80 CU" chiplet and one I/O scalable die, both with adequate cache amount on die, so they could go for 80-160 CU with two chiplets and one I/O, and 40-80 CUs with one chiplet and I/o with less RAM bus size, and monolithic for everything below.

Rootax · Feb 9, 2021

Frenetic Pony said:
Like, I know the guy was right before. But isn't this the same leak from like a month ago? The one that doesn't make a lot of sense because why would anyone make a chiplet that big with that low of a yield if you could just cut it in half and see yields skyrocket, design costs plummet, and get whatever flexibility you want with binning. Besides which, they'd need to design multiple chiplets for this (rather than one and just reuse like with Zen), cut another 25% of power just to hit 360 watts for this "160cu" top end chip, and run a 512bit bus with 18gbps GDDR6 or HBM just to supply the thing.

I'm just going to go ahead and doubt this one a bit, at least until concrete information on how this is supposed to be supported at all emerges.

I was thinking the same thing about the chiplet size. But then I thought, it's their first gpu chiplet design, maybe they will keep it simple ? Anyway, we'll see

Jawed · Feb 9, 2021

GDDR bus size needs to scale with the count of graphics chiplets. An I/O chiplet providing GDDR doesn't do that.

sir doris · Feb 9, 2021

Jawed said:
GDDR bus size needs to scale with the count of graphics chiplets. An I/O chiplet providing GDDR doesn't do that.

Could an I/O chiplet be designed for the max bandwidth, then disable a proportion of the Infinity Fabric and GDDR channels for lesser designs?

no-X · Feb 9, 2021

Frenetic Pony said:
Like, I know the guy was right before. But isn't this the same leak from like a month ago? The one that doesn't make a lot of sense because why would anyone make a chiplet that big…

7nm Navi 21 is ~520 mm². Let's say the IO die (PCIe/DP/HDMI/UVD/VCE/ETC) will reduce the die by ~60 mm² (just a quick guess). That's 460 mm². At 5 nm the resulting chiplet size could be around 255 mm². Maybe a bit bigger because of the interface for chiplet interconnection. Is it really that big? At the time of RV770 AMD called it "sweet spot".

Kaotik · Feb 9, 2021

sir doris said:
Could an I/O chiplet be designed for the max bandwidth, then disable a proportion of the Infinity Fabric and GDDR channels for lesser designs?

Yes, but no.
edit:
To clarify, yes, it would be possible, but it would be beyond strange and stupid.
I/O wants to sit on the edge of the chip(let), scalable I/O die for GPU would need to be quite big just to accommodate wide enough bus for the high end GPUs, which would make it unpractical in anything lower end.
By having memory controller(s) in each compute chiplet the bus width would scale with GPU performance in sensible way.

Bondrewd · Feb 9, 2021

Frenetic Pony said:
a chiplet that big

Ugh broski it's 40 gfx11 WGPs on N5 without the fancy uncore even being there.
That's small.

no-X said:
At the time of RV770 AMD called it "sweet spot".

That was before costs exploded but yeah.

sir doris said:
Could an I/O chiplet be designed for the max bandwidth, then disable a proportion of the Infinity Fabric and GDDR channels for lesser designs?

Pointless, instead they're throwing N6 tapeouts at the problem.
See: Genoa.

Frenetic Pony · Feb 9, 2021

Bondrewd said:
Ugh broski it's 40 gfx11 WGPs on N5 without the fancy uncore even being there.
That's small.

It would be around 160-240mm or so as a good guess. That's 2-3x the size of a zen 3 chiplet on a node that ideally shrinks things by almost half. But still, yields don't actually go up that much if you cut it in half. So you have a point.

What's more, just found TSMC's tiny sram bragging for their 5nm. 256mb cache is tiny tiny, 5mm. Suddenly I can see why AMD went with SRAM cache on RDNA2. It doesn't make a lot of sense at the moment, but as a future investment for 5nm it seems sensible.

Ok, bandwidth and power are still problems. They'd need to 18gbps+ on a 512bit bus, or HBM. They'll also need to increase architecture power efficiency again to get that huge chip in a reasonable tdp, as realistically the 5nm shrink won't be enough alone. Still, it all seems more reasonable now, and a potential monster of a chip, which would explain why Nvidia is urging on their "Ada" arch ASAP.

For future RDNA though I can see AMD going partially, or wholly, Samsung for their GPUs. They make more money on CPUs, so it'd make more sense for them to stick with the best foundry there even though supply from them is limited. But Samsung's gaafet is a transition that's coming one way or another. And it should put up a competitive fight against TSMC's 5nm and even their first 3nm, which looks to be such a disappointment for their customers that they immediately rushed out an announcement that they'd be transitioning to gate all around quickly after that as well.

AMD: RDNA 3 Speculation, Rumours and Discussion

Kaotik

Drunk Member

Bondrewd

Kaotik

Drunk Member

Bondrewd

Deleted member 90741

Guest

Deleted member 90741

Guest

CarstenS

Moderator

Rootax

dobwal

DegustatoR

Megadrive1988

Frenetic Pony

Leoneazzurro5

Rootax

Jawed

sir doris

no-X

Kaotik

Drunk Member

Bondrewd

Frenetic Pony

Similar threads