AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Kaotik · Jan 21, 2020

no-X said:
7nm process by itself should reduce power consumption at least by 50 %. They spent really long time (and a lot of resourcers) on Ampere development. Isn't it logical to expect significant improvements on architectural side?

TSMC says 7nm compared to 16nm (12nm should be pretty same) 65% less power at same performance or 35 - 40 % more performance at same power, that's lightyears away from +50% performance at -50% power, too far for mere architectural improvements save for some holy grail -like revolutionary new discovery that would transform the whole field.
Only way I can see that holding any light would be some specific scenario like pure RT-performance

trinibwoy · Jan 21, 2020

Kaotik said:
TSMC says 7nm compared to 16nm (12nm should be pretty same) 65% less power at same performance or 35 - 40 % more performance at same power, that's lightyears away from +50% performance at -50% power, too far for mere architectural improvements save for some holy grail -like revolutionary new discovery that would transform the whole field.
Only way I can see that holding any light would be some specific scenario like pure RT-performance

That performance refers to transistor switching speed correct? Which translates to clock speed. I don’t know how useful those numbers are. We know GPU performance increases primarily due to having more transistors, not faster ones. So I read that as 55% (1/0.65) more transistors at the same power. Then add any architecture efficiency gains on top of that.

Though 50% faster at 50% lower power is far fetched by any measure.

Benetanegia · Jan 21, 2020

Kaotik said:
TSMC says 7nm compared to 16nm (12nm should be pretty same) 65% less power at same performance or 35 - 40 % more performance at same power, that's lightyears away from +50% performance at -50% power, too far for mere architectural improvements save for some holy grail -like revolutionary new discovery that would transform the whole field.
Only way I can see that holding any light would be some specific scenario like pure RT-performance

You are forgetting about over 3x density and subsequent 3x amount of transistors. The numbers 65% less power or 35% more performance numbers refer to perf/power of transistors, not the chip. A really straightforward, with no architectural improvements involved at all, way of obtaining what the rumors claimed would be to put 3x as many transistors at the same transistor performance. So 3x transistors @ 35% power.

Miniature Kaiju · Jan 21, 2020

Benetanegia said:
You are forgetting about over 3x density and subsequent 3x amount of transistors. The numbers 65% less power or 35% more performance numbers refer to perf/power of transistors, not the chip. A really straightforward, with no architectural improvements involved at all, way of obtaining what the rumors claimed would be to put 3x as many transistors at the same transistor performance. So 3x transistors @ 35% power.

Nope, it's perf/power of the chip. What it means it's that the same chip, just shrunk down to the new node, can get you either a 65% reduction in power OR a 35% increase in perf through clocks. Yes, there's still the density increase to be explored, but power draw increases roughly linearly with transistor count, they don't come for free.

Kaotik · Jan 21, 2020

Benetanegia said:
You are forgetting about over 3x density and subsequent 3x amount of transistors. The numbers 65% less power or 35% more performance numbers refer to perf/power of transistors, not the chip. A really straightforward, with no architectural improvements involved at all, way of obtaining what the rumors claimed would be to put 3x as many transistors at the same transistor performance. So 3x transistors @ 35% power.

As @Miniature Kaiju pointed out, the numbers are for theoretical chip which gets shrunk, not one with 3x the transistors. "65% less power" means the same chip running at same clocks should consume 65% less and "35% more performance" that running the same chip at 35 % higher clocks should use same amount of power (as previous process at those lower clocks)

As for the density, theory and real life chips aren't exactly the same.
Coming from Vega 10 on GloFo 14nm to Navi 10 on TSMC 7nm, theoretically density could have improved slightly over 100% (so 2x density), but in reality it only improved little over 60%.
The "3x density" TSMC talks about is for the low power "mobile" variant of 7nm process (96.5 Mtrans/mm2), high performance "HPC" variant gets only 66.7 Mtrans/mm2 which isn't even double the 12nm density (33.8 Mtrans/mm2).
Since NVIDIA is coming from TSMC 12nm, they should actually yield slightly lower density improvements compared to AMD coming 14nm (they used same libraries on 12nm so they didn't take advantage of 12nm's possible density benefits). GloFo 14nm theoretically has 32.5Mtrans/mm2 density

trinibwoy · Jan 21, 2020

I just realized it’s 65% “less” power so I have to adjust my math a bit. Assuming perfect scaling that means a chip with 2.85x the transistors will consume the same power.

Now obviously reality will be nowhere near that but even 2x would be huge. Question is how come Navi got the density improvement but not the expected reduction in power consumption?

DegustatoR · Jan 21, 2020

trinibwoy said:
Question is how come Navi got the density improvement but not the expected reduction in power consumption?

Had to be clocked out of its optimal frequency window to compete with Turing?

Benetanegia · Jan 21, 2020

Miniature Kaiju said:
Nope, it's perf/power of the chip. What it means it's that the same chip, just shrunk down to the new node, can get you either a 65% reduction in power OR a 35% increase in perf through clocks

That's the same I said. The catch is the bolded part "performance through clocks", which is why:

Yes, there's still the density increase to be explored, but power draw increases roughly linearly with transistor count, they don't come for free.

Exactly, so when in the same die size, you can have 3X amount of transistors (properly used that would mean 3x execution units), each drawing roughly 1/3 the power as before, what do you get? Exactly 3X performance at equal power consumption. And if you make a chip that is half the size? 1.5x performance at 0.5x power, or in other words 50% higher performance at 50% lower power.

Miniature Kaiju · Jan 21, 2020

You won't get anywhere near a 3x increase in density for anything performant, though. You'll get barely a 2x increase going from 12nm to 7nm.

Also, it's worth noting that transistor count doesn't translate linearly to perf.

Kaotik · Jan 21, 2020

Miniature Kaiju said:
You won't get anywhere near a 3x increase in density for anything performant, though. You'll get barely a 2x increase going from 12nm to 7nm.

Also, it's worth noting that transistor count doesn't translate linearly to perf.

Probably even less, AMD got only 65% coming from less dense GloFo 14nm compared to TSMC 12nm

Benetanegia · Jan 21, 2020

Miniature Kaiju said:
You won't get anywhere near a 3x increase in density for anything performant, though. You'll get barely a 2x increase going from 12nm to 7nm.

That remains to be seen. All previous generations have achieved something really close ti theoretical, for example Pascal saw a 1.7x density increase on 1.9x density node. 16/12nm to 7nm TSMC is a 3.3x, so 3x is not out of the question. Plus Pascal relied heavily on increasing clocks, while here we'd be talking about lowest power at no increased clocks, which is always going be accompanied with relatively higher densities if both cases are pushed to the max.

Also, it's worth noting that transistor count doesn't translate linearly to perf.

Of course it does all things being equal. It can even bring larger than linear performance increases if one can concentrate those transistors on performance producing units.

Benetanegia · Jan 21, 2020

Kaotik said:
Probably even less, AMD got only 65% coming from less dense GloFo 14nm compared to TSMC 12nm

You can't use a completely different company, coming from a completely different foundry, with a completely different architecture and where there's a million unknowns about what changes were needed to make a conclusion on how well 7nm scales.

EDIT: Oh and I'm pretty sure GloFo 14nm has always been considered denser than TSMC's.

Kaotik · Jan 21, 2020

Benetanegia said:
You can't use a completely different company, coming from a completely different foundry, with a completely different architecture and where there's a million unknowns about what changes were needed to make a conclusion on how well 7nm scales.

EDIT: Oh and I'm pretty sure GloFo 14nm has always been considered denser than TSMC's.

GloFo 14nm (32.5 MTrans/mm^2) is denser than TSMC 16nm (28.2 MTrans/mm^2) but less dense than TSMC 12nm (33.8 MTrans/mm^2).

Reported transistor densities should be comparable between fabs and given design should require roughly same amount of transistors no matter the manufacturer. Sure there might are some differences, but when we're talking about numbers rounded to 100s of thousands, it should be within the margin of error. (Also TSMC and their processes are by no means unknown to AMD, they've designed chips for their 16nm too)

Also, for what it's worth, with Vega 20 they only managed to squeeze density up by 58 % compared to Vega 10.

w0lfram · Jan 21, 2020

So Turing, going to Ampere is going to gain 60% more die space, at same watts?

Kaotik · Jan 21, 2020

w0lfram said:
So Turing, going to Ampere is going to gain 60% more die space, at same watts?

Probably. Unless they go straight for N7+, that would of course change things, I'm just assuming it's N7 until someone confirms EUV

DegustatoR · Jan 21, 2020

Kaotik said:
GloFo 14nm (32.5 MTrans/mm^2) is denser than TSMC 16nm (28.2 MTrans/mm^2) but less dense than TSMC 12nm (33.8 MTrans/mm^2).

12FFC is not the same thing as 12FFN. Turing is 12FFN.

Kaotik · Jan 21, 2020

DegustatoR said:
12FFC is not the same thing as 12FFN. Turing is 12FFN.

So are you suggesting it's similar to AMD using old "14nm libraries" with GloFo 12nm? Then the increase in density should be slightly higher than the number I used of course. Regardless, at 12/14/16nm processes AMD and NVIDIA GPUs have had quite similar densities from 22 to 25 Mtrans/mm^2 or so

Benetanegia · Jan 21, 2020

Kaotik said:
So are you suggesting it's similar to AMD using old "14nm libraries" with GloFo 12nm? Then the increase in density should be slightly higher than the number I used of course. Regardless, at 12/14/16nm processes AMD and NVIDIA GPUs have had quite similar densities from 22 to 25 Mtrans/mm^2 or so

Yeah but it's not the same situation, even if the densitiy numbers are similar. I mean, Nvidia being 25 Mtrans/mm^2 on a 28.2 Mtrans/mm^2 node vs AMD being 25 Mtrans/mm^2 on a 32.5 MTrans/mm^2 is quite a different story. We are talking a 88% vs 75% of the node's maximum density achieved.

But that's not even the problem, because it's Navi that is the big elephant in the room with 40 Mtrans/mm^2 in a node that should be able to provide at least 65 Mtrans/mm^2, let alone 90 Mtrans/mm^2 if low power cells are used. That's just 63% and taking it for comparison is obviously skewing the results.

If Nvidia achieved 88% of 65 Mtrans/mm^2, we would be talking about a 2.3x increase in density, but still we don't know what kind of libraries were used in each case, so I stand by my previous claims.

Kaotik · Jan 21, 2020

Benetanegia said:
Yeah but it's not the same situation, even if the densitiy numbers are similar. I mean, Nvidia being 25 Mtrans/mm^2 on a 28.2 Mtrans/mm^2 node vs AMD being 25 Mtrans/mm^2 on a 32.5 MTrans/mm^2 is quite a different story. We are talking a 88% vs 75% of the node's maximum density achieved.

But that's not even the problem, because it's Navi that is the big elephant in the room with 40 Mtrans/mm^2 in a node that should be able to provide at least 65 Mtrans/mm^2, let alone 90 Mtrans/mm^2 if low power cells are used. That's just 63% and taking it for comparison is obviously skewing the results.

If Nvidia achieved 88% of 65 Mtrans/mm^2, we would be talking about a 2.3x increase in density, but still we don't know what kind of libraries were used in each case, so I stand by my previous claims.

NVIDIA didn't reach any higher with Samsung 14nm which is the same process as GloFo 14nm, 24-25 MTrans/mm^2. Also it's not just Navi 10 & 14, Vega 20 also has 40-41 MTrans/mm^2 density

trinibwoy · Jan 21, 2020

DegustatoR said:
Had to be clocked out of its optimal frequency window to compete with Turing?

That certainly seems to be the case for the 5700xt but the 5600xt is showing much better perf/watt even at relatively high clocks.

The voltage/frequency curve for Navi isn’t much better than Turing though so there’s definitely room for improvement.

AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Kaotik

Drunk Member

trinibwoy

Meh

Benetanegia

Miniature Kaiju

Kaotik

Drunk Member

trinibwoy

Meh

DegustatoR

Benetanegia

Miniature Kaiju

Kaotik

Drunk Member

Benetanegia

Benetanegia

Kaotik

Drunk Member

w0lfram

Kaotik

Drunk Member

DegustatoR

Kaotik

Drunk Member

Benetanegia

Kaotik

Drunk Member

trinibwoy

Meh