Nvidia Post-Volta (Ampere?) Rumor and Speculation Thread

Status
Not open for further replies.
10nm was never an option for high performance SoCs, it's designed only for mobile chips, so I wouldn't count it as a separate node when talking about GPUs

So? It's 2 node jumps whether they were designed for big/high performance chips or not. Each with their own improvements. 10nm doubled the density vs 16/12nm and 7nm increases it by more than 60% vs 10nm again for a conbined increase of 3.3x density from 16/12nm to 7nm. If 7n+ is used that's another 20% density increase on top of 7n, for something like 3.5x combined. Node jumps have never been higher than 2x before, it's a huge difference.
 
No, since perf/power gains were that of a full node.

But 0 or nearly 0 on density. There's 3 ways nodes improve, density, performance and power. Each node improves differently in each of those metrics, it doen't make them less of a node*. Arbitralily chosing one or two of those in order to categorize them based on whatever suits the moment is utterly stupid imo. 20 nm came with prettly lame improvements to perf and power, in fact I'm pretty sure it was worse than 10nm in that respect iirc, but was never called half-node.

* It's been more than a decade since "nodes" are only PR and not actual sizes.
 
And 10 is a half node in perf/power.

Arbitrary conclusion. While current 16/12nm was vastly improved over the initial 16FF node, the fisrt iteration wasn't really much better in those metrics than 10nm, at least not power. So is 16nm a full node or a half node? Which one of them? Get the point? A node is a node, period. Might be a good one or a bad one, overall. Or it might be good at one metric or another, but terrible in a third one. I don't care, still a node.

At the end of the day what matters to the discusion before it started derailing, is that 3.3x-3.7x density + 60-70% power reduction (depending on 7n or 7n+) is much more than the typical <2x density and 40-50% power reduction of a single node in the last decade+, and actually close to what 2 traditional nodes would look like (4x + 75%). If you prefer to call it 1.5x nodes, so be it. The tangible advatages of nearly 2 nodes are still there tho, and 3x density + 2/3 power reduction (1/3 power) is a perfectly fine recipe for 3x power efficiency, if things go right. Especially when (if rumor true) they clearly aimed for efficiency instead of performance increase: for comparison, Turing was 45% faster at launch same node. Pascal was 70% faster.
 
Arbitrary conclusion. While current 16/12nm was vastly improved over the initial 16FF node, the fisrt iteration wasn't really much better in those metrics than 10nm, at least not power. So is 16nm a full node or a half node? Which one of them? Get the point? A node is a node, period. Might be a good one or a bad one, overall. Or it might be good at one metric or another, but terrible in a third one. I don't care, still a node.

At the end of the day what matters to the discusion before it started derailing, is that 3.3x-3.7x density + 60-70% power reduction (depending on 7n or 7n+) is much more than the typical <2x density and 40-50% power reduction of a single node in the last decade+, and actually close to what 2 traditional nodes would look like (4x + 75%). If you prefer to call it 1.5x nodes, so be it. The tangible advatages of nearly 2 nodes are still there tho, and 3x density + 2/3 power reduction (1/3 power) is a perfectly fine recipe for 3x power efficiency, if things go right. Especially when (if rumor true) they clearly aimed for efficiency instead of performance increase: for comparison, Turing was 45% faster at launch same node. Pascal was 70% faster.
For what it's worth, TSMC considers it's 7nm "a full node" over 16nm and 10nm as "practice node for 7nm" https://en.wikichip.org/wiki/7_nm_lithography_process
TSMC started mass production of its 7-nanometer N7 node in April 2018. TSMC considers its 7-nanometer node a full node shrink over its 16-nanometer. Although TSMC has released a 10-nanometer node the year prior, the company considered its 10 nm to be a short-lived node and was intended to serve as a learning node on its way to 7.
 
For what it's worth, TSMC considers it's 7nm "a full node" over 16nm and 10nm as "practice node for 7nm" https://en.wikichip.org/wiki/7_nm_lithography_process

Sure, but again that's just arbitrary naming/classification based on marketing needs. Ffs can we just agree to disagree on what constitutes a worthy node and actually look at the density and power* advantages it brings?

*I prioritize power over perf, because that's exactly what the chip on the rumor does. 50% increase in performance is quite low for a new arch in a new node, compared to Pascal for example, which brought a 70% increase.
 
I actually can believe the 50% fast at half the power usage, only if it is on some RT works (or talking about the RT blocks only). I mean Turing was their first shot at it, and I don't doubt they have improved RT a lot since. Remember the jump in T&L performances between GF256 and GF2 (15=>25 mpolys/s on paper. Over 30 for ulta)? I can see something like that.
 
Is a new architecture from Nvidia expected? Maxwell —> Pascal type of transition seems more likely to me. Probably some RT tweaks in there.
 
Sure why not. It's been 3 years since Volta. Turing was the Volta tweak.

Coupling new architecture and a new node just seems like a risk that isn’t necessary given their lead. Turing is new for the consumer GPU market. It would be somewhat of a precedent in that market should Ampere be a new architecture.
 
Last edited:
If we follow past trends, comparing the biggest gaming dies NVIDIA releases, a certain pattern emerges:

580 to 780Ti is a 85% uplift (new node)
780Ti to 980Ti is a 45% uplift (same node)
980Ti to 1080Ti is a 75% uplift (new node)
1080Ti to 2080Ti is a 40% uplift (same node)

So at the very least I expect a 3080TI to be around 70% faster than 2080Ti at the same TDP, that would make it close to two times as efficient as Turing, it all depends on what NVIDIA can extract from the new node.

If we try to confide by the (50% uplift at half TDP rumor), I would say there is maybe a point in Ampere's lineup where a certain card is indeed 50% faster than the previous one at half the power (may be a 3050 6GB style card compared to a castrated 1650 Super 4GB), but big dies vs big dies, I really don't think the rumor is applicable there.
 
Last edited:
Pascal saw 50-70% perf/W improvement over Maxwell, though the latter was not as clocked high with turbo boosting. 3x perf/W would be 4-6 times as good as that and highly unlikely.

With 2080Ti's size, nvidia would have issues with surpassing its performance like they did with 1080 vs. 980Ti where 980Ti ref. was ~1200MHz. RT improvements should be much easier and 4k 60fps with RTX on and no DLSS crutch can be a huge seller.
 
With 2080Ti's size, nvidia would have issues with surpassing its performance like they did with 1080 vs. 980Ti where 980Ti ref. was ~1200MHz.
The 980Ti size was huge as well, the 1080Ti was almost 3/4 the size of 980Ti and achieved a 75% uplift.
RT improvements should be much easier and 4k 60fps with RTX on and no DLSS crutch can be a huge seller.
RT improvements don't come from fixed function units alone, they need a big uplift in compute performance as well, 4K60 RTX performance would need a huge increase in TF as well (50% more TF at least).
Coupling new architecture and a new node just seems like a risk that isn’t necessary given their lead. Turing is new for the consumer GPU market. It would be somewhat of a precedent in that market should Ampere be a new architecture.
Turing is not really a new arch, it's an upgraded Volta with RTX, so NVIDIA might feel the incentive to push a new arch to satisfy both gaming and HPC sectors.
 
Last edited:
Status
Not open for further replies.
Back
Top