Nvidia Post-Volta (Ampere?) Rumor and Speculation Thread

Status
Not open for further replies.
The rumors for the 2xxx generation were still way off base at this time in 2018. There's no reason to think 3xxx rumors are any better.
 
Yeah, it's just for theoretical analysis, never the less .. what do you think a more realistic analysis of the situation (clock scaling, TDP, gaming chips .. etc) would be?
No idea. But considering the clocks delta between GV100 and TU102 I'd say that a bump from 1.1 GHz to 1.8 GHz seems unlikely. Then I don't really expect gaming Ampere to launch at die sizes similar to GV100/TU102. It will likely top out somewhere around 500-600 mm^2 for a chip which will go into Titans / $1000+ products.
 
No, I mean the other rumor that was spread after that.

Apart from that: What amount do you think the next XBox and Playstation will have? Most likely, they will not stay at 8 GByte and for high-end Desktop, you need something more than „just what consoles have“ in order to cater to their target audience, aka PC Gaming Master Race. Otherwise, they'd feel diminished.
That one is literally the exact same fake leak, same account posted the fake image first and then that one with the same specs in text instead.
I'm betting safely, 16 GB for both, no separate memory pool for OS.
 
No idea. But considering the clocks delta between GV100 and TU102 I'd say that a bump from 1.1 GHz to 1.8 GHz seems unlikely.
Yeah, I expect the same range as V100: ~1500MHz, maybe 1600MHz @250W. It will still provide a substantial boost.
Then I don't really expect gaming Ampere to launch at die sizes similar to GV100/TU102. It will likely top out somewhere around 500-600 mm^2 for a chip which will go into Titans / $1000+ products.
We have two options:
Either the top gaming chip will follow the way of TU102 (~750mm) and be as large as they can be.
Or follow the path of GP102, by being heavily trimmed down (~500mm)

I vote for large as they can be, gaming chips need to double everything up: ALUs, INT32, RT, Tensor cores, .. etc. I don't think a small die can fit all of that.
 
Yeah, I expect the same range as V100: ~1500MHz, maybe 1600MHz @250W. It will still provide a substantial boost.

We have two options:
Either the top gaming chip will follow the way of TU102 (~750mm) and be as large as they can be.
Or follow the path of GP102, by being heavily trimmed down (~500mm)

I vote for large as they can be, gaming chips need to double everything up: ALUs, INT32, RT, Tensor cores, .. etc. I don't think a small die can fit all of that.
You do realize that ~750mm^2 chip for example would be about twice as expensive as TU102 just to manufacture? Considering how high the high end prices have already gone, I certainly don't hope for such monstrosity.
 
You do realize that ~750mm^2 chip for example would be about twice as expensive as TU102 just to manufacture? Considering how high the high end prices have already gone, I certainly don't hope for such monstrosity.
If a massive Geforce GPU would also increase the prices of smaller (say 250mm^2 & 500mm^2) GPUs somehow I see your concern. If not maybe there is some other downside I'm not thinking of?
 
If a massive Geforce GPU would also increase the prices of smaller (say 250mm^2 & 500mm^2) GPUs somehow I see your concern. If not maybe there is some other downside I'm not thinking of?
It doesn't, but bigger the top gaming chip is the bigger the other chips will most likely be too to prevent huge gaps on market coverage. Of course it would be possible to make one huge ass chip to the top and rest in more moderate sizes, but I just find it unlikely.
 
I vote for large as they can be, gaming chips need to double everything up: ALUs, INT32, RT, Tensor cores, .. etc. I don't think a small die can fit all of that.
Ampere's biggest target for gaming is significantly improving perf/price - which Turing kinda failed to do, even with recent "Super" price adjustments. A big die will go against this. But a lot will depend on where RDNA2 will be in 2020.
 
Yeah, I expect the same range as V100: ~1500MHz, maybe 1600MHz @250W. It will still provide a substantial boost.

We have two options:
Either the top gaming chip will follow the way of TU102 (~750mm) and be as large as they can be.
Or follow the path of GP102, by being heavily trimmed down (~500mm)

I vote for large as they can be, gaming chips need to double everything up: ALUs, INT32, RT, Tensor cores, .. etc. I don't think a small die can fit all of that.

It’s very unlikely that unit counts will double for gaming chips. I’m betting that ALUs will be in the range of

xx60 = 2560
xx70 = 3072
xx80 = 4096

With most performance gains coming from IPC, higher clocks, bigger caches, beefier RTX units, more memory bandwidth etc.

I would go conservative on the Ti/Titan too. Guessing 6144 alus on that bad boy.
 
xx80 = 4096
Maybe for the 3080, but 4096 is too low for a Ti or Titan, Titan RTX have 4608 already, you need higher than that for the next Titan.
With most performance gains coming from IPC, higher clocks, bigger caches, etc.
Ampere's biggest target for gaming is significantly improving perf/price - which Turing kinda failed to do, even with recent "Super" price adjustments. A big die will go against this.
emm, make no mistake, we are discussing the next gaming Ti/Titan here.

There is also the possibility of a 50% increase in ALU (~6700 cores) while operating at a frequency of ~2000MHz, that will demolish the need for an ultra big chip with lower clocks.
 
That one is literally the exact same fake leak, same account posted the fake image first and then that one with the same specs in text instead.
I'm betting safely, 16 GB for both, no separate memory pool for OS.
Why is this then repeated here over and over again?
 
Please stop.

2070 super is most of the time better than 5700 xt : https://www.techspot.com/review/1902-geforce-rtx-2070-super-vs-radeon-5700-xt/


Most of the time..? Doesn't that mean that it competes with it...?

And how does a chip so small in transistors, compete with a much larger chip, if it doesn't have more efficient gaming uArch..? Don't you see how inane your arguments are, that you conceed the argument, then ridicule me? Honestly, the point is, that rdna(1) is killing Turing spec for spec, and that rdna(1) is already EOL as rdna2 (the full new uArch from AMD) is coming in a few months.

As you said, the 2070 SUPER (most of the time) is better... but if you compare only the MODERN games and the games coming out...? Navi10 competes directly with the much larger (transistor count?) TU-106 die. (w/2176 CUDA Cores, 136 TMUs, and 64 ROPs)

Subsequently, rdna2 has an uplift over rdna1, & we all know this. It should be unfettered and open for gaming engines to utilize.
 
And how does a chip so small in transistors, compete with a much larger chip, if it doesn't have more efficient gaming uArch..?
Turing still has to carry the rather large dead wight of all the Tensor logic, besides the few sprinkles for RT acceleration.
 
"Modern" games (whatever the hell that means; people play all sorts of games, not necessarily "modern" ones):

https://www.techpowerup.com/review/wolcen-benchmark-test-performance-analysis/4.html
https://gamegpu.com/action-/-fps-/-tps/zombie-army-4-test-gpu-cpu
https://www.pcgameshardware.de/Jour...Benchmarks-Test-Review-Release-Steam-1342546/

I can go on but it was mostly like this for the whole 2019. Worth noting that GCN optimized console engines do tend to run better on RDNA cards than on Turing cards. Which of course doesn't tell much about the architectures, only the levels of optimization efforts.

It'll also be an interesting thing to see how engines made for RDNA2 in next gen consoles will run on RDNA1 vs Turing.

Also, TU106 is 2070, 2070S is a cut down TU104 (also found in 2060, 2080 and 2080S).
 
And how does a chip so small in transistors, compete with a much larger chip, if it doesn't have more efficient gaming uArch..? Don't you see how inane your arguments are, that you conceed the argument, then ridicule me? Honestly, the point is, that rdna(1) is killing Turing spec for spec, and that rdna(1) is already EOL as rdna2 (the full new uArch from AMD) is coming in a few months.
hmmm...

NAVI10 is 251mm2 on 7nm for 10.3 billion transistors
TU106 is 445mm2 on 16/12nm for 10.8 billion transistors

Number of transistors is in the same ballpark (within 5%), TDP is also very close despite AMD full node advantage, but RDNA lacks VRS, Ray Tracing acceleration, Tensor cores, DLSS, good video encoder and support of INT4/8 for fast inference !!! It's very clear which one has the upper hand.

So IMHO, RDNA is far away from Turing. AMD can only compete because of their node advantage. They already made their big move with RDNA and RDNA 2 will be a small architecture evolution (mostly bringing VRS and RT) on a refined mode, where Ampere is a totally new architecture with full mode shrink. Like everybody, I want close competition for the sake of reasonable prices. But let's be pragmatic, even with only a node shrink and zero uarch improvement (and it's not), it will be a bloodbath for AMD...
 
Been thinking about this for a bit: With Ampere being the probable successor to Volta, the early rumors about Ampere being scheduled for H1 2019 (2 years after Volta), TMSCs good execution streak and promises for 5nm being a full node compared to 7 nm, EUV reducing the # of masks (and 7 nm EUV only doing that for certain parts) AND the increased competition from AI startups and Intel as well, I would not completely dismiss the possibility, that Nvidia revamped Ampere for 5 nm.

As a safety net, Turing successors for gaming could be on 7 nm straight and not carry much Ampereism, apart from maybe beefed up raytracing.

Just some idle thoughts though.

later-edits in italics
 
Last edited:
A huge HPC chip on 7nm in 1H19 would probably cost so much that it wouldn't make sense even in HPC/AI market.

5nm is unlikely, for the very same reason - it'll probably be a couple of years until it'll be feasible for a 800 mm^2 die production.
 
hmmm...

NAVI10 is 251mm2 on 7nm for 10.3 billion transistors
TU106 is 445mm2 on 16/12nm for 10.8 billion transistors

Number of transistors is in the same ballpark (within 5%), TDP is also very close despite AMD full node advantage, but RDNA lacks VRS, Ray Tracing acceleration, Tensor cores, DLSS, good video encoder and support of INT4/8 for fast inference !!! It's very clear which one has the upper hand.

So IMHO, RDNA is far away from Turing. AMD can only compete because of their node advantage. They already made their big move with RDNA and RDNA 2 will be a small architecture evolution (mostly bringing VRS and RT) on a refined mode, where Ampere is a totally new architecture with full mode shrink. Like everybody, I want close competition for the sake of reasonable prices. But let's be pragmatic, even with only a node shrink and zero uarch improvement (and it's not), it will be a bloodbath for AMD...

I was too pissed off to answer nicely. Thx you, sir.
 
Status
Not open for further replies.
Back
Top