Nvidia Post-Volta (Ampere?) Rumor and Speculation Thread

ShaidarHaran · Mar 11, 2020

So TU102 -> GA102:
72 SMs 84SMs
+15% IPC
= 33.4% more performance

Clockspeed has to make up the rest of the difference to reach 40%.

Coupled with the shift to Samsung 10nm and I'm feeling rather underwhelmed by these rumors. Hoping it's FUD.

Bondrewd · Mar 11, 2020

ShaidarHaran said:
Hoping it's FUD.

It is.
The dude has been making shit up then deleting it for like what, 2 weeks already?

DavidGraham · Mar 11, 2020

techuse said:
For equivalent tiers. 3080 being 50% faster than 2080ti is a pipe dream

The 3080Ti being only 40% faster than 2080Ti is hogwash, this is a new node we are talking about. Even AMD was able to do much more. Heck, even NVIDIA achieved that with Turing on the same node as Pascal already.

We don't need rumors and suspicious sources to tell us about next gen, we only need to look at next HPC chips and work our way from there.

techuse · Mar 11, 2020

DavidGraham said:
The 3080Ti being only 40% faster than 2080Ti is hogwash, this is a new node we are talking about. Even AMD was able to do much more. Heck, even NVIDIA achieved that with Turing on the same node as Pascal already.

We don't need rumors and suspicious sources to tell us about next gen, we only need to look at next HPC chips and work our way from there.

AMD was only able to because of the huge clock jump that Nvidia has already exhausted. It seems unlikely that we will see another Maxwell to Pascal clockspeed jump.

Kaotik · Mar 11, 2020

DavidGraham said:
The 3080Ti being only 40% faster than 2080Ti is hogwash, this is a new node we are talking about. Even AMD was able to do much more. Heck, even NVIDIA achieved that with Turing on the same node as Pascal already.

Yeah, by growing die size by 60% (yes, they added new things too, but they don't take much of it)

McHuj · Mar 11, 2020

DavidGraham said:
The 3080Ti being only 40% faster than 2080Ti is hogwash, this is a new node we are talking about. Even AMD was able to do much more. Heck, even NVIDIA achieved that with Turing on the same node as Pascal already.

We don't need rumors and suspicious sources to tell us about next gen, we only need to look at next HPC chips and work our way from there.

The design choices for performance gains could be driven by the economics as well. If this rumor is true, it means that they are going with a cheaper more economical design. Samsung 10nm should be cheaper than 7FF, but that also comes at a price of density and performance which will be worse than 7FF. Samsung 10nm to me says, they're not going to push the price much higher than the 20xx series. (In fact I think in this potential world economy, it would probably be stupid to go for an even higher price).

DavidGraham · Mar 11, 2020

techuse said:
AMD was only able to because of the huge clock jump that Nvidia has already exhausted. It seems unlikely that we will see another Maxwell to Pascal clockspeed jump.

NVIDIA didn't need a clock speed increase for Turing. Also AMD merely increase the clocks by 300MHz to 400MHz, this is anything but huge. Huge would be Maxwell to Pascal of 1GHz to 1.7/1.8GHz.

Kaotik said:
Yeah, by growing die size by 60% (yes, they added new things too, but they don't take much of it)

Yup, which they might do again here. Turing on 10nm/7nm will be smaller in size, they will double it up from there if they needed.

McHuj said:
The design choices for performance gains could be driven by the economics as well.

Maybe, but I believe the prime thing here will be to maximize performance as much as possible, especially to distinguish themselves from next gen consoles and or RDNA2. NVIDIA said goodbye to small dies strategy when they integrated Tensor and RT cores and the rest of the new features. They will scale those up to increase performance even further, so bye bye small dies.

Anyway, I still think the HPC chips will tell us everything we need to know.

ShaidarHaran · Mar 11, 2020

DavidGraham said:
NVIDIA didn't need a clock speed increase for Turing. Also AMD merely increase the clocks by 300MHz to 400MHz, this is anything but huge. Huge would be Maxwell to Pascal of 1GHz to 1.7/1.8GHz.

Yup, which they might do again here. Turing on 10nm/7nm will be smaller in size, they will double it up from there if they needed.

Maybe, but I believe the prime thing here will be to maximize performance as much as possible, especially to distinguish themselves from next gen consoles and or RDNA2. NVIDIA said goodbye to small dies strategy when they integrated Tensor and RT cores and the rest of the new features. They will scale those up to increase performance even further, so bye bye small dies.

Anyway, I still think the HPC chips will tell us everything we need to know.

Maxwell ran well above 1GHz. I had a 980 Ti that did 1550MHz under water.

Malo · Mar 11, 2020

ShaidarHaran said:
Maxwell ran well above 1GHz. I had a 980 Ti that did 1550MHz under water.

And Pascal can do up to 2200mhz on water, see how that works?

ShaidarHaran · Mar 12, 2020

Malo said:
And Pascal can do up to 2200mhz on water, see how that works?

Kepler ran around 1GHz. Maxwell around 1500MHz. Pascal around 2GHz, Turing same. By percentage the increase from Kepler to Maxwell was bigger than that of Maxwell to Pascal.

Benetanegia · Mar 12, 2020

ShaidarHaran said:
Kepler ran around 1GHz. Maxwell around 1500MHz. Pascal around 2GHz, Turing same. By percentage the increase from Kepler to Maxwell was bigger than that of Maxwell to Pascal.

Erm... no. Stock clocked Kepler boosted to well over 1100 Mhz. Even adverticed boost clocks were around 1100 Mhz in a few models and of course actual boost clocks could be much higher. Again, on stock cards&clocks. There was around a 100 Mhz difference between Kepler and Maxwell and this difference didn't really get much higher with OC.

CarstenS · Mar 12, 2020

Compare apples to apples.

ShaidarHaran · Mar 12, 2020

Benetanegia said:
Erm... no. Stock clocked Kepler boosted to well over 1100 Mhz. Even adverticed boost clocks were around 1100 Mhz in a few models and of course actual boost clocks could be much higher. Again, on stock cards&clocks. There was around a 100 Mhz difference between Kepler and Maxwell and this difference didn't really get much higher with OC.

I owned multiple Kepler products, and multiple Maxwell products. Average clock on Kepler (680, multiple 780s) was AROUND 1GHz, as I said. Average clock on Maxwell, across 970, 980, 980 Ti was AROUND 1500MHz.

No one is saying Pascal didn't receive large clock speed boosts over Maxwell. I'm just putting that boost in perspective by providing additional context that Maxwell also saw a large clock speed bump over Kepler.

Benetanegia · Mar 12, 2020

ShaidarHaran said:
I owned multiple Kepler products, and multiple Maxwell products. Average clock on Kepler (680, multiple 780s) was AROUND 1GHz, as I said. Average clock on Maxwell, across 970, 980, 980 Ti was AROUND 1500MHz.

I don't care how many cards you had. You are either remembering it very badly or lying through your teeth. Stock clocks vs stock clocks, Kepler was around 1100 Mhz and Maxwell around 1200 Mhz. OC under water was something like 1350 Mhz vs 1500.

PSman1700 · Mar 12, 2020

Base clock for 680 was 1ghz, boost to 1050mhz somewhere. The 780 was lower clocked if i remember right.

https://www.techpowerup.com/gpu-specs/geforce-gtx-680.c342

https://www.techpowerup.com/gpu-specs/geforce-gtx-780.c1701

techuse · Mar 12, 2020

My take on the clock situations is as follows and accounts for whats usually achieved with average ASIC quality and air cooling.

Kepler - 1150 mhz
Maxwell - 1400 mhz
Pascal - 1950 mhz.

Beyond that is usually reserved for higher quality chips and/or more advanced cooling but the majority of GPUs within each family should have no problem hitting the above clocks.

Pinstripe · Mar 13, 2020

techuse said:
Very believable IMO. Hard to see overall performance being more than 50% faster as a best case.

I believe so, too. Nvidia developers already stated when they launched Turing that improved rasterization performance becomes less important to them, so we should only expect moderate improvements from now on. Raytracing otoh can still achieve a huge visual differences, so trying to imrpove on that front makes more sense now.

I'm a bit baffled about the VRAM amount through. I'd have guessed a doubling through the whole product line was in order.

DegustatoR · Mar 13, 2020

Pinstripe said:
I believe so, too. Nvidia developers already stated when they launched Turing that improved rasterization performance becomes less important to them, so we should only expect moderate improvements from now on. Raytracing otoh can still achieve a huge visual differences, so trying to imrpove on that front makes more sense now.

I'm a bit baffled about the VRAM amount through. I'd have guessed a doubling through the whole product line was in order.

You need to improve "rasterization performance" (which is what btw? general purpose FP32 SIMDs aren't tied to rasterization any more than your random memory access controller) to improve ray tracing performance since a lion's share of RT calculations do happen on your ordinary FP32 SIMD units.

jlippo · Mar 13, 2020

DegustatoR said:
You need to improve "rasterization performance" (which is what btw? general purpose FP32 SIMDs aren't tied to rasterization any more than your random memory access controller) to improve ray tracing performance since a lion's share of RT calculations do happen on your ordinary FP32 SIMD units.

They did add new features which can be used with rasterization. (Mesh shaders, texture shading stuff etc.)

If they reduce possible limitations of current RT core we might see decent improvement with same amount of RT units and ALUs.

Pinstripe · Mar 13, 2020

jlippo said:
They did add new features which can be used with rasterization. (Mesh shaders, texture shading stuff etc.)

If they reduce possible limitations of current RT core we might see decent improvement with same amount of RT units and ALUs.

Isn't that why there are now two FP32 units in every ALU, as this CorgiKitty claims? I'm no expert in microprocessor design, but wouldn't that mean more shading power can now be designated to help RT acceleration without the need to increase CUDA cores proportionally?

Nvidia Post-Volta (Ampere?) Rumor and Speculation Thread

ShaidarHaran

hardware monkey

Bondrewd

DavidGraham

techuse

Kaotik

Drunk Member

McHuj

DavidGraham

ShaidarHaran

hardware monkey

Malo

Yak Mechanicum

ShaidarHaran

hardware monkey

Benetanegia

CarstenS

Moderator

ShaidarHaran

hardware monkey

Benetanegia

PSman1700

techuse

Pinstripe

DegustatoR

jlippo

Pinstripe

Similar threads