Nvidia Volta Speculation Thread

You forgot to mention that efficiency went to the roof. in fact, Nvidia claims that Volta SM is 50% more efficient than Pascal one, a clear indication that Volta SM is not a tweaked update but a totally new design. Only with this advantage, Nvidia will keep a large lead in power efficiency vs Vega, thus in performance too, even if it implies a bigger die
Actually, I did mention it - with a caveat:
„If that's not total marketing BS and only true for VERY select tasks, then maybe the higher energy efficiency of the SMs might help a Volta-based gaming card.“
--
I agree, that this is most likely not derived from using the Tensor cores, but I have seen VERY creative math in the past from various vendors. But the 50% statement for example is suspiciously close to the peak TFLOPS numbers (15,0 vs. 10,6) while keeping the same (rough) power envelope. I am not sure whether I buy that each single Volta SM does deliver 50% more performance for each watt or rather the "whole of all those SMs when applying our boost clock deliver 50% more performance in the same power envelope".
 
Last edited:
A more realistic and grounded expectation would suggest maybe a 15% increase in power efficiency, which coupled with increased ALUs count and better scheduling can provide a considerable performance boost over Pascal.
I'd generally agree, but I don't see Nvidia investing in decided larger die area for their more "economically optimized" GPUs. Transistor density between GP100 (601mm², 15.3 bln) and GV100 (815 mm², 21,1 bln.) has not evolved as much as the "12 nm" process tech would imply. I realize that there is a possibility that Nvidia had lower the density for routing or whatever reasons in GV100, which might not be necessary with a possible GV102/4/6, but i do not see any hard indication for this.
 
I'd generally agree, but I don't see Nvidia investing in decided larger die area for their more "economically optimized" GPUs. Transistor density between GP100 (601mm², 15.3 bln) and GV100 (815 mm², 21,1 bln.) has not evolved as much as the "12 nm" process tech would imply. I realize that there is a possibility that Nvidia had lower the density for routing or whatever reasons in GV100, which might not be necessary with a possible GV102/4/6, but i do not see any hard indication for this.

Well to be fair they are effectively getting 50% more performance in the same power envelope, at more or less the same clocks (vs P100). It's entirely possible that they forwent density improvements with a view to maximize efficiency on their GV100 chip and this could change with the SP geared chips. It's also not unreasonable to assume that the GP102/4/6 successors will be larger, with the 470mm^2 GP102 having roughly 25% leeway before hitting the maximum size of a chip we have seen on consumer products.

If they can produce a ~16tflop gpu (without FP64/tensor resources) on a ~600mm die with a power budget of 250W that will be in line with previous generational improvements
 
Well to be fair they are effectively getting 50% more performance in the same power envelope, at more or less the same clocks (vs P100). It's entirely possible that they forwent density improvements with a view to maximize efficiency on their GV100 chip and this could change with the SP geared chips.
Agreed, it is possible, which is why I wrote this as well. It is totally possible that they will be able to have more density on gaming optimized chips or more clock headroom compared to corresponding pascal cards. But I fail to see any indication for this yet.

It's also not unreasonable to assume that the GP102/4/6 successors will be larger, with the 470mm^2 GP102 having roughly 25% leeway before hitting the maximum size of a chip we have seen on consumer products.

If they can produce a ~16tflop gpu (without FP64/tensor resources) on a ~600mm die with a power budget of 250W that will be in line with previous generational improvements
It may not be impossible, but I do not think it likely, that Nvidia who is very carefully managing their margins as of late, would be investing more die space than they absolutely must. And this factor seems to be determined by Vega, which is a unknown quantity as of yet (yes, I know about the "benchmarks" shown) with regards to gaming. Of course it is possible to throw more die space at the performance problem and they certainly have a bit of wiggle room there, but the question is: Will they do it if they don't have to.
 
Agreed, it is possible, which is why I wrote this as well. It is totally possible that they will be able to have more density on gaming optimized chips or more clock headroom compared to corresponding pascal cards. But I fail to see any indication for this yet.


It may not be impossible, but I do not think it likely, that Nvidia who is very carefully managing their margins as of late, would be investing more die space than they absolutely must. And this factor seems to be determined by Vega, which is a unknown quantity as of yet (yes, I know about the "benchmarks" shown) with regards to gaming. Of course it is possible to throw more die space at the performance problem and they certainly have a bit of wiggle room there, but the question is: Will they do it if they don't have to.

TOtally agree with this, and again there's so many unknowns at this point its really hard to guess where they are going.

If dp4a is clipped from 102/104/106 then they can't position GV102 as their best inferencing chip, which leaves pro graphics as the most profitable market segment for it.

Is that enough to warrant producing a 600mm die and have it trickle down to consumers? I don't know, but judging by their GV100 preview it seems like the thread scheduling changes are unlikely to boost performance by much in graphics oriented workloads because to knowledge there's far less divergence than in compute oriented loads.

Then again, it was a compute oriented preview.

Im curious to see if there will also be front-end changes as I am hard pressed to believe geometry throughput will be the same as the other six GPC based Pascal parts

Hopefully we will get more information soon, would be surprised if this clocks significantly higher than the average Pascal part
 
And this factor seems to be determined by Vega, which is a unknown quantity as of yet (yes, I know about the "benchmarks" shown) with regards to gaming.
Absolutely not. Volta is finished and taped out. Nothing can be changed at this point of time.

Of course it is possible to throw more die space at the performance problem and they certainly have a bit of wiggle room there, but the question is: Will they do it if they don't have to.
Again, I disagree. They have to keep pushing performance to motivate customers in buying new shiny graphic cards. +15~25% more die can be easily compensated by higher prices, or selling smaller die for same money. If Vega leaks are any indication of final performance / die size ratio, then Nvidia wont have any issue to sell Volta with great margins
 
Absolutely not. Volta is finished and taped out. Nothing can be changed at this point of time.


Again, I disagree. They have to keep pushing performance to motivate customers in buying new shiny graphic cards. +15~25% more die can be easily compensated by higher prices, or selling smaller die for same money. If Vega leaks are any indication of final performance / die size ratio, then Nvidia wont have any issue to sell Volta with great margins

I think he meant that what they release is dependent on how Vega pans out, not that the design of the chips themselves is still up in the air
 
I'd generally agree, but I don't see Nvidia investing in decided larger die area for their more "economically optimized" GPUs. Transistor density between GP100 (601mm², 15.3 bln) and GV100 (815 mm², 21,1 bln.) has not evolved as much as the "12 nm" process tech would imply. I realize that there is a possibility that Nvidia had lower the density for routing or whatever reasons in GV100, which might not be necessary with a possible GV102/4/6, but i do not see any hard indication for this.

If they do not increase the die then what happens to GV102?
You end up with a mismatch product range, and they need to improve the FP32 TFLOPs of GV102 just as they did with GP102 relative to GP100 - GV102 should be their top tier FP32 card as its purpose is not to have FP64 apart from the weakest ratio.
So GP100 was 9.3 TFLOPs FP32 for PCIe model and GP102 was I think 12.1 TFLOPs FP32, just commenting more broadly.
600mm2 has been around for a while now so options there for Nvidia beyond GP100, especially as they are now hitting even better efficiency with the latest evolution update of 16nm from TSMC with their '12nm' (that is not really 12nm).
Cheers
 
Last edited:
If they do not increase the die then what happens to GV102?
You end up with a mismatch product range, and they need to improve the FP32 TFLOPs of GV102 just as they did with GP102 relative to GP100 - GV102 should be their top tier FP32 card as its purpose is not to have FP64 apart from the weakest ratio.
Absolutely. History showed us that Nvidia is not afraid of big dies when the process is mature. JHH claimed many times that Pascal yields are fantastic and, 12nm being a custom 16nm, I expect it to remain the same (otherwise we wont see the mammoth 815mm2 V100).
IMHO Volta consumer will be 20~30% larger than same class Pascal chip. In other words, GV102 should be around 570mm2 and GV104 around 400mm2. in terms of performance, as usual with Nvidia, GV104 (xx80 card) will beat 1080Ti by 10~20% (the hardware scheduler will give even bigger advantage under DX12 titles).
 
If they do not increase the die then what happens to GV102?
You end up with a mismatch product range, and they need to improve the FP32 TFLOPs of GV102 just as they did with GP102 relative to GP100 - GV102 should be their top tier FP32 card as its purpose is not to have FP64 apart from the weakest ratio.
So GP100 was 9.3 TFLOPs FP32 for PCIe model and GP102 was I think 12.1 TFLOPs FP32, just commenting more broadly.
600mm2 has been around for a while now so options there for Nvidia beyond GP100, especially as they are now hitting even better efficiency with the latest evolution update of 16nm from TSMC with their '12nm' (that is not really 12nm).
Cheers
There are a lot of options that's for sure. And I am not claiming of course that what my feeling at this moment is the be-all-end-all. :)
But what we have seen for the past two Nvidia generations where, for one reason or the other, competition from AMD was less stiff than it has been in the past, they went down with their "oh-four"-chip from 400-ish mm² back to 300-ish mm² (GF114: 332 mm ², GK104: 294 mm², GM204: 398 mm², GP104: 314 mm²) again.

Now, you could exptrapolate a cadence here: 1st on new process small (GK104), second one larger (GM204), so that GV104 might have a larger die than it's predecessor GP104 indeed.

But OTOH, Nvidia is a very financially driven company and right now they seem to have a very comfortable standing in the price bracket of 380 to 550 EUR with GP104 without a Radeon GPU to immediately challenge them here (that of course implies that I expect Vega to compete at least with 1080 Ti in the price bracket above that - maybe a cut-down version will launch for symbolic 499 US-$/EUR). With GM204 we've seen how low prices can go especially for cut-down versions if they have to without destroying margins.

I think, barring an unexpectedly low price (say, 399) for a fast cut-down version of Vega, Nvidia will happily be able to live with a disconnect in performance between GV100 and GV102/104, whichever one it is they will market first in 2018.
 
There are a lot of options that's for sure. And I am not claiming of course that what my feeling at this moment is the be-all-end-all. :)
But what we have seen for the past two Nvidia generations where, for one reason or the other, competition from AMD was less stiff than it has been in the past, they went down with their "oh-four"-chip from 400-ish mm² back to 300-ish mm² (GF114: 332 mm ², GK104: 294 mm², GM204: 398 mm², GP104: 314 mm²) again.

Now, you could exptrapolate a cadence here: 1st on new process small (GK104), second one larger (GM204), so that GV104 might have a larger die than it's predecessor GP104 indeed.

But OTOH, Nvidia is a very financially driven company and right now they seem to have a very comfortable standing in the price bracket of 380 to 550 EUR with GP104 without a Radeon GPU to immediately challenge them here (that of course implies that I expect Vega to compete at least with 1080 Ti in the price bracket above that - maybe a cut-down version will launch for symbolic 499 US-$/EUR). With GM204 we've seen how low prices can go especially for cut-down versions if they have to without destroying margins.

I think, barring an unexpectedly low price (say, 399) for a fast cut-down version of Vega, Nvidia will happily be able to live with a disconnect in performance between GV100 and GV102/104, whichever one it is they will market first in 2018.
Nvidia is driven these days though by products that are orientated more on HPC and Tegra and Quadro and consumer having some synergy with them.
GV102/GV104 will be focused to fit well with HPC and professional market, meaning IMO Nvidia has little option but to increase the die across most products.
They cannot offer the next generation to the market without the notable increase in performance, they are not Intel :)

Need to consider this is not a refresh, and the market is bigger than consumer that shares or has synergy with the product released into Tesla and professional.
Cheers
 
Last edited:
TOtally agree with this, and again there's so many unknowns at this point its really hard to guess where they are going.

If dp4a is clipped from 102/104/106 then they can't position GV102 as their best inferencing chip, which leaves pro graphics as the most profitable market segment for it.

Is that enough to warrant producing a 600mm die and have it trickle down to consumers? I don't know, but judging by their GV100 preview it seems like the thread scheduling changes are unlikely to boost performance by much in graphics oriented workloads because to knowledge there's far less divergence than in compute oriented loads.

Then again, it was a compute oriented preview.

Im curious to see if there will also be front-end changes as I am hard pressed to believe geometry throughput will be the same as the other six GPC based Pascal parts

Hopefully we will get more information soon, would be surprised if this clocks significantly higher than the average Pascal part
Regarding Inferencing.
It needs to be seen just how it fits into the DL ecosystem but the 150W single slot GV100 is meant to be focused for inferencing, but this may still only be down to FP16 with mixed precision.
I still think they need the accelerated mixed precision dp4a Int8/Int8 Tensor operation and logically makes sense for them to still offer this with GV102.
It is fair to say a lot of the market would also like to see Vec2 packed FP16 unit in GV102 and lower Tesla as well, but who knows if Nvidia will do this now or later.

Cheers
 
Last edited:
Nvidia is driven these days though by products that are orientated more on HPC and Tegra and Quadro and consumer having some synergy with them.
Agreed, but apart from tight production supply, I don't see where they would need anything else than GV100 for HPC and Data Center. And Quadro, well, I suppose they could just fit larger framebuffers on the pascal-based models, add a GV100-board to the top-end and be done with it. Until Vega shows up and competes not only against Titan Xp in SpecViewperf, but also against the existing quadros with all parties using optimized drivers.

FWIW, upgrade cycles in the professional market are largely driven by financials and tax deduction possibilities as well and Pascal was/is one of those cyclces after Kepler.

Tegra is a different beast altogether.
 
I just stated a 10 TF card can do it, game developers are held back by consoles, the assets that I'm making will not run on today's consoles, just too many polys, too many bones, and too much on the lighting. And it will run on 2k reses. Yeah at 30 FPS on a gtx 1080 but hell that's ok. Right now, I can animated 10 of these characters running around on a gtx 970 30 fps, no advanced lighting (using UE4's default GI), on a Titan P, its running well above 100 fps. So I see no problem games to come out and look like that and run on Volta. As long as they aren't limited by consoles, its not a problem.

How do you recover the production cost of 250mil+ when your target audience are the Titan P owners? Hm ...
Games can look surprisingly real for a long time on so called "consumer" hardware, but it makes no money.
 
How do you recover the production cost of 250mil+ when your target audience are the Titan P owners? Hm ...
Games can look surprisingly real for a long time on so called "consumer" hardware, but it makes no money.


Cause I'm not planning on releasing the game for another 2 or more years ;) probably 4 years.

Kinda know target audiences ya know ;)

And this game isn't going to cost 250 million, no where near.

Game project budgets are way over the top, even 50 million is way too much.
 
How do you recover the production cost of 250mil+ when your target audience are the Titan P owners? Hm ...
Games can look surprisingly real for a long time on so called "consumer" hardware, but it makes no money.
Or they spend forever just on Kickstarter taking cash from there... cough Star Citizen :)
Probably one of the few times corporate execs at Ubisoft/EA/similar are a bit envious of someone elses business model :) - joking because it is not bringing in 100s of millions of dollars but if they could combine that approach with their own model they would be rather ecstatic; could milk the consumers even before preorder, DLC and in game purchase items-options lol (really should not laugh though because they will probably end up doing something crap like this eventually).
Cheers
 
Last edited:
Back
Top