Nvidia Volta Speculation Thread

Those are the numbers NV gave me. Their perf figures seem to be calculated against a ~1350MHz clockspeed, rather than the boost clock.
Or it could be a typo.
You can say the same about TitanxP when it comes to clocks spec and OCing.
Even at 2.0GHz. TitanXP is only 15 TFLOPS vs the 20 TFLOPS of the Titan V. A 33% more performance in favor of the Titan V.
Anyway point is it is not massive over the TitanxP (21.5% more relevant CUDA cores with the Titan V), which is reflected in some of those scores for the reason I mentioned,
Some claim the tester was having difficulties with GPU under-utilization and throttling. The performance is there as evident by Gears 4 and Superposition numbers, it's just not manifested well in 3D Mark for some reason. We'll have to wait for a proper review and analysis from a credible site.
 
Or it could be a typo.

Even at 2.0GHz. TitanXP is only 15 TFLOPS vs the 20 TFLOPS of the Titan V. A 33% more performance in favor of the Titan V.

Some claim the tester was having difficulties with GPU under-utilization and throttling. The performance is there as evident by Gears 4 and Superposition numbers, it's just not manifested well in 3D Mark for some reason. We'll have to wait for a proper review and analysis from a credible site.
You can only take both from spec, then calculate clock and TFLOPs from that.
The Titan V has 21.5% more CUDA cores in the regard you are interested in, both boost clock possibly to same level although you need to note the Titan V is a monster die and will have much greater total power consumption/thermals.
So that is why like I mention some results will not be that impressive and as some have mentioned they fit that scaling, but other test may utilise the newer arch and will have a much greater benefit when around similar clocks.

You got a link showing a 20 TFLOPs FP32 figure with the PCIE version of the V100 either Titan or HPC card - I am not yet confident I would fully accept some of what is seen on Reddit and reported by some tools such as EVGA Precision without further validation.
I still cannot see how it can sustain 2GHz with such a die and that cooler, we really need to see the accurate power consumption figures as well for such a GPU die.
They rate Titan V the same as most other top Geforce cards at 250W, but the top NVLink V100 is 300W - yeah some of that will be the NVLink but also due to its higher clocks.
Thanks

Edit:
I mean looking at the WhyCry results just now sort of fit with what I am saying from both the CUDA core increase and possibly the detriment in certain cases for the SM doubling, while other tests may be a fair bit more due to benefiting from other aspects of the newer arch.
 
Last edited:
Priceless. 3k for the card, but free benchmarks only plz. :D I can understand though.

They rate Titan V the same as most other top Geforce cards at 250W, but the top NVLink V100 is 300W - yeah some of that will be the NVLink but also due to its higher clocks.
Three instead of four HMB2 stacks and one memory controller plus associated ROP cluster plus 1.5 MiB L2 tile is gone as well.
 
Last edited:
Priceless. 3k for the card, but free benchmarks only plz. :D I can understand though.


Three instead of four HMB2 stacks and one memory controller plus associated ROP cluster plus 1.5 MiB L2 tile is gone as well.
True but one stack and relevant L2 will not be more than what 10W tops if that when considering they are also clocked lower than 2Gbps spec?
Look to the Tesla P100 that had both 12GB and 16GB models.
In the scheme of things core clocks would be more notable, where the NVLink version is a fair bit higher albeit adding to the power demand some as well.
The boost clock of the Titan V is same as the 300W NVLink2 V100 Mezzanine model, as a side note the Mezzanine model can also be said to have better cooling due to its implementation but one needs to see the performance envelope-power demand to see how important this could be; more so if the card is pushed but it may also impact leakage-static power/efficiency some.
 
Last edited:
Can tensor cores run at the same time as normal SMs at peak performance? I find it strange the tensor cores aren't promoted for visualization, someone should be able to find a graphics use for the 4x4 matrix flops.
 
Then again, when running pure fp32, there's lots of dark silicon to help with cooling (?)
You would have hotspots still, and it is still a 250W/300W die well before 2GHz, context is this is not cooled like the Mezzanine and also using the reference blower, and like I mentioned thermal related efficiencies from power leakage/static power.
Yeah not quite the same thermal issue as with transistor density but still the power consumption is going to be large due to the core/SM structure increase and a consideration, also for the VRM/power stage.

The TitanX (reduced core count Pascal version) hit 249W at around 1750MHz to 1775Mhz, I cannot find an accurate one for Titan Xp that has the full core count, which obviously is less than the Titan V but the only context we have is the Mezzanine V100 with its 300W TBP and higher specification core clock than the V100 250W PCIE card.
But to reiterate this needs to be taken into context with my full posts, not being critical here as I think it is a great GPU for a top Titan model.
 
Last edited:
I'm thinking of buy a Titan V to use for deep learning, if so I'll probably spend a few days/weeks to write custom microbenchmarks for it. Sadly the current shipping date for the UK website is Dec 30, so I haven't pulled the trigger on it yet :(
 
Back
Top