Nvidia Volta Speculation Thread

TSMC's N7+ will do the trick.
Sounds like a too small increase for 7nm and the timeframe is very early for 7nm, as they wrote early samples in late Q1 and availability in H2. Post volta can just mean the gaming oriented architecture of volta. Maybe they now call stuff like 128Sp to 64Sp per SM different architecture or there are more changes now between compute and gaming architecture. Might be a lot of marketing inside. Xavier has 30TOPS in 30W with a dedicated deap learning accelerator or 10 TOPS at 30W with tensor cores. It might be, that they just build a bigger dl accelerator into the next gpu. Without more details it's hard to guess.
 
This is technically "post-Volta", but I think this thread might be the next best place to share.

https://www.anandtech.com/show/1191...-pegasus-at-gtc-europe-2017-feat-nextgen-gpus

130 TOPS in 220ish W is pretty sizeable increase considering V100 does 120 TOPS in 300W.

Not necessarily an improvement, since a self-driving DL computer on car only need to apply pre-trained DL model to forecast instead of training the network itself, the 130TOPS could very well be very low precision stuff like int8 or even lower, just like GP102 can do nearly 50T DL ops but GP100 can only do 22T DLops.

Maybe Nvidia will install some int8 tensor cores on their volta geforce product line.
 
Looks like Baidu and Nvidia have been making some big improvements with FP16 training using gradient scaling for accuracy and memory resource improvements; this is used with the Volta Tensor cores and libraries as the techniques are too slow otherwise due to the steps/cycles involved.

Nvidia blog said:
three techniques for successful training of DNNs with half precision: accumulation of FP16 products into FP32; loss scaling; and an FP32 master copy of weights. With these techniques NVIDIA and Baidu Research were able to match single-precision result accuracy for all networks that were trained (Mixed-Precision Training). Note that not all networks require training with all of these techniques.
https://devblogs.nvidia.com/parallelforall/mixed-precision-training-deep-neural-networks/
Baidu/Nvidia paper on the topic: https://arxiv.org/pdf/1710.03740.pdf

They have shown this working well now for a couple of Baidu applications, seems like a pretty important milestone.
Cheers
 
Am I missing something in regards to performance enhancements (claims) with volta?
1. Twice the perf/w
2. better L1 cache performance and size (some graph comparing LDS vs new L1)
3. reduced latency for dependent back to back alu instructions.

Anything else you can think of?
 
Given the price estimates for the Volta ASIC (around $2000-$3000 at full discount, or around that magnitude I think), and given how over-engineered that thing is, is anyone actually believing that we are going to see Volta based GeForce cards? Especially given the complete lack of any such announcements?

If not, what else? Possibly a Pascal shrink instead? GP200 series?

It's not like Nvidia would need much to take the performance crown distinctively again, but there should be sufficient headroom with a more recent node to push the perf/W boundary further down by quite a bit.

If we are actually getting a different architecture for the GV100 based Tesla cards, and the (possibly) GP200 based GeForce cards, it would also become unlikely that Nvidia would release a GV100 based Titan card either.



Besides, I would be very careful with all numbers which Nvidia publishes with regards to neural network performance of the Volta cards. Especially if there is by chance the word "TensorRT" hidden somewhere in the footnotes, which essentially means it wasn't the same network, but a minimized one (layers combined, near-zero weights eliminated, reduced precision in all parts where possible). Where as the CPU "reference" had to execute the full network instead.

Apply the same basic minimization methods to the network executed on the CPU, and I severely doubt whether the GV100 could still claim more than a 5-10x speedup at most.
 
Given the price estimates for the Volta ASIC (around $2000-$3000 at full discount, or around that magnitude I think), and given how over-engineered that thing is, is anyone actually believing that we are going to see Volta based GeForce cards? Especially given the complete lack of any such announcements?
Isn't that for V100? As in it has tensor units. They have worked on the shader performance as well as tensor units so why would they waste all that R&D money by not releasing gaming cards based on the architecture? And didn't they say they had big things in store for the graphics side of things? (don't remember the exact quote)
 
Besides, I would be very careful with all numbers which Nvidia publishes with regards to neural network performance of the Volta cards. Especially if there is by chance the word "TensorRT" hidden somewhere in the footnotes, which essentially means it wasn't the same network, but a minimized one (layers combined, near-zero weights eliminated, reduced precision in all parts where possible). Where as the CPU "reference" had to execute the full network instead.
I think people have started focusing on performance numbers outside Nvidia's influence and so far researcher's results seem to align with Nvidia's Volta statements. Comparisons at this stage seem to be primarily against P100, or what they previously used.
 
Am I missing something in regards to performance enhancements (claims) with volta?
1. Twice the perf/w
2. better L1 cache performance and size (some graph comparing LDS vs new L1)
3. reduced latency for dependent back to back alu instructions.

Anything else you can think of?
4. Increased physical clock speed of the HBM memory
5. Increased efficiency of the HBM memory controller compares to P100
6. Separate integer execution units as opposed to shared with FP32
 
If we are actually getting a different architecture for the GV100 based Tesla cards, and the (possibly) GP200 based GeForce cards, it would also become unlikely that Nvidia would release a GV100 based Titan card either.

Of course there won't be a GV100 based Titan. This is a pure HPC-Chip like GP100. A Titan would only be possibly with a GV102. But as it seems with all the info published with Drive PX Pegasus, i agree that there won't be volta based geforce cards. The Geforce cards will get the post-volta architecture used in pegasus.
 
4. Increased physical clock speed of the HBM memory
5. Increased efficiency of the HBM memory controller compares to P100
6. Separate integer execution units as opposed to shared with FP32
Thanks for the input. What's the chances any of the volta graphics cards have HBM?
 
The gaming architecture will also have tensor cores. For compatibility and devs there will be a at least small amount of them, like it's with DP on consumer cards at the moment. The chip on pegasus has a high number of TCs, so there'll be even gpus with a lot of tensor cores. But maybe with lower precision than v100 tensor cores.
 
The gaming architecture will also have tensor cores. For compatibility and devs there will be a at least small amount of them, like it's with DP on consumer cards at the moment. The chip on pegasus has a high number of TCs, so there'll be even gpus with a lot of tensor cores. But maybe with lower precision than v100 tensor cores.
Double precision is fully programmable, while tensor cores are fixed function units aimed at AI. There is likely zero chance they'll be in consumer cards or even quadro's for that matter.
 
Back
Top