Nvidia Volta Speculation Thread

Samwell · Oct 10, 2017

Bondrewd said:
TSMC's N7+ will do the trick.

Sounds like a too small increase for 7nm and the timeframe is very early for 7nm, as they wrote early samples in late Q1 and availability in H2. Post volta can just mean the gaming oriented architecture of volta. Maybe they now call stuff like 128Sp to 64Sp per SM different architecture or there are more changes now between compute and gaming architecture. Might be a lot of marketing inside. Xavier has 30TOPS in 30W with a dedicated deap learning accelerator or 10 TOPS at 30W with tensor cores. It might be, that they just build a bigger dl accelerator into the next gpu. Without more details it's hard to guess.

LiXiangyang · Oct 11, 2017

ImSpartacus said:
This is technically "post-Volta", but I think this thread might be the next best place to share.

https://www.anandtech.com/show/1191...-pegasus-at-gtc-europe-2017-feat-nextgen-gpus

130 TOPS in 220ish W is pretty sizeable increase considering V100 does 120 TOPS in 300W.

Not necessarily an improvement, since a self-driving DL computer on car only need to apply pre-trained DL model to forecast instead of training the network itself, the 130TOPS could very well be very low precision stuff like int8 or even lower, just like GP102 can do nearly 50T DL ops but GP100 can only do 22T DLops.

Maybe Nvidia will install some int8 tensor cores on their volta geforce product line.

CSI PC · Oct 12, 2017

Looks like Baidu and Nvidia have been making some big improvements with FP16 training using gradient scaling for accuracy and memory resource improvements; this is used with the Volta Tensor cores and libraries as the techniques are too slow otherwise due to the steps/cycles involved.

Nvidia blog said:
three techniques for successful training of DNNs with half precision: accumulation of FP16 products into FP32; loss scaling; and an FP32 master copy of weights. With these techniques NVIDIA and Baidu Research were able to match single-precision result accuracy for all networks that were trained (Mixed-Precision Training). Note that not all networks require training with all of these techniques.

https://devblogs.nvidia.com/parallelforall/mixed-precision-training-deep-neural-networks/
Baidu/Nvidia paper on the topic: https://arxiv.org/pdf/1710.03740.pdf

They have shown this working well now for a couple of Baidu applications, seems like a pretty important milestone.
Cheers

Grall · Oct 12, 2017

Skynet is one step closer... Yayy!

xpea · Oct 15, 2017

Impressive V100 results on Huygens deconvolution GPU acceleration:
https://svi.nl/blogpost70-GPU-decon...VIDIA-s-brand-new-Volta-based-Tesla-V100-card
On 1000x1000x100 bench, V100 is nearly 2 times faster than P100, which is much better than the difference in specified FLOPs...

Deleted member 2197 · Oct 16, 2017

Volta DGX-1 User Guide ...
https://images.nvidia.com/content/technologies/deep-learning/pdf/DGX-1-UserGuide.pdf

Infinisearch · Oct 16, 2017

Am I missing something in regards to performance enhancements (claims) with volta?
1. Twice the perf/w
2. better L1 cache performance and size (some graph comparing LDS vs new L1)
3. reduced latency for dependent back to back alu instructions.

Anything else you can think of?

Ext3h · Oct 16, 2017

Given the price estimates for the Volta ASIC (around $2000-$3000 at full discount, or around that magnitude I think), and given how over-engineered that thing is, is anyone actually believing that we are going to see Volta based GeForce cards? Especially given the complete lack of any such announcements?

If not, what else? Possibly a Pascal shrink instead? GP200 series?

It's not like Nvidia would need much to take the performance crown distinctively again, but there should be sufficient headroom with a more recent node to push the perf/W boundary further down by quite a bit.

If we are actually getting a different architecture for the GV100 based Tesla cards, and the (possibly) GP200 based GeForce cards, it would also become unlikely that Nvidia would release a GV100 based Titan card either.

Besides, I would be very careful with all numbers which Nvidia publishes with regards to neural network performance of the Volta cards. Especially if there is by chance the word "TensorRT" hidden somewhere in the footnotes, which essentially means it wasn't the same network, but a minimized one (layers combined, near-zero weights eliminated, reduced precision in all parts where possible). Where as the CPU "reference" had to execute the full network instead.

Apply the same basic minimization methods to the network executed on the CPU, and I severely doubt whether the GV100 could still claim more than a 5-10x speedup at most.

Infinisearch · Oct 16, 2017

Ext3h said:
Given the price estimates for the Volta ASIC (around $2000-$3000 at full discount, or around that magnitude I think), and given how over-engineered that thing is, is anyone actually believing that we are going to see Volta based GeForce cards? Especially given the complete lack of any such announcements?

Isn't that for V100? As in it has tensor units. They have worked on the shader performance as well as tensor units so why would they waste all that R&D money by not releasing gaming cards based on the architecture? And didn't they say they had big things in store for the graphics side of things? (don't remember the exact quote)

Deleted member 2197 · Oct 16, 2017

Ext3h said:
Besides, I would be very careful with all numbers which Nvidia publishes with regards to neural network performance of the Volta cards. Especially if there is by chance the word "TensorRT" hidden somewhere in the footnotes, which essentially means it wasn't the same network, but a minimized one (layers combined, near-zero weights eliminated, reduced precision in all parts where possible). Where as the CPU "reference" had to execute the full network instead.

I think people have started focusing on performance numbers outside Nvidia's influence and so far researcher's results seem to align with Nvidia's Volta statements. Comparisons at this stage seem to be primarily against P100, or what they previously used.

silent_guy · Oct 16, 2017

Infinisearch said:
Am I missing something in regards to performance enhancements (claims) with volta?
1. Twice the perf/w
2. better L1 cache performance and size (some graph comparing LDS vs new L1)
3. reduced latency for dependent back to back alu instructions.

Anything else you can think of?

4. Increased physical clock speed of the HBM memory
5. Increased efficiency of the HBM memory controller compares to P100
6. Separate integer execution units as opposed to shared with FP32

Samwell · Oct 16, 2017

Ext3h said:
If we are actually getting a different architecture for the GV100 based Tesla cards, and the (possibly) GP200 based GeForce cards, it would also become unlikely that Nvidia would release a GV100 based Titan card either.

Of course there won't be a GV100 based Titan. This is a pure HPC-Chip like GP100. A Titan would only be possibly with a GV102. But as it seems with all the info published with Drive PX Pegasus, i agree that there won't be volta based geforce cards. The Geforce cards will get the post-volta architecture used in pegasus.

Infinisearch · Oct 16, 2017

silent_guy said:
4. Increased physical clock speed of the HBM memory
5. Increased efficiency of the HBM memory controller compares to P100
6. Separate integer execution units as opposed to shared with FP32

Thanks for the input. What's the chances any of the volta graphics cards have HBM?

silent_guy · Oct 17, 2017

Infinisearch said:
Thanks for the input. What's the chances any of the volta graphics cards have HBM?

With GDDR6 on the horizon? Quite low, I think.

At some point, I expect there to be a Volta HBM based Quadro version, just like for GP100. Does that count?

seahawk · Oct 17, 2017

Do you see many similarities between a GP104 and a GP100? It won´t be much different for Volta.

Infinisearch · Oct 17, 2017

seahawk said:
Do you see many similarities between a GP104 and a GP100? It won´t be much different for Volta.

Volta has tensor cores per SM, so the layout of things in the SM might have to be significantly different between GV100 and GV102/104/106/107.

Samwell · Oct 17, 2017

The gaming architecture will also have tensor cores. For compatibility and devs there will be a at least small amount of them, like it's with DP on consumer cards at the moment. The chip on pegasus has a high number of TCs, so there'll be even gpus with a lot of tensor cores. But maybe with lower precision than v100 tensor cores.

Infinisearch · Oct 17, 2017

Samwell said:
The gaming architecture will also have tensor cores. For compatibility and devs there will be a at least small amount of them, like it's with DP on consumer cards at the moment. The chip on pegasus has a high number of TCs, so there'll be even gpus with a lot of tensor cores. But maybe with lower precision than v100 tensor cores.

Double precision is fully programmable, while tensor cores are fixed function units aimed at AI. There is likely zero chance they'll be in consumer cards or even quadro's for that matter.

Infinisearch · Oct 17, 2017

silent_guy said:
At some point, I expect there to be a Volta HBM based Quadro version, just like for GP100. Does that count?

Yeah it counts, but I was more focused on consumer cards when I asked. I was hoping to see HBM2 on the new titan and ti model at least.

Kaotik · Oct 17, 2017

Infinisearch said:
Volta has tensor cores per SM, so the layout of things in the SM might have to be significantly different between GV100 and GV102/104/106/107.

I think he was referring to just that - GP104 and GP100 are already a world apart, GV104 and GV100 at least just as far apart if not more

Nvidia Volta Speculation Thread

Samwell

LiXiangyang

CSI PC

Grall

Invisible Member

xpea

Deleted member 2197

Guest

Infinisearch

Ext3h

Infinisearch

Deleted member 2197

Guest

silent_guy

Samwell

Infinisearch

silent_guy

seahawk

Infinisearch

Samwell

Infinisearch

Infinisearch

Kaotik

Drunk Member

Similar threads