Nvidia Volta Speculation Thread

Kaotik · Jul 20, 2019

Malo said:
Why would they change it when it was apparently good enough for a high end HPC focused GPU and Turing is far more consumer oriented? We know that the Tensors are different in Turing?

But is it a case of "it's good enough" or "it's better than anything else we have" or "it's good enough trrade off with current hardware"?

CarstenS · Jul 20, 2019

… or of constant evolution and field testing.

Deleted member 2197 · Jul 20, 2019

Malo said:
We know that the Tensors are different in Turing?

Turing added some additional precision modes (INT8, INT4) for inferencing, and experimental 4-bit and 1-bit precision modes for low-precision math research.

A1xLLcqAgt0qc2RyMz0y · Nov 23, 2019

New Nvidia Tesla V100S

https://www.anandtech.com/show/15146/new-nvidia-gpu-variant-at-supercomputing-2019

A1xLLcqAgt0qc2RyMz0y · Nov 25, 2019

New Nvidia Tesla V100S for PCIe Official Specs:

https://www.nvidia.com/en-us/data-center/tesla-v100/#specs

Performance is 16-17% faster than the Original Tesla V100 for PCIe.

Memory bandwidth is up a whopping 26% (1134 GB/s vs 900 GB/s) which is probably the main reason for the 17% overall gain.

Deleted member 2197 · Nov 25, 2019

Interesting the Tesla V100S PCI-e version has higher performance/specs than the original Tesla V100 NVLink version.

DavidGraham · Nov 25, 2019

pharma said:
Interesting the Tesla V100S PCI-e version has higher performance/specs than the original Tesla V100 NVLink version.

It's probably using the full die of GV100, ie the full 5376 cores.

iMacmatician · Nov 25, 2019

DavidGraham said:
It's probably using the full die of GV100, ie the full 5376 cores.

The data sheet shows that it still has 5120 SPs.

Andrew222333 · Oct 8, 2022

CSI PC said:
The point is they still increased it by 41.5% when the physical side only increased by 33% while also adding even more functionality and all within same TDP on a massive die and similar 16nm node (albeit latest iteration in TSMC typical fashion called 12nm).
They still do packed accelerated 2xFP16 math in V100 just like P100 btw.
You get 30TFLOPs FP16 and also the Tensor matrix function unit/cores, usually Tensor matrix will have more specific uses primarily towards Deep Leaning framework/apps (future it is in theory possible to use this with professional rendering-modelling, not talking about gaming though).
Those Tensor function units/cores can also be used for FP32 operations as well, so I think that works out around 2x faster with DL supported framework/apps.
Cheers

The semiconductor worlds is going into 18" wafers, this means more dies per wafer, better efficiency in fabs and better utilization of silicon. This die per wafer calculator is able to predict the number of dies per wafers and wafer lot. That is really good news because the price for end user will be lower.

entity279 · Oct 9, 2022

Andrew222333 said:
The semiconductor worlds is going into 18" wafers [...]

Is it going? Not sure if we have any indication of that..

Nvidia Volta Speculation Thread

Kaotik

Drunk Member

CarstenS

Moderator

Deleted member 2197

Guest

A1xLLcqAgt0qc2RyMz0y

A1xLLcqAgt0qc2RyMz0y

Deleted member 2197

Guest

DavidGraham

iMacmatician

Andrew222333

entity279

Similar threads