Earlier this week I found some NVIDIA presentations that contained details that I have not seen before (or I don't remember).
Slide 6 of "
The Future of HPC and The Path to Exascale" (from 3 March 2014) gives a roadmap with a DP GFLOPS/W value for Pascal. The presentation's date is between the GTC 2013 roadmap which does not contain Pascal and the GTC 2014 roadmap which does contain Pascal.
Below are the approximate DP GFLOPS/W values for the various architectures:
- Tesla: 0.5
- Fermi: 2
- Kepler: 5.5
- Maxwell: 10.5
- Pascal: 14
- Volta: 22
Slide 43 of "
Accelerators : The Changing Landscape" (from around May 2015) shows that a Pascal GPU has a peak of over 3 TFLOPS (presumably DP). I'm not sure if it includes boost, it very well could because the GFLOPS values of existing GPU parts on page 14 are from the maximum boost clocks.
Now, I had written up a long post analyzing these two pieces of information, but I subsequently found some more presentations and I had to change most of it.
Slides 12, 14, and 17 of "
GPU Accelerated Computing" (from 10 November 2014) contain a roadmap of specific Tesla parts, not just architectures. This roadmap contains a single-GPU part called "Pascal-Solo" with a 235 W TDP. [Also, you may notice something missing in slide 11, which makes sense given the date.]
The presentation doesn't specifically state what chip the "Pascal-Solo" uses, but I think the slides following the roadmap may point to the GP100 chip with HBM2. The roadmap contains all the Kepler Teslas that I previously knew of but it does not have any Maxwell Teslas (Maxwell isn't mentioned in this presentation at all), so I think it's possible that the "Pascal-Solo" isn't the only Tesla planned for 2016. But I haven't found any direct evidence of any other Pascal Teslas in 2016, or any Pascal releases before the later part of this year for that matter.
EDIT: That being said, I also haven’t found any hard evidence that says there will be no Pascal parts before late 2016. There may be little reason for presentations to specifically mention a future Pascal chip, even in a Tesla part, that does not have HBM2 or NVLink. I’m still hoping for a GP102 or GP104 release in March or April.
The last piece of information I found may explain what I previously thought might be a a discrepancy between the "Future of HPC" roadmap and the GTC 2015 roadmap. The GTC 2015 roadmap shows ~42 SGEMM/W for Pascal. Given
how close the SGEMM/W and theoretical SP GFLOPS/W numbers are for Maxwell, and that Maxwell and Pascal seem to be architecturally similar, I guessed that Pascal has a theoretical ~43 SP GFLOPS/W. I had also assumed that fast DP Maxwell has a 1:2 DP rate, but 43 is much higher than two times 14.
Slide 75 of "
New hardware features in Kepler, SMX and Tesla K40" (from April 2014) mentions that a Pascal with stacked memory has 4 DP TFLOPS, 12 SP TFLOPS, and 1024 GB/s. It's worth noting that the DP value matches the value from a presentation linked earlier in this thread.
I don't think these FLOPS numbers automatically imply a 1:3 DP rate—the number of significant figures are few enough to mask small differences from 1:3.
Question 1: Is it possible for a Pascal chip to consist of some SMs with a 1:2 DP rate and other SMs with no DP or a 1:32 DP rate?
Taking into account the above information, the 12 SP and 4 DP TFLOPS values more closely align with a ~280 W TDP than a 235 W TDP. So I'm thinking that either some roadmap information is outdated or there is some hidden > 235 W Tesla part that we don't know about. After all, the K20X wasn't unveiled at the same time as the K20, even though both parts launched at about the same time. So my current guess for the 2016 Tesla lineup is as follows:
- Tesla P##: 1x GP100, ~14 DP GFLOPS/W, 235 W, ~3.3 DP TFLOPS, ~9.9 SP TFLOPS, 1 TB/s
- Tesla P##X: 1x GP100, ~14 DP GFLOPS/W, 275-300 W, 4 DP TFLOPS, 12 SP TFLOPS, 1 TB/s [less likely?]
- Question 2: Is it possible to have 2x GP100 on the same interposer? (Or even 2x GP102 if that chip uses HBM2.)
By the way, I have collected a large number of NVIDIA roadmaps in this presentation file, including all four in this post.
https://www.icloud.com/keynote/000-oJJ9_Z8mkHNjW-08KaA3Q#NVIDIA_GPU_roadmaps