Doesn't seem like Nvidia sent any cards to reviewers.Titan Z
Also this is a Chinese review.
That is why they didn't send them to reviewers.
Doesn't seem like Nvidia sent any cards to reviewers.Titan Z
That is why they didn't send them to reviewers.
Well, as far as I managed to find out, there won't be any Maxwell-based Tesla soon (meaning: not in the next 6 months).
Instead, there will be a launch this month for a dual chip Kepler based Tesla SKU. As strange as it sounds..
My SMX assumption doesn't seem to hold. According to slide 16 in this slide deck, the K80 has 2.9 TF DP, 4992 CCs, and 480 GB/s memory bandwidth. These specs would imply 13 SMXs per chip and ~870 MHz core clock.It could be the Tesla K80, which is rumored to be a dual-chip part.
If this image is real then the K80 (GK210-DUO) would have ~520 MHz core clock and ~5.0 Gbps memory, assuming all SMXs and memory interfaces are enabled.
My favorite thing about GK210 is the 512 kB of register file and 128 kB of L1/shared memory per SM. Can be nice for occupancy limited code.
http://www.guru3d.com/news-story/nvidia-tesla-k80-dual-gpu-compute-accelerator.htmlThe Tesla K80 dual-GPU accelerator delivers nearly two times higher performance and double the memory bandwidth of its predecessor, the Tesla K40 GPU accelerator. With ten times higher performance than today's fastest CPU, it outperforms CPUs and competing accelerators on hundreds of complex analytics and large, computationally intensive scientific computing applications.
The Tesla K80 delivers up to 8.74 teraflops single-precision and up to 2.91 teraflops double-precision peak floating point performance, and10 times higher performance than today's fastest CPUs
Key features of the Tesla K80 dual-GPU accelerator include:
The Tesla K80 accelerates the broadest range of scientific, engineering, commercial and enterprise HPC and data center applications -- more than 280 in all. The complete catalog of GPU-accelerated applications (PDF) is available as a free download. More information about the Tesla K80 dual-GPU accelerator is available at NVIDIA booth 1727 at SC14, Nov. 17-20, and on the NVIDIA high performance computing website.
- Two GPUs per board - Doubles throughput of applications designed to take advantage of multiple GPUs.
- 24GB of ultra-fast GDDR5 memory - 12GB of memory per GPU, 2x more memory than Tesla K40 GPU, allows users to process 2x larger datasets.
- 480GB/s memory bandwidth - Increased data throughput allows data scientists to crunch though petabytes of information in half the time compared to the Tesla K10 accelerator. Optimized for energy exploration, video and image processing, and data analytics applications.
- 4,992 CUDA parallel processing cores - Accelerates applications by up to 10x compared to using a CPU alone.
- Dynamic NVIDIA GPU Boost Technology - Dynamically scales GPU clocks based on the characteristics of individual applications for maximum performance.
- Dynamic Parallelism - Enables GPU threads to dynamically spawn new threads, enabling users to quickly and easily crunch through adaptive and dynamic data structures.
Users can also try the Tesla K80 dual-GPU accelerator for free on remotely hosted clusters. Visit the GPU Test Drive website for more information.
Another aspect or maybe free interpretation: GK210 is a failsafe for 16/20nm not being ready for another round of >500mm² products. This could be an(other) indication, that GM200 was/is planned for 16/20nm only release.Interesting observation on Anandtech: this is the first GPU that is created for Tesla only. This means that the Tesla business is now large enough to warrant separate silicon? Remarkable.
Another aspect or maybe free interpretation: GK210 is a failsafe for 16/20nm not being ready for another round of >500mm² products. This could be an(other) indication, that GM200 was/is planned for 16/20nm only release.
GF100 was also a HPC chip.
Wouldn't a GPU with 15 SMXs enabled and at a lower clock improve performance/W? I'm also considering the possibility that the GK210 chip physically has only 13 SMXs, although I'm not sure why they would do that.
Interesting observation on Anandtech: this is the first GPU that is created for Tesla only. This means that the Tesla business is now large enough to warrant separate silicon? Remarkable.
Another aspect or maybe free interpretation: GK210 is a failsafe for 16/20nm not being ready for another round of >500mm² products. This could be an(other) indication, that GM200 was/is planned for 16/20nm only release.
I think the fact that it is called GM200 and not GM210 is highly revealing of its nature...
Umm why? If it was called GM100 instead of GM200 then that might have been something. GM200 is just following the standard Nvidia naming convention.
Are ~500mm² tapeouts so cheap?Nope, GM200 was planned for 28nm since at least late last year. GK210 being a failsafe makes no sense as GM204 beats it in everything except DP. My guess is that as it was a very minimal change, the design costs and time were low enough that it was worth doing it.
But GK210 is also a x10 part, while there was no GK200.Umm why? If it was called GM100 instead of GM200 then that might have been something. GM200 is just following the standard Nvidia naming convention.
There might be a clue in this presentation (link to parent page). Pages 11-25.I would have though so as well..but I guess there is a floor and diminishing returns as you go lower. Given the already low 562 mhz clock, perhaps there wasn't much benefit in going lower. And of course, given the dual GPU config and 300W TDP, there could simply be a hard power limit which limited them to 13 SMXs.
There was no GK100...
Are ~500mm² tapeouts so cheap?
If you take GK180 also in this "bigger cache Kepler" project, we are talking about two tapeouts, two years of working on it.
Also Mike Clark saw GK210 as summer 2014 product, while GM200 is/was end of 2014/early 2015.
Its probably a failed time to market project. Maybe they had to low resources because of Tegra Kepler/Denver and Maxwell.
But GK210 is also a x10 part, while there was no GK200.
The other odd aspect of GK210 is his MIA brother GK180, which was shipped at Zauba in early 2013 and has his own device in CUDA DLL (so it was not just the GK110B).
Maybe there is some intern lobby at NV who still wants to push the super-scalar approach...
But GK210 is also a x10 part, while there was no GK200.
Good read, thanks.There might be a clue in this presentation (link to parent page). Pages 11-25.