NVIDIA Maxwell Speculation Thread

$999 for the Titan X.

For me, it seems they're anticipating that Fiji will put up quite a fight.
 
Pascal news:

- 2x Perf/Watt of Maxwell (I think this is related to neural network learning performance though, not games)
- Will inherit the FP16 "mixed precision" from Tegra X1's GPU.
- Claims 4x better FP16 performance than Maxwell. Probably means 2x more FP32 ALUs together with the theoretical 2x better FP16 performance for the mixed precision capability.
- Stacked memory for up to 1TB/s bandwidth
 
It's a bit like explaining day to day ups and downs in the stock market: the urge to find a deeper reason in everything.
But sometimes everything can simply be explained by "it's just marketing". There is no rule book that requires Titan to have DP. It's just a label to separate mainstream from ultra high-end. And similarly, there's no magic formula that translates the presence, or not, of DP into a sticker price.
A bunch of people got together in a conference room and decided on a price. It's really that simple.
 
Is double-precision some how broken in Maxwell architecture? Building a ~600mm² GPU monster with ~200GFLOPs DP?
Could be there some internal fight about the superscalar approach used in Fermi Gen2 and Kepler, that GK210 was produced?
 
Is double-precision some how broken in Maxwell architecture? Building a ~600mm² GPU monster with ~200GFLOPs DP?
Could be there some internal fight about the superscalar approach used in Fermi Gen2 and Kepler, that GK210 was produced?
No, there's nothing broken. From what I heard, the Maxwell with double precision got extra features added and was renamed Pascal. Evidently Nvidia wanted the best possible gaming and deep learning performance for Maxwell, and was content with letting the traditional HPC market wait a little longer.
 
in the configuration of the chip , with the low ratio sp / dp , could have played a role the "cancellation" of 20nm ?
 
- Claims 4x better FP16 performance than Maxwell. Probably means 2x more FP32 ALUs together with the theoretical 2x better FP16 performance for the mixed precision capability.
Moving less data around should be the big win for FP16 and a reason why it resurfaces again.
 
Fun fact: The Titan X has about the same DP throughput as a Geforce GTX 580.

I think the GK110 is "young enough" to be kept in the market a bit longer for DP, and being stuck to 28nm means they had to cut somewhere.
Now nVidia is trying spin that FP32 and FP16 are spectacular for neural networks, which is why they spent 80% of the keynote talking about neural networks.
 
Moving less data around should be the big win for FP16 and a reason why it resurfaces again.

You can pack your data in FP16 even right now. There's already hardware support for encoding/decoding to FP16.
What they're adding is hardware support for directly evaluating calculations in FP16.
The big reason that helps performance is that they can pack double the number of variables in the same number of registers (2xFP16 in each 32bit register).
And if you use less registers you gain more latency hiding capabilities, which translates to more throughput.
 
http://i.imgur.com/tdzmIb3.png

7Tflops single/0.2 double. So that'd be about an 1140MHz clock, boost clock probably. GTC stream for anyone who wants it:

http://www.ustream.tv/channel/gpu-technology-conference-2015

1/35 DP ? ( i suppose the gpu is 1/32 as GM204 ), i know for Raytracing FP32 this card will be excellent, but even at 999$, you just buy 2 old gpu who had 4-5Tflops FP32 ( 780TI or whatever )...

Arg, they will need to disable at least 8SM on the "gaming" cutdown version for make peoples buy a gpu with 6GB more at this price ( before Evga and other release the "980TI" with 12GB instead of 6.. ( )
 
Last edited:
Fun fact: The Titan X has about the same DP throughput as a Geforce GTX 580.

I think the GK110 is "young enough" to be kept in the market a bit longer for DP, and being stuck to 28nm means they had to cut somewhere.
Now nVidia is trying spin that FP32 and FP16 are spectacular for neural networks, which is why they spent 80% of the keynote talking about neural networks.
It's not that they're "trying to spin" this. I work in deep learning, and nobody uses FP64. We've been doing FP16 experiments instead, so far they look promising.

Also, they're not pushing GK110 for DP. Rather, GK210, a really different chip (2x register file, 2x shared memory).
 
Yes they did :( I was wrong in thinking that Nvidia would not degrade the Titan name by doing so.

This (Titan X) should not be a Titan at all but rather the GTX 980 Ti (which we know would have gimped DP)
Since when has Nvidia ever cared about brand nomenclature confusion?
 
It's not that they're "trying to spin" this. I work in deep learning, and nobody uses FP64. We've been doing FP16 experiments instead, so far they look promising.
I understand that, but GTC has always been their place to brag about their GPU compute and they used to dedicate a good part of their time to FP64 performance.
It's good that they found an area where they could brag about FP16 performance, otherwise it'd be really awkward to talk about a card with very low FP64 performance in there.


Also, they're not pushing GK110 for DP. Rather, GK210, a really different chip (2x register file, 2x shared memory).
I'd bet that GK210 and GK110 come from the very same wafer, only difference being a bit of laser trimming here and there.
 
in the configuration of the chip , with the low ratio sp / dp , could have played a role the "cancellation" of 20nm ?
With Maxwell NV get rid of the superscalar structure of ALUs. 1:4 or even 1:8 should be very cheap, at least you did not make some mistakes in design. 1:32 is just ridiculous.

I'd bet that GK210 and GK110 come from the very same wafer, only difference being a bit of laser trimming here and there.
GK110 was around since end of 2012 so they needed 2 years to enable SM_37 with bigger caches? All data (CUDA, A1-stepping, time-frame) says it a new chip.
 
Back
Top