- The incremental cost of double precision and rounding is relatively small. And shrinking. Clearly the multipliers and adders are a solved problem. Additional cost close to zero. The rounding is fairly complicated but not excessive.
- Register files need to double in size, but that's also a fairly small area hit.
Well, this is something of a mixed bag because the relative sizes are small and all, but if you're talking about a GPU, you'll basically have to multiply that out by 50x since that same increase will be there 50x over. And unlike a CPU, the vast majority of that die space is functional logic, which itself isn't loaded over with all sorts of self-scheduling, forwarding, OOOE, predictors, prefetchers, god-knows-what-else. Whereas on a CPU, you're basically making an x% increase on <50% of the die area, on a GPU, you're making that same x% increase on what is effectively 80+% of the die.
If double precision never comes to the GPU, it will be because the grand vision of the GPU as a massive parallel general purpose calculation engine didn't materialize.
Either that, or nVidia, in their infinitesimal wisdom, somehow convinces everybody that half-precision floats are good enough for everything that can ever be conceived by the human mind. That and they'll also prove that pi is exactly 3.
If they do that, it will also generate a massive market for AIPUs and advancement in artificial intelligence in general -- after all, if people lack the natural kind, you need an artificial source.
But do ray-tracing, game physics, game graphics or AI need DP? As far as I can tell a lot of the emphasis here is on the data-parallel part (ultra high bandwidth, SIMD/vector) rather than precision, per se.
Mmmm... yes and no. Raytracing, I can see the need at some point as scales and complexity and granularity grows... at that point the main value of precision is to make sure that working with large numbers and small numbers at the same time doesn't turn into nothingness. Of course, there are points within there that you also expressly want less precision -- e.g. early rejection of samples because they won't make a meaningful difference to the final visual result -- there are cute ways to cheat away comparisons and branches if you round down to nothing. Physics, I can definitely see it as you start getting into more complicated primitives and higher frequency collision geometry -- there are reasons why a lot of the primitives you *think* are supported in most physics engines (e.g. cylinders) are absolutely not. AI... not much of a problem. Graphics as in the illumination models and color arithmetic and such... pretty much not an issue.
Though the problems are more or less down the road issues -- bandwidth is of course the bigger problem here and now... and I expect it to still be a problem when the name "Xbox 360" refers to the 360th-generation Xbox.