NVIDIA Maxwell Speculation Thread

Mayhaps. You don't need to buy a ludicrously expensive xeon to get virtualization though.

The thing about virtualization is that it's implemented in hardware that is non-negotiable as far as x86 CPU cores go and it's not a significant increase in hardware cost. It can be fused on or off irrespective of elements like the die, IO, and other physical parameters.

There are use cases for low-end virtualization that some buyers will pay money for, and it's also a prerequisite for many workloads for which buyers will pay massive amounts of money for.
As such, there's some income on a higher-volume segment and income from very lucrative ones as well.

For a DP-capable GPU, there isn't a good dividing line. A high-throughput DP device will need high bandwidth, but so does a high-performance gaming GPU.
Their die sizes are going to be large no matter what, a GPU can hit its TDP with SP and DP, and various extras don't significantly change that there are going to be two large dies with the engineering and manufacturing costs that goes into each distinct ASIC.

A more economically established high-end niche might change that, as Nvidia's high-end Tesla chips seem to indicate, but AMD's not holding the high ground there.
Even then, the compute market is very focused on cost/performance, which is something more measurable than cost/virtualization (this tends to be more binary). On top of that, GPU compute is itself undercut by CPU products that frequently get better utilization in various workloads and which still have a vastly superior software situation.

Even when CPUs lose, they can fall back to very lucrative markets where they still win.
A DP-specific GPU that loses can go nowhere if the hardware isn't also similarly at the top of its class in SP. However, if that's the case you can't charge more for the DP hardware if it's always enabled, negating having it at all unless you jack up the prices on SP hardware.
 
Eh, I should pay $3000+ for a firepro card why the fuck for exactly? It's the same god damn silicon as in the regular radeons. Same with nvidia's overpriced "higher end" junk, by the way.
No GPU would have high DP rates at this point if there wasn't a market to pay for it. 1/4 or 1/2 rate DP adds significant cost to a GPU.
 
It will probably happen after tessellation, but it's possible to perform some HSR prior to tessellation. If there's displacement mapping you probably need hints from software though. Or just let software do this level of HSR for you.

IMG doesn't do HSR before tessellation.
 
Yes, I know. However, say, virtualization in a CPU for example is not comparable to a GPU's DP performance. But yeah, deliberately gimping hardware just to charge more for the ungimped version is crap, no matter who's doing it (and intel has even experimented with paid CPU "ungimp DLC", making them the absolute worst of the worst by quite a degree really.)

As a side-note, I know of one (modern) instance where CPU floating point throughput was 'gimped' to provide market stratification for a single die.
AMD's 'Caspian' mobile CPU.
http://techreport.com/news/17567/amd-intros-new-notebook-platform-with-45nm-cpus
It can and has been done.
 
IMG doesn't do HSR before tessellation.
I never said they do though I'm not sure how you would know considering IMG hasn't discussed their tessellation implementation. I was speaking of a hypothetical tiling architecture and what's possible. This has gotten off topic though so we should leave this tangent.
 
I never said they do though I'm not sure how you would know considering IMG hasn't discussed their tessellation implementation. I was speaking of a hypothetical tiling architecture and what's possible. This has gotten off topic though so we should leave this tangent.
Just wanted to say that HSR before tessellation is very much doable (and an already used technique in some rendering engines). However it would be pretty hard for the GPU, since it needs to know the maximum distance the vertices can move. To allow GPUs to do this, a similar feature than dx11 "conservative depth output" could be introduced to give the GPU the guarantees it needs (to use hi-z / tiling buffer to cull patches).
 
Just wanted to say that HSR before tessellation is very much doable (and an already used technique in some rendering engines). However it would be pretty hard for the GPU, since it needs to know the maximum distance the vertices can move. To allow GPUs to do this, a similar feature than dx11 "conservative depth output" could be introduced to give the GPU the guarantees it needs (to use hi-z / tiling buffer to cull patches).
http://www.khronos.org/registry/gles/extensions/EXT/EXT_primitive_bounding_box.txt
 
I never said they do though I'm not sure how you would know considering IMG hasn't discussed their tessellation implementation. I was speaking of a hypothetical tiling architecture and what's possible. This has gotten off topic though so we should leave this tangent.

They have filed patents for it though.
 
So 16 CUs total with 3 disabled on the 970? Any guesses on die size. I figure <= 350mm^2.
You missed the GM204 PCB leak? http://www.techpowerup.com/202714/is-this-the-first-picture-of-geforce-gtx-880.html
This is more ~400mm².

is that a too big difference between 970 and 980? What if that vga is 960 (or 980 has 15 CUs)?
The driver they run these cards says "GTX 970" and there are also shop listings of 970 and 980.
GM206 will be probably end of 2014 / early 2015 - 35x35mm package chips at Zauba shipped first in August, so probably 3 months to go.

Here is a N16E-GT (GTX 970M?) which uses only 10 CU /SMM: http://compubench.com/device.jsp?benchmark=compu20&os=Windows&api=cl&D=NVIDIA+N16E-GT&testgroup=info
Maybe today there are some other yield strategies, see Tonga at R9 285...
 
it isn't a too big difference between 970 and 980? What if that vga is 960 (or 980 has 15 CUs)?


Depends on the clocks I guess. At similar clocks it's about a 20% deficit. The 7950 at launch was ~25% behind the 7970. 670 was ~20% behind the 680. So even at slightly lower clocks a 13 CU 970 falls in that range.
 
You missed the GM204 PCB leak? http://www.techpowerup.com/202714/is-this-the-first-picture-of-geforce-gtx-880.html
This is more ~400mm².


The driver they run these cards says "GTX 970" and there are also shop listings of 970 and 980.
GM206 will be probably end of 2014 / early 2015 - 35x35mm package chips at Zauba shipped first in August, so probably 3 months to go.

Here is a N16E-GT (GTX 970M?) which uses only 10 CU /SMM: http://compubench.com/device.jsp?benchmark=compu20&os=Windows&api=cl&D=NVIDIA+N16E-GT&testgroup=info
Maybe today there are some other yield strategies, see Tonga at R9 285...

Depends on the clocks I guess. At similar clocks it's about a 20% deficit. The 7950 at launch was ~25% behind the 7970. 670 was ~20% behind the 680. So even at slightly lower clocks a 13 CU 970 falls in that range.

Interesting, thank you
 
Back
Top