NVIDIA Maxwell Speculation Thread

lanek · Sep 13, 2014

dbz said:
Strangely enough, the Philippines were the first "source" of the GTX 670 pricing....at the same 20K Pesos ($470). Actual pricing: $400.

Wouldn't surprise me if Nvidia encouraged the circulation of higher prices prior to launch as they did with the GTX 680 also. Adds a little more to the PR war chest for no actual effort.

There's allways the pre order milking + shops who expect not have big quantities try to up the price a little bit ( +50$ for pre order + 50$ for the "new, hype " etc ).. They are not dumb they allways know some peoples will pay whatever price they put for get it as fast as possible.

But i dont know why, i dont expect a price under 500$.. ( i hope so, but i dont think ).

GTX 680 was launched at 500$, the 780 was a bit of exception at 650$ ( but have quickly been pushed down )..

Wynix · Sep 13, 2014

Kaotik said:
First german shops have listed ASUS Strix GTX 970's, too.
Priced 572,99€ at ImaxAgency http://imaxagency.de/shop/details.php?art=155042&artname=//ASUS+STRIX-GTX970-DC2OC-4GD5&lang=de
and 535,50€ at Compunator http://www.compunator.de/shop/details.php?art=155042&artname=//ASUS+STRIX-GTX970-DC2OC-4GD5

I can already see them comparing these prices with the GTX780 launch price and not it's current price.

revan · Sep 14, 2014

GTX980&970 specs from Techpowerup's GPUdatabase(take it with salt)
http://www.techpowerup.com/gpudb/2621/geforce-gtx-980.html
http://www.techpowerup.com/gpudb/2620/geforce-gtx-970.html
http://www.techpowerup.com/gpudb/b3051/palit-gtx-970-jetstream.html

Firestrike scores for a (supposedly) GTX980,1228Mhz core [I presume 3DMark shows us the boost clock for this (unrecognized) card, seems too high to be a base clock !?! ]/1750 Mhz Mem on the left and overclocked to 1400MHz(!) core (boost clock again ..I think) with a 5960X on the right:
http://www.3dmark.com/compare/fs/2740221/fs/2741313. could be fake, please remember...

..for comparison purposes I sent my 780Ti in the arena, (with a stock 4790K ) :
So, stock GTX 980 (1228Mhz boost/1750Mhz Mem) on the left versus a similary clocked 780Ti (1228Mhz core /1750 Mhz Mem) ... //the 1098Mhz (base) clock showed by 3DMark -> 1228MHz (boost) clock for 780Ti//
http://www.3dmark.com/compare/fs/2740221/fs/2751764... kinda useful for clock to clock comparation
Overclocked 980 GTX (1406MHz core/1803MHz Mem) versus overclocked 780Ti (1295MHz/1750 Mhz Mem) ... //1152MHz is the base clock, the "real"/boost clock is 1295Mhz //
http://www.3dmark.com/compare/fs/2741313/fs/2750884 ... useful comparison from an overclocking potential perspective (it seems Maxwell could go up 100MHz comparative to it's Kepler counterpart, 1400MHz versus 1300MHz)

I let you to draw conclusion from this ...

PS: please take note that the 5960X is more powerful than 4790K, so the overall result could be misleading ( much better score in physics test for the 5960X); I suggest focusing on the graphics score ...

colinisation · Sep 15, 2014

Videocardz has some some pics can't see the die itself but looks like it will be a fair bit smaller than the 780 series
http://videocardz.com/52321/nvidia-geforce-gtx-980-pictured

dnavas · Sep 15, 2014

colinisation said:
Videocardz has some some pics can't see the die itself but looks like it will be a fair bit smaller than the 780 series
http://videocardz.com/52321/nvidia-geforce-gtx-980-pictured

Three DisplayPorts?!
Looks like I'm going to have to go shopping for a pair of 4k monitors....

Kaotik · Sep 15, 2014

dnavas said:
Three DisplayPorts?!
Looks like I'm going to have to go shopping for a pair of 4k monitors....

Of course it's still unofficial but...
http://www.techpowerup.com/gpudb/2621/geforce-gtx-980.html

The GeForce GTX 980 will be a graphics card by NVIDIA. Built on the 28 nm process, and based on the GM204 graphics processor, in its GM204-300-A1 variant, the card supports DirectX 12.0. It features 1920 shading units, 120 texture mapping units and 32 ROPs. NVIDIA has placed 4,096 MB GDDR5 memory on the card, which are connected using a 256-bit memory interface. The GPU is operating at a frequency of 1050 MHz, which can be boosted up to 1178 MHz, memory is running at 1753 MHz.
We recommend the NVIDIA GeForce GTX 980 for gaming with highest details at resolutions up to, and including, 1920x1080.

So maybe the 4K monitor isn't the best idea

LordEC911 · Sep 15, 2014

Ailuros said:
If the die is truly in the 370-380mm2 region under 28nm, I'd need a VERY convincing reason why a shrink to 16FF asap would make in terms of cost any sense.

If the below picture is real, it is around R600 size.
With a 256bit bus would seem like a pretty good candidate to shrink to < 300mm2 on 16FinFet with minimal redesign.

colinisation said:
Videocardz has some some pics can't see the die itself but looks like it will be a fair bit smaller than the 780 series
http://videocardz.com/52321/nvidia-geforce-gtx-980-pictured

Measuring the first part of the pci-e connector at 1.1cm on my old 4850 and got 32pixels based on the GTX980 picture.

1pixel = ~.343mm
I got 59pixels by 60pixels for the GTX980 die.
20.2mm x 20.6mm = ~416mm2 (±~5% moe)

Edit- Using a similar method I got 23.3mm x 24mm = 559mm2 on the GK110 picture.

dnavas · Sep 15, 2014

Kaotik said:
So maybe the 4K monitor isn't the best idea

4k video NLE. I think the card would work for 4k games just fine, but as I don't game, I really don't know. Anyway, dual 4k is almost certainly nutty. :>

These sites disagree about outputs, which is why I'm wondering (maybe it's one 1.3 DP output?).

Ailuros · Sep 15, 2014

LordEC911 said:
If the below picture is real, it is around R600 size.
With a 256bit bus would seem like a pretty good candidate to shrink to < 300mm2 on 16FinFet with minimal redesign.

Measuring the first part of the pci-e connector at 1.1cm on my old 4850 and got 32pixels based on the GTX980 picture.

1pixel = ~.343mm
I got 59pixels by 60pixels for the GTX980 die.
20.2mm x 20.6mm = ~416mm2 (±~5% moe)

Edit- Using a similar method I got 23.3mm x 24mm = 559mm2 on the GK110 picture.

Then let's leave it until final announcement, since it's nowhere near as big.

Whether from hypothetical 420 down to less than 300 or 3x0 down to 2x0mm2 it doesn't change one bit that 16FF is a whole bit more expensive to manufacture on than 28nm. If you gain way less in die area while going to 16FF with a direct shrink, while at the same time each square millimeter on 16FF costs significantly more than former gain, then I'll leave it to you to consider where the hypothetical gain really is.

I could think of more clusters for a GM204 refresh under 16FF for instance, but then again it would make GM200 as a desktop solution rather redundant. Just because Charlie heard that they're going for 16FF shrinks for Maxwells it doesn't necessarily mean that it's the case also.

sheepdogexpress · Sep 15, 2014

I estimate gm104 is 380mm about assuming mounting holes are the same.

Considering the 256bit bus, die size and power consumption of gm204 are quite similar to Tonga, and both products designed for midrange(a step below halo chips), from a architectural standpoint, how far ahead is Nvidia?

It safe to say at this point looking at the price drops and the leaks, the gm204 performs at around the level of the gtx 780 ti, while Tonga I assume when fully enabled performs around the level of a 7970 ghz edition.

In my opinion, considering nodes bring about 40-50 percent power savings per transistor, this seems like quite a difference. Nvidia architecture efficiency is almost as much as node shrink which is pretty huge.

LiXiangyang · Sep 15, 2014

I think people give too much credits to Maxwell's power-efficient IC design.

There is a key difference between GK110 and GM204: GM204's DP units is basically non-existing, comparing to a GK110 die, they can save alot of die-size and energy comsumptions by simply cut the DP units along (since Nvidia use seperate DP/SP unit design to better segementing the market while saving R&D and manufacturing cost).

Based on some CUDA tools' report, it is highly likely that a GM200/210 will have a DP:SP ratio of 1:2 (improved from GK110's 1:3), the GM200/210's "performance per watt" is likely to shrink significantly given the same production process.

However 16nm is a huge jump comparing to 28nm, 3X higher transistor-density or 1/3 die size given the same transistor count, so if a GM200 or Maxwell's third generation were on a 16nm process, it will still be quite impressive, however by then it could be refered as Pascal 1st generation, anyway thanks to the delay of the development of new silcon process and apple's new ugly mobile phone, we get a poorly performed and possibly short-lived generation of GPU.

boxleitnerb · Sep 15, 2014

That makes no sense. DP units are power gated under normal gaming work loads afaik. They don't affect perf/W at all.

silent_guy · Sep 15, 2014

LiXiangyang said:
There is a key difference between GK110 and GM204: GM204's DP units is basically non-existing, comparing to a GK110 die, they can save alot of die-size and energy comsumptions by simply cut the DP units along.

Simple: compare it to gk104 instead of gk110.

3dcgi · Sep 16, 2014

boxleitnerb said:
That makes no sense. DP units are power gated under normal gaming work loads afaik. They don't affect perf/W at all.

While they might be power gated I wouldn't assume that.

Ailuros · Sep 16, 2014

LiXiangyang said:
I think people give too much credits to Maxwell's power-efficient IC design.

There is a key difference between GK110 and GM204: GM204's DP units is basically non-existing, comparing to a GK110 die, they can save alot of die-size and energy comsumptions by simply cut the DP units along (since Nvidia use seperate DP/SP unit design to better segementing the market while saving R&D and manufacturing cost).

I'm having somewhat a hard time to completely understand the above, but if you should mean that GM204 doesn't have dedicated FP64 SPs it would be new to me.

And what exactly do you mean by "alot of die size"? Last time I checked synthesis for FP64 unit at 1GHz under 28nm was at 0.025mm2. Can it that any additional logic for its implementation is "huge" over those 0.025?

Assuming GM204 has 16 FP64 SPs per SMM as SiSoft Sandra seemed to "read" then it would mean 256SPs for the entire chip or else 6.4mm2 for the synthesis of those FP64 units. I'll be generous and say it's all together at say 15mm2, is that really a LOT of die area? That's less than 4% of the entire die estate of the GM204.

Based on some CUDA tools' report, it is highly likely that a GM200/210 will have a DP:SP ratio of 1:2 (improved from GK110's 1:3), the GM200/210's "performance per watt" is likely to shrink significantly given the same production process.

If it shouldn't have dedicated DP units obviously; in any other case why exactly?

However 16nm is a huge jump comparing to 28nm, 3X higher transistor-density or 1/3 die size given the same transistor count, so if a GM200 or Maxwell's third generation were on a 16nm process, it will still be quite impressive, however by then it could be refered as Pascal 1st generation, anyway thanks to the delay of the development of new silcon process and apple's new ugly mobile phone, we get a poorly performed and possibly short-lived generation of GPU.

trinibwoy · Sep 16, 2014

Ailuros said:
Assuming GM204 has 16 FP64 SPs per SMM as SiSoft Sandra seemed to "read"...

If that's the case then either GM204 is not a compute capability 5.0 part or nVidia's documentation is wrong. They claim just one FP64 ALU per SMM for 5.0.

xDxD · Sep 16, 2014

videocardz.com/52362/only-at-vc-nvidia-geforce-gtx-980-final-specifications

McHuj · Sep 16, 2014

Only 2MB of cache? I thought they would have need a bigger one given the bandwidth.

I hope the price isn't true because that's a real bummer.

tviceman · Sep 16, 2014

64 ROPs? I was arguing on here two, three, four weeks ago there was no way nvidia would go with 64 ROPs on a 256 bit bus. But now, this close to release, it seems legit.

xDxD · Sep 16, 2014

tviceman said:
64 ROPs? I was arguing on here two, three, four weeks ago there was no way nvidia would go with 64 ROPs on a 256 bit bus. But now, this close to release, it seems legit.

Perhaps because cache organization?

NVIDIA Maxwell Speculation Thread

lanek

Wynix

revan

colinisation

dnavas

Kaotik

Drunk Member

LordEC911

dnavas

Ailuros

Epsilon plus three

sheepdogexpress

LiXiangyang

boxleitnerb

silent_guy

3dcgi

Ailuros

Epsilon plus three

trinibwoy

Meh

xDxD

McHuj

tviceman

xDxD

Similar threads