Where are the GT200 mainstream GPUs?

Arty

KEPLER
Veteran
It has been 10+ months since the GT200 launch and the mid size GPUs in the architecture are still missing in action. I dont recall such a big delay in 6xxx, 7xxx, 8xxx or 9xxx.

  • Bad yields
  • Improper planning (relying on TSMC 40nm over 55nm)
  • Huge inventory of old GPUs (*cough*rebrand*cough*)

What do you think could be the reasons?

FYI, I dont buy the theory "G92 is enough" because R&D cost into these GPUs is already sunk. A tape-out is also a one time cost unlike the recurring expenses associated with a larger die, larger bus width or higher layer PCB costs.
 
What specifications, die size, transistor count etc... would you expect from a mid range GT200? Can you highlight the differences between that and a 9800GTX+ / GTS 250?
 
FYI, I dont buy the theory "G92 is enough"

Then you would have to believe that a 55nm GT2xx variant would have worthy advantages over G92b. GT2xx improved both shading and texturing efficiency so they could probably match G92b with a bit fewer units but how do you guesstimate the potential die area and cost savings there?

I still say Nvidia's 40nm efforts will determine whether this renaming gambit is recognized as brilliant or moronic. If their 40nm products look like something they spent a lot of effort on in lieu of wasting time on 55nm GT2xx then they'll be redeemed and everybody will forget all about the renaming. If not, then the Nvidia clown show will just go on as usual.
 
What specifications, die size, transistor count etc... would you expect from a mid range GT200? Can you highlight the differences between that and a 9800GTX+ / GTS 250?
Die-size would be slightly smaller (better density) as it would be Nvidia's 5th or 6th GPU on 55nm. I cant ballpark the transistor count but it there has been a lot of talk about upping higher cuda processors per TMC. And smaller bus-width (192-bit?) and PCB with less layers.

Then you would have to believe that a 55nm GT2xx variant would have worthy advantages over G92b. GT2xx improved both shading and texturing efficiency so they could probably match G92b with a bit fewer units but how do you guesstimate the potential die area and cost savings there?
Yes, see above.

I still say Nvidia's 40nm efforts will determine whether this renaming gambit is recognized as brilliant or moronic. If their 40nm products look like something they spent a lot of effort on in lieu of wasting time on 55nm GT2xx then they'll be redeemed and everybody will forget all about the renaming. If not, then the Nvidia clown show will just go on as usual.
I think a lot of it comes down to relying on TSMC 40nm. I dont think the renaming shingdig will go on. See the most recent Dailytech article on GTS 240.
 
Well in theory a GTX 2xx variant would carry such things as double precision, better CUDA support, any efficiency gains in the 2xx family, etc...

As is, they risk losing more marketshare in the high margin professional market where FireGL has been making some steady inroads.

I guess by this reasoning, there's no reason really to ever make a midrange/budget part as you can just keep renaming former enthusiasts parts down into the midrange/budget areas...

Hmmm, one has to wonder why this isn't a widespread process. And is only done when companies generally have no other choice.

Regards,
SB
 
I guess by this reasoning, there's no reason really to ever make a midrange/budget part as you can just keep renaming former enthusiasts parts down into the midrange/budget areas...
You can't rename former enthusiasts parts down into midrange/budget too much. They use too much power, which is bad for oems and bad for costs (power circuitry, cooling solution), too complicated pcb (256bit bus for instance), it won't support any new features (not necessarily only related to supporting new 3d apis, but for instance things like video decoding).
That said, renaming G92 down sounds quite reasonable. GT200 might have higher efficiency per theoretical flop rate, but certainly NOT higher efficiency per die size. Even if nvidia would drop the DP unit, it probably would still have only similar performance per die size (and that's assuming it could be clocked as high as g92), it would not use less power and not be cheaper. And the feature set is, as far as that market segment cares about, pretty much identical.
But certainly nvidia will need new 40nm midrange chips.
 
((since we're all assuming here what a GT200 type midrange GPU would look like)) The main thing relevant to a gamer that I can see a Gt200 type card offering over a G92 be superior geometry shading performance.
 
I guess by this reasoning, there's no reason really to ever make a midrange/budget part as you can just keep renaming former enthusiasts parts down into the midrange/budget areas...

Except that's not what happened here. There were two very significant shrinks from 90->65->55.
 
The three 40nm chips nvidia appear to be coming out with are a G92 with i think a 192 bit bus, a G96 maybe 128bit bus(surely will be pad limited) and the third one even lower, definitely 64bit.

(That is if you believe rumors)

Assume if the above is true, and they are close to straight shrinks of current chips, implies they have been diverting resources to other projects. ie stuff like future gpu's ie GT300 and children(where in some sense GT200 tech might live on) or maybe onto other things they are experimenting with like ion and and cpus.
 
Well ChrisRay already mentioned this... Geometry shaders are much faster on GT200. I don't think anything else about DX10 loads has been massively improved.
And then you have to ask: are there any games out or in development that will make such heavy use of geometry shaders that gamer would actually notice the difference?
 
Well ChrisRay already mentioned this...

Oooops, I missed that one, sorry.

Geometry shaders are much faster on GT200. I don't think anything else about DX10 loads has been massively improved.

In general, anything using streamout should run much better.

And then you have to ask: are there any games out or in development that will make such heavy use of geometry shaders that gamer would actually notice the difference?

My only on-hands exeperience here is the Battleforge beta, which was rumoured to be DX10 focused. On XP, it runs twice faster than on Vista, with my 9600GT. I'll borrow a Radeon card today or tomorrow, and see if it's the case of an architectural difference, or they have a crappy DX10 codepath in the beta :smile:

Also, Stormrise is rumoured to be heavily GS intensive.
 
But G96 only has 32sp's. How can they go any lower on 40nm?

The chip is said to replace current G96 at similar performance levels. I think current G96 is around ~140mm2 or so, if they can get it down close to 100mm2 still can have a 128 bit bus. Provided yields are similar, should be able to make more $$$.

(Edit - G96 was ~140mm2 on 65nm, and ~120mm2 on 55nm, so likely will have to add units to maintain 128bit bus)

Below that think is some kind of G98 type part - guessing will be used in mobile as well.

Final one is G92/192bit memory, they had trouble with this had to lay it out more sparsely to overcome 40nm leakage problems so didnt get quite the area savings they were after. Yields are not yet good enough that it is overall cheaper than G92 55nm so this will probably be the last of the three.

Above sounds conservative, have not heard of any new features. Am guessing as they went this way cause any nvidia manager that ok's anything experimental or risky that later doesnt work out will likely pay a heavy price in the current environment.

The last chip G92 based one got me a little concerned that it might be quite some time before see a chip on 40nm that uses a 256bit bus.
 
Last edited by a moderator:
ATI is using a 128bit bus here.
768MB of 192bit G-DDR5 doesn't sound bad, that would give mighty bandwith.
I speculate that GPU (GT215 presumably) has 144SP (6x24) and CUDA level 1.2 (no FP64 support) ; "new G96" (GT216) could be 48SP or even a 2x32, 64SP part? I wish they would have something competitive with RV730..

GT218 would be a 24SP GPU, else their new IGP might be faster than it which would be quite shameful.
 
Last edited by a moderator:
The chip called 'GT206' was canned, it was a 8 TMU discrete GPU mostly replacing G98. There were no other 65/55nm derivatives because 40nm was supposed to be available earlier than this. Is that a good enough reply? :)

FWIW, this is my current *guess* for GT218/GT216/GT215:
64-bit DDR3: 8xTMU/32xSP/4xROP/0xDP -> RV710
192-bit GDDR3: 32xTMU/128xSP/12xROP/0xDP -> RV740
192-bit GDDR5: 64xTMU/256xSP/24xROP/1xDP -> RV790
 
192-bit GDDR3: 32xTMU/128xSP/12xROP/0xDP -> RV740
If that turns out to be true, Nvidia better have the freqs up a bit, considering how close RV740 came to RV770 (4850) in guru3d's test and how close the HD 4850 is to 9800 GTX+-now-GTS250. If the GT216 even saves on TMU and bandwidth without a healthy compensation via clock rates, it's very likely that it cannot keep pace with the final RV740*.


BTW - with 1xDP you mean per TPC?


*Given that the individual units stay more or less of the same capabilities.
 
Well, you can look at it that way, or you can consider that it'd have the same or more bandwidth if using 1.2GHz GDDR3 and, if it runs at the same core clock speeds as RV740 with similar shader clock multipliers as on G9x, would have similar TMU performance and slightly lower ALU performance.

Compared to G92, it will hopefully be clocked higher and remember that both the TMUs and ALUs are theoretically slightly more efficient (G92 can't get to peak bilinear throughput, ALUs can't make use of the MUL when not doing interpolation, etc.) - I don't really expect any clear winner versus RV740 if this config is right, but maybe I'm missing something.

And yes, that's what I mean by 1xDP (i.e. like GT200).
 
Back
Top