ELSA hints GT206 and GT212

Sigh, why don't people get it? 384 SPs/96 TMUs is probably wrong.

I'm harboring some pretty optimistic wishes for GT212 but that's all going to come down to how much Nvidia has improved their transistor density. The other thing to consider is that DX11 is obviously going to be more expensive per flop so they can't do too much with GT212 without causing collateral damage to GT300.
 
Alright, let's forget the rumours. What do you think of these?

GT212 | 288 SPs | 96 TMUs | 4 ROP blocks, 256-bit interface, GDDR5
GT214 | 144 SPs | 48 TMUs | 3 ROP blocks, 192-bit interface
GT216 | 72 SPs | 24 TMUs | 2 ROP blocks, 128-bit interface

I'm basing it on :yep2:
- the ALU:TEX ratio being carried over from G200, as it was carried over from G8x to G9x
- TMU count still being higher compared to G92/G94/G96, even though not two times higher
- GT214 and GT216 could produce similar performance as G92 and G94
I'm not sure about :oops:
The bandwidth needs of such chips, the viability of proposed interface widths on 40nm, and DDR3/GDDR3/GDDR5 prices compared to the cost of adding a wider interface.

The initial 40nm line-up would include only GT214, GT216 and perhaps GT218 - these would obsolete G92, G94, G96 and G98. G200b would stay for some time, to be later obsoleted by GT212. GT212 would be launched when the 40nm process is mature enough so it makes sense instead of 55nm for G200b.

What I don't like about my speculative roadmap is the gap between GT214 and GT212. Maybe GT212 could be toned down a bit, or GT214 toned up - probably based on what die size will nVidia need to put the pads on for their chosen bus width. And there used to be an even greater gap between G80 and G84...

Now about GT218. You're right that the basic building block is 8 TMUs + 16 SPs on G8x/G9x and 24 SPs on G200. I wonder, though… several sources claim the Quadro NVS 420 to have two times 8 SPs - is that possible or is it a mistake?

Anyway, GT218 would be more of a video decoding GPU than a low-cost gaming GPU, so it could very well have just one SIMD:
GT218 | 24 SPs | 8 TMUs | ???
The problem is, I'm not sure whether the chip could support a 64-bit interface. 32-bit with GDDR5 would probably create sufficient bandwidth, but nobody will stick expensive GDDR5 onto a $50 card and 32-bit bus sounds very unlikely. It would only make sense if GDDR5 was the standard choice for GT21x cards and nVidia designed the ROP/MC blocks with 32-bit channels (along with something similar to ATI's ring-bus or hub, so the crossbar doesn't get huge), to offset GDDR5's slower command rate. But something tells me that won't be the case and the majority of the cards will use GDDR3, those will be exchanged for DDR3 as time goes, and GDDR5 will be used only where really needed.
 
Why? It has half the SPs and TMUs and looks terribly underpowered on paper, but a 9600 GT comes very close to 8800/9800 GT in real scenarios. So G92's memory capacity/bandwidth limitation seems more stupid, in my lame opinion.
 
Gotta go with Lukfi here. The 9600GT was and still is a great card. It's G92 that was bandwidth starved.

I'm still holding out for 4:1 on GT2xx though.
 
The two are so close in performance (most games seem mostly dominated by ROP/BW) that it looks like a mistake - I shouldn't have implied that 9600GT's design was inherently broken, it's the line-up formed by these chips that looks farcical. Remember the smoke'n'mirrors with GSOs and GSs and umpteen variants of G92...

8800GT was great value before Christmas 2007 and 9600GT was even better when it came out a few months later, just making the more expensive NVidia GPUs really poor value.

Though it's tempting to speculate NVidia will continue the farce, it's pretty unlikely. Isn't it?

Anyway, Lukfi, the stratification you've drawn up seems reasonable to me

Jawed
 
The line-up was strange, but very competitive. I still believe that GF8800GT was more a marketing tool, than real product. It was hardly available and GF9600GT appeared sooner than nVidia "solved" this issue. It seems, that GF9600GT was simply late and nVidia needed anything to catch users eye - otherwise RV670 would have no competitor.

I think GF8800GT was initially purposely limited product - it was aimed to hold place for GF9600GT (which was much cheaper to produce, but late)

GF8800GT was very good for many brands - they sold them to suppliers under the condition, that they'll buy their overpriced 8400/8600, which nobody wanted.

It worked well.
 
8800GT was great value before Christmas 2007 and 9600GT was even better when it came out a few months later, just making the more expensive NVidia GPUs really poor value.

Though it's tempting to speculate NVidia will continue the farce, it's pretty unlikely. Isn't it?
The lineup was a farce because G92 was BW/memory starved. As I said, I have no idea about the number of ROPs and memory throughput needed for the GT21x chips I'm speculating about, so I expect nVidia not to make the same mistake as with G92 and use an adequate amount of memory, adequately fast.
 
Alright, let's forget the rumours. What do you think of these?

GT212 | 288 SPs | 96 TMUs | 4 ROP blocks, 256-bit interface, GDDR5
GT214 | 144 SPs | 48 TMUs | 3 ROP blocks, 192-bit interface
GT216 | 72 SPs | 24 TMUs | 2 ROP blocks, 128-bit interface
My first thought is that line-up is too dense; if you looked at the die sizes and board costs of GT214/GT216, I'm not 100% sure it'd be worth the trouble.

I really get the impression most people are still underestimating 40nm. It is from a density, performance and power perspective a very important process node - this is, however, compensated by greater wafer pricing increases than historical normals.

I also get the feeling most people are overestimating the die size of NVIDIA's SPs; you do realize they only take 25% of GT200's die size, right? So let's say ~150mm² for 240 SPs, or ~5mm² per group of 8 SPs. On 40nm, that goes down to less than 2.5mm²... So the difference between 72 SPs and 96 SPs is certainly less than 8mm². Given the likely performance boost, does it really make sense to keep the SP ratio so low?

I would argue that it makes no sense at all. Of course, NVIDIA's 65nm line-up didn't make much sense either, so I fully understand people's skepticism.

Regarding G94, it's easy to say it doesn't make sense but the problem is if you cut the memory bus down to 192-bit, you had to use either 384MiB or 768MiB of DRAM; the former is too little in that market segment, the latter was too much in that timeframe. Fun stuff! :) I still do believe the most significant 'basic' mistake of the G9x line-up by far is not going for a 320-bit memory bus on G92, though.

Now about GT218. You're right that the basic building block is 8 TMUs + 16 SPs on G8x/G9x and 24 SPs on G200. I wonder, though… several sources claim the Quadro NVS 420 to have two times 8 SPs - is that possible or is it a mistake?
It's not a mistake, G86 was 8 TMUs/16 SPs, MCP78 was 4 TMUs/16 SPs, but amusingly G98 was 8 TMUs/8 SPs as discussed in a recent thread (and very much to my surprise). MCP7A is 8 TMUs/16 SPs... There also was a SKU of G86 way back in the day that had 8 of its 16 SPs disabled.

---

BTW, one small comment - I think we're all assuming NV can't easily change the TMU-SP ratio in GPUs since, except for G98/MCP78, they seemingly never did so. I'm not sure that's right, and if so it'd make speculation about the ratio in different chips much more complex...
 
I still do believe the most significant 'basic' mistake of the G9x line-up by far is not going for a 320-bit memory bus on G92, though.
You'll get no argument from me there. If G80 had the same SP/TMU count and was designed with a 384-bit interface, there were bound to be problems with G92. Will nVidia make the same mistake twice?
It's not a mistake, G86 was 8 TMUs/16 SPs, MCP78 was 4 TMUs/16 SPs, but amusingly G98 was 8 TMUs/8 SPs as discussed in a recent thread (and very much to my surprise). MCP7A is 8 TMUs/16 SPs... There also was a SKU of G86 way back in the day that had 8 of its 16 SPs disabled.

BTW, one small comment - I think we're all assuming NV can't easily change the TMU-SP ratio in GPUs since, except for G98/MCP78, they seemingly never did so. I'm not sure that's right, and if so it'd make speculation about the ratio in different chips much more complex...
Hmm, my speculation was quite heavily dependent on the constant SP:TMU ratio. I didn't know that G98 and the MCP78 IGP were different. But they never did change the ratio on any other chips, that's strange. ATI has different ALU:TEX ratios for different market segments, so I guess this approach does make sense.

Unfortunately, without at least the SP:TMU ratio, we have close to nothing to fall back on :(
 
That's okay. GT218 has got compare to G98 32 instead of 8 SPs, so the shader power is much higher (132 instead of 34 Gflops).
The TMU/ROP performance is as good as G98 in theoreticly. But GT2xx has got several optimizing at these units.

I will remember: 22 W TDP. ;)
 
Interesting, based on the PCIe slot that package is indeed 23mm², which is what Fudzilla claimed in this post: http://www.fudzilla.com/index.php?option=com_content&task=view&id=11715&Itemid=34

So at least the two match. "The core is clocked at 550MHz and shader clock at 1375MHz" though? what the hell? Not only is this surprisingly low, these are the EXACT same clocks as the G96b-based G9400GT. Very dubious indeed...
23mm²? Are you really sure, Nvidia did dare to make four small round GPUs and positioned them around a nearly quadratic centerpiece which only serves stability reasons? At least, IF i read VR-Zones diagramm correctly. ;)

*SCNR*

BTT: In this matter i concur with konkort that they must have realized by now the importance of shaders, thus having made the switch from the 2:1 Ratio of G80 and the like to 4:1 on GT21x with G(T)2x0 being an intermediary with 3:1.

It'd only make sense after all, given the massive amount of die-space their scheduling logic and associated bulkhead takes, to go for the slightly bigger die and try and outperform AMD. It's not like they got any other choice, given the enourmous FLOPS/mm² AMD already has achieved with their 2nd 55nm generation.
 
23mm²? Are you really sure, Nvidia did dare to make four small round GPUs and positioned them around a nearly quadratic centerpiece which only serves stability reasons? At least, IF i read VR-Zones diagramm correctly. ;)
Sorry, I meant 23mm, not 23mm². So 23x23; this is the package size, not the die size.
 
thus having made the switch from the 2:1 Ratio of G80 and the like to 4:1 on GT21x with G(T)2x0 being an intermediary with 3:1.
4:1 or 5:1 on GT21X wouldn't surprise me... though I'd be a little disappointed with the former. Where I will be really disappointed is if GT3xx isn't 6:1 (which is where Nvidia said they were going 5 years ago... and still haven't made it).
 
What interests me is that I dont recall Nvidia ever releasing so many different products in such a short time. I mean we have the 285, 295, and the upcoming 212 and 218 etc ...it seems like they are flooding market with a lot of products with not much that is different between them. I dont think there was a precedent like this before....
 
=>ninelven: What you are talking about are physical SP vs. TMU counts, but that way you're ignoring that SPs are running at a much higher frequency. So while for example G92 is physically 2:1, effectively it is >4:1.

=>suryad: I think G200b was delayed a bit. And the rest is closely connected with the availability of 40nm, which seems to be kind of a holy grail these days...
 
Back
Top