NVIDIA GT200 Rumours & Speculation Thread

Status
Not open for further replies.
I would have to say that there is probably no way to predict with 100% certainty that there will be no chance of failure, but with better tools there should be a much better chance of avoiding failure.

On a side note, I have to admit that it will be interesting to see how GT200 compares to the 9800GX2. I imagine that when all is said and done, the GX2 will struggle to keep up in certain scenarios. I'm still waiting for the day when we see the monolithic 1024MB GPU.

I'm sure GT200-based 9900GTX will beat the twin-G92-based 9800GX2.

9900GT should be a decent value, while 9900GX2 goes for the very highend.

I'm hoping 9900GTX can get Crysis running (finnally) at 60fps with everything cranked so I don't have to get a 9900GX2.

GT200 should be the "real" GeForce 9 series, but yeah I do agree that Nvidia will probably call it GeForce 10, even though GeForce 10 name should be reserved for Nvidia's completely new clean-sheet architecture that should arrive in late 2009. Remember even though Nvidia is calling GT200 a next-gen GPU, it's still just a major refresh/overhaul of G80, much like NV47/G70/7800 was a refresh of NV40/6800.
 
Last edited by a moderator:
I'm hoping 9900GTX can get Crysis running (finnally) at 60fps with everything cranked so I don't have to get a 9900GX2.

Unlikely unless your running at low resolution.

It would have to be a full 4x faster than a 9800GTX to run at very high settings, 1920x1200, 60fps average.

And thats not even considering MSAA!
 
Don't we have several of those already? 8800GT, 2900XT etc

Was there a 1024MB version of 8800GT that was actually being sold? I remember reading about the 2900XT 1024MB, but thought it was just a very very limited special production run.

Also, it would be nice to see a 1024MB card that wasn't so bottlenecked in other areas that would make the extra 512MB ram offer little to no performance improvement.
 
I'm sure GT200-based 9900GTX will beat the twin-G92-based 9800GX2.

9900GT should be a decent value, while 9900GX2 goes for the very highend.

I'm hoping 9900GTX can get Crysis running (finnally) at 60fps with everything cranked so I don't have to get a 9900GX2.

GT200 should be the "real" GeForce 9 series, but yeah I do agree that Nvidia will probably call it GeForce 10, even though GeForce 10 name should be reserved for Nvidia's completely new clean-sheet architecture that should arrive in late 2009. Remember even though Nvidia is calling GT200 a next-gen GPU, it's still just a major refresh/overhaul of G80, much like NV47/G70/7800 was a refresh of NV40/6800.

I have to admit that the 9800 GTX name does throw things off. It probably would have been more appropriately named 8900 GTX, but I suppose there are some marketing benefits to calling it 98xx instead of 89xx :)
 
Was there a 1024MB version of 8800GT that was actually being sold? I remember reading about the 2900XT 1024MB, but thought it was just a very very limited special production run.

Also, it would be nice to see a 1024MB card that wasn't so bottlenecked in other areas that would make the extra 512MB ram offer little to no performance improvement.

I can't help but picturing you(even though I don't know how you look like) stepping into a best buy and asking for the 1024MB video card just like every other clueless joe sixpack.

I know you must know that memory does not mean performance, and wonder, why the hell do you want a 1024MB video card in particular?
 
I can't help but picturing you(even though I don't know how you look like) stepping into a best buy and asking for the 1024MB video card just like every other clueless joe sixpack.

I imagine you must know that memory does not mean performance, and wonder, why the hell do you want a 1024MB video card in particular?

Notice I said earlier: Also, it would be nice to see a 1024MB card that wasn't so bottlenecked in other areas that would make the extra 512MB ram offer little to no performance improvement.

Of course it would be stupid to buy a 8800GT or 2900XT with 1024MB RAM because performance is severely bottlenecked elsewhere. Obviously I was looking forward to future ~1024MB cards that could handle the UE3 engine from Sweeny, among other things.
 
Notice I said earlier: Also, it would be nice to see a 1024MB card that wasn't so bottlenecked in other areas that would make the extra 512MB ram offer little to no performance improvement.

Of course it would be stupid to buy a 8800GT or 2900XT with 1024MB RAM because performance is severely bottlenecked elsewhere. Obviously I was looking forward to future ~1024MB cards that could handle the UE3 engine from Sweeny, among other things.

I can see 1024MB cards in SLI helping greatly, as SLI could use the larger memory per-card. However, you're right, the other bottlenecks, such as the limit of the 256-bit bus, needs to be upped a bit as well to make it all work efficiently.
 
Lets talk flops for minute again.

Last year, around this time, there was word/rumor/report that Nvidia would have GPU that could do roughly 1 TFLOP, in Q4 2007. That would be 2 or 3 times greater performance than G80. The G80 did roughly 1/2 a TFLOP by some measurements, or roughly 1/3 a TFLOP by most accepted measurements.

So this new Nvidia GPU would need to be roughly 3x G80 in shader FLOPs performance.

The GPU to reach 1 TFLOP was at first thought to be the G92, even though that later turned out to be a mistake. Getting the name, and even the timeframe wrong aside, the 1 TFLOP GPU is still expected. The GT200 is that GPU.

How does GT200 get to ~1 TFLOP (assuming it does) with about 200 SP, instead of 128 SP which provides G80 & G92 with about 1/3 of a TFLOP? Clockspeed increase in the SP domain? More powerful SP with more math units.

It would be much easier to see ~TFLOP with 256 SP and just a 50% increase in clockspeed, but most people doubt Nvidia will put that many Stream Processors onto a single chip.

I really should re-read that thread related to that article.
 
Lets talk flops for minute again.

Last year, around this time, there was word/rumor/report that Nvidia would have GPU that could do roughly 1 TFLOP, in Q4 2007. That would be 2 or 3 times greater performance than G80. The G80 did roughly 1/2 a TFLOP by some measurements, or roughly 1/3 a TFLOP by most accepted measurements.

So this new Nvidia GPU would need to be roughly 3x G80 in shader FLOPs performance.

The GPU to reach 1 TFLOP was at first thought to be the G92, even though that later turned out to be a mistake. Getting the name, and even the timeframe wrong aside, the 1 TFLOP GPU is still expected. The GT200 is that GPU.

How does GT200 get to ~1 TFLOP (assuming it does) with about 200 SP, instead of 128 SP which provides G80 & G92 with about 1/3 of a TFLOP? Clockspeed increase in the SP domain? More powerful SP with more math units.

It would be much easier to see ~TFLOP with 256 SP and just a 50% increase in clockspeed, but most people doubt Nvidia will put that many Stream Processors onto a single chip.

I really should re-read that thread related to that article.

The time frame wasn't wrong, Nvidia just missed the mark and most people KNEW that G92 wasn't the "1TFlop chip" that was talked about.
As per the 1TFlop, Nvidia is, most likely, measuring peak theoretical performance.
240SPs x 3Flops per clock x 1500mhz = 1.05TFlop.
 
The time frame wasn't wrong, Nvidia just missed the mark and most people KNEW that G92 wasn't the "1TFlop chip" that was talked about.
That's not really true...
As per the 1TFlop, Nvidia is, most likely, measuring peak theoretical performance.
240SPs x 3Flops per clock x 1500mhz = 1.05TFlop.
Or 240SPs x 2Flops x 2000MHz, which would actually be better since it'd be easier to achieve maximum utilization. In that case you might even have the MUL exposed too and just not counted in the 'nearly 1TFlop' figure, so you could achieve 1.1-1.5GFlops depending on how it's exposed. However, it seems more likely that the 1TFlop figure is all-inclusive; who knows though. Also, clocks can come out slightly differently than what they were expected to be back then, so that figure might even be outdated.

I would find it an appealing design choice, personally, if the 'SFU OR Interp OR MUL' unit was exposed ala G86 and could also do an ADD. That is, it couldn't do a MADD or a logic op or... - just an ADD or a MUL. The idea is that the incremental cost of a FP32 ADD reusing as much of the MUL hardware as possible is pretty low (well, MADD is pretty cheap too compared to MUL, but it would require even more operands being sent to the unit), and there are plenty of isolated ADD/MUL ops in modern shaders that aren't MADDs. Being able to handle those cases as efficiently as possible would be advantageous; this would also help guarantee neither the main nor the SFU unit would practically ever idle except in corner cases. Of course, whether this makes sense depends on the relative size of the SFU/Interpolation unit and the complexity of feeding the unit with 2 read operands per clock. And of course, this is just wishful thinking...
 
How does GT200 get to ~1 TFLOP (assuming it does) with about 200 SP, instead of 128 SP which provides G80 & G92 with about 1/3 of a TFLOP? Clockspeed increase in the SP domain? More powerful SP with more math units.

The same marke^^^^broken math that gets G80 to 1/2 TFLOP? *ducks*

Aaron Spink
speaking for myself inc.
 
That's not really true...
Or 240SPs x 2Flops x 2000MHz, which would actually be better since it'd be easier to achieve maximum utilization. In that case you might even have the MUL exposed too and just not counted in the 'nearly 1TFlop' figure, so you could achieve 1.1-1.5GFlops depending on how it's exposed. However, it seems more likely that the 1TFlop figure is all-inclusive; who knows though. Also, clocks can come out slightly differently than what they were expected to be back then, so that figure might even be outdated.
The problem as always becomes power. 55nm vs 65 nm doesn't really give that much headroom. So I'd stick with the marketing flops for the moment. As it is, it will probably be a fairly hot card.

Aaron Spink
speaking for myself inc.
 
The 9800GX2 has 32 ROPs, 256 SPs, and 128 TMUs. The GT200 is rumoued to have 32 ROPs, 240 SPs and 80 TMUs. The former's TDP is ~200W. That's less than a R600. I know it's binned for power efficiency, but I still really don't see the problem.

And all indicators are clearly pointing at 1TFlop or 'Nearly 1TFlop' not being calculated that way, although who knows how the SFU/Interp unit will be reused for ALU ops this time around since that seems to change at every RTL revision, yay.
 
The 9800GX2 has 32 ROPs, 256 SPs, and 128 TMUs. The GT200 is rumoued to have 32 ROPs, 240 SPs and 80 TMUs. The former's TDP is ~200W. That's less than a R600. I know it's binned for power efficiency, but I still really don't see the problem.

And all indicators are clearly pointing at 1TFlop or 'Nearly 1TFlop' not being calculated that way, although who knows how the SFU/Interp unit will be reused for ALU ops this time around since that seems to change at every RTL revision, yay.

Don't forget the additional power draw of the NF200 chip present on the 9800 GX2. ;)
At least on the 780i SLI motherboards it was -another- cause for concern regarding power consumption/heat output.
 
Marketing flops, NVflops, ah yes, I still remember the complete nonsense of Xbox GPU initially being announced by Bill Gates and Nvidia as 140 GFLOPs. Then being cut down to "only" 80 GFLOPs when NV2A was finalized. If the book 'Opening The Xbox' can be trusted, it turns out Xbox was pushing about 20 GFLOPs, which is more realistic I suppose.

Then PS3 GPU, 1.8 TFLOPs. Right. Same thing. Reality is closer to 1/4 of a TFLOP. That's programmable. At least Nvidia is using programmable figures nowadays instead of Nvflops. Even though it's still all peak theoretical.

In their recent ~6 hr conference call (I listened to the whole damn thing) Nvidia also mentioned having the performance of a couple/several TFLOPs, in a couple of quarters. Where does that come from? Totally new next-generation architecture, beyond G80/G92 and GT200? I doubt it. More likely JSS had the 9900 GX2 in mind (rumored 2x GT200 on a card) that or two/ SLI'd 9900 GX2 (2 cards, 4 GPUs). I think that's reasonable for ultra highend. Not so much for gaming but for heterogeneous computing/ Tesla stuff.

Oh and assuming there is a 9900 GX2, or whatever they call the dual-GT200-on-a-card, it would be nice if the GPUs aren't crippled like the 9800 GX2 is with 32 ROPs (16 per GPU) which otherwise would've had 48 ROPs.
 
One could argue that Tri SLI 8800GTX offer nearly 1 teraflop too. Dependent on how you look at it. Its kinda interesting to look at from a theoretical point of view but doesnt really mean much obviously.
 
That's not really true...
Actually it is.
Whether it was GT200/G100 or a G90 we may never know but they definitely were planning on a new series in Nov. This comes from some interesting sources and even some marketing documents.


Or 240SPs x 2Flops x 2000MHz, which would actually be better since it'd be easier to achieve maximum utilization. In that case you might even have the MUL exposed too and just not counted in the 'nearly 1TFlop' figure, so you could achieve 1.1-1.5GFlops depending on how it's exposed. However, it seems more likely that the 1TFlop figure is all-inclusive; who knows though. Also, clocks can come out slightly differently than what they were expected to be back then, so that figure might even be outdated.
So you really think that Nvidia is going to be clocking this beast of a chip, 520-600mm2, at 800 core 2000 shader? That simply goes against logic.
 
Whether it was GT200/G100 or a G90 we may never know but they definitely were planning on a new series in Nov.
Well, yes, the rumours were of a new series in Nov. My point simply is that given the timeframe in which the 'nearly 1TFlop' comment was made, plans for Q4/Q1 were already finalized. So clearly that must have referred to GT200, not G9x.
So you really think that Nvidia is going to be clocking this beast of a chip, 520-600mm2, at 800 core 2000 shader? That simply goes against logic.
No, I'm expecting a higher shader/core clock ratio. Of course I predicted that for G92 too, and the ratio isn't that different. Who knows, we'll see.
 
Status
Not open for further replies.
Back
Top