NVIDIA GT200 Rumours & Speculation Thread

Status
Not open for further replies.
I dare to propose, that the batch-size is untouched from the G80, e.g. an additional SIMD array of eight SPs per cluster just increases the threading parallelism...
When talking about next-gen CUDA hardware (e.g. during the 2006/07 UIUC course) there have been very strong hints of 32-element batches.

Jawed
 
AFAIK, G80 duplicated the same instruction among the two SIMD arrays, somewhat limiting the threading flexibility -- what is the chance GT200 to break from this "tradition", or there isn't need to do so?
 
I was just checking Charlie Demerjian's news and comparing it to NDA info I have myself and I realized that although Charlie's interpretation is very anti-nVidia, some of it is right. So I'm inclined to believe Charlie knows GTX 200 specs and clocks, just his info about RV770 is way off, which is why he claims that single RV770 will be on par with the GTX 260.

How do you know he is not right with 770? Are you also under AMD NDA?
 
From the slide posted by trini

"Do not distribute" :LOL:

I kind of feel sorry for those who bought a 9800GX2. I mean, there would be no point in nVIDIA to spend alot of resources especially on the driver side of things to maintain scaling for the GX2 in newer titles to come since its pretty much reached EOL (due to GT200). Quad SLi becomes meaningless once again just like the 7950GX2.

Quad SLI scaling will take no more resources to maintain scaling than 3 way SLI does. Nway is fully backwards compatible with 2 way and 3 way SLI. Hence the reason they made it Vista only. Turn on max AFR and use the same SLI compatibility bits as other profiles and call it a day. It actually works very well 95% of the time.

Chris
 
=>Jaaanosik: No I am not under AMD NDA, so I don't *know*, but anybody with half a brain can tell Charlie is wrong on this. We know there's gonna be an R700, we know GT200 will be power-hungry and we know that nVidia's architecture is more effective in terms of performance per transistor. Now, CJ says GT200 has 1,4 billion transistors. GTX 260 is basically 75% of GTX 280. There are parts of the chip that don't scale down with the number of SPs/TUs/ROPs/whatever, so the GTX 260 has to use more than GT200's 75% transistors (1,05 B). RV770 can't be big, and even if it had around 1 billion transistors - which is IMO far more than in actually has - it would still lose to GTX 260. R700 will likely be the most direct opponent to GTX 260, while GTX 280 will remain unmatched. (btw, nejsi náhodou Slovák? :) )
 
BTW, here's a thought: I think the only way Charlie could possibly be right about the GT200b die size being 'a little more than 400mm²' while GT200 is presumably 576mm²... is if GT200b uses GDDR5. 576*0.81 is 466mm², but you could assume the I/O for GDDR5 per-bit isn't massively higher than GDDR3 so you could save maybe 20mm² there. And if you save a bit on the MCs too, there you go, low 400s.
Where did you get the 0.81 figure? With perfect scaling you'd get (55/65)^2 = 0.72, which would be perfect for low 400 die size. I admit though you probably won't get perfect scaling, and I'd also assume nvidia wants to reintegrate nvio again, so yeah your theory makes sense.
I don't get the reasoning behind GT200 anyway, why not use GDDR5 in the first place? Why still on 65nm (55nm is hardly bleeding edge any more)? Though I guess if it's really delayed a lot that would explain both of those...
Speaking of delays, what happened to the 8600GT (G84) replacement (G96)? Those are still fabbed at 80nm, whereas AMDs current lineup doesn't include anything but 55nm parts for some months now...
 
Where did you get the 0.81 figure? With perfect scaling you'd get (55/65)^2 = 0.72, which would be perfect for low 400 die size. I admit though you probably won't get perfect scaling, and I'd also assume nvidia wants to reintegrate nvio again, so yeah your theory makes sense.
That kind of calculation is worthless for half-nodes; TSMC says it's a 10% linear shrink, so 0.9*0.9 = 0.81 - considering it's also the case for analogue and I/O apparently (!!), that means 466.56mm² if GT200 is 576mm².
I don't get the reasoning behind GT200 anyway, why not use GDDR5 in the first place? Why still on 65nm (55nm is hardly bleeding edge any more)? Though I guess if it's really delayed a lot that would explain both of those...
Remember that the cost per MiB of GDDR5 is higher than GDDR3, and on an ultra-high-end part, which GT200 was always supposed to be, you'd definitely want >=768MiB... And with a 256-bit bus, that'd man 1024MiB. For a 512MiB SKU, I'm sure GDDR5 would be a very good strategy; but at 1024, I don't have the raw data but I'm definitely very skeptical. Also, you're right of course that if GT200 had come out in Q4/Q1, GDDR5 wouldn't even have been an option.
Speaking of delays, what happened to the 8600GT (G84) replacement (G96)? Those are still fabbed at 80nm, whereas AMDs current lineup doesn't include anything but 55nm parts for some months now...
Will come on 65nm in time for the Back-to-School OEM cycle unless it got delayed again for the billionth time...
 
=>Jaaanosik: No I am not under AMD NDA, so I don't *know*, but anybody with half a brain can tell Charlie is wrong on this. We know there's gonna be an R700, we know GT200 will be power-hungry and we know that nVidia's architecture is more effective in terms of performance per transistor. Now, CJ says GT200 has 1,4 billion transistors. GTX 260 is basically 75% of GTX 280. There are parts of the chip that don't scale down with the number of SPs/TUs/ROPs/whatever, so the GTX 260 has to use more than GT200's 75% transistors (1,05 B). RV770 can't be big, and even if it had around 1 billion transistors - which is IMO far more than in actually has - it would still lose to GTX 260. R700 will likely be the most direct opponent to GTX 260, while GTX 280 will remain unmatched. (btw, nejsi náhodou Slovák? :) )

That's true for R600 and derivatives, but if you have no data for R700 performance or transistor count, I think this statement is a little hazarded, even if I think RV770XT will not match a GTX260 too.
If G92b will not bump substantially the clocks and bandwidth and the rumors of RV770 being 1.25x9800 GTX are true, i.e., at least in the midrange card this will not be true anymore, as the number of transistors of RV770 is rumored also to be much less than 1b. Of course is a big if.
 
=>Jaaanosik: No I am not under AMD NDA, so I don't *know*, but anybody with half a brain can tell Charlie is wrong on this. We know there's gonna be an R700, we know GT200 will be power-hungry and we know that nVidia's architecture is more effective in terms of performance per transistor. Now, CJ says GT200 has 1,4 billion transistors. GTX 260 is basically 75% of GTX 280. There are parts of the chip that don't scale down with the number of SPs/TUs/ROPs/whatever, so the GTX 260 has to use more than GT200's 75% transistors (1,05 B). RV770 can't be big, and even if it had around 1 billion transistors - which is IMO far more than in actually has - it would still lose to GTX 260. R700 will likely be the most direct opponent to GTX 260, while GTX 280 will remain unmatched. (btw, nejsi náhodou Slovák? :) )

You are right; I just wanted to get out you if you are kicking for both camps. :)
Yeap, I am. I'd send you a PM but it does not work for me yet. Can somebody fix it? A mod, anybody, ... please.
 
That's true for R600 and derivatives, but if you have no data for R700 performance or transistor count, I think this statement is a little hazarded, even if I think RV770 will not match a GTX260.
I think the R700 could nicely fit into the mammoth hole ($200) between the 280 & 260.
 
AFAIK, G80 duplicated the same instruction among the two SIMD arrays, somewhat limiting the threading flexibility -- what is the chance GT200 to break from this "tradition", or there isn't need to do so?

The two SIMD arrays in G80 are independent.
 
We know there's gonna be an R700, we know GT200 will be power-hungry and we know that nVidia's architecture is more effective in terms of performance per transistor.
I'd rather argue with performance per "die area", which looks quite comparable to me - rv670 and G94 have a very similar die area (if you factor in that one is fabbed at 55nm and the other at 65nm). Ok maybe G94 has a slight edge but it's hardly what I'd call a blow-out victory.
Now, CJ says GT200 has 1,4 billion transistors. GTX 260 is basically 75% of GTX 280. There are parts of the chip that don't scale down with the number of SPs/TUs/ROPs/whatever, so the GTX 260 has to use more than GT200's 75% transistors (1,05 B). RV770 can't be big, and even if it had around 1 billion transistors - which is IMO far more than in actually has - it would still lose to GTX 260.
I agree that it seems unlikely that any RV770 based single-chip product will reach GTX 260 performance. Even ATI puts 4850 performance "only" at the same level as the 9800GTX, so assuming 4870 is 30% or so faster it would mean that GTX 260 would only roughly perform 30% better than 9800GTX. For such a monster chip (even if parts of it disabled) that sure would be disappointing.
R700 will likely be the most direct opponent to GTX 260, while GTX 280 will remain unmatched.
This one I don't see yet. GTX 260 might be faster than a single rv770 (and it should be - otherwise nvidia really has a big problem), but the difference to GTX 280 doesn't look that big to me that two rv770 can't catch it - assuming that the multichip rv770 scales well (depending how that works out there could well be situations where even a GTX 260 might be faster but it might be the exception rather than the norm...)
 
gtx200seriespricenk1.jpg

You want one:D
 
That's true for R600 and derivatives, but if you have no data for R700 performance or transistor count, I think this statement is a little hazarded, even if I think RV770XT will not match a GTX260 too.
Then we agree on this, don't we?
You are right; I just wanted to get out you if you are kicking for both camps. :)
I am not *kicking* for any camp, although some people call me an ATi fan. (You sound like you haven't heard my name before, how could that be? :cry: )
mczak said:
I'd rather argue with performance per "die area", which looks quite comparable to me - rv670 and G94 have a very similar die area (if you factor in that one is fabbed at 55nm and the other at 65nm). Ok maybe G94 has a slight edge but it's hardly what I'd call a blow-out victory.
If you don't mind my asking, what's wrong with comparing transistor counts?
 
According to someone on another forum, in a letter directed to share holders nVidia says that the amount of transistors in GT200 is ~1.2 billion opposed to 1.4 billion reported by CJ? (was it posted here already?)
 
Status
Not open for further replies.
Back
Top