Beyond3D's GT200 GPU and Architecture Analysis

It's an FP format, s6e3 if I remember rightly (could be wrong there though, I'll check).
Ok seems such a format really exists (still for the 10/10/10/2 int format the same would apply you'd need more-than-8-bit blend units for full speed)
And it looks like Tridam couldn't find full speed FP10 or FP16....
Interesting. Looks like indeed there are twice as many blend units, but for everything else than 4x8int they need 2 cycles (or two need to be joined for 1 pixel or whatever).
btw the AA numbers are also interesting - g80/g92/g200 have a very large drop with 8xAA. Why? I'd have expected half z samples / s compared to 4xAA but instead it's only about one third. The 3870 numbers are strange too however, again I'd have expected half the z samples / s with 8xAA compared to 4xAA (with the architecture reportedly being able to handle four sub-samples per clock) and not the same number...
 
I can't help but laugh at those that are underwhelmed by GTX 280. You're silly.
It's very similar to people being underwhelmed by R600 compared to R580. The GTX 280 has a lower clock than G92 and not as much of an increase in functional units as the transistor increase would allow.
 
I can't help but laugh at those that are underwhelmed by GTX 280. You're silly.

Ehh, being 50-90% faster isnt even up to Moores Law though, considering its been 20.5 months since the 8800 ultra.

I suppose the shock comes from the fact that last generation we had a 2.5-3x boost over the previous generation, with only 8 months between. Now we have a 20 month gap with only 50%-90%.

I mean, its nice, it is considerably faster, but for $650 and 20 months later I really expected more.
 
I view the G8X to G200 transistion as akin to the Nv4x to G7x transition.
 
nice article, but...

20th or 23th of June: GT200/G80 Architecture Deep-Dive

I'm really waiting for this one :smile:

up to this point, the GT200 seems to be a good product which falls short of being great and far short of the hype that was generated around it. the chip is huge and extremely hungry, and it's also like 6 months late - but, it's still 1.4Bn trans in a single chip, and by far the fastest graphics core ever produced.

what I wonder about is:
- are the 2.5x as many threads sufficient to solve some of the scheduling difficulties the G80 architecture seemed to have (the gaming tests seem to suggest "not really" as yet, but let's wait a few driver revisions)
- will it be capable of running more complex GS codes without doubling over and dying? see the Global Illumination demo...
- is there a technical reason behind the lack of DX10.1 support, or just the obvious political one?
 
I can't help but laugh at those that are underwhelmed by GTX 280. You're silly.
Really? I dunno, I kind of expected more. Maybe unrealistically...

The point is indeed that G80 not only brought with it a huge speed boost, but also a transition to unified shaders, scalar pipelines, fast DB, orthogonal texture blending, filtering and MSAA, etc. etc. all of which had both performance and algorithmic implications.

GT200 brings what over G8x/G9x:

1) More performance... but arguably nothing you couldn't already get in the form of the 9800GX2
2) Faster geometry shader... if I cared about the thing, maybe
3) Doubles... cool but not exposed for graphics and not breaking any speed records for HPC
4) ...
5) More chip, more power, more heat, more cost...

Don't get me wrong, I still may pick one up at some point, but I tend to agree with Anand's conclusion that at $650 it's a tough sell compared to the 9800GX2 which is not only $150 cheaper but also faster. Now I'm not a big multi-GPU guy, but with numbers like, and arguably more explicit software programmability of multi-GPUs coming, this may be this is the generation that opinion changes...

Anyways I'm looking forward to seeing what ATI brings to the game this time around. Seems like they'll maybe have a nice mixed bag of performance *and* feature updates that might just be more tempting for a software guy like me :) That said, if they still haven't added an "aspect scaling" option to their drivers yet for LCDs that'll still be a show-stopped ;)

But give me a day... maybe the 280 will grow on me. Or maybe it'll go down to like $400 or less soon.
 
The big problem with Nvidia's line-up for me right now is there's no cool, quiet single GPU card that can consistently give me the performance I want at my native 1680x1050 resolution with 4xAA.

GTX260 might have the performance but something about cut-down cards turns me off now (I currently have a 640MB GTS) plus it doesn't have that nice round teraflop number :) IMO Nvidia has left AMD a big opening to come in and grab major mind/market share with RV770.
 
Anyways I'm looking forward to seeing what ATI brings to the game this time around. Seems like they'll maybe have a nice mixed bag of performance *and* feature updates that might just be more tempting for a software guy like me :)
Features? Performance and/or driver updates yeah, but features, really?

That said, if they still haven't added an "aspect scaling" option to their drivers yet for LCDs that'll still be a show-stopped ;)
Been in the drivers for a few months I believe, worth googling...

Jawed
 
Based on a comparison with the previous generation, one can roughly estimate that around one quarter of the transistor count was invested into GP-functionality.

Most people are encoding huge videos maybe once a month and I can hardly see a motivation to invest a single Euro into gaining a few minutes of encoding time.
And physics on the GPU is not even worth to be mentioned until there is an industry standard for it. There is plenty CPU time available for this anyway.

So this development is definitely not in the interest of the end-user.
 
Still said:
Based on a comparison with the previous generation, one can roughly estimate that around one quarter of the transistor count was invested into GP-functionality.
Uhhhh, all the SMs combined (ALUs+Schedulers) is barely more than 25%. Are you suggesting NVIDIA should have created a GPU without a single ALU in its shader core? :p
trinibwoy said:
GTX260 might have the performance but something about cut-down cards turns me off now (I currently have a 640MB GTS) plus it doesn't have that nice round teraflop number :) IMO Nvidia has left AMD a big opening to come in and grab major mind/market share with RV770.
While I agree the GTX260 might not be the Perfect GPU, I find that reasoning to be extremely bizarre to say the least. Care to clarify perhaps? :)
 
While I agree the GTX260 might not be the Perfect GPU, I find that reasoning to be extremely bizarre to say the least. Care to clarify perhaps? :)

Who said anything about reason? :p

I just don't want to feel like I have a hobbled chip in my machine. My GTS was my first card like that and it always felt like I was missing out. So it's either a whole GT200 or a whole RV770 for me :)
 
Who said anything about reason? :p

I just don't want to feel like I have a hobbled chip in my machine. My GTS was my first card like that and it always felt like I was missing out. So it's either a whole GT200 or a whole RV770 for me :)
Heh, okay. That's a bit like saying you're missing out on a 8800GT because one cluster is disabled, so you'd rather buy a 9600GT. But meh, guess if you care about bragging rights it is a disadvantage... :)
 
Uhhhh, all the SMs combined (ALUs+Schedulers) is barely more than 25%. Are you suggesting NVIDIA should have created a GPU without a single ALU in its shader core? :p
From what I could see in the NV presentation, around 50% is spent on the shader processors and the thread scheduler. A big part of the memory controllers have to be added to this count as well.

And even without that you can make this out by comparing with G80/G9x cores.
 
Uhhhh, all the SMs combined (ALUs+Schedulers) is barely more than 25%. Are you suggesting NVIDIA should have created a GPU without a single ALU in its shader core? :p
He's talking about the difference between GT200 and a scaled up G92/G80. Various structural units must have grown in size, and the only real explanation offered is a larger register file and features for GPGPU.
 
Heh, okay. That's a bit like saying you're missing out on a 8800GT because one cluster is disabled, so you'd rather buy a 9600GT. But meh, guess if you care about bragging rights it is a disadvantage... :)

Nah, it's nothing like that. The G80 GTS is a good 40-50% slower than the GTX so there's a real sacrifice being made. Given equal performance I would prefer a fully enabled chip over a "yield enhancing" SKU though. Of course, it's all pretty superficial but I think you're allowed to indulge in those frivoloties as a consumer.

I get the impression that GT200 could make use of a lot more ALU horsepower so hopefully all the recent ROP/TMU enhancements last a while and Nvidia focuses more of the upcoming transistor budget on upping the flops.
 
He's talking about the difference between GT200 and a scaled up G92/G80. Various structural units must have grown in size, and the only real explanation offered is a larger register file and features for GPGPU.
Well, one part of this certainly are the DP units. Maybe they aren't that big (given their low number and probably little overhead for managing them since apparently they share all the register/issue etc. resources), but for games these transistors are obviously a complete waste doing absolutely nothing than consume (hopefully only a little bit due to clock gating) power.
The register file increase might also be targetted more towards GPGPU - claiming a ~10% increase increase in 3dmarks due to this is nice, but I'm not sure the increase in transistors/die size/power usage would be worth it (well I've no idea of the increase in these areas in absolute figures really but is it below 10%?)
 
Back
Top