The Official 9600GT Reviews Thread

Discussion in '3D Hardware, Software & Output Devices' started by Unknown Soldier, Feb 20, 2008.

  1. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,022
    Likes Received:
    122
    ROPs maybe too?
    I'll agree though this really is surprising. Most people (I'm guilty too) assumed that G92 SKUs are mostly limited by shader ALUs (certainly not texture mapping...), though that they are limited by memory bandwidth quite a bit too isn't exactly new either. Still, the amount of non-scaling between 64 shader units and 128 units is astonishing.
    OTOH if you think of 9600GT as twice the 8600GTS, the scaling is VERY good, even approaching superlinear...
    btw isn't it possible to disable some more shader clusters on a 8800GT? Would imho be an interesting experiment to see at which point performance really suffers.
    For the 9600GT though I'm now really reasonably confident that its performance is now indeed limited by the shader ALUs :).
     
  2. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    17,884
    Likes Received:
    5,334
    another noob question i have also seen it said that the ati card actually has only 64 shaders but for some reason you multiply that by 5 (because the shaders are vec5) and you get 320

    quote: " The Radeon HD 2000 series supports Superscalar marchitecture and it means that with Vec5 or Superscalar, ATI can process 5 scalar instructions per clock. It has 64 Unified Shaders and by multiplying these two numbers you end up with the amazing number of 320 Stream processors."

    but the g80 is vec4 so why dont we multiply its shaders by 4 ?
     
  3. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    That one's easy. It isn't vec4. :)
     
  4. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    17,884
    Likes Received:
    5,334
    its not ?
    the next line from that quote:
    "The Nvidia G80 GTX can only handle Vec4 instruction in a scalable way, or four independent instructions,"

    or does that mean the g90 has to do vec4 as 4 instructions while the ati card can do vec5 in 1 instruction ? (then why dont they say the g90 can do vec5 as 5 instructions)
     
  5. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,022
    Likes Received:
    122
    I dunno where you got that quote from, but there's nothing Vec4 in that architecture. If you have a vec4 shader instruction, this will be decomposed into 4 scalar ones.
     
  6. Twinkie

    Regular

    Joined:
    Oct 22, 2006
    Messages:
    386
    Likes Received:
    5
    That ones easy too.

    Since G80 has scalar ALUs, they can only handle vec4 instructions in a scalable way i.e as mczak explains decomposing it into 4 scalar instructions.

    A 64SP 8800GT vs 9600GT at the same clocks would be quite interesting.
     
  7. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    17,884
    Likes Received:
    5,334
    actually i got it from everyone favourite site fudzilla :D

    ps: i get it now, its not vec4 (nor does the quote say it is) thanks twinkie

    so next noob question: the ati shaders can do 5 instructions per clock as opposed to a single instruction per clock on nvidia cards - have i got it right now ?
     
  8. Rangers

    Legend

    Joined:
    Aug 4, 2006
    Messages:
    12,791
    Likes Received:
    1,596

    Well a 8800GT has 112 shaders not 128 (not sure if that's what you're reffering too). But most benchmarks compared the 9600GT to 8800GT, and I remember 8800GT wasn't much slower than 8800GTX. But also the 9600GT's are clocked a bit higher. I worked it out and I think at stock for both 9600GT should have like 69% the shader power, or ~70% of 8800GT. It's not really "almost double" for 8800GT. Seen that way it's slightly easier to grasp how the shader deficit might be not such a factor. 9600GT has 70% the shader power and is slightly faster in some other areas due to the higher core clock.
     
  9. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,022
    Likes Received:
    122
    That's basically correct, yes. Though rv670 units aren't symmetric (4 "simple" MADD-capable unit + 1 which can do more) and it's actually possible to co-issue a MUL along a MADD on G8x/G9x (so it's not really a "true" scalar unit I guess), but at least in terms of MADD instructions it's true.
     
  10. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,022
    Likes Received:
    122
    Hmm my quick calculation says it only has 62% percent of the shader power (comparing non-overclocked versions). That's definitely more than half (compared to the 8800GTS 512 it would indeed be exactly half), true, but not that much more neither... And yes the higher core clock might provide some benefits (it's only 9% more), though I remain sceptical about this. Still, if the 8800GT would have been shader limited quite significantly then the 9600GT would have no way to get that close in performance no matter what.
    To summarize, with 62% the shader power it achieves roughly 80% or so of the performance of the 8800GT overall - without AA. Once you enable 4xAA (and 16xAF) it's more like 90% (but of course it's expected it fares even better in that situation due to the 8800GT getting even more memory bandwidth limited). So there certainly is some benefit of having more shader units - it's just not that much...
     
  11. mito

    mito beyond noob
    Veteran

    Joined:
    Jun 16, 2004
    Messages:
    1,392
    Likes Received:
    10
    Location:
    Feindesland
    I'm not sure if this is right place to ask this simple question:

    Why isn't nVidia releasing a high-end 9800gtx card like it did with previous models?

    The most expensive card today is still the relatively old 8800ultra.
     
  12. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    17,884
    Likes Received:
    5,334
    maybe the want to get rid of more 8 series chips first
     
  13. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,022
    Likes Received:
    122
    Actually, another interesting comparison would be 9600GT to 8800GTS (old). Those cards are very close in raw numbers - the 8800GTS has a ~10% advantage in shader throughput and memory bandwidth (though 9600GT might compensate the latter by the supposedly better compression), and a very small 4% deficit in ROPs. Yet, the 9600GT seems to be a bit faster typically (not much though but still based on the raw numbers you'd expect it to be a bit slower).
     
  14. Unknown Soldier

    Veteran

    Joined:
    Jul 28, 2002
    Messages:
    4,047
    Likes Received:
    1,670
    Why? The 9600's shader is faster, the memory is faster and the core is also faster.

    GTS(old)
    Core - 513
    Memory - 793(1586)
    Shader - 1188

    9600
    Core - 650
    Memory - 900(1800)
    Shader - 1625

    It's interesting to note that my ori. GTS gets quite a boost once I raise the shaders, core and memory .. so this would suggest that the 9600 is sitting pretty at it's default settings. I hear it can be o/c'd though, so you can expect the 9600 to do quite well'ish.

    US
     
  15. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,022
    Likes Received:
    122
    clock speeds are irrevelant alone. What matters here is clock * nr of units. The old GTS has an advantage in shader throughput (96*1200 vs. 64*1625 - 10% advantage for GTS), memory bandwidth (800*320 vs. 900*256 - 10% advantage for GTS) and a very slight deficit in ROP capability (20*500 vs. 16*650). True core clock is used for other things, and the 9600GT thus has an obvious advantage in texture address capability, but still less filtering capability (32*650 vs. 24*550 TA, 9600GT 73% more, and 32*650 vs. 48*500 bilerps, 8800GTS still 15% more), but still neither one really looks somehow texture limited on paper so I doubt it really matters.
    But yes the 9600GT will be faster in things like primitive assembly etc. which run at core clock (unless those things have been scaled back from G80 to G94), which is exactly why it could be faster...
     
  16. PsychoZA

    Newcomer

    Joined:
    Mar 1, 2007
    Messages:
    75
    Likes Received:
    0
    Now for Easter I want a 'G92' with 24 ROPs and GDDR4. Thanks.
     
  17. Blazkowicz

    Legend

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    sadly you won't get a non-existent GPU. (I want a 3dfx Rampage and it does exist!)

    I might get a 9600GT but the problem I have is it's overkill for the games I have and I'd need to buy a HDD to store newer, bigger games, let alone buying some of them :oops:
     
    #57 Blazkowicz, Feb 25, 2008
    Last edited by a moderator: Feb 25, 2008
  18. AnarchX

    Veteran

    Joined:
    Apr 19, 2007
    Messages:
    1,559
    Likes Received:
    34
  19. mito

    mito beyond noob
    Veteran

    Joined:
    Jun 16, 2004
    Messages:
    1,392
    Likes Received:
    10
    Location:
    Feindesland
  20. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,022
    Likes Received:
    122
    Huh where? In none of the graphs a 9600GT is really even close to a 8800 ultra (that's really a bit too much for this chip to ask for!). Now, two of them is a different story...
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...