Nvidia GT200b rumours and speculation thread

Discussion in 'Architecture and Products' started by nicolasb, Jul 11, 2008.

  1. igg

    igg
    Newcomer

    Joined:
    May 16, 2008
    Messages:
    63
    Likes Received:
    0
    I really hope we'll get some official information on this chip after the R700 performance previews (are they still scheduled for tomorrow?) are published.

    Maybe Nvidia wants to spoil the R700 launch like they did with the 4850s launch (9800GTX+).
     
  2. neliz

    neliz GIGABYTE Man
    Veteran

    Joined:
    Mar 30, 2005
    Messages:
    4,904
    Likes Received:
    23
    Location:
    In the know
    Nvidia DX 10.1 part? .. I don't think so...
     
  3. annihilator

    Newcomer

    Joined:
    May 27, 2008
    Messages:
    87
    Likes Received:
    0
    Location:
    Istanbul
    Wouldn't the card have to be 512bit if they enabled the remaining cluster?
     
  4. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    No, clusters aren't tied to the memory controllers.
     
  5. INKster

    Veteran

    Joined:
    Apr 30, 2006
    Messages:
    2,110
    Likes Received:
    30
    Location:
    Io, lava pit number 12
    Trini already answered, but if you want proof of that, look no further than any run-of-the-mill 8800 GT.
    It has a disabled cluster, yet retains the 256bit memory interface and 512MB GDDR3 capacity of its 8800 GTS 512MB, 9800 GTX/GTX+ and 9800 GX2 cousins.
     
  6. Blazkowicz

    Legend

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    well they can independently disable shader clusters or ROP clusters (tied to memory, 4 ROPs = 64bit). a shader cluster is two multiprocessors (8SP + SFU + registers) on G8x/G9x, three on G200, so 16SP and 24SP respectively.

    so with G92 you have 128SP/256bit (the full GPU), 112SP/256bit (8800GT, 9800GT), 96SP/192bit (8800GS)
    GTX280 is 240SP/512bit, GTX260 is 192SP/448bit. so they respectively have 10 and 8 shader clusters, 8 and 7 ROP clusters.

    just recalling those boring facts because there seems to be confusion with those dreadful "clusters" :)
     
  7. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    Obviously some like to think that way, yes. Me, I'd rather believe, Nvidia vastly underestimated RV770. :)
     
  8. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Can't say I blame them. ATI's engineering didn't have much to brag about with their previous DX10 efforts.

    NVidia may be in a tough spot now due to their commitment to CUDA. I think Jawed pointed this out, but it seems like they're stuck with 8-wide SIMDs. ATI basically has 16x5 SIMDs right now and there's no pressing need to go for better granularity. Even after 55nm scaling, the former is more than half the size of latter, and despite increased utilization and clock speed, that's not even close to being small enough.

    I think computational speed is starting to matter less, though. Games are probably using a bit more math, but it's not increasing as fast as GPU ability. We'll see if GT300 has some innovations there.
     
  9. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    In terms of math capability I think Nvidia can keep up even with the more expensive 8-way SIMD approach.

    What I don't get is why GT200 seems to have a lot more supporting logic than RV770. For example, ALU+TEX on RV770 seems to be a larger percentage of the die than ALU+TEX on GT200 even with NVIO parceled out to a separate chip. Since a lot of the arbitration logic is part of the clusters what is all the extra stuff on the GT200 die?
     
  10. Freak'n Big Panda

    Regular

    Joined:
    Sep 28, 2002
    Messages:
    898
    Likes Received:
    4
    Location:
    Waterloo Ontario
    Can you elaborate as to why you think NV is in a tough spot due to their commitment to CUDA?
     
  11. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,716
    Likes Received:
    2,137
    Location:
    London
    NVidia's connecting 10 clusters to 8 ROP partitions - whereas ATI's connecting 10 clusters to 4 MCs. The interconnection logic scales faster than either side being connected- it's a combinatorial explosion.

    The sheer quantity of memory bus pins is prolly also a factor.

    Jawed
     
  12. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    IMO they really don't want to move away from 8-wide SIMDs, as that is something they want to keep consistent in their GPU computing framework. ATI hasn't made any such commitment.

    Sure, but not at the same cost as ATI. NVidia has a long history of designing as optimally as possible for a given set of design constraints (NV3x aside). I'm pretty sure their current design can't get any smaller.

    I don't think you're right about that. The ALU space is ~25% on both, and while TEX space is about the same for NV's DX10 chips, the TEX area is a lot smaller than the ALUs for ATI. It looks like ~40% ALU+TEX on RV770, and 50% ALU+TEX on GT200 and G92/G80.

    It's not just the ALUs that are awesome in RV770, but what ATI can do with 40 seemingly small TMUs is quite impressive compared to the 64 TMUs in G92. XBit labs has some fairly texture-intensive shaders (see R580 vs. R520), but RV770 is still beating G92 in them (here).

    Anyway, in the "extra stuff" there's still a lot of arbitration logic to decide which workloads go to which cluster. There's still all the rasterization w/ Z-cull, which needs to feed the shaders twice as fast to take advantage of twice the ROPs in GT200. IMO, 50% non-shader space on GT200 doesn't seem out of place compared to 60% in RV770, all things considered.
     
    #52 Mintmaster, Jul 14, 2008
    Last edited by a moderator: Jul 14, 2008
  13. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    If only CUDA was not so close to the hardware and a little bit more abstract..
    They had probably lost a bit of performance here and there but they would have not found themselves in this situation.
    I guess a future CUDA revision is going to address this issue.
     
    #53 nAo, Jul 14, 2008
    Last edited: Jul 15, 2008
  14. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Yup, that's why I think they're in a tough spot. They have to choose between changing some CUDA fundamentals and letting ATI keep the ALU per-mm2 efficiency crown (which may not be so bad). NVidia would rather not have to do either.

    I'm sure they felt that they made the right choice when R600 was out, and still felt fine with RV670. Only with RV770 does it this years-old decision look a bit restricting.
     
  15. Freak'n Big Panda

    Regular

    Joined:
    Sep 28, 2002
    Messages:
    898
    Likes Received:
    4
    Location:
    Waterloo Ontario
    Oh I see, thanks for elaborating. My bet is that they will retool cuda so they can move away from the area inefficient 8 way SIMD design
     
  16. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Well, remember that it does give them better branching granularity and dependent instruction throughput (though I think ATI can achieve the latter with minimal cost as well, as I've argued before), and I think they have the option to get even better granularity if they want.

    I don't think it's too useful right now, but it could be in the future, especially for non-graphics loads. I think if NVidia improves its texturing, memory controller, and AA performance, the areal math inefficiency may not matter for games.

    However, gaudy math numbers must look tempting for HPC customers, too, and if AMD pushed FireStream hard then NVidia may have no choice but to do what you suggested.
     
  17. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    955
    Likes Received:
    52
    Location:
    LA, California
    Slight tangent: what do you guys think of the bulk synchronous parallel processing model? There's a paper from microsoft research at siggraph 08 on it: BSGP: Bulk-Synchronous GPU Programming, scroll down a bit for the paper.
     
  18. PeterT

    Regular

    Joined:
    May 14, 2002
    Messages:
    702
    Likes Received:
    14
    Location:
    Austria
    Thanks for the link, I just finished reading it. It looks like a very interesting model and implementation, and the fact that they developed a non-trivial application on it (the X3D parser) makes it a lot more credible.

    However, I think it's more than just a slight tangent from the topic of this thread. Perhaps this part should be split off to a separate thread in the GPGPU forum?
     
  19. nicolasb

    Regular

    Joined:
    Oct 21, 2006
    Messages:
    421
    Likes Received:
    4
    Fudo says GT200b will be here in September, maybe even August:

    http://www.fudzilla.com/index.php?option=com_content&task=view&id=8515&Itemid=1

     
  20. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    "We believe that Shaders and clock of 55nm GT200 are definitely going to be higher than the 65nm and the chip itself should be a bit cooler."

    That's one hell of a speculation. It definitely needs more than just a thorough understanding of semiconductors and the industry as whole to figure that out. ;)
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...