ELSA hints GT206 and GT212

Discussion in 'Architecture and Products' started by AnarchX, Sep 9, 2008.

  1. Novum

    Regular

    Joined:
    Jun 28, 2006
    Messages:
    335
    Likes Received:
    8
    Location:
    Germany
    Nope. The TMUs are right next to the ALUs and should be much larger. The layout of the TMUs of this chip must be irregular.

    GT200:
    [​IMG]

    red = Vec8
    green = octo TMU.

    3xVec8 + octo TMU = TPC.

    What you have marked as ROPs+TMUs ist most likely the GDDR5 interface.
     
    #741 Novum, Jun 30, 2009
    Last edited by a moderator: Jun 30, 2009
  2. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    Hmm, I've been told that NVidia's specifications page is wrong and it's 24.

    So that would appear to indicate the entire line-up is based upon 3 multiprocessors with a pair of quad TMUs per cluster.

    So TMUs appear to be:
    • GT218 - 8
    • GT214 - 16
    • GT215 - 32
    Jawed
     
  3. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    I know it's this way with GT200 and older chips. GT21x are a new breed and 'til now, I fail to identify the TMU area(s) on those GPUs.
     
  4. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    What you've marked doesn't add up to an entire cluster. Could be general control or it could be TMU. Dunno.

    Also, what's interesting is that in the GT215 die shot the clusters appear to contain much less logic than GT200 (the ratio of area for "ALUs" to "TMUs" is wildly different comparing the two) - implying that the layout of GT215 doesn't have clusters as single contiguous units.

    Either that or there's much less TMUs. Or that scaling to 40nm has been wildly non-linear depending upon unit :???:

    The scaling of the ALUs, for what it's worth, appears to be ~2x, from 65nm GT200 to 40nm GT215. One "ALU" in GT200 is 0.654mm² and the same unit in GT215 is 0.323mm².

    Jawed
     
  5. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,490
    Likes Received:
    400
    Location:
    Varna, Bulgaria
    There are four similar structured rectangle blocks, situated between the pairs of TPCs distinguishable in the die shot -- those could be texturing hardware, being just the samplers, mapping units or even both (too small for eight TMU quads, anyway... duh!). :???:
     
  6. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,430
    Likes Received:
    434
    Location:
    New York
    What's GT214? Isn't it GT216?
     
  7. RussSchultz

    RussSchultz Professional Malcontent
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,855
    Likes Received:
    55
    Location:
    HTTP 404
    Its oddly marked, that's for sure, but there's obviously 10x(3x+1) instances.

    For the die shot linked to, I don't believe the areas marked 'SIMD' should cover the area that they do. Each SIMD block does seem to represent 3x of something, but the piece attached to it (which I'm saying shouldn't be part of it) isn't a duplicate on each of the different blocks. It might be a routing issue that's making them look different (and they're only instanced on lower metal layers), but I kinda doubt that.

    I don't think that what that person has labeled as the same thing on the lower and left hand edges are actually the same thing.

    What I see is
    4x(3x)--what's mark SIMD
    8x --what's marked octo-dunnos
    8x --what's marked QTU on the left
    4x --what's marked QROP of the left
    8x --what's marked QTU on the bottom
    4x --what's marked QROP on the bottom

    I'd gather that there are 4 functional units, each composed of:
    3x something (SIMD)
    2x something (QROP of the left)
    2x something (QROP of the bottom)
    2x something (OCTO on teh top)
    1x something (QTU on the left)
    1x something (QTU on the bottom)
     
  8. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    :oops: yep!

    Jawed
     
  9. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    Agreed.

    Appears to be PCI Express
    IO connections for GDDR, with what's labelled QROP actually prolly corresponding with command bus with the remainder being data bus.

    Jawed
     
  10. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,490
    Likes Received:
    400
    Location:
    Varna, Bulgaria
  11. RussSchultz

    RussSchultz Professional Malcontent
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,855
    Likes Received:
    55
    Location:
    HTTP 404
    How sure are you about that?

    Those seem awfully busy and large to be pads and drivers for 4 pins for each square.
     
  12. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,490
    Likes Received:
    400
    Location:
    Varna, Bulgaria
    LOL, you obviously haven't seen what a truly large pad array looks like. :lol:
     
  13. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
  14. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    That shot doesn't distinguish at all between SIMD-Control- and TU-logic - it's all "Texture", whereas the GT21x-shot at least shows some additional logic besides the actual ALUs in the SIMD-parts of the die - but just not enough to make me believe, that TUs are still incorporated.

    I'm not saying, fellix is wrong and I am here, but I've yet to see convincing evidence for either position.

    And take into account, that for DX10.1 Nvidia would have to overhaul their TMUs either way - and maybe they tried to get away with less space, maybe combining some of the stuff for accessing memory, which is replicated in both TMUs and ROPs.

    I could imagine, you can get away with less space when routing a dual-lane (1 for ROP-use, 1 for TU-use) to memory compared to having to to the individual routing from two far away parts of the die (I guess that's the principle of highways or autobahns also).
     
  15. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,490
    Likes Received:
    400
    Location:
    Varna, Bulgaria
    Just to make my statement more figurative (the red outline):

    [​IMG]
     
  16. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    Try this, too:

    http://pc.watch.impress.co.jp/docs/2008/0617/kaigai_16l.gif

    Even though it, too, doesn't make the distinctions you require.

    I agree there should be some control stuff per cluster and I've no idea of the extent of the SIMD-specific stuff (i.e. 3x MAD-8, MI-2 and DP-1).

    Yes, they definitely have to do extra things (e.g. gather). We still don't even know how many TMUs there are. For all we know there's only 16 of them :razz:

    Yes, to a degree "repeater islands" across the die imply that routing will agglomerate. The routes themselves don't take space since they are in metal layers under the logic layer.

    Jawed
     
  17. Novum

    Regular

    Joined:
    Jun 28, 2006
    Messages:
    335
    Likes Received:
    8
    Location:
    Germany
    What's missing? NVIDIA marks it the same.

    Hrm that could explain it. But then that NVIDIA picture is wrong:

    [​IMG]

    There should be more "random" logic that is not texture that belongs to the ALUs.
     
  18. RussSchultz

    RussSchultz Professional Malcontent
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,855
    Likes Received:
    55
    Location:
    HTTP 404
    The red squares don't look to be instances. The contents look similar, but they don't look like instances.

    And the blue squares seem to be too big. (the areas closer to the center line do not match across instances)
     
  19. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,490
    Likes Received:
    400
    Location:
    Varna, Bulgaria
    I think we've already concluded here, that irregularities between similar block instances are due to employing a full automatic design & tuning for the selected logic circuits.
     
  20. RussSchultz

    RussSchultz Professional Malcontent
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,855
    Likes Received:
    55
    Location:
    HTTP 404
    What you're pointing to doesn't look like a sea of gates, either (i.e. the product of automatic place and route).

    I mean, I guess it just doesn't make sense to me to 'halfway instance' something in a way that looks close to the same, but not quite.

    Usually instancing is either plopping hard macros down, or just letting the auto place and route do its thing and ending up with a sea of gates.

    Of course, I only tangentially work in the back end of chip design, so I just might not be familiar with the technique.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...