AMD: R9xx Speculation

Discussion in 'Architecture and Products' started by Lukfi, Oct 5, 2009.

  1. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
  2. no-X

    Veteran

    Joined:
    May 28, 2005
    Messages:
    2,455
    Likes Received:
    471
    Maybe this change maybe somewhat related to the patent that Jawed linked some time ago(?)
     
  3. SimBy

    Regular

    Joined:
    Jun 21, 2008
    Messages:
    700
    Likes Received:
    391
    2 poly/clock?
     
  4. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,541
    Likes Received:
    964
    Well it would fit 24 SIMDs with 80 SPs each… but obviously that contradicts the slide itself. Weird. So either the TMUs are decoupled or it's yet another fake.
     
  5. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,716
    Likes Received:
    2,137
    Location:
    London
    Hmm, so this means ALU:TEX is 5:1 (in terms of cycles) rather than 4:1 as it has been for years now. So perhaps there's something in those patent applications that I've linked several times :razz:

    I expect this will be fine for games, 80 TMUs in Cypress seem to be wasted anyway.

    Compute applications which depend on L1->ALU bandwidth might be a bit constrained. Though there's always the possibility that TEX->ALUs could be beefed-up. If, as one of the patent applications seems to suggest, ALU's can write to the L1s, then that'll be more interesting...

    2 polys per clock is definitely what we want to see.

    After Barts's revealing that 16 ROPs ~ 32 ROPs as far as performance goes, I think it's reasonable to expect Cayman to be significantly more bandwidth efficient, and for 32 Cayman ROPs to be worth significantly more than 32 Cypress ROPs.

    I can't see anything here that looks faked, and I'm cautiously optimistic it'll work out well...


    One possible arrangement?:
    • 30 SIMDs - each 16 ALUs with 64 ALU lanes
    • 12 octo-TMUs - totalling 96 TMUs
    • Each set of 10 SIMDs has 4 octo-TMUs
    Or?:
    • 30 SIMDs - each 16 ALUs with 64 ALU lanes
    • 12 octo-TMUs - totalling 96 TMUs
    • Each set of 15 SIMDs has 6 octo-TMUs
    I dare say the latter accords with 2 polys per clock.
     
  6. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    The spec's from this slide surprisingly coincide with my early prediction for Cayman architectural layout here. :grin:
     
  7. Mianca

    Regular

    Joined:
    Aug 7, 2010
    Messages:
    333
    Likes Received:
    19
    The rumor about "decoupled" TMUs is rather old now. It came from the same source that indicated a 6 module architecture (with each module sharing 4x4 TMUs and 1 dedicated tessellator unit) ...

    See my earlier post.
     
  8. PSU-failure

    Newcomer

    Joined:
    May 3, 2007
    Messages:
    249
    Likes Received:
    0
    Makes sense, considering Barts probably has 2x8 SIMD, there would be the same macro-redundancy in Cayman (1 spare SIMD for each bank of 16).

    I was also thinking AMD could deviate from Hemlock and use high yield, partly disabled dies (perfect if the die is big as it'll allow for $400-500-650 prices, or something similar).
     
  9. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,716
    Likes Received:
    2,137
    Location:
    London
    My interpretation of the patent applications is that an octo-TMU can deliver 4 texturing results, based on 64-bit texels (e.g. fp16 RGBA texels), per clock to an ALU.

    So one arrangement we might see for a shader engine, assuming 2 shader engines (quick and dirty photochop from your picture fellix :razz:):


    [​IMG]


    This consists of 3 clusters - each containing 5 SIMDs and 2 octo-TMUs. Each octo-TMU can deliver its results to any of the 5 pairs of ALUs aligned with it, delivering 2 quads of results to the respective ALU quads or a single quad of results to one or the other of the pair.
     
  10. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    But do 2 tris/clk fit with that too? If so, how?
     
  11. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    One triangle per clock fits well with Cypress' two SIMD blocks, so what's the trouble here?
    Just making an analogy. :razz:
     
  12. no-X

    Veteran

    Joined:
    May 28, 2005
    Messages:
    2,455
    Likes Received:
    471
    It would be 50% more efficient - now 1 of 2 rasterizers is nearly always idling. With 2 setups and 3 rasterizers, only one of them would be idling...
     
  13. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,213
    Now we can safely say that HD 5970 will be at least 30% more powerful than HD 5870.
     
  14. Wirmish

    Newcomer

    Joined:
    May 4, 2007
    Messages:
    160
    Likes Received:
    0
    Maybe it's another fake... :???: ... maybe not.
    [​IMG]

    · Difference in size between the "3" and the "0".
    · Different font for the "30" and the "32".
    · GDDR5 @ 5 GHz seems too slow especially since 2Gb @ 6 GHz is available from Hynix (H5GQ2H24MFR-R0C), Samsung (K4G20325FC-HC03), and Elpida (EDW2032BABG-60-F).
    · Number of TMUs vs SIMD seems a little strange.
     
    #5174 Wirmish, Nov 21, 2010
    Last edited by a moderator: Nov 21, 2010
  15. gongo

    Regular

    Joined:
    Jan 26, 2008
    Messages:
    605
    Likes Received:
    25
    That is a lot of SP! ...is it still 4+1? I guess the more SP is needed for MLAA with only 32ROPS....wonder why they did not bump up Barts SP....looks a perf gap between AMD Cayman and Barts will be formed....if the 2GB vram is true...on to the bandwidth...i know AMD recent gpus are not bandwidth limited...i guess they are made with GDDR5 limitations in mind or will it finally hold back Cayman massive SP count...i think Cayman will be powerful...$499?
     
  16. SimBy

    Regular

    Joined:
    Jun 21, 2008
    Messages:
    700
    Likes Received:
    391
    Maybe, maybe not. I guess it's a photo of presentation slide on projection screen taken from an angle so that may very well be the reason for the things you pointed out.
     
  17. neliz

    neliz GIGABYTE Man
    Veteran

    Joined:
    Mar 30, 2005
    Messages:
    4,904
    Likes Received:
    23
    Location:
    In the know
    Lol, zeroes don't shrink when they come closer ;)
     
  18. Tridam

    Regular Subscriber

    Joined:
    Apr 14, 2003
    Messages:
    541
    Likes Received:
    47
    Location:
    Louvain-la-Neuve, Belgium
    It's a fake based on the slide #72 exposed at the event in LA last month.
     
  19. SimBy

    Regular

    Joined:
    Jun 21, 2008
    Messages:
    700
    Likes Received:
    391
    Good call then Wirmish.
     
  20. caveman-jim

    Regular

    Joined:
    Sep 19, 2005
    Messages:
    305
    Likes Received:
    0
    Location:
    Austin, TX
    The full time guys are, the part time guys want 2 weeks.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...