Trinity vs Ivy Bridge

Discussion in 'Architecture and Products' started by rpg.314, Jun 29, 2011.

  1. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,537
    Likes Received:
    496
    Location:
    Varna, Bulgaria
    Is Trinity supposed to have a new revision of the BD architecture? I think there's an additional SRAM bank to the instruction pre-decode array in the front-end, compared to the current revision of BD. :???:
     
  2. TKK

    TKK
    Newcomer

    Joined:
    Jan 12, 2010
    Messages:
    148
    Likes Received:
    0
    Trinity is Piledriver-based.
     
  3. cal_guy

    Newcomer

    Joined:
    Jun 27, 2008
    Messages:
    217
    Likes Received:
    3
    Yes Trinity use the Piledriver core.
     
  4. TKK

    TKK
    Newcomer

    Joined:
    Jan 12, 2010
    Messages:
    148
    Likes Received:
    0
    No, I think 1100 GFLOPs is for that ominous "2013 platform" (Trinity successor). If you look very closely, you see a slight bend in the line just above the 800 mark. So Trinity seems to be in the 800-850 range of that chart.
     
  5. Albuquerque

    Albuquerque Red-headed step child
    Veteran

    Joined:
    Jun 17, 2004
    Messages:
    4,051
    Likes Received:
    639
    Location:
    35.1415,-90.056
    Ahhh, yes I do see that. Ok, so Llano is ~600, Trinity is somewhere around the 850 mark. That's not too far off from 50% depending on the rounding error on Llano. I mean, if we look at the backside-kink of that line, Llano might be ~550 ;)

    The real deal is that's just a really terrible graph, and given the multiple sources posted above, is obviously wrong and should be dismissed.
     
  6. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,537
    Likes Received:
    496
    Location:
    Varna, Bulgaria
    Scaled comparison of Llano and Trinity, using the I/O pads on the left side for reference:

    [​IMG]

    Some observations on the layout of the SIMD multi-processors -- the placement of the register file banks in the ALU array is different in Trinity, as well as the whole layout of the texture unit.

    Here are the differences (so far) on the CPU side -- BD vs. Piledriver cores:

    [​IMG]

    Those banks are most probably the pre-decode bits (used for the BTB, branch selector, end bits & etc.), that AMD has been using ever since the first K7 architecture to aid the instruction decode flow. And since these are located in the branch prediction area of the front-end block, I guess AMD is aiming at improving namely this aspect of the architecture.
     
    #166 fellix, Jan 5, 2012
    Last edited by a moderator: Jan 5, 2012
  7. DarthShader

    Regular

    Joined:
    Jul 18, 2010
    Messages:
    350
    Likes Received:
    0
    Location:
    Land of Mu
    So it's 20% more CPU perf + 30% more GPU perf = 50% more perf! :lol:

    I am personaly specualting there will be 2 x 256 bit FMAC in each module, so that would be doubling peak Flops and then a clock boost on top. So over 200GFlops from the CPU alone, so the GPU won;t have to clocked that high to reach the projected total GFLOP values. But since I might be the only one thinking that, I could be very wrong. :D
     
  8. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,532
    Likes Received:
    957
    It is (or at least it was supposed to be) 50% more FLOPS on the GPU for 30% more performance in actual games, and up to 20% more performance on the CPU side for common applications.

    The FPU appears to be largely unchanged, so no 256-bit FMACs.
     
  9. TKK

    TKK
    Newcomer

    Joined:
    Jan 12, 2010
    Messages:
    148
    Likes Received:
    0
    What strikes me as odd is the GPU in Trinity. The 6 VLIW4(?)-SIMDs only take up ~as much space as the 5 SIMDs in Llano, yet the "uncore" of the Trinity GPU is MUCH larger and appears to be the only reason why Trinity is larger than Llano. Any idea what all that space is used for? Larger cache(s) to reduce memory bandwidth bottlenecks?
     
  10. no-X

    Veteran

    Joined:
    May 28, 2005
    Messages:
    2,412
    Likes Received:
    426
    Compared to the ALU blocks, the rest of Llano's GPU is 3,78-times bigger. But 4,75-times bigger for Trinity (rough numbers). I would expect exactly opposite numbers... :???:
     
  11. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,532
    Likes Received:
    957
    Would you? Cayman had fewer shaders than Cypress, but was significantly bigger. And (presumably) it didn't have has much redundancy for vias and stuff.
     
  12. OpenGL guy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,357
    Likes Received:
    28
    Cayman had 24 SIMDs vs. Cypress' 20. So even though that's a few less ALUs, that's 20% more texture units, L1 cache, LDS memory, etc.
     
  13. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    The GPUs of Trinity vs. Llano exhibit the exact same ratio as Cayman vs. Cypress (trading five VLIW5 vs. six VLIW4 SIMD engines). ;)
     
  14. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,532
    Likes Received:
    957
    Yep, that was my point. :)
     
  15. tunafish

    Regular

    Joined:
    Aug 19, 2011
    Messages:
    627
    Likes Received:
    414
    The present desktop BD's cannot keep the FPU fed with data. What exactly would be the point of doubling the peak flops when you are so bandwidth-starved that it would never increase real-world performance?
     
  16. DarthShader

    Regular

    Joined:
    Jul 18, 2010
    Messages:
    350
    Likes Received:
    0
    Location:
    Land of Mu
    This is not a BD, this is Trinity. It has a different mem controller and different goals, maintaining maximum throughput being not one of them. Going to 256bit will happen someday anyways, the sooner the better.

    PS. Source of your claim?
     
  17. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,537
    Likes Received:
    496
    Location:
    Varna, Bulgaria
    This is comparison of the IGP "uncore" sections of Llano and Trinity -- SIMDs are cut out too. Trinity's section takes 40% more area, compared to Llano's.

    [​IMG]
     
  18. no-X

    Veteran

    Joined:
    May 28, 2005
    Messages:
    2,412
    Likes Received:
    426
    Any explanation? Cayman has bigger ROPs (EQAA, faster ops with single/dual-channel FP32, faster Int16, coalesced writes). Maybe VCE and 3rd display pipeline for Eyefinity have some impact, too. But I can't believe that improved ROPs, VCE processor and 3rd display output can be responsible for such a massive difference.
     
  19. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,537
    Likes Received:
    496
    Location:
    Varna, Bulgaria
    Second setup pipe?

    *runs for cover*

    :lol:
     
  20. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,797
    Location:
    Well within 3d
    The pictures do not look like they have an equivalent level of detail, but if the Trinity shot is accurate and no obfuscated, the GPU section looks like it has fewer ordered grids that correspond to SRAM or customized logic than Llano. The kind of featureless pudding inbetween all the storage is visually similar to the RV770 die shot.

    Perhaps AMD has allowed the logic on the periphery to bloat, due to more standard cells and automated layout. The logic sections may be physically bigger out of proportion of any transistor count increase, possibly to reduce leakage or variation.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...