AMD: Southern Islands (7*** series) Speculation/ Rumour Thread

Discussion in 'Architecture and Products' started by UniversalTruth, Dec 17, 2010.

  1. EduardoS

    Newcomer

    Joined:
    Nov 8, 2008
    Messages:
    131
    Likes Received:
    0
    I put my lazyness to sleep, open techpowerup and Excel and plotted an interesting chart about % 7870 performance at 2560 per pixel fill rate:
    [​IMG]

    Maybe ROPs are a bottleneck, but then why Taithi performs so well? Are the ROPs and memory controllers tied in other parts? What associativity for others? What else could blottleneck? For sure it isn't shader processing :cool:
     
  2. caveman-jim

    Regular

    Joined:
    Sep 19, 2005
    Messages:
    305
    Likes Received:
    0
    Location:
    Austin, TX
    The branding (GHz Edition) leaves room for 1GHz+ 7970, and if GK104 is dropping at $550 then it could replace the 7970 which moves down to, say, $499.

    I don't see how a GHz edition 7970 precludes the existence of a 7990 'New Zealand', it could be a bit difficult if it's XT ASIC's at Pro clocks but theres a ton of room between single SKU pricing and two boards to fit a 375W card in there.

    If GK104 is indeed redefining perf/w then AMD needs a New Zealand, as GK104 dual will be the product to beat (especially if NV have licked power monitoring and got working hard TDP cap plus turbo, neatly leapfrogging PowerTune).
     
  3. caveman-jim

    Regular

    Joined:
    Sep 19, 2005
    Messages:
    305
    Likes Received:
    0
    Location:
    Austin, TX
    ...no, that says nothing at all. 'engineering to humiliate the competition' could mean 'ati vs nv' or it could mean 'gamer vs. opponent', depending on who's card is best at a price point.
     
  4. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,708
    Likes Received:
    2,132
    Location:
    London
    Maybe your picture will appear eventually. Until then:

    http://hexus.net/tech/reviews/graphics/36269-amd-radeon-hd-7850-vs-6850-vs-5850-clocks/

     
  5. EduardoS

    Newcomer

    Joined:
    Nov 8, 2008
    Messages:
    131
    Likes Received:
    0
    Maybe I missed thumbnail?
    [​IMG]
     
  6. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,708
    Likes Received:
    2,132
    Location:
    London
    The postimage.org website just doesn't seem to be working. Anyone else see the picture?
     
  7. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,244
    Likes Received:
    4,465
    Location:
    Finland
    The picture in EduardoS's earlier post has been showing up just fine on my screen ever since it was posted, same with the thumbnail he just posted
     
  8. Mianca

    Regular

    Joined:
    Aug 7, 2010
    Messages:
    333
    Likes Received:
    19
    What I find most interesting about EduardoS graph ist that the perf/fillrate ratio seems to be strikingly similar for Pitcairn and Cape Verde - two chips that share a very similar balance of ROPs/CUs/TMUs/bandwidth.

    As for Tahiti's 35% gain in perf/fillrate: That's actually kind of disappointing given that anything but ROP count was upped by at least 50% over Pitcairn. Perf/fillrate should have been at least 50% better than Pitcairn if ROPs weren't a limiting factor at all.

    Reading the graph that way, one could actually assume that Tahiti would have been about 15% faster with 50% more ROPs - as this would have put the perf/fillrate ratio right in line with the very nice scaling of the other two GCN chips. :grin:

    Ahh! So to keep things flexible in Tahiti you've actually got a configuration in which 3 physical RBs share 3 Channels over a crossbar - and for the current HD 79** series you decided to deactivate one RB per crossbar for redundancy? That would explain a lot. :wink:

    [​IMG]
     
    #3128 Mianca, Mar 13, 2012
    Last edited by a moderator: Mar 13, 2012
  9. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    That would be quite a bit of redundancy. Would it make sense to include that in the reevaluation of the binning for Tahiti? At least assumed it doesn't cause aliasing issues with the assignement of render target tiles for the rasterizers and the ROP partitions.

    I wonder what it would do performance wise. While it could help in some situations (8x MSAA for instance), I would almost think the hypothetical combination of a triple setup/rasterizer with 32 ROPs could be more effective for "modern" workloads.
     
  10. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    I'm going to go with the idea that the smiley at the end is how seriously that diagram should be taken.

    BF3 showed that there were scenarios where AMD's ROP throughput was hurt significantly more than Nvidia's. I'd expect those settings to be pushed aggressively in reviews going forward.
     
  11. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    Comparisons to nV are not very useful considering the different ROP capabilities and different limits which apply.

    From the numbers I looked at, I got a different impression when comparing HD7870 with HD7970. When activating MSAA in BF3, the performance delta between Pitcairn and Tahiti does not decrease (as it would if it were seriously ROP limited). On the contrary, it tends to increase, despite the higher ROP throughput of Pitcairn. The higher memory bandwidth of Tahiti overcompensates the (possible) limitation by the ROP throughput. Therefore, Tahiti is not in a hard ROP limit in BF3.
     
  12. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    That would be one way to find areas where GCN could use some improvement.

    The numbers seem a bit slower in geneal than benchmarks I've seen on other sites, but I don't know German and it looks like the settings are not identical.

    I'm focusing on its lack of distance from the 580, and statements concerning BF3 where building the g-buffer takes an inordinately long time to complete on AMD chips versus Nvidia.
     
  13. Mianca

    Regular

    Joined:
    Aug 7, 2010
    Messages:
    333
    Likes Received:
    19
    Pitcairn's theoretical ROP throughput isn't much higher than Tahiti's. Plus, in that benchmark you quoted, HD 7970 gains an amazing 3% on HD 7870 when activating MSAA @ 2560x1600.

    Problem is that HD 7970 has about 72% [!] more memory bandwidth than HD 7870. So it actually shouldn't even be a competition. That's what trinibwoy pointed out: If a 264.000 MB/s card can't shake off an 153.600 MB/s card by a significant margin in those bandwidth-heavy szenarios, what gives?
     
  14. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    You are right, they changed an additional setting (FXAA), which influences the shader load. So here is another set (only changing MSAA).
    But the trend is the same. The distance between Pitcairn and Tahiti doesn't decrease when activating MSAA. It is between about 25% and 35%, generally increases with higher resolution (higher pixel shader ROP and bandwidth load combined, that can be expected) and slightly increases (0% to 2%) when activating 4xMSAA (maybe almost isolated ROP and bandwidth load addition, but frankly I have no idea how the Frostbite2 engine used by BF3 actually works, probably there are also other contributing factors like increased shader load for some steps).
    And why do you think the dominating reason for that is the ROP count? Could also be some scheduling/work distribution problem, where nVidia has an upper hand, isn't it?

    The point is, that Tahiti does not lose more than Pitcairn when ROP limitations should set in (activating MSAA) while it maintains a consistently higher performance than Pitcairn also with MSAA.
    Is it a very bandwidth heavy scenario?

    Or look at it from the other side:
    The performance relation between Tahiti and Pitcairn stays basically almost the same. If you activate MSAA or not, Tahiti is always the same amount faster than Pitcairn (it even gains a percent or two with MSAA). And that with 8% less ROP capacity. Doesn't it tell us, that Tahiti shows consistent perormance in comparison to Pitcairn and is therefore not completely off-balance? They just use different means to get there. But the performance picture is actually quite consistent between Pitcairn and Tahiti.

    Of course you can always say that if you would have added 50% more ROPs in a certain game you would be 10% faster (and in a non bandwidth limited fillrate test even 50%). But that would also come at a cost (die size, power consumption and ultimately clockspeed). You could also say that a triple setup/raster engine plus 48 ROPs on a 1.5GHz 384Bit interface may have bought them even 25% performance in some games. A quad setup/raster with 8 pixels/clock per rasterizer would also improve the performance in setup limited scenaries considerably while staying at the 32 pixel/clock raster and ROP limit. But with the available evidence I don't think it is justified to say that Tahiti is mainly ROP limited.
     
  15. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    For BF3, it wasn't the ROP count but how the design handles wide MRTs. The cost is apparently higher than the total bandwidth needed.
    If they did not want to modify that part of the pipeline to reduce the impact of this corner case, upping the peak level of performance with additional ROPs could provide a higher peak come down from.

    If Nvidia's design remains as consistent in Kepler as it is in Fermi, then I'd expect that Nvidia's reviews will focus on workloads like that.
     
  16. EduardoS

    Newcomer

    Joined:
    Nov 8, 2008
    Messages:
    131
    Likes Received:
    0
    Actually, since I included the HD7*50 models the CUs/TMUs ratio increase by 25%, with no meaningful increase in performance.

    Since Tahiti have 3-way associative ROPs and the other chips doesn't (direct mapped?) this may increase Taihiti ROP's efficiency.

    Initially I read this as 4 partitions each with 8 ROPs and 3 channels, then I realized that maybe Dave forgot to mention something...

    Could be two partitions and on each partition 8 ROPs that access the high part of a dual channel controller and 8 ROPs that access the lower part? A wild guess, in fact I have no idea...
     
  17. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    From your link:


    „Treiberversionen
    * AMD Catalyst 11.11c Performance-Treiber
    * AMD 8.921.2-111215a (HD 7970)
    * AMD 8.921.2-120119a (HD 7950)
    * AMD 8.932.2 (HD 7700)
    * AMD 8.95.5-120224a (HD 7900)
    * Nvidia GeForce 290.36“
    That seems to indicate, no new benchmarks were conducted for the older cards, thus not including the new FXAA path for GCN in effect with the recent patch.
     
  18. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    @Carsten:
    The FXAA thing (together with the driver versions) was the reason why I linked a different test one post later. This second test doesn't have these problems (all cards testes with the 12.3 preview driver) but shows the same trend. ;)
     
  19. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,708
    Likes Received:
    2,132
    Location:
    London
    Yesterday at a completely different internet connection your image did show up, while at multiple computers at this internet connection it didn't. Today it shows on my computer :???:
     
  20. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,708
    Likes Received:
    2,132
    Location:
    London
    BF3 is a system seller, or rather an NVidia system seller, since AMD is so bad in this game. As far as I'm aware there is no insight as to the key architectural features of these chips and how they affect performance in this game.

    In summary, Tahiti looks like rubbish, particularly in terms of performance per mm²:

    HD7970 0.136986
    HD7870 0.188679
    HD6970 0.089974
    HD6870 0.098039
    GTX580 0.084615

    from

    http://www.hardware.fr/articles/856-13/benchmark-battlefield-3.html

    based upon the 1920 ultra results.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...