AMD: R9xx Speculation

Discussion in 'Architecture and Products' started by Lukfi, Oct 5, 2009.

  1. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    Cypress was ideal for a tessellation workload that AMD predicted would be in use.
    Fermi, and the preponderance of tessellation benchmarks and one or so games that for various reasons go beyond that, did not turn out to match AMD's vision.
    It happens.

    It has manufacturing advantages. Most everything else is marketing.
    What is possibly more important is that AMD does not have the same ability to leverage the professional market for high ASP products as Nvidia.
    I would be curious what money AMD would make it if made a chip the size of Fermi, since it would most likely not have a revenue profile significantly different from what it has right now.
     
  2. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    Nice analysis, good of you to actually lay it out clearly.

    I'm utterly bemused by Cayman.

    Not only is ms per frame worse in HD6970 than HD6870, but ns per pixel is only 36% faster in HD6970 while theoretically it is 68% faster (assuming VLIW-5 and VLIW-4 are equivalent, knock off 10% if you like).

    It just seems utterly broken, struggling to 18% faster than HD6870 at 2560x1600 in this game. It's 53% bigger, what the hell is going on in there?
     
  3. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    The performance in AMDs SubD11 SDK-Sample has been reported to AMD beforehand and they said, they're working on a new driver in order to improve certain tessellation workloads. In some, however, Caymans tessellation hardware already works as advertised, seeing a healthy increase over Cypress'. So I agree that with regard to tessellation, coming drivers might drastically improve certain workloads. But I am not so certain that a substantial increase in overall performance through improved drivers is very likely.
     
  4. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    How do you know? Can you run it in wireframe?

    The per-frame time is 5.0 ms for the 580, 7.6 ms for the 460, and 13.7 for the 450. It's ~8ms for the 5870, 6870, and 6970. In a different test of the same game, the per-frame time increases from 9ms to 13ms with DX11 enabled. All evidence is pointing to geometry.
     
  5. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    I think improvement in some tessellation workloads is a large part of what ATI needs (HAWX, Dirt2, LP2 among others). The other games dragging down averages are Battleforge and StarCraft2, where ATI takes a huge AA hit.
     
  6. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    Dirt2 is using tessellation only on water, crowds and cloth. If none of them are in view you can rightfully assume there's no tessellation going on. The amount of tessellation obviously depends on your chosen benchmark run.

    Our test you linked was the first one, where I think more crowd and cloth was being visible, before we switched to the Malaysia track, where's less tessellation used - percentage-wise. Here's a youtube video from our benchmark: http://www.youtube.com/watch?v=OZ86KHKkA58

    Taking into account the unigine heaven 2.1 results we have published, it seems there's no simple answer as to what kind of workloads will profit from driver improvements. I've also benchmarked (but not published) TessMark, which purely derives it's scores from tessellation prowess. There, Cayman did not show improvements over Cypress in all tessellation levels (amplification factors of 8, 16, 32 and 64).
     
    #6746 CarstenS, Dec 16, 2010
    Last edited by a moderator: Dec 16, 2010
  7. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    16,156
    Likes Received:
    5,090
    Yup, Tech Report shows an idle power difference between 5870 1 gig and 5870 2 gig of 12 watts. And at load (they use L4D2) of 39 watts. Definitely not a tiny difference.

    Speaks well of the idle power improvements they've made at idle. Where 6970 is 11 watt less than 5870 2 gig. But at load it's only 11 watts less.

    AMD and Nvidia being in a ruinous price war at the time triggered historically low prices for their respective top end chips. Both companies were hurt a lot by that price war, and it's extremely unlikely that either one will go to those levels again.

    It's not a question of whether or not 5870 or 6970 can go to those price levels, they can. But more a matter of AMD not wanting to go down to those levels. Nvidia could obviously force them to but it's obvious they don't want to go down to those levels either.

    There may be some price jockeying in the coming months as demand starts to slow, but I doubt 6970 will be hitting the 200 or lower price point unless Nvidia wants to take a large hit to their margins and they've already indicated they aren't willing to do that this round, or at least this quarter.

    Regards,
    SB
     
  8. leoneazzurro

    Regular

    Joined:
    Nov 3, 2005
    Messages:
    518
    Likes Received:
    25
    Location:
    Rome, Italy
    This is really strange, as there are 2x tessellation units plus possibly also the Barts tessellation improvements and I would expect that synthetic tests should show this better, not the other way around. Is the load balancing controlled by the driver and it is in some cases broken, or?
     
  9. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    Consider me as puzzled as the next guy about this question. At first I thought my system might be borked somehow, but AMD did not point me in that direction, so I guess they could at least reproduce the general trend.
     
  10. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Thanks.

    In the per pixel load, BW and ROPs will play a role. I've added 5770 OC results to the spreadsheet and get 4.35ns/pix, so the the 6870 is 56% faster per pixel despite only 40% more SIMDs. If you compare the 6970 with the 5770 OC, it has 1.96x ROP, 2.3x SIMD rate and BW, 1.88x FLOPs, and uses all this to render at 2.13x the speed.

    Everything seems in order on the per-pixel side, and beating the GTX 580 there is no mean feat. The per-frame time really is the killer here. We're waiting from Dave to see if the command processor is what he meant by front end, but that would be a real shame. To be clear, I expect that some of the per frame time is raster related, as shadow/reflection maps aren't 100% geometry limited, but if much of the rest is command processor limited then that would be quite baffling.
     
  11. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,435
    Likes Received:
    263
    I agree with that.

    Cayman was designed before GTX480 launch, let alone GTX580.

    40nm is expensive. 55nm was cheap.
     
  12. ECH

    ECH
    Regular

    Joined:
    May 24, 2007
    Messages:
    682
    Likes Received:
    7
    So the question is can the odd performance issues seen between 6970 and the 5870 be improved upon with better drivers?
     
  13. Lightman

    Veteran Subscriber

    Joined:
    Jun 9, 2008
    Messages:
    1,804
    Likes Received:
    475
    Location:
    Torquay, UK
    Just plugged my card and in Endless City there is a huge improvement compared to Cypress for sure.
    [1920x1200 - Fraps average of 20FPS for stock HD6970 Phenom X6 3.45/4.0GHz compared to 13FPS for 1GHz HD5870 on I7 4.2GHz]

    Double checked TessMark but as mentioned before it scores are same as on Cypress for some reason. Maybe OpenGL4 is not yet aware of new Cayman capabilities?

    If any of you have a quick bench requests feel free to PM me. I might not have time to do a lot today but weekend is already reserved for that :razz:.
     
    #6753 Lightman, Dec 16, 2010
    Last edited by a moderator: Dec 16, 2010
  14. Ancient

    Newcomer

    Joined:
    Mar 17, 2010
    Messages:
    120
    Likes Received:
    0
    Not true :wink:
     
  15. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,079
    Likes Received:
    648
    Location:
    O Canada!
    OpenGL is behind DX for Cayman's improvements; there's only benchmarks for OGL tessellation at the moment. There will be improvements for OpenGL in the new year.
     
  16. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    Does that mean, the dual front end can only be capitalized upon if the driver explicitly controls the command streams accordingly instead of the hardware taking care of itself in this regard?
     
  17. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,079
    Likes Received:
    648
    Location:
    O Canada!
    I believe the fixed function hardware will take care of itself, but I don't know that Tessmark is setup bound in the first place. That could be tested with an OpenGL vertex or triangle test (there should be one somewhere, I'm sure).

    BTW - the driver has been updated and should bring improvements in a few of the DX SDK tests. Download it again and give it a go.
     
  18. Lightman

    Veteran Subscriber

    Joined:
    Jun 9, 2008
    Messages:
    1,804
    Likes Received:
    475
    Location:
    Torquay, UK
    You mean HotFix driver available for everyone on AMD's website or Press driver only? And to think I've downloaded HotFix driver 1h ago :twisted:.

    PS. Love the PowerTune already, whoever had that idea at AMD deserves a cake!
     
  19. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    Thanks! Will try first thing tomorrow morning. :)

    edit: I think it's the official hotfix being updated. The EMEA-press FTP stil shows the same files as a few days ago.
     
    #6759 CarstenS, Dec 16, 2010
    Last edited by a moderator: Dec 16, 2010
  20. AnarchX

    Veteran

    Joined:
    Apr 19, 2007
    Messages:
    1,559
    Likes Received:
    34
    Could we assume that PowerTune on Antilles, will enable one GPU to use more than 250W in scenarios of high inter frame dependencies?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...