AMD: R9xx Speculation

Discussion in 'Architecture and Products' started by Lukfi, Oct 5, 2009.

  1. John021

    Newcomer

    Joined:
    Jan 1, 2010
    Messages:
    29
    Likes Received:
    0
    From fudo:
    I think this is too much :oops:
    PS: Lol at Jimbo's coment in fuad page xD
     
  2. SimBy

    Regular Newcomer

    Joined:
    Jun 21, 2008
    Messages:
    700
    Likes Received:
    391
    What was the biggest chip ATI made up till now?
     
  3. Topman

    Newcomer

    Joined:
    Oct 16, 2006
    Messages:
    73
    Likes Received:
    5
    R600 ~420mm² ?
     
  4. no-X

    Veteran

    Joined:
    May 28, 2005
    Messages:
    2,407
    Likes Received:
    416
    in terms of die size: R600 (420mm²)
    in terms of transistors: Cypress (2154M)
     
  5. Bouncing Zabaglione Bros.

    Legend

    Joined:
    Jun 24, 2003
    Messages:
    6,363
    Likes Received:
    83
    So AMD sees Nvidia do Epic Fail with GF100, and decides to abandon their hugely successful sweet-spot strategy after one generation and follow the same route as Nvidia with a giant 300 watt single chip that bleeds heat like a nuclear furnace?

    That seems really likely. :roll: Looks like Fudo is getting his information on future AMD products from Nvidia PR again.
     
  6. PSU-failure

    Newcomer

    Joined:
    May 3, 2007
    Messages:
    249
    Likes Received:
    0
    I think it's just as likely as AMD going half-rate DP with a 4-wide SPU (split/distributed T lane... edit: not implying it's the case here, take this as a "when" or "if" ).

    Evergreen-class GPUs have a post-tessellator issue and it's likely Cypress should have handled that. After all, they were marketing RV770's sideport back then, although it has never been used.
     
  7. RobertR1

    RobertR1 Pro
    Legend

    Joined:
    Nov 2, 2005
    Messages:
    5,841
    Likes Received:
    1,276
    This is an industry of bullet points and buzz words. Such "wins" go a long way.
     
  8. neliz

    neliz GIGABYTE Man
    Veteran

    Joined:
    Mar 30, 2005
    Messages:
    4,904
    Likes Received:
    23
    Location:
    In the know
    I think they just really like to hit the same stone, twice.
     
  9. DarthShader

    Regular

    Joined:
    Jul 18, 2010
    Messages:
    350
    Likes Received:
    0
    Location:
    Land of Mu
    That's something I thought after posting, but couldn't get back immediately to edit. :) Let's hope it does indeed include the "novelty tax".

    Maybe that's why we heard the rumours of vapour chamber coolers being standard?

    I can localy get a non reference 5850 for the equvalent of 228Euro. Cheapest GTX 460 1GB from Palit for 171 Euro! (Today's Euro prices)

    For being future-proof. I am not a person who changes gear often, I'd like to get the most milleage from hardware as possible. For a reasonable price ofc. :)
     
  10. AnarchX

    Veteran

    Joined:
    Apr 19, 2007
    Messages:
    1,559
    Likes Received:
    34
  11. SimBy

    Regular Newcomer

    Joined:
    Jun 21, 2008
    Messages:
    700
    Likes Received:
    391
  12. LordEC911

    Regular

    Joined:
    Nov 25, 2007
    Messages:
    875
    Likes Received:
    205
    Location:
    'Zona
    Anyone going to take a wack at measuring?
     
  13. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    4,027
    Likes Received:
    90
    I might, depending on if I get to keep my Barts samples or not.
     
  14. jaredpace

    Newcomer

    Joined:
    Sep 28, 2009
    Messages:
    157
    Likes Received:
    0
  15. SimBy

    Regular Newcomer

    Joined:
    Jun 21, 2008
    Messages:
    700
    Likes Received:
    391
    Well by the looks of it, earlier measurements of 230mm^2 seem spot on.
     
  16. Unknown Soldier

    Veteran

    Joined:
    Jul 28, 2002
    Messages:
    3,768
    Likes Received:
    1,305
  17. Unknown Soldier

    Veteran

    Joined:
    Jul 28, 2002
    Messages:
    3,768
    Likes Received:
    1,305
    I thought Bart had already been measured?

    [​IMG]
     
  18. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    In what way are they abandoning the sweet spot strategy?

    NVidia introduced GF100 first and only got GF104 out 3 months ago. That's what the old strategy looks like. AMD is introducing the sweet spot chip first, and judging by its die size and performance, they're doing exactly what they did with the RV7xx except this time NVidia won't be squeaking out a marginal victory at the $400+ price point.
     
  19. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,472
    Likes Received:
    1,832
    Location:
    London
    http://www.microsofttranslator.com/...pu.org/viewthread.php?tid=3405&extra=page%3D1

    i.e. 1/2 for POW and 1/2.5 for SIN.

    For HD5870, GPU Shader Analyzer shows 129 ALU instructions for the POW shader, i.e. half MUL rate. Here's how 4 POW instructions compile:

    Code:
    00 ALU: ADDR(32) CNT(12) KCACHE0(CB0:0-15) 
          0  t: LOG_sat     ____,  KC0[0].x      
          1  z: MUL         ____,  KC0[0].x,  PS0      
             t: LOG_sat     ____,  KC0[1].x      
          2  w: MUL         R127.w,  KC0[1].x,  PS1      
             t: EXP_e       R127.y,  PV1.z      
          3  t: LOG_sat     ____,  PS2      
          4  x: MUL         ____,  R127.y,  PS3      
             t: EXP_e       R127.z,  R127.w      
          5  t: EXP_e       ____,  PV4.x      
          6  t: LOG_sat     ____,  PS5      
          7  y: MUL         ____,  R127.z,  PS6      
          8  t: EXP_e       R0.x,  PV7.y      
    01 EXP_DONE: PIX0, R0.xxxx
    END_OF_PROGRAM
    SIN is a bit of a tangle to compile in GPUSA, but with a bit of fiddling it comes out as 162 instructions, i.e. 1/2.5.

    Strange thing about the SIN shader is that it's mostly MUL, MULADD and FRACT instructions. Trying to normalise the input to the SIN instruction, it seems. So, ahem, it might have no use as a test, e.g. this is 4 SIN instructions:

    Code:
    00 ALU: ADDR(32) CNT(30) KCACHE0(CB0:0-15) 
          0  x: MULADD      ____,  KC0[0].x,  (0x3E22F983, 0.1591549367f).x,  0.5      
             w: MULADD      ____,  KC0[1].x,  (0x3E22F983, 0.1591549367f).x,  0.5      
          1  z: FRACT       ____,  PV0.w      
             w: FRACT       ____,  PV0.x      
          2  y: MULADD      ____,  PV1.z,  (0x40C90FDB, 6.283185482f).y, -(0x40490FDB, 3.141592741f).x      
             z: MULADD      ____,  PV1.w,  (0x40C90FDB, 6.283185482f).y, -(0x40490FDB, 3.141592741f).x      
          3  x: MUL         T0.x,  PV2.y,  (0x3E22F983, 0.1591549367f).x      
             y: MUL         ____,  PV2.z,  (0x3E22F983, 0.1591549367f).x      
          4  t: SIN         ____,  PV3.y      
          5  z: MULADD      ____,  PS4,  (0x3E22F983, 0.1591549367f).x,  0.5      
             t: SIN         ____,  T0.x      
          6  x: FRACT       ____,  PV5.z      
             y: MULADD      ____,  PS5,  (0x3E22F983, 0.1591549367f).x,  0.5      
          7  x: FRACT       ____,  PV6.y      
             y: MULADD      ____,  PV6.x,  (0x40C90FDB, 6.283185482f).y, -(0x40490FDB, 3.141592741f).x      
          8  x: MUL         ____,  PV7.y,  (0x3E22F983, 0.1591549367f).x      
             w: MULADD      ____,  PV7.x,  (0x40C90FDB, 6.283185482f).z, -(0x40490FDB, 3.141592741f).y      
          9  z: MUL         ____,  PV8.w,  (0x3E22F983, 0.1591549367f).x      
             t: SIN         T0.z,  PV8.x      
         10  t: SIN         ____,  PV9.z      
         11  x: ADD         R0.x,  T0.z,  PS10      
    01 EXP_DONE: PIX0, R0.xxxx
    END_OF_PROGRAM
    which compiles to 12 cycles.

    An 8 instruction MUL looks like this:

    Code:
    00 ALU: ADDR(32) CNT(32) KCACHE0(CB0:0-15) 
          0  z: MUL         R127.z,  KC0[0].y,  KC0[1].y      
             w: MUL         R127.w,  KC0[0].x,  KC0[1].x      
          1  x: MUL         R127.x,  KC0[0].w,  KC0[1].w      
             y: MUL         R127.y,  KC0[0].z,  KC0[1].z      
          2  x: MUL         R126.x,  KC0[1].w,  PV1.x      
             y: MUL         R126.y,  KC0[1].z,  PV1.y      
             z: MUL         R126.z,  KC0[1].y,  R127.z      
             w: MUL         R126.w,  KC0[1].x,  R127.w      
          3  x: MUL         R127.x,  R127.x,  PV2.x      
             y: MUL         R127.y,  R127.y,  PV2.y      
             z: MUL         R127.z,  R127.z,  PV2.z      
             w: MUL         R127.w,  R127.w,  PV2.w      
          4  x: MUL         R126.x,  R126.x,  PV3.x      
             y: MUL         R126.y,  R126.y,  PV3.y      
             z: MUL         R126.z,  R126.z,  PV3.z      
             w: MUL         R126.w,  R126.w,  PV3.w      
          5  x: MUL         R127.x,  R127.x,  PV4.x      
             y: MUL         R127.y,  R127.y,  PV4.y      
             z: MUL         R127.z,  R127.z,  PV4.z      
             w: MUL         R127.w,  R127.w,  PV4.w      
          6  x: MUL         R126.x,  R126.x,  PV5.x      
             y: MUL         R126.y,  R126.y,  PV5.y      
             z: MUL         R126.z,  R126.z,  PV5.z      
             w: MUL         R126.w,  R126.w,  PV5.w      
          7  x: MUL         ____,  R127.x,  PV6.x      
             y: MUL         ____,  R127.y,  PV6.y      
             z: MUL         ____,  R127.z,  PV6.z      
             w: MUL         ____,  R127.w,  PV6.w      
          8  x: MUL         R0.x,  R126.w,  PV7.w      
             y: MUL         R0.y,  R126.z,  PV7.z      
             z: MUL         R0.z,  R126.y,  PV7.y      
             w: MUL         R0.w,  R126.x,  PV7.x      
    01 EXP_DONE: PIX0, R0
    END_OF_PROGRAM
    As far as I can tell the throughput for XYZT would be the same as XYZWT in all three of these tests.

    But I think it rules out XYZW with emulated transcendentals.
     
  20. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    But still, 160 or 192 VLIW units (pro and XT => 800/960 SPs with xyzwt or only 640/768 with xyzt) appears to be quite on the low side to reach close to Cypress performance. They really need to have widened some bottlenecks to get that performance.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...