AMD confirms R680 is two chips on one board

Discussion in 'Architecture and Products' started by nicolasb, Dec 14, 2007.

  1. AlexV

    AlexV Heteroscedasticitate
    Moderator Veteran

    Joined:
    Mar 15, 2005
    Messages:
    2,535
    Likes Received:
    144
    Did you at least make an attempt of reading the article I linked to?Did you check out what fits and doesn't fit in the EDRAM?Did you check out how it's typically used?Did you check out the bandwidth that Xenos normally has available to it(22.4GBs to the main memory, 32GB between Parent and daughter die, BTW), if stuff doesn't go to the EDRAM/it isn't used for some reason?Read page 4 paragraph 2 as well.

    Again I'm asking how do you determine what Lost Planet uses or doesn't use or how do you take that comparison to the desktop realm?I personally have no friggin clue as to how its BW demands are. You function under the assumption that the EDRAM was some absolutely needed thing and that Xenos as it is uses the entirety of the BW it brings. That's hardly the case. I think it was relatively cheap to add, looked good in terms of providing paper specs, and gave some interesting possibilities(I'm thinking primarily about the prospect of using tiling and thus having really low performance reduction with 4X MSAA at 720p, but devs don't seem to have swarmed over it due to a number of reasons).

    There's nothing showing the 3870 to be horribly BW limited in normal useage scenarios. Other parts like the 8800GT, being also an 16 ROP part, seem to do OK with even less bandwidth. Again, don't misunderstand this as some crusade against increasing BW-it isn't. But in the context of the RV670 a 512-bit bus would've been pointless, as it was in the context of the R600.

    Let's get another example:x1900xtx and 2900xt. Gobbles more BW, same number of ROPs, TUs and so(let's call them RBEs as per ATis nomenclature for the 2900 line). Did it translate into a whopping defeat for the x1900xtx under supposedly BW limited scenarios?Nope. And let's ignore the "Shader-resolve killing performance" argument as that's fairly invalid.

    To get my point across:IMHO, the RV670 isn't in great need of BW, it's hardly a limiting factor for it, considering its typical useage scenarios. Simply look at it man, show me at least some indication of BW limits IRL. What Fellix said is probably correct(haven't checked that out....and I also don't have a RV670), but you're not going to be spending your time doing blending only, are you?
     
  2. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    Well, don't we? :lol:

    Anyway, my intention was to point a simple BW-related situation with an eye for ease comparison.
    IMHO, it is evident that for the moment, the most cost effective implementation for the R600 marchitecture is namely the 8-way 256-bit interface, with some *arguably* GDDR4 hi-speed touch. :wink:
     
  3. {Sniping}Waste

    Regular

    Joined:
    Jan 13, 2003
    Messages:
    833
    Likes Received:
    29
    Location:
    Garland TX
    I have a HD2900PRO 512bit mem bus and some test would show that the HD3870 could be memory bandwidth limited. I clocked the HD2900pro to core or 845mhz and the memory to 850mhz. In 3Dmark 06 I was getting around 9000. I uped the memory speed to 910 and left the core the same at 845MHZ and the score went up to 9600, a 600 point jump. This would show a memory bandwidth was limiting the core at 845mhz and that was on a 512bit memory bus.
     
  4. armchair_architect

    Newcomer

    Joined:
    Nov 28, 2006
    Messages:
    128
    Likes Received:
    8
    The main difference between console and PC is that the console has to perform well under far fewer configurations (resolutions, AA modes, etc.). This makes it possible to target the console very carefully at one or two high priority setups, and lets you design more towards the "worst-case" end of the spectrum than a "typical case".

    Xenos was intended to never bottleneck on pixel throughput, and in the timeframe it was being designed the obvious candidate for backend bottlenecks was heavy alpha blending (for particles/smoke/weather/etc.) rather than the current HDR (4xfp16) and deferred rendering candidates.

    So do the math. Facts:
    500 MHz
    32 samples/clk (8 pixels @ 4xAA)
    no compression
    32 bits/sample for z/stencil
    32 bits/sample color at full-speed, 64 bits/sample at half-speed

    Therefore, with Z/stencil read+write and alpha blending (color read+write) for each sample, the backend units can consume up to:
    500 MHz * 32 samples * ((2 * 4B) + (2 * 4B)) = 238.4 GB/s

    Which is 93% of the 256GB/s theoretical peak -- which is better than you'd get in a more complex setup (like more clients and having to arbitrate between them).

    So yeah, Xenos' EDRAM bandwidth and ability to consume that bandwidth are very well matched and provide exactly what was being aimed at: 4xAA with alpha-blend and z-test without the backend being a bottleneck.

    Doing the same analysis for RV670 is much harder: compression has varying effectiveness, and the bandwidth is shared with textures, vertex data, scanout, etc. (which not only compete for bandwidth, but make it harder to get maximum efficiency out of the DRAMs). But it's easy to see that if you've got 4xFP16 rendertargets (RV670 can blend these at full speed) and/or less than perfect compression, RV670's RBEs can be bandwidth limited even though they only do two samples/pixel/clk.

    Of course this kind of heavy alpha blending w/ Z read/write is only a fraction of current frames, so the RBEs will typically consume much less bandwidth than this -- that's what I meant by designing PC parts more around typical or average case than worst case.

    I'm not saying RV670 is often bandwidth-limited in current games -- just that it has the potential to be, unlike Xenos (considering only backend bandwidth, of course).
     
  5. PsychoZA

    Newcomer

    Joined:
    Mar 1, 2007
    Messages:
    75
    Likes Received:
    0
    The 8800GT isn't, but the 8800GTS is very limited by it.
     
  6. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
    Again, i'm not trying to make definate ascertions either way. I'm not saying Xenos DOES use all that bandwidth or that RV670 IS bandwidth limited. What i'm saying is that two mutually exclusive statements have been made and i'm asking for that to either be acknowledged or the conflict resolved via some technical reason that i'm not aware of. For clarification, the two statements are:

    • Xenos can use near 256GB/s of bandwidth (doesn't matter that its only for limited situations and not main memory, it can still use it for some things)
    • RV670 cannot use more than 72GB/s
    To me, both of those statements can't be correct, either one, or the other is wrong.

    We can say that most of the time, Xenos doesn't use anywhere near that much bandwidth but the fact of the matter is that situations were it does use a lot of it (if they even exist) would also be able to exist in a PC game and thus in those situations RV670 would be bandwidth limited.

    The question then becomes one of how often to those situations occur? Is it so rarely that it doesn't have much noticable impact on real world performance? And if thats the case, what implications does it have for the usefulness of all that bandwidth in the 360?
     
  7. AlexV

    AlexV Heteroscedasticitate
    Moderator Veteran

    Joined:
    Mar 15, 2005
    Messages:
    2,535
    Likes Received:
    144
    The proof of that being where?Ignoring that we're talking trans IHV comparisons that don't make that much sense(I picked the 8800GT as an (far-fetched)example because it packs a similar functional unit arrangement, the GTS(classic) handles more pixels(20>16), nV can handle 4 multisamples per cycle, ATi only does two etc.).Show me this great BW limited scenario
     
  8. IbaneZ

    Regular

    Joined:
    Apr 15, 2003
    Messages:
    743
    Likes Received:
    17
    Nothing to get excited about. Ok, it might give the 14 months old 8800 GTX a fight. Yawn...

    This is gonna be a crappy year.
     
  9. vertex_shader

    Banned

    Joined:
    Sep 8, 2006
    Messages:
    961
    Likes Received:
    14
    Location:
    Far far away
    After 441 day (jan28) ATi take away from NV the "king of the hill crown" and you are not excited? :smile:
     
  10. AlexV

    AlexV Heteroscedasticitate
    Moderator Veteran

    Joined:
    Mar 15, 2005
    Messages:
    2,535
    Likes Received:
    144
    I give up:|
     
  11. AnarchX

    Veteran

    Joined:
    Apr 19, 2007
    Messages:
    1,559
    Likes Received:
    34
    602, NV has the crown since 5.6.06 - 7950GX2... ;)
    And ATi will hold it from 28.1. to 14.2.
     
  12. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
    Cheers, that basically answers my point.
     
  13. _xxx_

    Banned

    Joined:
    Aug 3, 2004
    Messages:
    5,008
    Likes Received:
    86
    Location:
    Stuttgart, Germany
    To be realistic, ATI has been trailing nV ever since the GF6800 appeared.

    But back on topic, what brand of crystal ball do you posess vertex_shader? How can you possibly know ATI will get the crown back? Or even that they'll release the product on time? Or that CF will work properly? Or that nV won't counter immediately if so needed?
     
  14. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
    Lol, very good point. I would usually discount the 7950GX2 with it being a dual GPU card but that would be a little hypocritical under the circumstances :smile:
     
  15. vertex_shader

    Banned

    Joined:
    Sep 8, 2006
    Messages:
    961
    Likes Received:
    14
    Location:
    Far far away
    Yeah right, when we count HD3870X2 than we need to count that too :smile:

    Yes, better than zero :wink:
     
  16. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
    I don't know about that. NV had the feature advatage but ATI certainly had the speed advantage in NV40 vs R420.

    NV didn't retake the speed crown until the 7800GTX and then later ATI re-took it with the X1900XTX (i'm ignoring the minor blip of the X1800XT).
     
  17. vertex_shader

    Banned

    Joined:
    Sep 8, 2006
    Messages:
    961
    Likes Received:
    14
    Location:
    Far far away
    I can't talk about unreleased crystal ball sorry :lol:

    I have good feelings about hd3870x2, of course its won't meat everyone 100% expectations and has weak points in games (not working CF, or not scale good), but overall performance i'm optimistic now.

    HD3870X2 coming on jan28, so according to the rumors its have 17 day lead against 9800GX2 what is faster but cost 50$ more (at least in paper).
     
  18. _xxx_

    Banned

    Joined:
    Aug 3, 2004
    Messages:
    5,008
    Likes Received:
    86
    Location:
    Stuttgart, Germany
    Well that is debatable, but as an overall package the GF6800 was capable of more than ATI's competing products and to me that counts more than a few % more speed in like half of the games.

    As for R680, I have no idea but vertex, you claimed that above with certain confidence which is only found in people with only the "feeling" and no hard facts ;) Thus I asked.
     
  19. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    Okay, are we still on the bandwidth subtopic?

    Anyways, I've made some cross testing today with my old 2900XT and 3870 laing around here, to illustrate a bit more the puzzlement.
    For the sake of apple-to-apple comparison, both GPUs were clocked at 800MHz, while 2900 and 3870 boards were set to 1800MHz and 2700MHz (115GB/s vs. 86GB/s) for the memory, respectively.

    -=3DMark'06 Single Texture FillRate=-

    HD2900: 8449 MPix;
    HD3870: 7635 MPix;


    -=FillrateBenchmark v0.92=-

    HD2900:
    Code:
               FrameBuffer Clear : 11392 FPS
                      Color Fill : 11936,15 M-Pixel/s
                          Z Fill : 22943,68 M-Pixel/s
                  Color + Z Fill : 11754,96 M-Pixel/s
                  Single Texture : 11825,42 M-Pixel/s
      Single Texture Alpha Blend : 7791,339 M-Pixel/s
                   Dual Textures : 6220,992 M-Pixel/s
                 Triple Textures : 4190,11 M-Pixel/s
                   Quad Textures : 3168,377 M-Pixel/s
        1 Floating Poing Texture : 11830,45 M-Pixel/s
                  Render to Self : 9139,809 M-Pixel/s
    HD3870:
    Code:
               FrameBuffer Clear : 12044,8 FPS
                      Color Fill : 11747,41 M-Pixel/s
                          Z Fill : 22407,65 M-Pixel/s
                  Color + Z Fill : 9427,118 M-Pixel/s
                  Single Texture : 11578,8 M-Pixel/s
      Single Texture Alpha Blend : 6963,384 M-Pixel/s
                   Dual Textures : 6140,461 M-Pixel/s
                 Triple Textures : 4142,295 M-Pixel/s
                   Quad Textures : 3125,595 M-Pixel/s
        1 Floating Poing Texture : 11573,76 M-Pixel/s
                  Render to Self : 9104,365 M-Pixel/s
    An additional set of numbers from an 8800GTS-512 board (800-2000/2200 MHz):
    Code:
               FrameBuffer Clear : 28019,2 FPS
                      Color Fill : 12205,42 M-Pixel/s
                          Z Fill : 57722,85 M-Pixel/s
                  Color + Z Fill : 11369,92 M-Pixel/s
                  Single Texture : 12187,81 M-Pixel/s
      Single Texture Alpha Blend : 6163,11 M-Pixel/s
                   Dual Textures : 12160,13 M-Pixel/s
                 Triple Textures : 12054,43 M-Pixel/s
     
    #219 fellix, Jan 11, 2008
    Last edited by a moderator: Jan 11, 2008
  20. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    4,027
    Likes Received:
    90
    Test some real games and then we'll talk ;)
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...