AMD: R7xx Speculation

Discussion in 'Architecture and Products' started by Unknown Soldier, May 18, 2007.

Thread Status:
Not open for further replies.
  1. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    1 billion transistors is not very many, if R600 is ~700M.

    E300 ~107M
    R420 ~160M
    R520 ~320M
    R600 ~700M

    If the multi-chip rumour is true, then wouldn't that add overhead transistors?

    Jawed
     
  2. gandalfthewhite

    Newcomer

    Joined:
    Feb 21, 2004
    Messages:
    43
    Likes Received:
    0
    ati has been doing a semi split devolpment team for awhile they said they were going to stop doing that around the R420 days but who knows if they did or not.
     
  3. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    16,063
    Likes Received:
    5,016
    Well, considering both ATI and Intel are investigating multi-chip/multi-core scalable graphics chips. I'd imagine there's something there that makes it worth it.

    It may just be something as simple as... At some point you aren't going to be able to continue shrinking the process and thus, you'll run into a wall for how many transistors you can put on a single chip.

    1 billion transistors on a single chip is going to make for a monster of a chip even at a smaller process.

    As such, is it possible to move the ringbus off chip? Such that it could serve multiple chips on a single substrate? Could this be one of the reasons that ATI has invested so heavily in that memory architecture?

    And if so, would the complexity of such a solution be offset but using more chips of a simpler less transistor heavy design? I'd imagine this would improve overall yields, no?

    It'd be interesting to see if Nvidia was also investigating a possible move to multi-chip/multi-core. Actually, I think we may already be seeing the signs of this with the NVIO chip.

    Regards,
    SB
     
  4. neliz

    neliz GIGABYTE Man
    Veteran

    Joined:
    Mar 30, 2005
    Messages:
    4,904
    Likes Received:
    23
    Location:
    In the know
    We talked about it in another thread and yes, although there are separate teams (because of distance) much of the resources and "production" was shared nowadays. there is no such thing as a "R600 team" anymore.

    Although Multi-chip design is not a performance enhancer per-sé it does open roads to future improvements far beyond what is possible with the current limitations.
    With R600 already being built with a parallel design in mind I can't see performance degrading when going to a multi-core design.

    As Jawed pointed out, how are you going to feed such a beast when it's a group of mid-range processors stuck together.
    My guess is that this design will incorporate clock domains from ati which will group four RV610's together clocked at insane speeds where the whole design will be made or broken with the interaction of the ring bus controller.
     
    #24 neliz, May 21, 2007
    Last edited by a moderator: May 21, 2007
  5. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    In the future, going multi-chip may be the only way to get past a certain amount performance simply because single chip could run into area/power constraints, but given the same amount of aggregate shaders/texture/MC BW, it can never as efficient: it is just too easy on a single chip to add extremely high-BW buses. The cost of external buses is very high.

    Exaclty how is R600 more built with parallel design mind than other GPUs?
     
  6. neliz

    neliz GIGABYTE Man
    Veteran

    Joined:
    Mar 30, 2005
    Messages:
    4,904
    Likes Received:
    23
    Location:
    In the know
    that's what I was trying to say yeah :(

    I meant parallel operations on the R600 seem to perform much better than on previous hardware and I can see work for a dispatcher that will resolve bottlenecks in some situations, but then again.. other bottlenecks arise with this kind of design.
     
  7. Megadrive1988

    Veteran

    Joined:
    May 30, 2002
    Messages:
    4,638
    Likes Received:
    148

    sorry, I haven't been keeping up with the more recent ATI/AMD developments as I should be. interesting, thanks.
     
  8. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    16,063
    Likes Received:
    5,016
    Well referring to this diagram from the B3D piece...

    http://www.beyond3d.com/images/reviews/r600-arch/r600-big.png

    It appears that the SPUs and ROPs (RBEs) are setup as 4 distinct and seperate groups of processing clusters. And that there's an overall command/setup(?) structure that controls the whole thing. And presumably it all communicates with each other over the ringbus.

    On the surface at least, it would seem possible that you could

    1. Move the ringbus off chip to maintain the same type of communication.
    2. Have a central "command" processor chip.
    3. Have multiple dedicated processing chips.

    I realize this is a gross over-simplification of what is probably going on. But it wouldn't take much imagination to think that R600 was possibly just a stepping stone on the way to a multi-chip/multi-core architecture, R700 perhaps?

    Regards,
    SB

    [Edit] Which makes me wonder if RV610 and RV630 are ways for them to experiment with different TMU/SPU ratios to find out which one works best for future multi-whatever chips.
     
    #28 Silent_Buddha, May 21, 2007
    Last edited by a moderator: May 21, 2007
  9. Unknown Soldier

    Veteran

    Joined:
    Jul 28, 2002
    Messages:
    2,238
    Likes Received:
    33
  10. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,429
    Likes Received:
    428
    Location:
    New York
    I'm not sure where the motivation for splitting up R700 will come from. Is it really going to be that complex of a chip where upcoming 65nm and 55nm processes will result in excessively large dies? Isn't 65nm something like a 50% reduction compared to 90nm?
     
  11. Tim Murray

    Tim Murray the Windom Earle of mobile SOCs
    Veteran

    Joined:
    May 25, 2003
    Messages:
    3,278
    Likes Received:
    66
    Location:
    Mountain View, CA
    It's not an issue of die size as it is tape out costs, or so Arun has convinced me.
     
  12. _xxx_

    Banned

    Joined:
    Aug 3, 2004
    Messages:
    5,008
    Likes Received:
    86
    Location:
    Stuttgart, Germany
    No way you could move the ring bus off chip, that's rather ridiculous idea. It would slow down everythig by an order of magnitude.
     
  13. PSU-failure

    Newcomer

    Joined:
    May 3, 2007
    Messages:
    249
    Likes Received:
    0
    On chip ring bus + "off chip" clients is a solution.

    I don't know how much space each part of the R600 takes, but if texturing units consume a lot of die space that's one possible application.

    By removing the ROPs (hey, doesn't that look like an extension of R600's "custom filters"?), only 2 parts remain, looking quite similar to the good old Voodoo2 design: TMU chips and 1 SP array + memory controller die.
     
  14. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    Even if the ring bus could be used as a basis for inter-chip communication, hard as it is, it's probably one of the least difficult problems to solve. All it does, after all, is just transport data...

    The overall architecture of how to partition a GPU into multiple dies and do it efficiently is much harder: what kind of data will travel between the dies? What will the memory architecture look like (more or less mirrored, like CF/SLI, or distributed and shared) ? Will it duplicate setup engines or will there be a master/slave configurations? etc.

    That's why don't really see how the organization of the major blocks in R600 is significantly different from R580 or G80: there are no obvious indications of doing things significantly different that would make separation easier.
     
  15. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    16,063
    Likes Received:
    5,016
    But isn't it already separated to an extent? At least the diagram would imply that the SPUs and ROPs are four separate entities.

    Regards,
    SB
     
  16. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    939
    Likes Received:
    35
    Location:
    LA, California
    It don't think it would be mirrored since if 2-4 medium size dice are used to make a high end board, I would expect the cost of equivalent amounts of useable memory (compared to a single chip design) would be significant. On the other hand, I'd also think that full-speed texturing from non-local memory would require significantly more latency tolerance than would otherwise be the case, so that would adversely affect performance/mm2.

    silent_guy - is chip to chip latency be significantly reduced by placing multiple chips into a single package (like Clovertown)? Also, another possibly crazy question: is it possible to build a compute tile (where in this case a compute tile is an entire GPU) where each tile connects to the top/bottom/left/right tile on the same wafer? That way, maybe one could actually cut different sized dies out of a wafer - you'd cut say 2x2 tiles for high-end dies, 1x1 for low end, 1x2 for midrange. I imagine each tile would be connected to the neighbours with some fairly wide/high speed bus and each tile would be designed to handle the fact that the bus leading to any one of the 4 neighbours might go nowhere...
     
  17. PSU-failure

    Newcomer

    Joined:
    May 3, 2007
    Messages:
    249
    Likes Received:
    0
    If you look at the Z-buffer you'll see it's connected both to the "scheduler" part and to the "output" part.

    psurge> Clovertown doesn't have die to die connection, they rely on the "Netburst" FSB for that.
     
  18. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    213
    Location:
    Uffda-land
    One of the trueisms/mantras of engineering is something like "optimize for the common case, not the corner case". Maybe that idea is being applied here as well. Especially when that tiny percentage of high-end buyers have proven that cost isn't a major concern for them, you can start building into your models the idea that you can stick them with the extra memory costs associated with SLI/CF types of implementations.

    But some of you old timers know that I've been expecting this kind of thing that Inq is suggesting re R700 to become common for two years or more.

    The fly in that ointment to some degree, however, is the experience with two GX2, which didn't seem too promising, frankly.
     
  19. neliz

    neliz GIGABYTE Man
    Veteran

    Joined:
    Mar 30, 2005
    Messages:
    4,904
    Likes Received:
    23
    Location:
    In the know
    512-bit multichip boards all the way geo
     
  20. satein

    Regular

    Joined:
    Aug 17, 2005
    Messages:
    483
    Likes Received:
    21
    Location:
    Sheffield, UK.
    I think the way two GX2 didn't seem too promising since the GPU wasn't designed for multiple chip in mine (architecture wise). The base line on G70 architecture would be for dual chip only. Anyway, more than 2 chips would work (theoretically) but it might not at its best spot comparison to that of dual chips in SLi/Crossfire setup. This same analogy also could be seen in the CPU area, that the more you add the processor doesn't mean the more performance you can get.

    G80 and R600 would be more interesting to see if it could do well for more than 2 chips SLi/Crossfire.

    Regards,

    Edit: typo as usual...
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...