R700 Inter-GPU Connection Discussion

Discussion in 'Architecture and Products' started by Arty, Jun 28, 2008.

  1. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,205
    Likes Received:
    607
    To do single frame parallel rendering without duplicating the vertex load half the dynamic textures and half the transformed vertices have to go over the link ... that's quite a lot of data for what seems a 16x PCI-e 2.0 speed link.
     
  2. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    4,007
    Likes Received:
    60
    I doubt he was the first. I'd been saying it for quite some time myself, and there were slides out there for years detailing some sort of "shared memory" scheme, also I believe HSI schemes had been detailed, if not implied at least...
     
  3. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    4,007
    Likes Received:
    60
    Vertices can be streamed and textures can be compressed. Also, ~8GB/s bi-directionally ain't half-bad.
     
  4. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    I'm pretty sure we'll still see AFR, but that doesn't mean we won't have textures split across the memory pools. Whether the PCI is the only link or if there is another link (as has been suggested multiple times), BW from static textures is usually not too high, especially when using DXT.

    Space occupied by render-to-texture, including shadow and reflection maps, would obviously be doubled, and I'm sure that games using under 512MB would duplicate everything, but with a good enough link and smart memory management there should be a lot more than 512MB available to a 3D app if needed without too big of a performance drop.

    SFR without duplicating vertex work will take quite some time, IMO. Interestingly the software solution is quite similar to tiling on the 360...
     
  5. Karma

    Newcomer

    Joined:
    Jul 28, 2004
    Messages:
    36
    Likes Received:
    0
    Well, we already have a picture of the card with 16 RAM chips on both sides. That seems to indicate 2x1GB.
     
  6. Karma Police

    Regular

    Joined:
    Sep 2, 2004
    Messages:
    433
    Likes Received:
    6
    Location:
    192.168.2.1
    Now that I look at the chips closer, it's 2x512MB total on the board.

    So assuming the card in the picture and the screenshot card are the same, it looks like ATi really does have working shared memory.

    EDIT: Hey, what's up with my ID?
     
  7. Anarchist4000

    Veteran Regular

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    16 total chips or 16 chips on each side? If the pool is shared 2x512MB would be plenty of room I'd think.
     
  8. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,239
    Likes Received:
    3,184
    Location:
    Finland
    The pics have 8 chips per side, 4 per GPU on back and 4 per GPU on front.
    However, there was 2x1GB models of 3870X2 too, wasn't there? So nothing is saying the slide couldn't just have 2 of such cards there. Then again, we can hope it really is shared memory pool per card.
     
  9. Karma Police

    Regular

    Joined:
    Sep 2, 2004
    Messages:
    433
    Likes Received:
    6
    Location:
    192.168.2.1
    Can anyone make out the labels on the chips?
     
  10. Pantagruel's Friend

    Newcomer

    Joined:
    Jun 17, 2007
    Messages:
    59
    Likes Received:
    0
    Location:
    Budapest, Hungary
    and if we see AFR, what will we see in those cases when the application wants to reuse geometry in the subsequent frame?
    um. what do we see now, actually? :?:
     
  11. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,402
    Likes Received:
    4,111
    Location:
    Well within 3d
    Combining both paths of a branch is possible with many ISAs, and Cell can do this as well.

    Whether it's done through predicated instructions or conditional moves for a given architecture, the effect of executing down both paths of a branch can be derived.

    Doing it in hardware is not done in any significant architecture I know of.

    Right now, I believe each frame's geometry setup is done fresh.
    I'm not sure how to safely reuse geometry between frames without checking to see if it needs to be set up again.
     
  12. Pantagruel's Friend

    Newcomer

    Joined:
    Jun 17, 2007
    Messages:
    59
    Likes Received:
    0
    Location:
    Budapest, Hungary
    so, when a single card happily reuses geometry shader output from the previous frame, the dual card solutions are simply forced to redo the geometry calculations? it sounds possible. it also sounds ugly :???:
     
  13. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    214
    Location:
    Uffda-land
    My spidey-sense is still telling me what we're going to see here is incremental rather than revolutionary.

    Not that useful progress can't be made incrementally!
     
  14. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    I can't think of many reasons that you would use a geometry shader's output for multiple frames. I doubt any game does that today.

    More complicated are things like position data used in cloth and water simulation along with some image persistence techniques. The only way to handle that is to transfer the texture across the link immediately after updating it.
     
  15. Pantagruel's Friend

    Newcomer

    Joined:
    Jun 17, 2007
    Messages:
    59
    Likes Received:
    0
    Location:
    Budapest, Hungary
    I was thinking of stuff like waterfalls or explosions generated with GS. it was only my assumption that it preserves data across multiple frames - wouldn't it?

    but your examples work just the same. and your solution implies there's a lot of data passed back and forth every frame. even with the 8GB/sec link, I think it may become a bottleneck.
     
  16. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    If you look at my posting history, you'll see that I haven't been a big fan of multi-GPU for just this reason. The two big problems are persistent data and wastage of memory.

    With a good link, both can be solved to a certain degree (I'm pretty sure ATI's Froblins demo is going to be trouble for R700). With an insanely fast link, multi-GPU design can probably be as fast as monolithic GPU design in all scenarios, but we're a long way from that.
     
  17. compres

    Regular

    Joined:
    Jun 16, 2003
    Messages:
    553
    Likes Received:
    3
    Location:
    Germany
    What if the link is as fast as the memory interface?
     
  18. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Still not as good as monolithic design, but it may be close enough 95% of the time.
     
  19. rwolf

    rwolf Rock Star
    Regular

    Joined:
    Oct 25, 2002
    Messages:
    968
    Likes Received:
    54
    Location:
    Canada
    Lets see....
    - lower cost
    - higher yields
    - better performance

    If they can share the same framebuffer then there goes your arguement.
     
  20. rwolf

    rwolf Rock Star
    Regular

    Joined:
    Oct 25, 2002
    Messages:
    968
    Likes Received:
    54
    Location:
    Canada
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...