AMD: R8xx Speculation

Discussion in 'Architecture and Products' started by Shtal, Jul 19, 2008.

?

How soon will Nvidia respond with GT300 to upcoming ATI-RV870 lineup GPUs

Poll closed Oct 14, 2009.
  1. Within 1 or 2 weeks

    1 vote(s)
    0.6%
  2. Within a month

    5 vote(s)
    3.2%
  3. Within couple months

    28 vote(s)
    18.1%
  4. Very late this year

    52 vote(s)
    33.5%
  5. Not until next year

    69 vote(s)
    44.5%
  1. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,325
    Likes Received:
    93
    Location:
    San Francisco
    If you are clever about it it's going to cost you nothing, you just need as many 1x1 exposure textures as many frames in flight (across all your processors) you might have.
    Typically 3 or 4 textures on PC (while on PS3 I have done it with just one, but that's another story..) so that you can lock one of them without running the risk of stalling the CPU.
    A few frames latency, in this particular case, is not really an issue.
     
  2. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    I understand that, but I was wondering more about the time between locking and reading the value, even when the resource you're locking isn't being used by anything. I assume this won't be as fast as a simple memory access.

    BTW, what are your thoughts on doing HDR this way? It's a moving window of range that itself is "only", say, 8000:1, but can be anywhere on the luminance scale you want it to be. I discussed this with you in some thread way back, but IIRC you weren't convinced at the time.
     
  3. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,325
    Likes Received:
    93
    Location:
    San Francisco
    Driver overhead aside I don't see why reading back 4 bytes should be that slow, but maybe console development spoiled me :)

    IIRC I wasn't convinced on the approach Valve used for their HDR (I mean the way they compute or better determine exposure), but deferring exposure usage is fine, in fact Heavenly Sword does that (mostly to speed up tone mapping as it enabled me to remove a milion tex2D() per frame with a simple scalar constant )
     
  4. Humus

    Humus Crazy coder
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,217
    Likes Received:
    77
    Location:
    Stockholm, Sweden
    Sure. And if you don't read back it's always an option to sample in the vertex shader and pass as an interpolator, which takes away the sampling cost from the pixel shader. But the problem is of course, how much time are developers going to spend a whole lot of time optimizing for CrossFire/SLI. About as much as optimizing for S3 cards I suppose since I guess it represents about the same size of the market.
    And this example is of course one of the simpler to find a solution for, in other cases it may not be reasonable to let it lag a few frames. Not to mention all the cases where perfectly reasonable optimizations that work well on single-GPU setups has problems on multi-GPU. For instance in your average racing game, you may want to update a cubemap for the car reflections, and to speed things up you just update one face each frame. That reduces the overhead to a fraction of the cost of updating all faces, but for multi-GPU it introduces a sync point. If the update is relatively early in the frame it may not have to be a problem, but if it's late the GPUs may end up being idle for much of the frame.
     
  5. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,420
    Likes Received:
    179
    Location:
    Chania
    Honestly why not? If it doesn't double then a healthy increase. What's the theoretical floating point throughput difference between RV670 and RV770? Besides that pure theoretical factor the RV770 isn't of course by as many times faster than the RV670.
     
  6. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    On our last engine I also just locked the 1x1 exposure map, and used that value as a shader constant later on. It's very fast on PC also, if you have multiple 1x1 textures to prevent lock stalling. Only the last texture is really needed, and 1x1 textures are very small (the memory overhead is really nothing).
     
  7. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,325
    Likes Received:
    93
    Location:
    San Francisco
    I'm glad to hear that works well on PC too, I thought multiple 1x1 textures would have been able to avoid lock stalling, but I never tried to implement it for real.
    On PS3 I just used a single 1x1 rendertarget stored in XDRAM, RSX renders direcly to it and CELL reads it back and 'cache it' right after a flip().
     
  8. Megadrive1988

    Veteran

    Joined:
    May 30, 2002
    Messages:
    4,638
    Likes Received:
    148

    timeframes

    Pyramid3D developed around 1995-1996, announced in early 1996 (thus mid 90s). There were at least 2 different chips, one with on-chip geometry engine/T&L, and one without that was just a rasterizer. Should've been released in late 1996 or early 1997.

    Glaze 3D (which I guess never taped out) was announced in 1998, due to be released in 1999
    (thus late 90s)

    XBA - Extreme Bandwidth Architecture was some evolution of Glaze 3D with eDRAM, or was an implementation of Glaze 3D, this was announced in 2000.

    Axe and Hammer - Were early this decade 2000,2001
    evolutions or implementations of XBA that were DX8 and DX9 respectively.
     
    #68 Megadrive1988, Jul 26, 2008
    Last edited by a moderator: Jul 26, 2008
  9. kyetech

    Regular

    Joined:
    Sep 10, 2004
    Messages:
    532
    Likes Received:
    0
    Hey just for a laugh, since were talking about old school multi GPU configurations. I thought I would get my 3Dlabs Oxygen GMX2000 card from the loft (Attic)

    I bought it in 1999 for £1200 ($2400) in todays exchange rate.

    It had 96 MB of Memory.
    3 fans. God knows how many different chips.
    And it was a beast.

    http://www.web2suite.com/temp/gmx2000.jpg

    I used it for visualisation of 3D work in softimage 3.8 and XSI v1 (beta)

    Crazy how far things have moved on.
     
  10. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,489
    Likes Received:
    400
    Location:
    Varna, Bulgaria
    ...and people are complaining about the length of the current boards! :D
     
  11. kyetech

    Regular

    Joined:
    Sep 10, 2004
    Messages:
    532
    Likes Received:
    0
    Yeah, It wouldnt fit in my respectable sized case until I took a small plastic handle off the end. And then it fit with about 5 mm to spare at end of case!

    Crrrr AZY.
     
  12. Shtal

    Veteran

    Joined:
    Jun 3, 2005
    Messages:
    1,344
    Likes Received:
    3
    DX11 support in RV870!
     
  13. Shtal

    Veteran

    Joined:
    Jun 3, 2005
    Messages:
    1,344
    Likes Received:
    3
    Reminds me R300.... No DX9 API was available when Radeon 9700Pro launch Aug 2002.
     
  14. Anarchist4000

    Veteran Regular

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    I guess it's not to surprising. I'm not sure on all the features being added in DX11 but doesn't the current hardware support most of them? I wouldn't think very many hardware changes would be necessary.
     
  15. ZerazaX

    Regular

    Joined:
    Oct 29, 2007
    Messages:
    280
    Likes Received:
    0
    AFAIK DX10 and DX10.1 cards can support DX11, it just won't support all features I guess since certain hardware changes must be made.

    That said, the ATI cards seem to support nearly all of the so-far released DX11 list... the tesellation unit, shader computations, etc. so I'm not sure what the big differences will be.
     
  16. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,183
    Likes Received:
    1,840
    Location:
    Finland
    The "DX10 & 10.1" cards can support DX11 is most likely just the very same that was true before, as long as you had drivers, even a Voodoo3 "supported" DX9 - it could run with it, sure it didn't sport even DX7 featureset but the drivers were still compatible (3rd party drivers, that is)

    "compute shaders" will most likely be available for 10/10.1 hardware, too, though, and it's possible that tesselation unit on HD3/4k can be used too, but that's about it, IMO.
     
  17. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,325
    Likes Received:
    93
    Location:
    San Francisco
    AFAIK RV670 tessellation unit doesn't match DX11 style tessellation pipeline. Same story regarding AMD ALUs suddenly being able to support SM5.0, I find it quite unlikely.
     
  18. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,183
    Likes Received:
    1,840
    Location:
    Finland
    I have no real idea, but I remember reading something that the tesselation unit on RV670 was a bit different from the R600 one, and RV770 would be identical to that of RV670. The only real reasons I can come up with the initial modifications would be either to comply with DX11s tesselation or space savings.
     
  19. Shtal

    Veteran

    Joined:
    Jun 3, 2005
    Messages:
    1,344
    Likes Received:
    3
    NV30 supported DX9 SM2, but it was terribly slow. I would assume it requires hardware modification for properly running DX11 SM5 code path - otherwise you have a problem.
     
  20. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,183
    Likes Received:
    1,840
    Location:
    Finland
    I don't think anyone really thinks the cards would be capable of SM5.0, but Tesselation & Compute Shaders could be separated from SM5.0 (as in, don't require SM5.0 supporting GPU, just SM4.0 for compute shaders & tesselation unit for tesselation, think of geometry instancing, "sm3.0 card" classed feature which could be supported by ATI SM2.0 GPUs aswell, now just remove the need for "hacks" to use the feature on older cards and you're good to go)
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...