256MB Graphics Cards

Discussion in 'Architecture and Products' started by Dave Baumann, Feb 14, 2003.

?

When do you think 256MB cards will be necessary?

  1. Now (Its useful for FSAA)

    100.0%
  2. Next 6 months

    0 vote(s)
    0.0%
  3. Next 12 Months

    0 vote(s)
    0.0%
  4. When DoomIII Ships!!

    0 vote(s)
    0.0%
  5. 256MB?? This is getting silly...

    0 vote(s)
    0.0%
  1. demalion

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,024
    Likes Received:
    1
    Location:
    CT
    I'm not sure if the 8-bit stencil buffer has to be replicated for each sample set...I'd think it would be.

    There are several explanations in these forums in the past, and some concepts outlined by some of the ATI people in scattered places. Sorry, this computer and my connection (using a modem now, haven't configured for the DSL yet) is so slow that until I get used to it, my annoyance factor while using is prohibitive in regards to me doing a search for you, but IIRC, keywords "compression", "bandwidth", "memory" should help you find the comments (I'd guess it was sireric and OpenGL guy who made some comments I recall, so perhaps search by post and look for those threads with matches with their names).

    EDIT: I just realized that it might be necessary to look at this thread to understand some of what I say above.

    My brief explanation of why, if not how: what if you update the buffer and store it compressed? If so, what happens if you update it the buffer for that screen position again? What if it can't be compressed into the same space? How do you manage the overflow that results? How do you maintain predictable alignment and addressing for each screen position such that the buffer can be randomly accessed?

    IIRC, the general indication is that IHVs haven't discovered how to do the above efficiently with a lossless compression scheme yet. We had an interesting discussion primarily about z buffer compression. For framebuffer compression, well, I think Matrox FAA might be considered to be one method of dealing with some of the above issues if it actually saves storage space, though the current implementation seems to have issues.

    As for the how of color compression for the R300: if several samples are the same color, you send the color once and cause it to be replicated in the suitable amount of places. Possibilities might be some flexibility in the addressing controller that lets a data value be sent to multiple locations (actual way of doing the above), or as I proposed in another post some sort of "all or nothing" 1 bit mask used to represent that either all the sample colors are the same for a pixel or they are not, and if set, only the first color sample is read at all, and it was the only one written in the first place (conceptual way of doing the above).

    I'm pretty sure ATI engineers could conceivably have come up with something a bit different than the above ;), but since (in my limited knowledge) they appear feasible, there is a chance they didn't have to.
    I'll note the first works pretty well to my mind with the idea of a limited amount of samples being allowed, and something about addressing concerns tickles my memory in regards to the discussions I mentioned.
    Don't take the above as anything but a general indication of the issues and my own theories on how they might be resolved, though...search out the discussions for better answers!
     
  2. Nagorak

    Regular

    Joined:
    Jun 20, 2002
    Messages:
    854
    Likes Received:
    0
    It would be too slow to play with that resolution anyway! There's no point adding more memory when the rest of the card would bog down, long before it became useful.
     
  3. no_way

    Regular

    Joined:
    Jul 2, 2002
    Messages:
    301
    Likes Received:
    0
    Location:
    estonia
    Ok, with 8X+ antialiasing pure framebuffer mem requirements for IMR-s ARE indeed getting silly.
    8 times the mem just to do polygon edge antialiasing ? bleh, what a waste. Bring out the tilers.
     
  4. Dave H

    Regular

    Joined:
    Jan 21, 2003
    Messages:
    564
    Likes Received:
    0
    Probably would be. I didn't even think of it. I'm a little slow today...

    Ouch, and on Valentine's Day too. :cry: Good luck. My thoughts are with you. :wink:

    :idea:

    Of course: you just skip writing and reading from all sub-sample framebuffers but one. But of course you need to "skip over" that memory position, to keep everything aligned, hence no savings in space.

    Brilliant. Why couldn't I think of that?! (Of course the bit mask will also need to be saved to memory, so we need to add another 229k per frame to the calculation. I think.)

    Hmm...ok; that answer was satisfying enough (proof-of-concept) that I can go to bed in peace and do a proper forum search re: z-compression tomorrow.

    Thanks! :)
     
  5. Katsa

    Newcomer

    Joined:
    Feb 12, 2003
    Messages:
    11
    Likes Received:
    0
    Guys,

    you've discounted the fact that it's desirable to have geometry reside in video memory as well.

    And memory per vertex amount is growing all of the time due to the increased amount of texture coordinates etc.

    I could throw off a figure off the top of my head and say a game released this year might use about 20 MB of vertex buffers.

    With 64 bytes/vertex (again, off the top of my head) this would be only about 325000 vertices in lets say a level (divided maybe over 5-15 min of gameplay). It's not a feat at all for any card to push... So my estimate actually could be pretty low, and you'd need much more space for vertex buffers.
     
  6. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,090
    Likes Received:
    694
    Location:
    O Canada!
    #

    Yup, its certianly quite interesting to see the size of the vertex buffering in 3DM03.
     
  7. DeanoC

    DeanoC Trust me, I'm a renderer person!
    Veteran Subscriber

    Joined:
    Feb 6, 2003
    Messages:
    1,469
    Likes Received:
    185
    Location:
    Viking lands
    Vertex compression is becoming more and more important. And everybody should be using it if they are using vertex shaders (in most cases its completely speed free with no visual loss in quality).

    If 3DM03 isn't using any then thats a bit crap :-(

    I'll just plug my vertex compression chapters in ShaderX1 and 2 here then in case they are reading :)
     
  8. Ante P

    Veteran

    Joined:
    Mar 24, 2002
    Messages:
    1,448
    Likes Received:
    0
    double buffering of course
     
  9. Ante P

    Veteran

    Joined:
    Mar 24, 2002
    Messages:
    1,448
    Likes Received:
    0
    Perhaps he's counting the stencil buffer too?

    As for compression, since compression isn't constant you still need to reserve as much mamoery as you'd need without compression.
    (Sorry if that was a dumb answer.)
     
  10. Ante P

    Veteran

    Joined:
    Mar 24, 2002
    Messages:
    1,448
    Likes Received:
    0
    You have obviously not used a R300?
    I play with those settings (+16x Aniso) in a handfull of games.
    An R350 would increase that "handfull" to many more me thinks.
     
  11. Jare

    Newcomer

    Joined:
    Feb 15, 2003
    Messages:
    8
    Likes Received:
    0
    Stencil needs to be multisampled just like the zbuffer, otherwise you wouldn't get antialiasing at the edges of your stencil masks.
     
  12. Basic

    Regular

    Joined:
    Feb 8, 2002
    Messages:
    846
    Likes Received:
    13
    Location:
    Linköping, Sweden
    And then throw in high precision MRT.
    1600x1200x4x128/8 = 117MB
    That's for four 4x32bit fp buffers, no AA.
    And that's in addition to the usual FB.
    Is it possible to get AA on MRTs? In that case 6*117MB = ouch.

    It wouldn't be useful in real time though. :)
     
  13. horvendile

    Regular

    Joined:
    Jun 26, 2002
    Messages:
    418
    Likes Received:
    2
    Location:
    Sweden
    This may have been answered earlier, but here goes:
    I believe that since it is paramount to keep the compression lossless, compression ratios cannot be guaranteed (I have no or very little technical knowledge about this, but it stands to reason). Thus, memory must be allocated for worst case scenarios, i.e. no compression at all.

    Naturally, in practice there is "always" compression, so bandwidth is saved, but the saved space is nevertheless not available for other data.
     
  14. Arun

    Arun Unknown.
    Legend

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    302
    Location:
    UK
    I'll respond to that by a direct quote from the DX9 SDK, in the "Multiple Render Target" subtitle:

    Uttar

    EDIT: I'd like to say I'm not impressed by Mufu's comment about R350 6x being as fast as R300 4x
    According to Anand, the R300 is between 11% and 18% slower at 6x than at 4x - that's probably because Z & Color compression become even more efficient.
    And that means you'd only need 18% faster RAM to get that. Really not impressive, IMO.
    Assuming that Anand figure was in an optimal case, and considering a 21% performance drop, and thus 21% faster RAM, you'd get 375Mhz RAM.
    That's exactly what The Inquirer had predicted on the 4th of February:
    So, as you see, nothing new :)
     
  15. Tahir2

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,978
    Likes Received:
    86
    Location:
    Earth
    I know it's could be considered egotistical to quote oneself (haw haw) but I just wanted everyone to know I was joking about my sources .. I don't have any - hehe. :D
     
  16. arjan de lumens

    Veteran

    Joined:
    Feb 10, 2002
    Messages:
    1,274
    Likes Received:
    50
    Location:
    gjethus, Norway
    Hmmm - an idea: 6x multisampling requires 6x the amount of memory per buffer. With compression, you can reduce the amount of memory that needs to be accessed by a large amount - most of the time. This would imply that in the multisample buffer, there will be large amounts of memory that are almost never accessed - wouldn't it be possible to map these memory areas out to AGP memory and that way free up lots of onboard memory?
     
  17. Arun

    Arun Unknown.
    Legend

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    302
    Location:
    UK
    Seems unoptimal to me. AGP has way too high latency...
    Might not be as much of a problem as I think, but I doubt I'd do miracles anyway.

    Uttar
     
  18. MuFu

    MuFu Chief Spastic Baboon
    Veteran

    Joined:
    Jun 12, 2002
    Messages:
    2,258
    Likes Received:
    51
    Location:
    Location, Location with Kirstie Allsopp
    Man, I think you speculate a little too quantitatively sometimes! I meant to say "as useable" as the 4x mode, sorry - corrected now. It'll depend on final clockspeeds anyway and that's pretty much a marketing call, what with there being no real competition. :-\

    256MB is defintely an option for R350, Tahir. It's an "option" for R300 and NV30 as well though - doesn't really mean much.

    MuFu.
     
  19. arjan de lumens

    Veteran

    Joined:
    Feb 10, 2002
    Messages:
    1,274
    Likes Received:
    50
    Location:
    gjethus, Norway
    Still I would think it's a better idea than AGP texturing, especially given that framebuffer accesses have much more predictable/prefetchable access patterns than texture map lookups and therefore are much less sensitive to latency.
     
  20. mboeller

    Regular

    Joined:
    Feb 7, 2002
    Messages:
    923
    Likes Received:
    3
    Location:
    Germany
    R350 :D

    How was the presentation, Dave?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...