Alternative AA methods and their comparison with traditional MSAA*

Discussion in 'Rendering Technology and APIs' started by mitran, Nov 15, 2009.

  1. Trejser

    Regular

    Joined:
    Dec 4, 2009
    Messages:
    621
    Likes Received:
    0
  2. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    Nice find ! Great to see more people using the technique. Will definitely pay attention to the AA once I have the game.

    This sounds useful too:

     
  3. Oninotsume

    Newcomer

    Joined:
    Dec 10, 2005
    Messages:
    169
    Likes Received:
    6
    Location:
    Japan
    Hi,

    Excellent find. I'm glad to see the Gow III devs are so open to
    answering user inquiries.

    Yes. If the method proves applicable across various game designs,
    it would be nice to see it rolled into the SDK.

    I'm even more hyped for this game now. Can't wait to feel my eyes
    pop out:grin:

    Oninotsume
     
  4. marcus_rocks

    Newcomer

    Joined:
    Jan 9, 2009
    Messages:
    86
    Likes Received:
    0
  5. DonaldDuck

    Newcomer

    Joined:
    Mar 15, 2009
    Messages:
    90
    Likes Received:
    0
    In fact, if someone achieves that goal (a fast and cheap way of handling transparencies in PS3) that would help programmers of multiplatform games a lot. Nowadays, that thing seems to be the main problem when porting a game from XTS to PS3.

    In some ways, the SPU´s are allowing programmers to make things in ways no one really expected. It could be considered a fault in GPU hardware design (RSX being a bit rigid) or an achievement in system design. May be both.

    Well, to my taste is complex but interesting. As we say in Spain "el hambre agudiza el ingenio", being hungry makes us smart... I suppose some programers prefer the other way. I probably would. Nevertheless, it´s interesting.
     
  6. corduroygt

    Banned

    Joined:
    Nov 26, 2008
    Messages:
    1,390
    Likes Received:
    0
    If PS3 transparencies are an issue, why does frostbite engine have the exact opposite issue, with the snow covered trees looking noticably worse on the 360? I also think quarter resolution is a satisfactory method of dealing with transparencies in the PS3.
     
  7. nightshade

    nightshade Wookies love cookies!
    Veteran

    Joined:
    Mar 26, 2009
    Messages:
    3,392
    Likes Received:
    93
    Location:
    Liverpool
    I think a quater reso with MSAA applied & slight blur or haze to transparency (KZ2) is just perfect for most games.
     
  8. _phil_

    Veteran

    Joined:
    Jan 3, 2003
    Messages:
    1,659
    Likes Received:
    13
    From Tim Moss twitter:


    The #gow3 AA technique saved 5ms from the GPU, costs ~20ms on 5 SPU's (~4ms Latency), its very pretty and only on #ps3 ;-P
     
  9. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    43,577
    Likes Received:
    16,029
    Location:
    Under my bridge
    Thanks _phil_. That actually makes sense! Although 20ms to add AA is damned expensive.

    Edit : Actually it doesn't make sense. 60fps afford ~17ms per frame. 20ms of AA processing just isn't possible! 4 frames would take 80ms, which would be 5 frames at 60fps. The only way I can see it working is if he really means that it adds 4ms of latency due to 4ms processing time. That amounts to a kinda weird total of 20 ms as 16ms to generate the frame plus 4ms to apply AA, but during that 4ms the next frame is being generated, such that the game runs at 60fps with a 4ms lag added to the time taken from updating the game to creating the current frame image.
     
  10. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    Can you stuff even more "ms of work" to the SPUs ? Or are they all overwhelmed in GoW3 now ? What other improvements/problems/regrets would you overcome if you were to take another stab at it ?
     
  11. T.B.

    Newcomer

    Joined:
    Mar 11, 2008
    Messages:
    156
    Likes Received:
    0
    4ms * 5 SPUs = 20ms of SPU time.
     
  12. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    43,577
    Likes Received:
    16,029
    Location:
    Under my bridge
    Ahh, right. Still sounds expensive (20ms figure is a bit dumb. Are we going to count current GPU times in the hundreds of ms because they have so many shaders!) compared to hardware MSAA, especially at 60fps, but the quality seems worth it.
     
  13. T.B.

    Newcomer

    Joined:
    Mar 11, 2008
    Messages:
    156
    Likes Received:
    0
    That's just how we do it on Cell, TBH. You have 6 cores and while a GPU always runs the same program on all "cores", that's just not true for the SPUs. So if you have a properly parallelisable problem, it makes sense to measure performance in "1 SPU time".

    "Example": I have 100ms of SPU time at 60Hz and I budget up to 20ms for a piece of code. Maybe I just run it on 2 SPUs and get 10ms latency. Or I put it on 5 and get 4ms. That decision will depend on scheduling needs, but I still know how much SPU time I've committed.

    Still sounds dumb? ;)
     
  14. corduroygt

    Banned

    Joined:
    Nov 26, 2008
    Messages:
    1,390
    Likes Received:
    0
    Considering the ps3 architecture, I'd say 20ms total spu time is less valuable than 5ms rsx time, not to mention the much better quality.
     
  15. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    43,577
    Likes Received:
    16,029
    Location:
    Under my bridge
    :razz: Not as a measure of SPU usage as it use it, but in it's use in these quotes, it seems a mixed measurement. "We spend 20ms SPU time to do 4ms work that takes 9ms on GPU." That's all a bit muddled! But in terms of Cell's structure, I can see it makes sense. My language interests would rather see a different measure created though - A Cell unit where there are 6000 units (SPE ms) to a PS3's Cell. Maybe 'clicks'. "We saved 5 ms by shifting the AA from GPU to CPU. Our FSAA process takes 20 clicks on Cell, which we spread across 5 SPEs, 4ms each."
     
  16. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    I agree with Shifty. I took 20ms to be the end-to-end duration. So it's 4ms in duration (over 5 SPUs) vs 5ms on RSX ? Same quality output ?

    EDIT:
    Don't quite understand his follow up:

    Ok... I think I understand it a little better. Each of the 5 SPUs took 4ms, but the duration is longer (depending on how the SPUs line up their work). And the overall end-to-end time is 5ms faster than on a GPU ? with same quality output ?
     
  17. Weaste

    Newcomer

    Joined:
    Nov 13, 2007
    Messages:
    175
    Likes Received:
    0
    Location:
    Castellon de la Plana
    For better or worse, Sony really like maying funny/bizarre machines.
     
  18. Titanio

    Legend

    Joined:
    Dec 1, 2004
    Messages:
    5,670
    Likes Received:
    51
    I read it as ~4ms from start to finish when doing it on 5 SPUs. Occuping 5 SPUs for 20ms per frame each in a 30fps game (33ms per-frame budget) would be far too much.

    They previously said this was 5-6ms faster than doing 'regular' AA on GPU (I guess 2xMSAA given that's what they were using before). So I take that to mean 5 SPUs doing MLAA = less than half the cost of 2xMSAA on RSX.

    On a side note, by way of comparison...although it's likely not apples-to-apples, Intel reported performance of ~46ms of processing time on a single 3Ghz Intel core with its MLAA implementation (for a 720p frame - 20m pixels per second). A single SPU with Santa Monica's implementation would be ~20ms.
     
  19. marcus_rocks

    Newcomer

    Joined:
    Jan 9, 2009
    Messages:
    86
    Likes Received:
    0
    I don't think transparency is a hard problem. Everything within a frame is predictable. We know which object is transparent. We know which object would have to be merged the color from the transparent object's color. We just need the SPUs to calculate every pixel a new color and let the GPU process it.
     
  20. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    Yes, I understand this to be the best case scenario (All 5 SPUs start and run at the same time with no/trivial dependency between them). In this case, the RSX would take 9ms or so to complete.

    EDIT:
    Ha ha, the local store is to sidestep slow global memory access. The split memory pool is a little awkward. Other than those 2 features, heterogeneous computing, more cores, the EIB and NUMA are not uncommon in high performance computing today.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...