Alternative AA methods and their comparison with traditional MSAA*

Discussion in 'Rendering Technology and APIs' started by mitran, Nov 15, 2009.

  1. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    What exactly in AAA made it possible on Xenos, if you think it can't do MLAA (within the window) ?
    [size=-2]Sounds like the MLAA edge detection is taking up too much time.[/size]

    EDIT: Thanks nightshade.
     
  2. Ruskie

    Veteran

    Joined:
    Mar 7, 2010
    Messages:
    1,291
    Likes Received:
    1
    MLAA sounds like a pain for 360 cpu and i dont know would it be feasible for Xenos(Joker and somebody else mentioned it could).I feel like AAA is way to go for 360.4A gained performance boost after dropping 2xMSAA for AAA and it looked very comparable.On PC it was AAA~4xMSAA,am i right?So maybe with some improvements you could get similar results on Xenos plus you wont have to tile,and performance will be better as will IQ.
     
  3. T.B.

    Newcomer

    Joined:
    Mar 11, 2008
    Messages:
    156
    Likes Received:
    0
    I assume by 'AAA' you mean the technique used by Metro 2033 and not actual analytical anti-aliasing (which is something not even remotely related). My understanding from looking at the images is that for every pixel, they look at a small pixel neighborhood, find edges in there and blur (or maybe even cleverly) blend those to give you the final pixel colour.

    So every GPU thread processes a single pixel in this approach. In the MLAA algorithm however, pixels are not independent, but have a rather strict order in which they need to be processed. In other words, MLAA is not embarrassingly parallel and thus hard to implement on a GPU. Edge detection is not the issue.

    I'll go back into my cage now. ;)
     
    #443 T.B., May 23, 2010
    Last edited by a moderator: May 23, 2010
  4. nightshade

    nightshade Wookies love cookies!
    Veteran

    Joined:
    Mar 26, 2009
    Messages:
    3,392
    Likes Received:
    93
    Location:
    Liverpool
    Well they didn't switch from 2*MSAA to AAA.
    Instead the switch happened from deferred rotated grid super-sampling to AAA.
     
  5. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    43,577
    Likes Received:
    16,029
    Location:
    Under my bridge
    That's what I was thinking, and it'd thus need a completely different GPU core architecture to be able to apply MLAA. Unless, as you say, the process can be reengineered for a GPU's structure.
     
  6. assen

    Veteran

    Joined:
    May 21, 2003
    Messages:
    1,377
    Likes Received:
    19
    Location:
    Skirts of Vitosha
    So what exactly is that AAA you feel like it is the way to go?
     
  7. Ruskie

    Veteran

    Joined:
    Mar 7, 2010
    Messages:
    1,291
    Likes Received:
    1
    Well i was reading the interview with 4A Games about their engine and tech and it seemed to me that they got rather good results and performance with AAA in comparison with 2xMSAA(I was wrong,it was not MSAA,checked it again and it was running deferred rotated grid super-sampling).Then again I was totally wrong since they are doing deferred rendering they already had to tile so that kinda contradicts with what i have said in previous post.I guess i thought that AAA could help alot since you would then bypass tiling.Also I got impression that maybe MSAA is not really the way to go since you will have to tile thus have additional geometry overhead.
     
  8. Acert93

    Acert93 Artist formerly known as Acert93
    Legend

    Joined:
    Dec 9, 2004
    Messages:
    7,782
    Likes Received:
    162
    Location:
    Seattle
    It has been a common mantra that for parallel computing and especially that required by the SPEs developers need to think outside the box and re-think solutions to known problems. Maybe it is time some publishers begin cracking the Xbox developer whip and expect the same. Maybe we would see more ingenious solutions. Or maybe MSAA is "good enough" for most of them.
     
  9. assen

    Veteran

    Joined:
    May 21, 2003
    Messages:
    1,377
    Likes Received:
    19
    Location:
    Skirts of Vitosha

    See? That's Sony fanboys for you. You toil day and night, and they cancel your vacations ;-)
     
  10. brain_stew

    Regular

    Joined:
    Jun 4, 2006
    Messages:
    556
    Likes Received:
    0
    I'm going to have to disagree with this. Metro's AAA really wasn't a very good solution at all. Even in those still pics you can see that anything approaching a straight vertical or horizontal line gains no edge smoothing and the lack of any sub pixel rendering was a serious issue. I'd have took bog standard 2xmsaa over it, it really didn't help the overall image quality much at all and may have introduced some side effects as well.
     
  11. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    :lol: That's what's happening to me now and for the rest of my 2010.
    Not related to fanboyism at all. It's called "Sh*t Happens". ^_^


    EDIT:
    Okie, I understand the issue on hand better now. Thanks for the info. What's a good name for Metro's AA scheme if it's not AAA technically ?

    *hugs* T.B.
     
  12. Billy Idol

    Legend Veteran

    Joined:
    Mar 17, 2009
    Messages:
    5,984
    Likes Received:
    822
    Location:
    Europe
    Reading the MLAA paper, it notes that the first step is to find the "edges", where only the longest are considered (primary edge), which then are split up into L-shape structure to apply the color averaging using a connecting triangle (or its respective area)!
    Now, when you want to make a parallel version of this algorithm to fire up all SPUs...for instance with domain decomposition technique:

    -Considering 4 SPUs, one should split the image at least in for equal pieces to process each piece independently.

    -If you use this patern detection indepentently for each piece of the image...the number of pattern and especially their shapes ('longest primary edge') could change, right?
    -Especially at the 'artificial' boundaries of the single sub-domains...

    -If the number and form of the pattern changes, the triangle you use to determine the new color of the pixels differs compared to the single SPU case, thus the resulting color differs, thus the anti-aliasing of the image differs

    -Typically, if you want good load balancing, you should split the image in more than four pieces, which exaggerates this problems.

    -The problem with respect to load balancing I see is that in theory it could well be that one SPU detects no edges in its sub-domain, thus sitting around while the others do their hard averaging work, if no special care is taken in such sitations (i.e. dynamic load balancing!)

    What interests me:
    - Can one generally say, that the shorter primary edges due to the domain decomposition yield a worse IQ when using the triangles to average, compared to the single SPU case?

    If this is right, this could be a major drawback of the algorithm...because the only alternative I see with respect to a parallel version of this algorithm is to somehow communicate with neighbor domains to find the unique pattern - this smells like a difficult "quality verus SPU time" quest!
     
  13. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    The more general question is, besides MLAA, are there alternate algorithms/subsytems that are not embarrassingly parallelizable in the entire graphics pipeline ?

    EDIT:
    In some problems, you may overlap the problem space, and have the SPUs recalculate the results in the overlapped areas. That way one reduces the amount of communication between the SPUs.

    Another common trick is to rearrange the data (e.g., stash the intermediate results somewhere convenient/shared), so that the SPUs fetch them together with the input data.

    Not sure if these tricks will work in MLAA since I have not studied it. :p
     
  14. Billy Idol

    Legend Veteran

    Joined:
    Mar 17, 2009
    Messages:
    5,984
    Likes Received:
    822
    Location:
    Europe
    Overlapping of domains is a valid option. Although it decreases parallel efficiency the more domains (i.e. the more domain boundary) you have...but more domains could be desired due to performance/load balancing...we here typically try to come up with an algorithm which (at least in theory) scales ideal :smile:
    Another problem I see is that you don't know how much domain overlap you need a priori (as the pattern, hence the extension into the neighbor domain, are unknown when you decompose the image)

    I don't understand what you mean with your data rearrange...maybe you could be more specific?
     
    #454 Billy Idol, May 25, 2010
    Last edited by a moderator: May 25, 2010
  15. assen

    Veteran

    Joined:
    May 21, 2003
    Messages:
    1,377
    Likes Received:
    19
    Location:
    Skirts of Vitosha
    The thing is, we haven't put in the graphics pipeline things that are not embarrassingly parallelizable; this doesn't mean there aren't other domains interesting for computer graphics, which fall into that category.

    Classical radiosity, for example, where every patch of the scene interacts with every other, is not embarrassingly parallelizable. I thing the updating of hierarchies needed for raytracing of dynamic scenes isn't, either.
     
  16. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    I can't ! (without knowing the MLAA calculations)

    The issue is: We need to partition the data but they depend on data in other partitions. In some problems, we can allocate the SPUs and organize the data in such a way that a worker SPU can get partial result from other SPUs first. Then when the depended variable arrives, resolve the rest.

    Yeah, that's what I meant. Also, if we revisit some of the existing solutions, will we find new approaches. Embarrassingly parallel algorithms are low hanging fruits. Computer graphics probably has a lot of it. In addition, were there mathematical approximations formulated to exploit the early SIMD GPU architecture for instance ?
     
  17. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    43,577
    Likes Received:
    16,029
    Location:
    Under my bridge
    Has anyone tried MLAA with cel-shaded graphics? I feel IQ is about the only area where cel-shaded loses out to real drawn artwork. If the edges could be AA'd, they'd look spectacular. I'm thinking of DQVIII here on PS2. Lose the jaggies and it'd be close to cartoon quality. Drawing the edges to a separate edge buffer and applying MLAA to that is all it'd take.
     
  18. jlippo

    Veteran Regular

    Joined:
    Oct 7, 2004
    Messages:
    1,451
    Likes Received:
    580
    Location:
    Finland
    MLAA should work great until the edge is too thin.

    In cel shading you usually use fins and such for black lines, it would be preferred to do antialiasing while rendering those.
    You would still have sub-pixel accuracy and ability to tweak lines when they are smaller than pixel.

    Actually, fins might be quite nice solution to get 'cheap' antialiasing for most games.
     
  19. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    43,577
    Likes Received:
    16,029
    Location:
    Under my bridge
    What do you mean by 'fins'?
     
  20. jlippo

    Veteran Regular

    Joined:
    Oct 7, 2004
    Messages:
    1,451
    Likes Received:
    580
    Location:
    Finland
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...