Graphical effects rarely seen this gen that you expect/hope become standard next-gen

Discussion in 'Console Technology' started by L. Scofield, Nov 3, 2009.

  1. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    They are offscreen and blurry and lighting "enhanced" due to camera objective, evironment light, monitor light and camera settings.

    No.

    Sure.

    Yes, better IQ becouse of detail density and all edges recieve AA + transparencies, and super sampling for post proccess buffers.
     
    #221 Neb, May 31, 2010
    Last edited by a moderator: May 31, 2010
  2. Arwin

    Arwin Now Officially a Top 10 Poster
    Moderator Legend

    Joined:
    May 17, 2006
    Messages:
    18,063
    Likes Received:
    1,660
    Location:
    Maastricht, The Netherlands
    I was thinking DX11 because I don't know if in DX9 you have flexible enough access to the various stages of the graphics pipeline, but I guess for MLAA you probably only use simple line detection in one or two render-targets, so I guess that's probably no issue for DX9.
     
  3. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    43,577
    Likes Received:
    16,028
    Location:
    Under my bridge
    T.B. posted about this...
    So why haven't Intel or nVidia/AMD created an implementation on GPUs? At the moment I'd have to say because the GPUs cannot run it efficiently enough to be worth bothering with. I'd be surprised if the reason is absolutely no-one has tried to get MLAA working well on GPUs, as it's a superb IQ tool and the GPU manufacturer who can offer an implementation in the drivers would have a significant advantage over their rival.

    That's not to say GPU's won't find an MLAA implemetation, but at the moment it's not a case of, "oh, just run it on CUDA on a DX9 GPU."
     
  4. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    #224 Neb, May 31, 2010
    Last edited by a moderator: May 31, 2010
  5. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    But that is the GOW solution done for PS3 architecture. And nothing prevents to acheive even better MLAA solution on other hardware by taking advantage of the hardware features. I heared/read other devs say how X was impossible or very hard to do on X platform and yet it appeared in some form..


    Maybe becouse it is a hit or miss regarding visual improvements. Might also need to be implemented via game engine and not forced from drivers as that would probably not work well. Like ATI's edge detect mode algorithm dont work in all games. But there are some MLAA alike solutions like for example Metro 2033 has for IIRC DX9-DX11.

    Atleast ATI is doing some form of custom AA with edge detection and more with shaders.

    http://developer.amd.com/assets/Architecture_Overview_RH.pdf (page 42 and beyond)

    Well if algorithm developer says algorithm is better suited for GPU than CPU (PC). And a CPU of the type Quad-core 3.0GHz needs ~5ms for Intel MLAA...

    http://visual-computing.intel-research.net/publications/mlaa.pdf
     
    #225 Neb, May 31, 2010
    Last edited by a moderator: May 31, 2010
  6. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    43,577
    Likes Received:
    16,028
    Location:
    Under my bridge
    Possibly. I have repeatedly said we may yet see an MLAA algorithm running on GPUs. However, it is wrong to say at this point, "it can be done on a DX9 GPU," when the algorithm and hardware architecture and current lack of examples do not support that view. It's also wrong to say nothing prevents other architectures running a better solution. Architectures have inherent limits due to the compromises made to make them ideally suited for their designed workloads, programmability versus performace regards GPUs. If GPUs were as capable as CPUs for running 'CPU-code', we wouldn't have CPUs now, would we? ;)

    Again, that's not to say the task cannot be mapped onto GPUs; I have remarkable faith in the ingenuity of developers and researchers! But let's not jump the gun and say DX9 GPUs are capable of producing better IQ MLAA than GOW3 on faith alone.

    How is it hit or miss? It's at worse no more hit-and-miss that HDR breaking MSAA.

    That's edge-detection for an optimised multiple sampling. MLAA requires information for the whole edge, not just the immediate locality. For this reason GPUs are a very poor fit. Optimised edge-based multisampling will provide better quality of sub-pixel meshes than MLAA, and may well offer excellent IQ that's a suitable alternative. Again though, nothing yet points to GPUs being able to implement the MLAA concept and filling in the antialised detail based on edge information.
     
  7. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    First of all I said...

    ..which is a big difference from your sentence morphing.

    Never have I stated it would do better... nor that it would be DX9 GPUs.

    Tell that to the sub pixel sized polygons and transparency.

    The Intel MLAA concept seems well suited for GPUs or are the Intel MLAA algorithm developer(s) lying in their document? (see comments from previous post)
     
  8. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    I think on PC, they prefer a general solution since the hardware has an abundant and upgradable amount of spare capacity. They can always scale MSAA higher and higher to have similar results. So the benefits may be lesser. However, Intel's MLAA should be doable in modern GPUs.

    The real question is not whether it can be done on certain GPUs. It's whether it can be done within the allocated time, and reduce/minimize the so-called strobing effects and support sub-pixel handling (if necessary). Why did the Metro 2033 guys not implement MLAA ?
     
  9. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    Well might be for pushing sales on hardware. But just speculation!

    Yes either on GPU or CPU as it has low impact on CPU perfomance and many games dont use all threads nor keep threads fully occupied. Think about Crysis only uses 2 threads leaving a Quad with 2 idling threads and 2 threads in use are not keept fully busy!

    The perfomance should give them a wide playfield which will change its form depeding on efficiency. About Metro 2033 I dont know but perhaps a solution suitable on multiplatform. According to devs quality should be slightly worse than for MLAA on certain angles. Think they mentioned ~5ms perfomance hit on Xenos.
     
  10. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    I am not familiar with PC pipeline these days.
    Don't you have to let the CPU read and update video memory efficiently first, especially when people love 60fps or higher on PCs with everything turned on ?

    I remember some other posters complained that aliasing is visible for near vertical (or horizontal ?) lines. How does Metro AA work w.r.t. Intel MLAA ?
     
  11. Johnny_Physics

    Newcomer

    Joined:
    Sep 12, 2003
    Messages:
    205
    Likes Received:
    3
    Location:
    Norway
    That demo is totally awesome.
    For the ones on slower connections(like me) or slower computers/laptops(like me) here's another link:
    http://www.youtube.com/watch?v=ON4N0yGz4n8

    legendCNCD, would you mind if I PM:ed you about about a few things?
     
  12. T.B.

    Newcomer

    Joined:
    Mar 11, 2008
    Messages:
    156
    Likes Received:
    0
    That's some strong wording you got there. Alex Reshetov thinks it should be portable to GPUs, I think it's not a problem well suited for them. I can see that that's confusing, but it's no reason to start calling people liars, especially when talking about their published research.

    So, I just had a quick look at Alex' code and it seems that he's distributing work in blocks of 8 (horizontal or vertical) scanlines(*). So, is that embarrassingly parallel? For very large numbers of lines, it pretty much is. For small numbers of lines, it isn't. So, given a 720p image, I get a maximum of 90 blocks for the vertical case. Is that enough for GPU parallelization? Not on CUDA it isn't. But on larrabee it just might be.

    In other words: Different perspectives, different opinions.

    (* No, that's not what we do.)
     
  13. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    Ok, I see where the implementation discrepancies lie now. Thanks for clarifying, T.B.
     
  14. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    And neither have I called anyone liar.


    http://visual-computing.intel-research.net/publications/mlaa.pdf
     
    #234 Neb, May 31, 2010
    Last edited by a moderator: May 31, 2010
  15. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    It looks like they are referring to pure CPU performance on an image when talking about their multicore implementations. In a game, the entire run-time needs to be integrated and shared with the GPU pipeline. Someone should already be experimenting with the setup on the PC side as we speak.

    The author mentioned that his current CPU framework can be used for raytracing too. But that does not necessarily mean that it can run a raytraced game efficiently. He may be talking about something of a much larger scale (hence massively parallelizable). This would tie in with T.B.'s observation (that it may be difficult for a small dataset, but can scale better for large dataset): [size=-2]Remember, we are talking about a research paper in general. The application area is open and left for implementors to tackle on a case-by-case basis.[/size]

    EDIT:
    This line of discussion reminds me of Guerilla's slides and comments about dynamic radiocity:
    http://forum.beyond3d.com/showpost.php?p=1434701&postcount=209

    What can be done if the CPU, GPU and memory are "tightly coupled" ?
     
  16. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    Yes the other link has one example of perfomance with Intel MLAA on PC.

    With software rendering SSAA, MSAA(?) would cost tremendously and perhaps what MLAA was mainly targeted for. Anyway they give some mp numbers for what a single thread can proccess per second with un-optimised code, single CPU thread. And aint such HW like Nvidia and ATI GPUs mass multi-core hardware that relies on large scale parallelisation?
     
  17. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    Yes, single thread implementation means he's not relying much on parallelization (now). :)

    And if he sets his eyes on grand challenges, his parallelization numbers and approach may not be applicable for small problems like games. :)
    There is overhead in parallelization. It's easier to hide and spread the overhead for larger datasets (and longer time horizon).
     
  18. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    Well it is an example for perfomance but it mention it's 'embarrassingly parallel'. Anyway doing it for a scene in a modelling program or game, both are images per second with X amount of mega pixels per frame. The MLAA would be applied to final output frames.

    I find this interesting.

     
  19. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    Exactly. I don't think it conflicts with T.B.'s observation/findings. For a large enough dataset and time horizon, you can splurge on the cores. There are data dependencies/ordering in the dataset, so for a small problem set, you may not gain as much. Note that the author never promise that his algorithm will be applicable in all situations. He uses 8-bit data so that it can be ported to more platforms (CPU and GPU).
     
  20. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    As I understand it, it scales better for higher-resolutions. Example of 720p not being suitable for CUDA but how about resolutions that is used on PC which ranges in the 2-4x pixel amount?

    But would you really have to gain a lot to have it run fast? The example of a Q3.0GHz having the MLAA take 5ms render time on a 1024x1024 render.

    That is understandable and brings the question to how to find a solution? Adapt to hardware. Isn't GOW3s MLAA a modified version of Intel MLAA, I mean it aint running a chalk copy of the Intel MLAA 'un-optimised' algorithm or? What about the other games solution on all platforms? :)
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...