Since the recent use of screen-space AA techniques in Saboteur has generated a lot of interest (original console technology thread) I thought it would make sense to create a thread dedicated to the techniques, and particularly to implementing them on GPUs (eg. using OpenCL). I've actually been toying with ideas in this space for a long time, with the goal of getting the best filtering of 720p images that is possible in less than 10ms.
First, here are the phases I'd consider in such an approach:
Of course, these can be integrated or done separately. Interestingly, all of the steps except for slope and weight calculation can be done quite effectively with traditional pixel shaders, no fancy GPGPU-specific stuff needed. (the problem with weight calculation is that there is an arbitrary number of output values)
Here's a high level view of this:
Of course, it might make sense to combine the blending into the blend calculation step and never explicitly store the weights, or to start from a luminosity buffer for the edge detection instead of an image if you already have one for some purpose, but those are implementation details.
The interesting stuff clearly happens in the jaggy detection and slope calculation step. Here, I know of one particular method that seems like it could be very successful on GPUs:
"Double line scanning" as proposed here. This would require two passes: one for vertical edges and one for horizontal ones. The drawbacks I can see on GPUs is that you "only" get a degree of parallelism equal to half the number of lines/columns, and that quite a bit of dynamic branching is involved.
That's it for now, any comments are welcome (particularly if you actually implemented any method like this or similar, or are planning to!).
First, here are the phases I'd consider in such an approach:
- edge detection - this can be done based on RGB, luminance, Z, or any combination of those. This step only consists of finding areas of large gradients, those aren't necessarily always ones we want to filter
- "jaggy" detection - based on information from the previous step, find the start and end of staircase patterns (aliasing)
- slope calculation - find out the length of line fragments identified by start and end points (can also be dependent on surroundings if you want high quality)
- weight calculation - based on the slope and start/end points, calculate blending weights for in-between pixels
- blend - actually perform the blending
Of course, these can be integrated or done separately. Interestingly, all of the steps except for slope and weight calculation can be done quite effectively with traditional pixel shaders, no fancy GPGPU-specific stuff needed. (the problem with weight calculation is that there is an arbitrary number of output values)
Here's a high level view of this:
Of course, it might make sense to combine the blending into the blend calculation step and never explicitly store the weights, or to start from a luminosity buffer for the edge detection instead of an image if you already have one for some purpose, but those are implementation details.
The interesting stuff clearly happens in the jaggy detection and slope calculation step. Here, I know of one particular method that seems like it could be very successful on GPUs:
"Double line scanning" as proposed here. This would require two passes: one for vertical edges and one for horizontal ones. The drawbacks I can see on GPUs is that you "only" get a degree of parallelism equal to half the number of lines/columns, and that quite a bit of dynamic branching is involved.
That's it for now, any comments are welcome (particularly if you actually implemented any method like this or similar, or are planning to!).