Convolution filtering as AA

Bumpyride

Newcomer
First post but long time reader, sorry if this is the wrong forum. I'm coming from a more audio signal processing point of view so forgive the lack of specific terminology. Anyway...

Why isn't convolution filtering used more in realtime 3d rendering for low pass filtering (anti-aliasing)? With an appropriate filter mask, it seems like it could be implemented in a fixed fashion and would have an advantage over super-sampling in terms of memory use and bandwidth. It does have the downside of softening the entire scene, but that goes in hand with the positives of filtering transparent textures as well as polygon boundaries. I'm aware of nvidia's quincunx (sp?) AA formats which used a tent filter as an approximation to the gaussian, but those didn't meet with much enthusiasm and haven't really been developed further (never saw them myself, so I can't say if they worked or not).

I'm just thinking about this because I recently got God of War for the ps2 and it has an option to soften the image. It comes at the cost of detail but I find I like enabling it more because getting rid of aliasing and noise is more pleasing to me than retaining all the detail. It just makes me think that a more robust filter, while not perfect, could be beneficial. Particularly on a console where a television isn't nearly as precise a display as a computer monitor. Anyway, I just wanted to see what people thought about it.
 
A convolution filter can soften the image and alleviate the impact of aliasing artifacts, but it cannot really remove them (other than completely evening the image out).

Just like when you do digital audio recording, in order to avoid aliasing you have to do two things:
choose a sample rate that is at least twice the highest "interesting" frequency (audible range), and use a low-pass filter that cuts out any higher frequencies because they would be reflected back into the audible spectrum.

However, the low-pass has to take place before the sampling. After the sampling, no low-pass filter in the world could help you because the higher frequencies have already been mirrored back, there are no frequencies in excess of half the sample frequency any more.

So using a low-pass afterwards is like removing frequencies that were intended to be there in the first place. Meaning the reduction of detail (blur) as you see it with quincunx.

The big problem with aliasing in computer graphics is that you might be able to low-pass texture data (that's what mip-mapping is about(*)) but there is no known way to low-pass scene data. An polygon edge is a signal discontinuity and as such means infinitely high frequency. What can be done is raising the sample rate to minimize visible aliasing artifacts.


(*) the texture filtering "low pass" traditionally has too high a cutoff frequency to remove all texture aliasing, btw. But this is a tradeoff between sharpness and aliasing.
 
Ahhhhaahhhh. I wasn't thinking in terms of the sampling frequency. That clears things up nicely (on that matter at least).

Wouldn't it still be beneficial to use something more robust than a mean filter in the downsampling after the super sampled image is computed. I guess that goes back to quincunx, but it seems like more development could have helped with its shortcomings.

Thanks.
 
Bear with me here, but I thought about it a bit and have another question.

It seems to me that a lot of the aliasing artifacts in graphics come from these infinite frequencies where a signal discontinuity is being sampled (polygon edges, transparent textures, etc.). It seems to me, then, that this discontinuity would be infinite in frequency no matter what the sample rate, and so a high quality low pass filtering could be used to 'blur' the polygon edges while leaving most other details intact. I'm not saying this would be a particularly good approach; I'm just wondering if it would work.

The odvious trouble is that using a gaussian filter kernel represents a low pass filter with a fairly gradual frequency cut-off, so lots of harmless detail is lost in the process. However, using a good sized and properly windowed sinc function as the filter kernel would represent a low pass filter with a much sharper cutoff (approaching a step function if the filter mask is large enough), so it would 'blur' the polygon edges and leave most other details unaffected. Of course, this would probably be porhibitvely expensive computationally - I'm just curiuos to see if my thinking is right on this.

I'd also like to get some impressions on just how prohibitively expensive it would be. In acoustics, by my understanding at least, a windowed sinc filter may be as much as 40-bits wide. I'd imagine a 40x40 filter mask would be far too much for any kind of reasonable post-processing, but what size mask would be reasonable? Isn't something similar to a guassian filter used in the tone mapping for HDR images?
 
Ringing might not offend the ears, but it does the eyes. Our eyes perform area integration, not low pass filtered pointsampling ... you have to be carefull applying sampling theory to vision, it's not as simple as in audio.

A little bit of ringing can look good, but the ideal sinc filter is far from ideal ;)
 
Good point. Proper windowing should reduce the ringing, but I think that just comes back to needing a ridiculously large filter.

Anyway, thanks for the help guys. I'm finding it interesting trying to apply my limited knowledge of digital filters to my limited knowledge of graphics.
 
Bumpyride said:
The odvious trouble is that using a gaussian filter kernel represents a low pass filter with a fairly gradual frequency cut-off, so lots of harmless detail is lost in the process. However, using a good sized and properly windowed sinc function as the filter kernel would represent a low pass filter with a much sharper cutoff (approaching a step function if the filter mask is large enough), so it would 'blur' the polygon edges and leave most other details unaffected. Of course, this would probably be porhibitvely expensive computationally - I'm just curiuos to see if my thinking is right on this.

<bolding on my part>

well, how do you determine which are those other high freqs that can go together with the poly edges? 8)

IMO, until we get to use displays of native resolution close to the edge of human vision resolution, any post-filtering of the already reconstructed image will kill much potentially 'good' freqs. yet, once we get to such resolutions, a post-filtering like the one you propose could be used to make the image 'reality-perfect' - i.e. remove any subtly-disturbing freqs in the already 'nearly-perfect' signal.

apropos, check out the Anti-Grain Geometry sw library - an extremely good 2d rasterizer, it also has an example of recurrent reconstruction filtering.
 
I was thinking more from the point of view that the aliasing is more irritating to the eye than the lack of detail - to a certain extent. Higher resolutions would help because the frequency of the polygon edges would still be infinite, but the frequency of meaningful details would be lower compared to the sampling rate (and would thus be affected less by the filter - by my reckoning at least).

It seems to me, and please correct me if I'm mistaken, but realtime graphics are using more and more post-processing to get effects like depth of field, high dynamic range, and the like. Wouldn't it make sense to build hardware into GPU's that can accelerate these types of filtering?

That anti-grain geometry website is really interesting. I've been trying to find some sources that demonstrate digital filtering in image processing from a somewhat basic level, since what I do know doesn't apply very directly (as Mfa pointed out) and I don't have a lot of time to 'waste' just because I find it interesting.
 
Bumpyride said:
I was thinking more from the point of view that the aliasing is more irritating to the eye than the lack of detail - to a certain extent.
It's a difficult tradeoff. Our eyes try to focus on an image until the detail is sharp, so if the whole imge is out-of-focus, it makes it hard to look at.

I interviewed at Real3D (obviously many years ago) and they showed off an arcade game that used their technology. The engineer told me about how they calculated the necessary mio-map level to eliminate aliasing in the texture maps, and how the game designer sharpened the mipmap level so that it would look aliased, but less blurry...

Enjoy, Aranfell

PS: As distracting as visual artifacts are, I think sound artifacts are worse. I was taught in a video class that people will watch a video with horrible visual noise (snow etc.), but will turn it off if there is very much noise in the audio. That matches my own experience.
 
Bumpyride said:
Good point. Proper windowing should reduce the ringing, but I think that just comes back to needing a ridiculously large filter.
Not really, because a sinc filter with a large window will still ring ... if you only use say 2 lobes it isnt so large anymore (Lanczos2 filter). If you are really interested in this stuff read this paper, and some of the papers which cite it.

I am personally of the opinion that reconstruction filters shouldnt have negative taps ... if you want sharpening apply it in post processing.
 
You're right. Rippling does go down with a larger window, but not very quickly. You could choose a window (like Blackman) that would reduce ringing at the expense of a more gradual transition at the cutoff (more blurring).

All of this is academic really as I'm beginning to understand the difficulties with image processing. Your eyes don't process information in terms of frequency like your ears do. Hence, examining a filter in terms of frequency response ignores the things that are important in images like edge preservation and smoothness. It's really fascinating stuff, though, and thanks for the link to that paper.

Just out of curiousity, why do you feel that sharpening is better done after reconstruction than during? I'm sure that makes sense but I'm not quite with you on it.

As aranfell pointed out, these types of problems are equally annoying in audio when you come across them. Fortunately, the ear is quite good at filtering out noise, and there are never any discontinuous signals (aside from shockwaves) to deal with. So problems of this type are rare, but taking things to the next step from acceptable sound to actually fooling the ear becomes a sticky issue. The phase response of any filtering, as well as the rest of the reproduction chain, becomes an issue and you can get mired down pretty quickly.
 
For one because it confuses the issue between what the reconstruction should do, and esthetics. Also unsharp sharpening is a really outdated sharpening algorithm.
 
Back
Top