AA/AF enhancements

OpenGL guy · Jun 12, 2003

psurge said:
related to the subject of out of range texture samples:

assuming ddx and ddy work on 2x2 pixel stamps simply by computing
a difference between register values in adjacent stamp pixels, what happens when values in the stamp are missing (at the edges, due to z-rejects, etc...)? I.e.

At first I thought that you would just run the pixel shader for the missing samples (allowing ddx/ddy to be computed as they normally are), and discarding the actual results... But, you might easily end up running a shader with invalid inputs. Any ideas on how to handle this?

I'm not sure if I understand your question, but here's an answer

If you sample outside the polygon, you get your texture sample based on the current texture addressing mode. That's why sampling outside of the polygon isn't always a big deal (like on the edges of two triangles joined to form a quad).

SA · Jun 13, 2003

Coverage mask AA techniques can process samples in a pixel with the same precision as individual point samples if the information is kept around to do so. This includes the correct processing of implicit edges. Z3 does this by using z slopes (hence Z3 - the z, dzdx, and dzdy), although there are other techniques, this is the most effective. FAA did not bother to do this.

When using z slopes with coverage mask AA, the result is actually a form of tile based renderer with pixel sized tiles. The rendering is done in two passes, much like a typical TBR. The first pass renders to "tiles" that are a rectangular collection of higher resolution screen samples. The second pass then renders each tile to the final frame buffer.

The major differences compared to typical TBRs are the format of the data in the tile and the size of the tile. In the case of coverage mask AA, the data is kept in the tile as a set of geometry information (z slopes and a z value), bit masks, and colors. This means that colors are rendered on the first pass which is different than a deferred rendering tiler which keeps the geometry as triangles and unrendered color information. However, both render at a lower resolution than the sample resolution on the first pass and collect up fragments in the "tile" buffer to be rendered on the second pass. On the second pass coverage mask AA, processes each pixel sized "tile" by processing each of the fragments found on the first pass - typically front to back to produce the final color for the pixel.
A typical TBR of course, processes each tile by processing each of the triangles in the tile (not necessarily front to back).

Coverage mask AA derives many of the same benefits as typical TBRs, that is, it substantially reduces external memory bandwidth for a given sample resolution. This is because the processing on the first pass is done at a lower resolution than the sample resolution, and the results are kept in a compressed format for the second pass which processes each tile.

There is one notable difference when it comes to AA though between coverage mask techniques and typical TBRs. Typical TBRs are capable of rendering super-sampled AA at high sample densities (say 16x) while coverage mask techniques must use multi-sampling. This is because TBR's do not render the color on the first pass, but only when the tile is rendered. Since coverage mask AA renders colors on the first pass, it can only afford to render one color per fragment, since rendering a color for each sample would lose most of the benefits.

This may not be much of an advantage in actual practice though, since a TBR would be limited by pixel shader performance, even though it is not restricted by external memory bandwidth. Super-sampling each sample when running a sophisticated pixel shader is simply impractical. The computational resources could be put to much better use and multi-sampled AA is good enough.

JohnH · Jun 13, 2003

dsx/dsy work fine on edges of a poly as the plane eqn is continuous (although it is assuming it to be linear, so there's room for error there), this is also pretty much what all LOD calcs are based on these days...

However a slight aside to this, they can fall to peices when dynamic flow control is used as each pixel in the 2x2 stamp may end up going down a different path so that at the point that any one pixel performs the dsx/dsy the source values for the other pixels may actually be invalid or at least not whats expected. In these circumstances I beleive PS3.0 states that they return 0 (or maybe it just ignores the issue in the doc). Basically this means that anyone using this type of functionality and dynamic flow control may have to resort to other methods.

John.

PS - this also applies to LOD's supplied to texld and texldd...

Dave Baumann · Sep 1, 2003

Ostsol said:
According to a FSAA tester program. . . it should be like this:

Code:

* - - - - - - - - * - - - * - - - - - - - - - * - - * - - - - - - - * -

If you superimpose all the sample positions from all three available modes it appears that R300 utilised a 12x12 sparce sample grid.

Ailuros · Sep 1, 2003

Completely OT (and pardon me for that), but it has been a mighty long time since SA has posted anything on these boards.

Randell · Sep 1, 2003

maybe Rev drove him off with his questioning of who he was.

Bambers · Sep 1, 2003

Code:

* - - - - -
- - - * - -
- * - - - -
- - - - - *
- - * - - -
- - - - * -

On the 6xAA I have wondered why that particular pattern.

As was mentioned in a previous thread if you show the whole continuous gird you can get all the sparse sample patterns for 6x6 depending on where you choose.

That patterns seems to be a bit 'cramped' in the top left to bottom right direction, why not this one:

Code:

- - - * - -
- * - - - -
- - - - - *
- - * - - -
- - - - * -
* - - - - -

or one of the others?

K.I.L.E.R · Sep 2, 2003

If they eliminate the ONLY FSAA option (aka QCX AA) I will NEVER EVER buy an nVIDIA card.
They had best keep QCX AA in. Just because YOU hate it doesn't mean other people hate it.

I for one believe QCX AA is the only AA mode in existence worth using.

Thank you very much.

Typedef Enum said:
I can only hope, along with pretty much everybody else, that NV40 will finally address this issue, and ultimately eliminate that crappy Quincunx mode altogether.

keegdsb · Sep 2, 2003

K.I.L.E.R.: Your humor must now be captured in my sig.

K.I.L.E.R · Sep 2, 2003

Why is it that everytime I try and post my serious opinion, people think I'm joking?

But everytime I try to be funny people don't laugh.

People.

Ailuros · Sep 2, 2003

K.I.L.E.R said:
Why is it that everytime I try and post my serious opinion, people think I'm joking?

But everytime I try to be funny people don't laugh.

People.

Uhhhhm.... LOL?

Ailuros · Sep 2, 2003

Randell said:
maybe Rev drove him off with his questioning of who he was.

Did Rev actually ask him?

In any case I donÂ´t think it would be reason enough for him to vanish. Maybe there isnÂ´t just something exiting enough for him around to comment on.

Pavlos · Sep 2, 2003

Bambers said:
That patterns seems to be a bit 'cramped' in the top left to bottom right direction, why not this one:
....
or one of the others?

An optimal sampling pattern for antialiasing with N samples from an NxN grid can be derived by solving the n-queens problem .

I suppose ATIâ€™s engineers tested all the solutions and chose the one with the best results for the usual game environments.

The SRT Rendering Toolkit

DemoCoder · Sep 2, 2003

But the patterns shown above are not a solutions to N-queens, since two diagonals have attacking queens.

BTW, do you have any references to a proof that the N-queens solutions are optimal sampling patterns for N-by-N sparse grids?

I mean, it seems intuitively plausible (no two samples on same row, column, or diagonal) don't neccessarily mean that it is the optimal pattern for a given function. It might seem a good "precondition". That is, any optimal pattern is atleast a solution to N-queens, or at most, a distance of Epsilon from N-Queens for some metric, but there must be some other constraints, such as considering the critical angles in most scenes.

Fred · Sep 2, 2003

One thing I don't understand how to formulate, is precisely what you consider for 'critical angles for most scenes'.

It seems like its mostly pathological cases that are usually the ones where you want to optimize your grid pattern for, not neccesarily the average case.

DemoCoder · Sep 2, 2003

There are many pathological cases, and you can't optimize for all of them, so which do you pick? Given that the vast majority of games (except for shooters and flightsims) have a viewplane with fewer DOF, you should optimize cases that occur in a such settings. That's why ATI's AF implementation works so well.

This assumes you are forced to choose. Perhaps your grid is so large and you are taking so many samples that you can handle almost all pathological cases well. The question is, is 6x6 or 12x12 positions enough? And if not, then what is the best you can do with 6x6 or 12x12.

K.I.L.E.R · Sep 2, 2003

This thread now about chess?

But the patterns shown above are not a solutions to N-queens, since two diagonals have attacking queens.

Anyway, as OpenGL Guy has said, he can arrange the sample patterns however he wants to.
I'm sure OpenGL Guy won't actually attempt the chess puzzle just so he can waste time trying to make AA look any better than it currently is.

If anything I would ask GL Guy to implement a filter like the QCX one.

Ailuros · Sep 2, 2003

If anything I would ask GL Guy to implement a filter like the QCX one.

There's an alternative sollution for that. Take a large watertank, spill a glass of milk inside and put it in front of your monitor.

Bambers · Sep 2, 2003

Or you could just buy a monitor with a focus control and turn it up/down a bit.

Or if you haven't got that, take the case off and twiddle the focus knob at the back

Simon F · Sep 2, 2003

Pavlos said:
An optimal sampling pattern for antialiasing with N samples from an NxN grid can be derived by solving the n-queens problem .

Surely that doesn't take into account the replication of the pattern.

I would have thought a system that, when replicated, satisfied a poisson-disk distribution would be a better aim.

Of course, there is another tradeoff and that is the cost of implementation. It would be cheaper in the hardware to restrict the samples to be a certain fixed fractions and that may be what really limits the sampling pattern to these choices.

AA/AF enhancements

OpenGL guy

SA

JohnH

Dave Baumann

Gamerscore Wh...

Ailuros

Epsilon plus three

Randell

Senior Daddy

Bambers

K.I.L.E.R

Retarded moron

keegdsb

K.I.L.E.R

Retarded moron

Ailuros

Epsilon plus three

Ailuros

Epsilon plus three

Pavlos

DemoCoder

Fred

DemoCoder

K.I.L.E.R

Retarded moron

Ailuros

Epsilon plus three

Bambers

Simon F

Tea maker

Similar threads