AA/AF enhancements

Given the most speed enhancment the NV35 had over the NV30 is with 4x AA, I suspect their color/z compression algorithms are not so great...

(This is why I don't think NV35 is what NV30 was supposed to be. I doubt NV30 was ever supposed to be 256bit. NV35 is a desparate attempt to catch up. Adding 256bit memory was probably the simplest thing (!) they could change in the architecture.)
 
Uttar said:
OpenGL guy said:
Why are you arguing with me?

Maybe we want to make you confused until you say everything? ;) j/k

More seriously though, let me summarize this...
ATI *is* using a sparse grid
That picture is incorrect.
ATI's goal was not to have one sample at each row and column in a 6x6 grid for a total of 6 samples, while it may still be the case ( is it, actually? Hmm... )


Uttar
Could you please refer back to earlier parts of the discussion, and show the worst case scenario with ATI's 6x that produces worse AA than their 4x?
I am honestly curious.
 
Anyone willing to bet that the _AA solution_ will come from PowerVR ? Probably some kind of coverage mask AA. Should be reasonable to do on chip with TBR architecture.
 
Ostsol said:
OpenGL guy said:
Chalnoth said:
Btw, here's an edited pic that I pulled off of this post:

sparse.jpg
And your picture is incorrect.
According to a FSAA tester program. . . it should be like this:
Code:
* - - - - -
- - - * - -
- * - - - -
- - - - - *
- - * - - -
- - - - * -
Which is precisely what that picture shows. Notice the grey pixel. That's supposed to be the center of the pixel. Just move the bottom row to the top and you'll have the picture you wrote out.

I select those 6 samples because those were the ones highlighted in the thread.

And OpenGL Guy, you're still not saying why it's wrong. The picture I posted was taken off of a Radeon 9700. The fact that the samples chosen were not all from the same pixel doesn't matter, not for this argument (it will have an effect on the edge AA, I'm sure, but it still shows what I meant it to...).

As for the fact that one cannot do 4x spare sampling FSAA on a 6x6 grid, one can just use a 4x4 grid instead.

Edit:
Btw, one final thing. Whether or not it actually samples from a grid is a hardware optimization issue. The end result is that it certainly appears to.
 
Althornin said:
Could you please refer back to earlier parts of the discussion, and show the worst case scenario with ATI's 6x that produces worse AA than their 4x?
I am honestly curious.
I'll see if I can point it out adequately.

4x FSAA:
Code:
- x - -
- - - x
x - - -
- - x -
6x FSAA:
Code:
x - - - - -
- - - x - -
- x - - - -
- - - - - x
- - x - - -
- - - - x -
As you can see, the samples in 6x FSAA have a definite slant in one direction. If the angle were very close to this, one would expect that it would start to look a bit worse than 4x FSAA.

In my own experience with the Radeon 9700, I thought I saw a similar situation in Morrowind. Upon closer examination, I discovered that the problem I thought was edge aliasing was instead texture aliasing. So, I personally haven't noticed any deficiency of the 9700's 6x FSAA. But I will say that neither did I notice much improvement over the 4x setting.
 
Chalnoth said:
And OpenGL Guy, you're still not saying why it's wrong. The picture I posted was taken off of a Radeon 9700. The fact that the samples chosen were not all from the same pixel doesn't matter, not for this argument (it will have an effect on the edge AA, I'm sure, but it still shows what I meant it to...).
It's wrong because you are wrong. For one, if I wanted to, I could program all of the samples into a straight line. Pretty useless for AA, but complete programmability is there, as has been mentioned elsewhere.
As for the fact that one cannot do 4x spare sampling FSAA on a 6x6 grid, one can just use a 4x4 grid instead.
Sure you could do that, but we don't. Get it yet? You're making assumptions based on what you see, which is like trying to reason from effect to cause, hence a fallacy. Since I have told you that you are wrong, why don't you just drop it? You're trying to prove me wrong which is not going to happen as long as I'm sitting here with the hardware specs in front of me.
 
Chalnoth said:
So, I personally haven't noticed any deficiency of the 9700's 6x FSAA. But I will say that neither did I notice much improvement over the 4x setting.

It's very hard to notice a difference, IMO, so I generally just stick with 4x (especially since the benchmarks on a 9800 Pro tend to indicate that the memory controller was optimized for this setting).
 
SA said:
BTW, I agree that FAA is one of the best AA algorithms currently available, without its current shortcomings of course.

What is missing is sparse grid sampling, intesections using z slopes, and a fixed number of levels per pixel with fragment merging, which would put it directly in the realm of Z3. By allocating a separate buffer for the AAed pixels the way FAA does, you could substantially reduce the storage requirements needed by Z3 while increasing the maximum number of levels per pixel for better AA. By using fragment merging the way Z3 does, you correctly handle order independent transparency, and put a cap on the memory requirements for worst case scenarios. Of course, using z slopes provides high quality AA at implicit intersections.

You know some of us had hopes for this in the NV30 this time last year :(
 
RussSchultz said:
OpenGL guy said:
For one, if I wanted to, I could program all of the samples into a straight line.

Could you program them different for each pixel?
Almost sounds like you're thinking of an adaptive algorithm that would change the sample pattern based on the angle of the edge as it appears on the screen.
 
Actaully, I was aiming at seeding it with a deterministic pseudo-random sequence so that the sampling pattern is the same frame to frame, but different pixel to pixel. That should help break up patterns that form on angles that approach the critical ones.
 
RussSchultz said:
Actaully, I was aiming at seeding it with a deterministic pseudo-random sequence so that the sampling pattern is the same frame to frame, but different pixel to pixel. That should help break up patterns that form on angles that approach the critical ones.
Yeah that would be interesting, but we can't do that yet. Also, I'm not sure how useful it would be because neighoring pixels might have some weird interactions (samples may not be evenly spaced).
 
OpenGL guy said:
Yeah that would be interesting, but we can't do that yet. Also, I'm not sure how useful it would be because neighoring pixels might have some weird interactions (samples may not be evenly spaced).
Wouldn't this also involve a heck of alot of state change, potentially having drastic effects on performance?
 
Ostsol said:
OpenGL guy said:
Yeah that would be interesting, but we can't do that yet. Also, I'm not sure how useful it would be because neighoring pixels might have some weird interactions (samples may not be evenly spaced).
Wouldn't this also involve a heck of alot of state change, potentially having drastic effects on performance?
You can't change states per pixel... It would have to be done in the hardware.
 
OpenGL guy said:
Yeah that would be interesting, but we can't do that yet.
Feh. What good is it then? ;)
Also, I'm not sure how useful it would be because neighoring pixels might have some weird interactions (samples may not be evenly spaced).
True enough, though if the 'noise' were spread out, it should be mostly unnoticeable. Or, choose the pattern choices so that it works out to where the sample patterns don't get concentrated.

Now, what would be mighty cool to demonstrate the goodness of your pattern vs. that of the evil heinous competitor, is to have an app/slider that lets you change the sample pattern in real time.

Of course, I'm a sucker for gadgets.
 
RussSchultz said:
Now, what would be mighty cool to demonstrate the goodness of your pattern vs. that of the evil heinous competitor, is to have an app/slider that lets you change the sample pattern in real time.
The only trouble is that you can only make things worse... For the number of samples we support (2, 4, or 6), we are using the optimal placement. (Of course, optimal does not mean "best in all cases" it means "good for most cases"). We've considered exposing the sampling positions to the public, but the idea never caught on because there's no benefit in making things look worse.
Of course, I'm a sucker for gadgets.
Me too, that's why I work on drivers :D My gadgets have 110 million transistors and I can save on R&D costs by letting ATi design and build the chips ;)
 
Nice to see that someone agrees with me about FAA:).What I don`t believe is that Matrox have what it takes in terms of money and manpower to bring the technology to the next level.They were the first to do it on consumer cards, but others may/will pick it up from here.A FAA like algorithm would fit nicely into NV`s plan of free antialiasing, wouldn`t u say ;)
 
OpenGL guy said:
It's wrong because you are wrong. For one, if I wanted to, I could program all of the samples into a straight line. Pretty useless for AA, but complete programmability is there, as has been mentioned elsewhere.
The only question here is as to what subpixel accuracy FSAA samples can be chosen. The exposed freedom in subpixel accuracy is no greater than the number of samples chosen. Is the actual freedom in subpixel accuracy much higher than this?

This is what lead me to believe that the R3xx architecture's triangle setup units always calculates an ordered grid, of which specific samples are taken depending on the situation. Of course, the triangle setup engine only needs to produce the added samples for z depth values to test for FSAA coverage, which may be a reason to not bother with only outputting an ordered grid.
 
OpenGL guy said:
Me too, that's why I work on drivers :D My gadgets have 110 million transistors and I can save on R&D costs by letting ATi design and build the chips ;)

I keep trying to convince marketting to put an ethernet MAC or 802.11 core on our next chip. They don't seem to think its worth it just so I can build my own streaming audio system for my house.
 
RussSchultz said:
OpenGL guy said:
Me too, that's why I work on drivers :D My gadgets have 110 million transistors and I can save on R&D costs by letting ATi design and build the chips ;)

I keep trying to convince marketting to put an ethernet MAC or 802.11 core on our next chip. They don't seem to think its worth it just so I can build my own streaming audio system for my house.
tell em you could make one for me also.
You just doubled your market!

On topic - Sure, there is a limited case where the 6x pattern is less "accurate" (actual coverage is not approximated as accurately) than the 4x, but in most cases, its more accurate - this will lead to better AA.
I can also easily note the differences between ATI's 6x and 4x.

Also, the only "bad" case comes at the angle which benefits LEAST from AA anyways - the perfect 45 degree (off of any axis) angle looks the best anyways. Jaggies are most irritating when there is a nice horizon, with like 3-4 jaggies racing across the top of it as you move around.
 
Back
Top