Investigating GeForce4 4XS AA

MikeC

Newcomer
I posted this information at nV News this evening...


I've finally had a chance to digest the various GeForce4 previews and reviews and am surprised that more emphasis wasn't put on image quality comparisons - especially the new 4XS AA mode that's available under Direct 3D. I've spent most of the day investigating the qualities of this mode which are very impressive. For example here are two screenshots from Serious Sam 2 running under Direct 3D mode.


http://www.nvnews.net/images/news/200202/ss2_4xaa.shtml - 4X Antialiasing - No anisotropic filtering (237KB)

http://www.nvnews.net/images/news/200202/ss2_4xsaa.shtml - 4XS Antialiasing - No anisotropic filtering (252KB)


My findings thus far when comparing 4X with 4XS antialiasing under Direct3D (at least in Serious Sam 2).


- Texture swimming/shimmering is significantly reduced with 4XS AA. If you have the Serious Sam 2 demo, run it under Direct 3D and play this scene from the Sierra de Chiapas level. Take your character and move around with the steps and temple in view. You'll see texture swimming coming at you from many sources - the steps, the temple, the trees, the status, etc. Once 4XS AA was enabled, the only noticable texture shimmering was coming from the bottom three row of stairs.


- When viewing the screenshots, notice the moire pattern that appears on the stairs in the 4X AA screenshots. With 4XS AA, it is entirely eliminated.


-The 4XS AA mode has "anisotropic-like" qualities in that textures are markedly sharper. I have another series of screenshots which I will include in the preview that show it's quite difference that just using 4X AA with anisotropic filtering enabled.


-The drawback of 4XS AA is that it cuts frame rates by as much as 50%.
 
I haven't got a GF4 Ti yet but here're some words from me.

Before addressing your points, Mike, have you asked why only D3D for the 4XS mode?

- SSam (1st or 2nd encounter) uses a rather peculiar (in my experience, at least) system for dynamic LOD changes. It is quite unlike anything I've seen. Which means to say it really isn't what you usually see. This, however, isn't so much of a "problem" since SSam's engine is impressive in terms of wide open spaces, which leads back to my mentioning of LOD. That said, I'm not sure exactly what you mean by the "texture swimming". It could simply be the 4XS' aniso.

- re moire stuff... can't say until I have the damn board :smile:

- re aniso-like quality in 4XS. Eh? Read the NV material carefully... if you don't get it, ask NV.

- re performance. Duh.
 
I think it is the so called mystery mode (MS + SS combined), I don't know the exact registry setting, but you can set it with NVMax on every GF3.
 
On 2002-02-09 04:11, Reverend wrote:

- re aniso-like quality in 4XS. Eh? Read the NV material carefully... if you don't get it, ask NV.

I've been over the documentation a few times and it states that Accuview incorporates anisotropic filtering and that 4XS goes beyond other AA modes by providing greater subpixel coverage.

When I did my Quake 3 2X and Quincunx antialiasing image quality comparison, I didn't see any anisotropic filtering effects unless I manually enabled them. Am I missing something here?
 
Unfortunately, that Mystery Mode only works in D3D, too. Rivatuner calls it "2x + 1x2 supersampling".

ta,
.rb

P.S. Hmm. There's also an option called "multisample masking" . . . hmm . . . I start wondering whether AccuView is in any way something new and GF4-unique? .rb

________
Suzuki DL650 VStrom
 
Last edited by a moderator:
There's also an option called "multisample masking" . . . hmm . . . I start wondering whether AccuView is in any way something new and GF4-unique?

Indeed, for me it looks too that this pattern isn't 'hardwired' in anyway but it is a driver question. As soon as your texture sample location is programmable you could do this new pattern on the Geforce3.

The 'only' new thing of "AccuView" seems to be some bandwith savings.
 
The 'only' new thing of "AccuView" seems to be some bandwith savings.

And I'm still trying to work out what they are. My suspicion is they've got rid of the back buffer.
 
There are two opimisation over the three steps in their technical brief I see right now. One would be a 3dfx-kind approach by combining and filtering the subsamples directly in the RAMDAC.

Another idea with more bandwith savings for quincunx would be line caching. If you could cache the samples of 2 full lines on chip, you could to all the combining and quincunx filtering of the color samples on chip. Less than 30 KB cache would be sufficiant to do this for quincunx (2 lines à 1600 pixels * 2 subsamples à 32bits each). Of course this seems to be problematic for blending effects. It would add additional blurring if you read back the filtered values instead of the real subpixel data. But you could do this just for 2x MS, and do the filtering in the next step. Not sure, it may not be practiable in reality to read back filtered values.

<font size=-1>[ This Message was edited by: ram on 2002-02-10 09:08 ]</font>
 
On 2002-02-09 04:12, MikeC wrote:
Interesting. What registry settings would that be? I could create yet another comparison.

get the new rivatuner10 there you can choose 4xs fsaa mode and yes it works on a gf3.
 
ram, i guess GF3 uses a line cache, too; but may be not so optimized like in GF4.

The quincunx perfomance penality at GF3 was quite small, afaik.
 
Hmm, yes sometimes it was small, but sometimes, Quincunx was even slower than 4xMS. Descent3 for example.


<font size=-1>[ This Message was edited by: ram on 2002-02-09 11:10 ]</font>
 
On 2002-02-09 09:49, DaveBaumann wrote:
The 'only' new thing of "AccuView" seems to be some bandwith savings.

And I'm still trying to work out what they are. My suspicion is they've got rid of the back buffer.

Having read NVIDIA's Accuview documentation once again, it eludes to using the back buffer on page 7. It also verifies what ram was explaining in regards to the 4XS mode in that some form of supersampling is going on.


The Accuview subsystem substantially increases performance by optimizing the way
pixels move through the graphics pipeline to create antialiased images.

1. Subpixels are rendered in parallel (thanks to multisampling technology) to a
back buffer. This back buffer is a factor that is larger than the final display
resolution.

2. The image is filtered and written out to a front frame buffer.

3. The frame buffer is sent to the display.
 
As a former V5 owner, I am now on my 3rd nVidia card, a GF3 TI500. I have never been able to use FSAA on a GF, because, well, it sucks! I found that I was only able to use the card at high rez (1280x1024) with full AF. After downloading &amp; installing v10 RivaTuner, I spied the 4xS setting, &amp; thought I'd check it out. After a bit of experimenting, I settled on these settings:4Xs FSAA, AF 4, LOD -1. Now I can truly say I'm impressed. This I can live with. EXCEPT.....I'm really killing the bandwidth! Good thing I'm old &amp; suck at FPS's! While the image is a little bit softer than without FSAA, it's very slight. Nothing like the standard MS on the GF3. I have a pretty powerfull system:
Tbird 1.4 @1510 (10x151)
GF3 TI500 @ 265/586
Shuttle AK31v3.1
512 meg Crucial PC2100(2x256)
here's some 3DMark numbers:
2000: 11,927/3953 w/FSAA
2001: 8462/2909 w/FSAA
YIKES! Maybe I need that TI4600! How bout a GF5? KUDO's (ACK! I'm choking!It's really hard to say this!) to nVidia for finally giving us some decent FSAA an old 3DFX'er can live with. Hate to be a complainer, but how bout some SPEED with that FSAA? Now.... just where is that 2000DDR ram at?.......
 
Umh..Accuview...what does 'Accu' stand for?
Accumulation? ;)
What if they finally used informations collected by their visibility subsystem to implement some kind of accumulation buffer trough bit-masks? I know..this is a crazy thought :p
But we can all see how much performances improvement gf4 has on gf3 with AA modes..so I don't believe they obtained such better results with just some tweaking here and there, but maybe I'm wrong :rollseyes:
 
Mike, those 3 steps are describing what is normally done for AA - the next sentence states that Accuview does some of these steps in parallel, indicating that it is doing something different from the three steps.
 
Well then someone with a GF4 Ti would be welcomed to check out this old article of mine and compare things (if you have the games featured in the article). I'd mentioned this article of mine before in the B3D temp forum.

Here's how it is for a GF3 re my article after the RivaTuner guy got interested in my article:

1) The third lookup table is used for NV17, not NV25. So different FSAA lookups are used for NV10/11/15 (1.41x1.41 SSAA and 2x2 SSAA), NV17 (two unknown MSAA
modes and 2x2 SSAA) and NV20/25

2) Both 5 and 6 modes on NV20/25 are the hybrids of MSAA and SSAA. This is a fact I can state with 100% reliability. This is proved by the driver's code (MSAA'ed frame is vertically supersampled with 1x2 ratio). Such supersampling also causes LOD bias to change a bit.

3) Mode 5 is probably 2x MSAA + 2x SSAA.

4) Mode 6 is probably 4xMSAA+9-tap + 2x SSAA. This mode is also handled in the different ways on the detonator 21.xx/23.xx. Terrible flashing and color distortion in 23.xx allows me to assume that you guys use 4x MSAA and dynamically change the mask of active samples. I've tested it on a D3D sample application and got the same visual effect on 4x MSAA when masking one sample.

5) It looks like 4xMSAA+9-tap is a bit corrupted on 23.xx

All the Mode #s mentioned refers to the D3D AA registry settings as per my article.



_________________
Reverend
Beyond3D
3DPulpit

<font size=-1>[ This Message was edited by: Reverend on 2002-02-09 17:55 ]</font>
 
Back
Top