Why is NVDA still using impractical 8x AA setting w/ 6800U?

Why is NVDA still using an impractical 8x AA setting with the 6800 Ultra? Does anyone have any insight into why they chose to use a MS+SS mix for their 8xAA, which causes a huge performance drop and essentially renders this mode useless? Does this 8xAA method even save transistors or complexity for the design?

I am of the opinion that up to 4xAA and 16xAF seems to be a good area of focus, but still NVDA has essentially rendered their 8xAA mode useless because of the massive performance drop. In addition, some reviews will test the 8xAA mode, and this makes the 6800 Ultra look exceptionally poor compared to it's 4xAA performance.

How easy or difficult would this be to fix/remedy/improve?
 
Funny I was thinking the same thing after I read a few reviews. Very very good question. I don't have the answer to that. Broken drivers? Something that slipped though the cracks?

Maybe the best answer is there is a known issue that they put on the back burner. Remember, NV40 is really still in final testing it is not a retail product. NV has been known to intro cards then wait for literally months before it is actually on the shelves.
 
From beyond3d's preview:
Pixel Engines (ROP)
Following on from the Texture and Shader engines are the ROPs (Render Outputs), and below is a high level diagram of the ROP:

The ROPs are responsible for such basic operations as Z checking (to decide whether the pixel should actually be written, if it hadn't been rejected from an earlier compare) and either writing or blending pixels to the frame buffer. The NV40 pipeline has both a Z ROP, which does the Z writing, and a C ROP. The C ROP is a combined Z and Colour ROP. The use of the C ROP is what achieves NV2A's, NV3x's, and now NV4x's optimised Z / Stencil rendering path such that during non-colour rendering situations the C ROP can be utilised to write and second Z/Stencil value per clock cycle, but will be used for colour writes when value need to be written to the frame buffer.
This also signifies that NV4x is only capable of 2 FSAA Multi-Sample samples per clock cycle, and indeed David Kirk confirmed this to be the case - as it has, in fact, been since NV20. To achieve 4X Multi-Sampling FSAA a second loop must be completed through the ROP over two cycles - memory bandwidth makes it prohibitive to output more samples in a cycle anyway. Unlike ATI, only one loop is therefore allowed through the ROP which continues to restrict NV4x to a native Multi-Sample AA of 4X - modes greater than 4X still require combined Super-Sampling and Multi-Sampling.
 
I think the question was why drivers expose 8xS instead of 8xM when 8xM (as well as two 16x FSAA modes) are significantly faster?
 
Geeforcer said:
I think the question was why drivers expose 8xS instead of 8xM when 8xM (as well as two 16x FSAA modes) are significantly faster?

Can you elaborate? What is this "8xM" mode? Are you just talking about the other 8x mode that used to be in the control panel? And AFAIK there is only one 16x mode, and it is OGL only. There was another D3D mode called 16x which was really 12x. It performed similarly to 8xS, but didn't look quite as good.

The obvious reason to me that NVIDIA is exposing a hybrid mode is that it needs to in order to expose a FSAA setting greater than 4x. NVIDIA doesn't want to look weaker than ATI by not offering something higher than 4x, especially when R420 has been rumored to support 8x. At least, that was in the rumor bin awhile ago, don't know if that's still in the cards or not.
 
Well, as Dave has explained, technically the Geforce 6 series is unable to do more than 4x FSAA without doing some form of multisampling like the 4xS modes in the previous generations.

Personally, I don't think that missing modes over 4x FSAA are a problem. On my Radeon 9700, 6x FSAA is mostly not usable because of performance, and I don't perceive a great image quality difference. That is probably the reason why nVidia decided to go for 4x FSAA.

I think the 8x FSAA mode in Geforce 6 is not meant for us gamers, but for professionals. That might be the indication, because they have chosen for 4 supersampled, 2x FSAA configuration for 8x FSAA. It probably results in much clearer textures and pictures, that might be very usable for 3D professionals.
 
The hybrid or pure SSAA modes can still have their uses for older and/or resolution/CPU restricted type of games.

If any kind of Super sampling is "useless", then I frankly wonder why so many Radeon owners where pleading for SSAA support in the R3xx line.

Want an example? Assume I have an LCD monitor with a 1024 native resolution in front of me and want to play let's say 4x4Evo.
 
A small point worth making here is that as more effects move to the fragment level FSAA will be required to remove aliasing, a good example to look at for this is parallax bump mapping. The hybrid modes are good as they reduce the overall fill impact while providing high levels of AA on edges with soem level of fragment AA. Ultimatley though I wouldn't be surprised to see a shift back to FSAA only, and that MSAA may become useful for legacy applications only.

John.
 
JohnH said:
Ultimatley though I wouldn't be surprised to see a shift back to FSAA only, and that MSAA may become useful for legacy applications only.

John.

That seems odd considering that legacy applications would probably fare better performance-wise than newer applications when using SSAA as newer games tend to be more stressful. Also, if MSAA would be used only for legacy, why spend the extra transistors implementing it? It would make more sense to me just to get rid of support entirely if you were going in that direction.
 
What I find odd is the HUGE drop in performance. I know it does SS, but still...70-90% drop? I was like "wow" when I read Dave's review.
 
StealthHawk, the key point is that the emphasis may move back to FSAA in the not so distant future. MSAA may persist as a HW feature due to it still providing benifit in some application categories and also to some extent the understanding of the issues by reviewers (edit: or rather the lack there of!).

John.
 
Kombatant said:
What I find odd is the HUGE drop in performance. I know it does SS, but still...70-90% drop? I was like "wow" when I read Dave's review.
Its probably bandwidth, MSAA typically allows you to losslessly compress the FB by up to 4:1, this can't (easily) be done when applying SS.

John.
 
JohnH said:
StealthHawk, the key point is that the emphasis may move back to FSAA in the not so distant future. MSAA may persist as a HW feature due to it still providing benifit in some application categories and also to some extent the understanding of the issues by reviewers (edit: or rather the lack there of!).

John.

How about leaving anti-aliasing entirely to ISVs and let the developer decide wherever MSAA, SSAA or a combination of the two in the same scene would make more sense?

I don't think forcing AA through the driver control panel is the ideal sollution, nor should developers treat anti-aliasing nowadays as a luxury item.
 
Yep, it should be up to teh ISV to choos the SS vs MS AA. I can also tell you the ISV's would rather the quality of the various MSAA techniques was standardised as at the moment they can't predictably set an AA'd output quality...

John.
 
Standardisation sounds "dangerous" to me; I hope that if such a thing would ever be a consideration, that we wouldn't end up with ordered grid patterns f.e. just because it might be easier....
 
Probably going to be a game of leap frog through the rest of the year for those playing the waiting game. There's the excellent NV40 ultra which will then be performance jumped slightly by the X800XT a month or so later. Which in turn will be surpassed by a speed bumped .11u?? NV45 updated to allow a couple of more loops through the pixel engine ROP and allow upto 8x MSAA. Only to again be trumped by an early R500 showing before the year is up.
 
JohnH said:
Kombatant said:
What I find odd is the HUGE drop in performance. I know it does SS, but still...70-90% drop? I was like "wow" when I read Dave's review.
Its probably bandwidth, MSAA typically allows you to losslessly compress the FB by up to 4:1, this can't (easily) be done when applying SS.

You'e right and wrong - it is true that with SS your samples are usually different and cannot be compressed efficiently.
However the main performance problem is fill rate - with SS you actually have to render 2-4-n times as many pixels, complete with texture reads and shader calculations.

I'd also say that brute force SSAA is not the answer to shader aliasing. It's better to implement some sort of AA in the pixel shader, as it can then be adaptive, which SS isn't.
 
Ailuros said:
How about leaving anti-aliasing entirely to ISVs and let the developer decide wherever MSAA, SSAA or a combination of the two in the same scene would make more sense?
I think it'd be even better if the developer was given the option to switch between MSAA and SSAA (with the same number of samples, of course) on a per-traingle basis. This way SSAA could be used for those shaders that really need antialiasing, without dramatically reducing performance.

Even better would be to specify a minimum desired amount of SSAA: so that applications could use mixed-mode FSAA if available. Now, this would require completely programmable FSAA sample patterns to implement properly.
 
Back
Top