nVidia going for 16 sample FSAA?

i think this presentation very clearly shows how nVidia has gone wrong. they are focused on 1) increasing numbers that have tangible value, but don't make the scene look more realistic, 2) doing this with brute force. if you want more realism you have to focus more on a more accurate simulation of light. the more detailed the screen is the easier you can see how wrong things are. nVidia has lost perspective.
 
Sage said:
i think this presentation very clearly shows how nVidia has gone wrong. they are focused on 1) increasing numbers that have tangible value, but don't make the scene look more realistic, 2) doing this with brute force. if you want more realism you have to focus more on a more accurate simulation of light. the more detailed the screen is the easier you can see how wrong things are. nVidia has lost perspective.

It seems that way doesn't it?

nVIDIA also have a reputation of being hypocritical.
 
Not that it really matters, but if my memory doesn't betray me, the various crystallballs across the rumour mill predicted 10x sample AA for the NV25.

Heck I'd be happy even with just 8x samples as long as it's at least sparse grid.
 
Sage, just take it at face value ... a summation of reasons why there will be plenty of things to sink fillrate and transistors in for the foreseeable future. How much you are trying to read into it is ridiculous.

OpenGL guy, that is not a problem ... just defer the shading and merge fragments when possible ;)
 
MfA said:
Sage, just take it at face value ... a summation of reasons why there will be plenty of things to sink fillrate and transistors in for the foreseeable future. How much you are trying to read into it is ridiculous.

OpenGL guy, that is not a problem ... just defer the shading and merge fragments when possible ;)

I think Sage is saying that nVIDIA aren't concentrating on IQ and it will become more noticeable in future. At least that's how I taken her words.

I'm open for corrections.


Ailuros, the NV20 can even do 16xAA through OpenGL. There was a driver released a few months back which allowed it.
Of course the performance with it enabled even under some DX7 games at 640x480 isn't what I would consider playable.
 
the one thing that I did like about that presentation was the emphasis on accurate stereoscopic rendering. if you've played a game using GOOD stereoscopic effects you can't go back and be happy about it. I'm hoping that one of the cat 3.8 features is page flipping for stereo 3d rendering, because nVidia is actually still way ahead in the game, even though their 45.23 stereo drivers are crap (cant use AA, surfaces that use programmable shaders simply dont show up).
 
Sage said:
the one thing that I did like about that presentation was the emphasis on accurate stereoscopic rendering. if you've played a game using GOOD stereoscopic effects you can't go back and be happy about it. I'm hoping that one of the cat 3.8 features is page flipping for stereo 3d rendering, because nVidia is actually still way ahead in the game, even though their 45.23 stereo drivers are crap (cant use AA, surfaces that use programmable shaders simply dont show up).

Don't count on good stereoscopic support from Ati. You won't get it. It has been said in interviews time and time again.
 
Maybe I will be in left field or more like falling off the bleachers but I will give it a shot. Render 4x MSAA into a Buffer, then do a 4x SSAA on the buffer to give a 16X AA solution. How about a 4x MSAA into frame buffer then a 4x SSAA of the frame buffer writting to another frame Buffer or directly onto the screen when needed (post process)? Wouldn't this be sorta a 16x solution effect? Seems like memory requirements using a buffer setup won't need to be much more then a 4x AA solution.
 
noko said:
Maybe I will be in left field or more like falling off the bleachers but I will give it a shot. Render 4x MSAA into a Buffer, then do a 4x SSAA on the buffer to give a 16X AA solution. How about a 4x MSAA into frame buffer then a 4x SSAA of the frame buffer writting to another frame Buffer or directly onto the screen when needed (post process)? Wouldn't this be sorta a 16x solution effect? Seems like memory requirements using a buffer setup won't need to be much more then a 4x AA solution.
Mixing of Supersampling and Multisampling, just what NVidia does with all the AA modes >4x (4xS, 6x, 8x, 8xS, 12x, 16x)
But you can't save memory this way. When doing 1024x768 with 16xAA, you simply render 2048x1536 with 4xMSAA and do the 4-sample downfiltering twice.
 
I am beginning to feel like I am falling into the parking lot but how about this. Take the 4x MSAA buffer (your example, 2048x1536), downsize into overlapping tiles lets say 256x192 and do downfiltering twice for the display buffer. I can't see how the memory requirements will be much more then 4xMSAA. I am just a beginner by the way. The GPU does the MSAA while the SSAA filter is external and reads the display buffer to the screen. Viola, 16x AA with slightly more then 4xMSAA memory requirements.
 
K.I.L.E.R said:
Sage said:
yeah, I know. but I can still hope can't I? :cry:

No unless the market for it mystically grows exponentially over the next couple of weeks. ;)

Count me in with Sage.
Stereoscopic viewing is nigh on the only reason I would consider an nVidia card. Their support of the feature is appreciated.
Did that count as exponential growth? :)
 
noko said:
I am beginning to feel like I am falling into the parking lot but how about this. Take the 4x MSAA buffer (your example, 2048x1536), downsize into overlapping tiles lets say 256x192 and do downfiltering twice for the display buffer. I can't see how the memory requirements will be much more then 4xMSAA. I am just a beginner by the way. The GPU does the MSAA while the SSAA filter is external and reads the display buffer to the screen. Viola, 16x AA with slightly more then 4xMSAA memory requirements.
Downsize into overlapping tiles? What would that be good for?

You can only start downsampling when you finished rendering a frame (side note: a tile based deferred renderer can downsample when finished rendering a tile, because it knows what primitives contribute to that tile even before starting rendering). While you are rendering you need all subsamples, because all of them could possibly be updated by some following primitive. So you need all the 1024x768x16 samples in memory at once.
 
noko said:
I am beginning to feel like I am falling into the parking lot but how about this. Take the 4x MSAA buffer (your example, 2048x1536), downsize into overlapping tiles lets say 256x192 and do downfiltering twice for the display buffer. I can't see how the memory requirements will be much more then 4xMSAA. I am just a beginner by the way. The GPU does the MSAA while the SSAA filter is external and reads the display buffer to the screen. Viola, 16x AA with slightly more then 4xMSAA memory requirements.

Both MSAA and SSAA require that there be a "pixel slot" for every subsample. So ATI's 6X SG-MSAA at any given resolution takes up 6 times more memory than a non-AA screen at the same resolution. So if you blow up the frame buffer by a factor of 4X (SSAA) then on each of these new pixels have 4X MSAA you will have a total of 16 pixel slots you need to allocate for each "original pixel". (4 MSAA slots for each of the 4 SSAA slots)

Or are you suggesting rendering only a fourth of the screen at once, then downsampling after this "tile" that has been rendered into a color buffer, then repeating this process for the other tiles? If so, an immediate mode renderer would choke since the primitive list sent to it "jumps" all over the screen. PVR caches the triangle list, sorting the tris into tile bins, but if we commit to TBDR there are many more FSAA options now open to us.

Don't feel too bad. I had (have?) this great idea for dramatically reducing the vertex transform requirements for shadow, or "first pass depth only," rendering (which in the future will take up a steadily increasing share of VS time) that has some slight problems with artifacts but nonetheless seems as if it could work with some refinement. However, over the years I've come to realize that there are hundreds of men at each of the firms in the industry who have had years of training in this field and do this all day long. Whatever idea I could ever have is sure to have been either thought out already or (more likely) dismissed outright already. But that's life for you... :cry:
 
akira888 said:
However, over the years I've come to realize that there are hundreds of men at each of the firms in the industry who have had years of training in this field and do this all day long. Whatever idea I could ever have is sure to have been either thought out already or (more likely) dismissed outright already. But that's life for you... :cry:

Don't think like that! Those trained and hired people work under different constrains than you. All great things started out as 'some whacky idea I had'!
 
Ailuros, the NV20 can even do 16xAA through OpenGL. There was a driver released a few months back which allowed it.
Of course the performance with it enabled even under some DX7 games at 640x480 isn't what I would consider playable.

Your last sentence above shows under what light I wrote my former post. The only other mode that was/is essentially fillrate free on NV25 is 2x sparse grid MSAA.

I´ll borrow a picture from 3DCenter´s article about the 16x OGL mode:

mode9.png


http://www.3dcenter.de/artikel/geforcefx_aa_modi/index6_english.php

A 4xOGSS + 4xOGMS combination; you end up with less than 1/4th of your fillrate, for just a 4*4 grid. I don´t think that NV will result to such a mode in upcoming products in order to claim 16x sample AA, but if they do they deserve to be shot heh.
 
aZZa said:
Would a high end TBDR be the first card with the capability to introduce a usable real-time gaming 16 sample AA until the r500/nv50 tech arrives, late 04- early 05??
I bet some Parhelia owners would argue you don't need a tiler for usuable 16x AA. Then again others would say the cards need to be fast enough without AA first.
 
3dcgi said:
aZZa said:
Would a high end TBDR be the first card with the capability to introduce a usable real-time gaming 16 sample AA until the r500/nv50 tech arrives, late 04- early 05??
I bet some Parhelia owners would argue you don't need a tiler for usuable 16x AA. Then again others would say the cards need to be fast enough without AA first.
Also the Parhelia's algorithm broke down in some cases leaving some edges without antialiasing.
 
Back
Top