New selective supersampling demo

BTW, and I forget and is too lazy to check, XB360's Xenos does 2xMSAA "for free" right?
Actualy, any present GPU that is capable of outputing 2 multisamples per clock could do "free" 2xMS if there is enough bandwidth (once again) and Xenos have a lot of it, dedicated just for framebuffer op's like this, while super-sampling would stress enough the texturing capacity and overflow the small amount of the eDRAM buffer even more than the HD screen mode does already.
 
Last edited by a moderator:
Vysez said:
Other than the DB part, Humus, why does this demo require SM3.0?

Is it mandatory due to the technology you use or is it simply because you coded your app with SM3.0 in mind?
He uses gradient instructions, which I think are only available in SM3.0. Can't remember if they're in any of the SM2.x profiles. But that's likely why.
 
Vysez said:
Other than the DB part, Humus, why does this demo require SM3.0?

Is it mandatory due to the technology you use or is it simply because you coded your app with SM3.0 in mind?
Already cursing the X800? (Didn't take long! :p)
 
Dave Baumann said:
Already cursing the X800? (Didn't take long! :p)
I already miss the R9250, to be honest. :cry:

Poor, R9250, she got smashed into pieces by some maniac...

:devilish:
 
fellix said:
Acctualy, any present GPU that is capable of outputing 2 multisamples per clock could do "free" 2xMS if there is enough bandwidth (once again) and Xenos have a lot of it, dedicated just for framebuffer op's like this, while super-sampling would stress enough the texturing capacity and overflow the small amount of the eDRAM buffer even more than the HD screen mode does already.

Granted a console and PC graphics unit (especially Xenos) are two entirely different beasts, but I don't seem to be able to measure a higher memory consumption so far between 2xMS and 2xSSAA on the PC .

I'm obviously missing something or I haven't tested the correct scenarios with the appropriate tools. Any help maybe?
 
Reverend said:
Just so I know what Tim Sweeney as a developer thinks about doing something like this (SS directly in shaders, in/on specific surfaces within a scene) [snipped]

allow me.
tim sweeney 'as a developer' thinks much more about his release dates and schedules than about technology. he's much more inclined to discard something that could affect his development schedules negatively (as using selective supersampling in his shaders would) than adopt it into production. as it has been known for ages, brute force supersampling is a pure waste of fragment power. as things stand today, mipmapping and aniso should address all your texture-sampling needs, MS should address all your edge sampling needs, 'dumb' selective SS should address all your alpha-testing and 'smart' selective SS (ala humus' demo) all your high-frequency shaders needs. sure, some of those don't come automagically but they do save tons of bandwidth and ALU resources as a result. whether that is of high importance to devs/projects X, Y and Z is another matter.

just remember one thing: realtime graphics have always been about smoke and mirrors (and about the beauty of the female assistant, but that's outside the scope of this post) - whoever is more skilled with using those gets the spectators' ovations.
 
*/Quietly gets up and breaks the silence with a clap/* Well put darkblu.

Realistically, though, how feasible selective algorithms such as the ones mentioned above? How many "man hours" do they take to code and debug for in addition? I ask because it seems both Carmack and Sweeney are weary of spending their time on what seems to me an important IQ issue. It seems it only took Humus a little while to come up with his selective solution based on branching via the gradient parameter. Perhaps Carmack and Sweeney are weary of spending time on techniques only serve to improve the image quality to performance ratio.
 
Last edited by a moderator:
That's tricky. Separating out the sampling frequency of some parts of a shader program from other parts would be difficult to specify. I'm skeptical that this will happen because it's a very complex solution, and will be obsoleted over the next 6 years or so due to plain old Moore's Law improvements. Video resolutions are going up much slower than GPU processing power, so supersampling will be practical in a few years.

great i'll wait just a couple of years then
 
So, extrapolating from Tim's comments, its tough to justify the software development cost of making things more efficient for the gpu, since the gpu will have ALU power to burn in the upcoming span of time (due to the fact that increases in resolution will not scale linearly with increases in ALU/fragment capabilities, yada yada), thus, the gpu will have enough to inefficiently apply supersampling to all pixels even though the performance wasted by this could be applied somewhere else.

Maybe this line of thought will come back to bite ISVs that view things this way, since they might want to concentrate their development efforts on furthering some other aspect of the 3D pipeline, however such efforts will be less feasible, provided good IQ is to be sustained with reasonable performance. I'm beggining to wonder about where the onice lies when all is said and done, regarding good IQ to performance trade-offs. To what extent should the IHV pick-up the slack and devise more "intelligent" hardware as opposed to leaving it to the ISV or vice versa?
 
Last edited by a moderator:
Reverend said:
Just so I know what Tim Sweeney as a developer thinks about doing something like this (SS directly in shaders, in/on specific surfaces within a scene) :
Tim Sweeney said:
That's tricky. Separating out the sampling frequency of some parts of a shader program from other parts would be difficult to specify. I'm skeptical that this will happen because it's a very complex solution, and will be obsoleted over the next 6 years or so due to plain old Moore's Law improvements. Video resolutions are going up much slower than GPU processing power, so supersampling will be practical in a few years.
That's a pretty stupid thing to say.

If you don't make any software advancements whatsoever by the time a GPU quadruples in performance, then you've got a pretty old engine. As Humus, darkblu, and others have said, 4x supersampling will always quarter your pixel shader speed, fillrate, and texture sampling rate. A dev should be using those resources to make better graphics instead.

Put another way, you're basically paying for a 7800GTX 512 to get 4x supersampling at the same speed a 6800GS does 4x multisampling. Worthless except for outdated graphics engines.
 
Vysez said:
Other than the DB part, Humus, why does this demo require SM3.0?

Is it mandatory due to the technology you use or is it simply because you coded your app with SM3.0 in mind?

It's the gradients. Now that it was mentioned, the NV3x do support gradients in 2.x, so they could be supported too. And of course, it's always possible to let any DX9 card run at least the non-supersampled version of the shader. Actually, now that I think of it, it may be possible to implement this on ps2.0 too with a lookup texture that stores a sample radius in the mipmaps.
 
Humus said:
It's the gradients. Now that it was mentioned, the NV3x do support gradients in 2.x, so they could be supported too. And of course, it's always possible to let any DX9 card run at least the non-supersampled version of the shader. Actually, now that I think of it, it may be possible to implement this on ps2.0 too with a lookup texture that stores a sample radius in the mipmaps.
Well, one question about that: that'll give you the MIP map level, but it won't be able to separate minification due to distance from minification due to anisotropy. Maybe you could do two lookups into two copies of the same texture, one with anisotropy, one without?

But I suppose that's getting into doing way too much work.
 
Luminescent said:
Realistically, though, how feasible selective algorithms such as the ones mentioned above? How many "man hours" do they take to code and debug for in addition? I ask because it seems both Carmack and Sweeney are weary of spending their time on what seems to me an important IQ issue. It seems it only took Humus a little while to come up with his selective solution based on branching via the gradient parameter. Perhaps Carmack and Sweeney are weary of spending time on techniques only serve to improve the image quality to performance ratio.

I'm not surprised if developers don't immediately jump on the bandwagon on a technique that's perhaps in initial research stage. I found that it was good for solving the common specular aliasing problem with, but it's of course not clear whether this will translate well to other classes of aliasing. But I think it will be useful in many cases. Maybe if you keep solutions like this in mind when you start on a new engine/game I don't think it's unrealistic to put it in, and could easily be made artist controlable. It may not be as attractive to go back and patch a whole shader library though if you don't know if it's worth it. But on the other hand, not all shaders/materials or properties would need anything like this either, which is kind of the point of it in the first place.
 
Humus said:
It's the gradients. Now that it was mentioned, the NV3x do support gradients in 2.x, so they could be supported too. And of course, it's always possible to let any DX9 card run at least the non-supersampled version of the shader. Actually, now that I think of it, it may be possible to implement this on ps2.0 too with a lookup texture that stores a sample radius in the mipmaps.

If you could implement it via PS2.0, then do so... I'm a little deprived here running a Radeon 9800 :p
 
If you're not doing dynamic branching to skip seemingly unaliased parts, you don't need gradients, do you? Can't you just use the normal and tangent vectors in the vertex shader to determine the offsets needed for supersampling the normal map?

Maybe I'll try fiddling with the source code. Haven't done so yet with any of your demos. BTW, Deathlike2, I'm on a 9800 as well.
 
Mintmaster said:
If you're not doing dynamic branching to skip seemingly unaliased parts, you don't need gradients, do you?

They are still needed (unless of course I can get that texture lookup trick working). They aren't used primarily for the texture fetch, but as a mean to compute texture coordinates for the other samples.
 
Humus said:
I'm not surprised if developers don't immediately jump on the bandwagon on a technique that's perhaps in initial research stage. I found that it was good for solving the common specular aliasing problem with, but it's of course not clear whether this will translate well to other classes of aliasing. But I think it will be useful in many cases. Maybe if you keep solutions like this in mind when you start on a new engine/game I don't think it's unrealistic to put it in, and could easily be made artist controlable. It may not be as attractive to go back and patch a whole shader library though if you don't know if it's worth it. But on the other hand, not all shaders/materials or properties would need anything like this either, which is kind of the point of it in the first place.
Well, to tell you the truth, Humus, I really like this method better. I mean, it's less accurate, but it should be much faster:
http://developer.nvidia.com/object/mipmapping_normal_maps.html

The idea is to basically take the length of the filtered bump map, and use that as a measurement of how quickly the surface normal is varying over the pixel.
 
Ailuros said:
Granted a console and PC graphics unit (especially Xenos) are two entirely different beasts, but I don't seem to be able to measure a higher memory consumption so far between 2xMS and 2xSSAA on the PC .

I'm obviously missing something or I haven't tested the correct scenarios with the appropriate tools. Any help maybe?
2xMSAA and 2xSSAA do have the same memory consumption. I'm not sure where fellix said otherwise though.
 
Back
Top