Motion blur and Depth of field in NV30?

Tokelil

Regular
Now with all this talk about Nvidia bringing some of the 3dfx features into the game, will we see a resurrection of cinematic effects like motion blur and depth of field blur. Or are these techniques easy to do in the pixel shader so hardware support for it is a step back.
They are hyping all these Cine (FX) effects and “just like the moviesâ€￾, so was just wondering…
 
PC-Engine said:
It's already being done on NV2A in Xbox games.
So that basically means that it is done in the vertex/pixel shader, since NV2A has no such features hardwired AFAIK.
 
Those 'T-Buffer techniques' can be done with GF3+ 'multisample buffer', so NVidia already has hardware support for it.
 
Depth of Field also gets "hacked in" on some titles by manipulating texture LOD and doing a blur filter/blend pass. It's significantly cheaper than the multisample approach. Cheap motion blur can also be done using the vertex shader to stretch geometry combined with some texture manipulation.
 
Or you can just use a basic accumulation buffer, kinda like what Unreal Tournament (classic) uses in some cases. :D
 
But the AA buffer approach, while being (and looking) "more correct" is significantly more expensive then adjusting LOD bias by depth of field and doing a convolution. This is multiplied several times when you take Doom3-style lighting into account.

I bet that Halo2 uses the LOD hack for DOF.
 
Xmas said:
Those 'T-Buffer techniques' can be done with GF3+ 'multisample buffer', so NVidia already has hardware support for it.

The "T-Buffer" is also almost identical to an accumulation buffer that nVidia has supported since the TNT.

The primary problem with doing motion blur/depth of field in the way 3dfx was promoting is fillrate. Quite simply, we're still fillrate-limited, even on today's highest-end video cards (read: Radeon 9700). This will become more the case as games like DOOM3 are released.

So, given the increasing power of the pixel/vertex shaders, it becomes more expedient to use those instead.
 
Chalnoth said:
The "T-Buffer" is also almost identical to an accumulation buffer that nVidia has supported since the TNT.
I wasn't aware that NVidia supports accumulation buffer in hardware. The main difference between T/MS and A-Buffer is that T/MS preserve the information per sample, which is why it allows a few more effects.
 
well all i know is

well all i know is that effects associated with the 3dfx "M-Buffer" are possible on the NV3x line. im not sure if its done through an actual accumulation buffer of sorts, or with programmable pipelines. you want proof? look at the background of the "fruit" pic from the NV30 demos.... :) doesnt it just look so wonderful
 
Xmas said:
I wasn't aware that NVidia supports accumulation buffer in hardware. The main difference between T/MS and A-Buffer is that T/MS preserve the information per sample, which is why it allows a few more effects.

Right, it's an accumulation buffer with performance enhancements. And nVidia most certainly supports the accumulation buffer fully in hardware. There's really no other way for the GeForce/2 series to have such a small performance hit for enabling FSAA (for a supersampling implementation). If the accumulation buffer (and conversely, FSAA) didn't have hardware acceleration, then the GeForce/2 series when running at, say, 800x600 with 4x FSAA would have been slower than 1600x1200 with no FSAA.

Still, as far as I could tell, making use of the T-buffer took up too much fillrate anyway, irrespective of the performance enhancements over the accumulation buffer.
 
Ailuros said:
Xmas,

Right click in Rivatuner on "Multisample masking" ;)
Which is the answer to what question? ;)


Chalnoth said:
Xmas said:
I wasn't aware that NVidia supports accumulation buffer in hardware. The main difference between T/MS and A-Buffer is that T/MS preserve the information per sample, which is why it allows a few more effects.

Right, it's an accumulation buffer with performance enhancements. And nVidia most certainly supports the accumulation buffer fully in hardware. There's really no other way for the GeForce/2 series to have such a small performance hit for enabling FSAA (for a supersampling implementation). If the accumulation buffer (and conversely, FSAA) didn't have hardware acceleration, then the GeForce/2 series when running at, say, 800x600 with 4x FSAA would have been slower than 1600x1200 with no FSAA.

Still, as far as I could tell, making use of the T-buffer took up too much fillrate anyway, irrespective of the performance enhancements over the accumulation buffer.
That's not exactly an accumulation buffer (in the OpenGL sense). With an accumulation buffer, you render one image, add it to the accumulation buffer, render another image, add it to the accumulation buffer and so on. The values are scaled every time so that every image contributes the same part to the final image. An accumulation buffer usually has higher precision than a normal framebuffer. The number of images need not be known beforehands. This is *slow*, but requires much less memory than supersampling. The application controls how to use the accumulation buffer.

What GF/2 (and Kyro, Radeon7x00, others) do is render an image in a higher resolution and then downsample the image (1x2 or 2x2 box filter). It's 100% driver controlled.
If 800x600 2x2 OGSS is really faster than 1600x1200, it's because of the lower memory requirements.

T-Buffer and M-Buffer (VSA-100/NV2x) can now either work similar to what GF/2 does (whether the samples are stored each in its own buffer or all in one isn't that important, but NV2x does multisampling), or in an application controlled mode. The important thing here is sample masking. The application can control which buffer(s) should be rendered to. So if you want to use a fixed and known amount of samples (and if this number is 2 or 4 ;) ), you can use T/M-Buffer like a fast accumulation buffer with automatic accumulation of samples.
 
GeForce2 performs almost exactly the same at 800x600 with 2x2AA compared to 1600x1200 straight rendering... the memory savings are minimal at best (reduced resolution front buffer only), the core is just running a 1600x1200 back buffer and downsampling via a spiffy driver trick to an 800x600 front buffer.

ADDENDUM: Just for those who can't read between the lines, while nVidia does provide hardware accumulation buffer support, GeForce256 and GeForce2 do NOT use it for their FSAA!
 
yup, exactly. FSAA on the GeForce2 was merely a driver hack to downsample. it was just something they managed to throw in at the last moment, not really a primary feature...
 
Xmas said:
That's not exactly an accumulation buffer (in the OpenGL sense). With an accumulation buffer, you render one image, add it to the accumulation buffer, render another image, add it to the accumulation buffer and so on. The values are scaled every time so that every image contributes the same part to the final image. An accumulation buffer usually has higher precision than a normal framebuffer. The number of images need not be known beforehands. This is *slow*, but requires much less memory than supersampling. The application controls how to use the accumulation buffer.

Okay, then, I guess I was a bit off.

But this is just the exact same math as alpha blending, then. The obvious problem is, of course, precision. No current nVidia card supports greater than 8-bits per color storage formats.

Regardless, the point is that the enhancements in 3dfx's technique were in relation to speed, and they apparently didn't speed things up enough anyway, particularly in relation to the amount of fillrate those cards had (Yes, the techniques may have come into use, if 3dfx hadn't lost it around the time of the Voodoo3...).
 
Sage said:
yup, exactly. FSAA on the GeForce2 was merely a driver hack to downsample. it was just something they managed to throw in at the last moment, not really a primary feature...

I still find it an absolute impossibility that the performance could have been nearly as high as it was without the hardware being designed with FSAA in mind from the start.

As another point of evidence, the TNT series did support FSAA, though only with a few driver sets. The FSAA could never be forced, though it could be enabled by an application (I had one Direct3D motorcycle game that used it...but wow was it slow!).
 
One more post!

Tagrineth said:
Or you can just use a basic accumulation buffer, kinda like what Unreal Tournament (classic) uses in some cases. :D

Just fyi, UT doesn't use the accumulation buffer. For the motion blur effect when somebody picks up the speed relic, all that happens is multiple transparent copies of the player are placed in appropriate positions and rendered accordingly (i.e. with just normal blending techniques).
 
Back
Top