Cost of 3dfx RGSS AA on current hardware?

Free how? Doing it on scanout may be cheaper, but it's not free. You're still consuming a lot more bandwidth than with no AA. Instead of reading a single sample, you read four. Still better than reading 5 and writing 1 which is the case with a resolve pass before scanout, but hardly free.
I was talking about fillrate requirements for resolve (previous post). Resolve didn't consume any additional fillrate (compared to competing solutions) thus from fillrate stadpoint resolve was free... hope you understand me :)
 
How could you only pay the postprocessing once on the V5? It blended the samples on scanout.
I wasn't refering to the V5. By postprocessing I mean shaders like bloom and/or tone mapping.
On the other hand, doing postprocessing after multisample resolve is standard practice.
That's the point, this standard practice applies to our SSAA as well. At first sight it seems that since we need to render scene n times, the total cost will be n times bigger than without SSAA. However, because we can factor out the postprocessing shaders, it's not just that bad.
Not sure how you'd get motion blur for free either.
Just as you jitter the camera spatially, you can also "jitter" the scene temporally. Exactly the same way as you'd be doing when implementing standard motion blur with accumulation buffer. All you need is to apply both "jitters" at the same time.

The "AntiAlias" demo on your site can be easily modified to show this effect. In the file Main.cpp, line 216, we have this fragment:
Code:
	[B]for (int i = 0; i < nSamples; i++){
		<<<loop body omitted>>>
	}[/B]
We need to add few lines before, after, and inside that loop:
Code:
	static vec2 oldRot;
	static vec3 oldPos;
	vec2 newRot(wx, wy);
	vec3 newPos = position;

	[B]for (int i = 0; i < nSamples; i++){[/B]
		float fPhase = (i + 1.0f) / nSamples;
		vec2 lerpRot = lerp(oldRot, newRot, fPhase);
		vec3 lerpPos = lerp(oldPos, newPos, fPhase);
		glMatrixMode(GL_MODELVIEW);
		modelView = rotateXY(-lerpRot.x, -lerpRot.y) * translate(-lerpPos);
		glLoadMatrixf(transpose(modelView));
		glMatrixMode(GL_PROJECTION);

		[B]<<<the original loop body stays unchanged>>>
	}[/B]

	oldRot = newRot;
	oldPos = newPos;
 
Uh, I don't think you get free spatial and temporal AA by combining a spatial jitter and at the same time, updating the scene. You seem to be claiming that the same 4 samples can do "double duty", but I think this is false.

Imagine I draw a 45-degree line from 0,0 to 1,1. I then render this line 4 times by subpixel offsets of the pixel centers (my jitter). I have now achieved 4x supersampling by sampling the function, f(y)=x at 4 times the sampling rate.

Now, what if this line is moving each frame at a velocity of 5 pixels to the right. You seem to be claiming that if for each subpixel offset, I also update the position of the line according to the scene (let's say, drawing it from 5,5->6,6, 10,10->11,11, and 15,15->16,16) and blending all four of these together I will have simultaneously performed spatial AA and temporal AA.

But this is false. All you'll have achieved is drawing 4 *different* lines at 4 different positions, with a subpixel offset, and when you resolve these 4 frames, all you'll have of some minor temporal AA, but *no* spatial AA. The lines will appear just as aliased as they previously were, albeited, blended with their backgrounds.


The only time this could work to give you any kinda of spatial AA is if the line is moving so slowly between frames, or if you framerate is so high, that the 'temporal' motion of the scene amounts to subpixel adjustments only, and even then, it won't really be spatial AA, because you're not oversampling the *same* function spatially.

I dunno, there seems to be some kind of mythology surrounding 3dfx and T-Buffer, that it could perform some kind of voodoo magic with AA. That the T-Buffer could somehow give you 4x spatial and temporal at the same time for the same cost.

Moreover, temporal antialiasing sucks at such low sampling rates.
 
Moreover, temporal antialiasing sucks at such low sampling rates.
I don't know, I think rendering at 240fps on 60Hz display is quite neat. More so given that you could, if you wished, render only a small part of the screen at 240fps (the part containing a fast-moving object) while the rest was still rendered at 60fps so as not to flatten the fill-rate.
 
I don't know, I think rendering at 240fps on 60Hz display is quite neat. More so given that you could, if you wished, render only a small part of the screen at 240fps (the part containing a fast-moving object) while the rest was still rendered at 60fps so as not to flatten the fill-rate.
How would you do that?
 
The only time this could work to give you any kinda of spatial AA is if the line is moving so slowly between frames, or if you framerate is so high, that the 'temporal' motion of the scene amounts to subpixel adjustments only, and even then, it won't really be spatial AA, because you're not oversampling the *same* function spatially.
Uhh, I'm not sure that I agree with that. The "function" that needs to be super-sampled is indeed multidimensional over a pixel, varying in X, Y and time. Super-sampling over that domain *will* give you an anti-aliased result over all three variables, as long as enough samples are used. In particular, in the limit many samples are taken for almost the same instant of time.

I do agree that many, many samples would be necessary to get a great result, but I'm not convinced that a smaller number - combined in a quincunx-like scheme perhaps - couldn't give a reasonable enough result. Motion blur doesn't have to be very high quality to be convincing, as some nasty texture tricks have taught us in the past ;)

To my knowledge though, 3dfx didn't support proper sampling of the time domain, as that require separate geometry at arbitrary sampling locations, which implies some sort of callback system, or some sort of built-in frame-to-frame interpolation, which seems unlikely given that there's no guarantee of what will be rendered each frame. Maybe they just blended subsequent frames or something...
 
Uh, I don't think you get free spatial and temporal AA by combining a spatial jitter and at the same time, updating the scene. You seem to be claiming that the same 4 samples can do "double duty", but I think this is false.
images

FWIW, Pixar's "1984" image (thumbnail above) with its spatial and temporal antialiasing only used 8 samples per pixel. Of course, there was no correlation between sub-pixel sample locations and sample time and there was no repeating pattern from pixel to pixel (and which makes it expensive to achieve in hardware).
 
How would you do that?
With a T-Buffer. :)

What the T-buffer did was to allow up to 4 different versions of the same scene to be rendered in separate frame-buffer regions, and the four to be blended together not in memory but actually at the DAC stage to produce the final image. One possible application of this was "motion-blur" which involved rendering the frame (or, more likely, part of the frame) 4 times (corresponding to different moments in time) and then blending them.

The T-buffer could be used on only sections of the whole screen rather than the entire screen. So, if you identified one region of the screen as containing a fast-moving object, you could render 4 different versions of that one part of the screen while rendering the rest of the scene only once, and thus save on fill-rate.

The T-buffer was quite flexible: it could also do very high quality anti-aliasing, depth-of-field "out of focus" effects, soft-edged shadows, etc. (although certainly not all at the same time!) I think the intention was that future products would allow more than 4 versions of the scene to be rendered. A 16-way T-buffer would allow 4-way motion-blur and x4 antialiasing at the same time (if you had enough fill-rate available).
 
That's the point, this standard practice applies to our SSAA as well. At first sight it seems that since we need to render scene n times, the total cost will be n times bigger than without SSAA. However, because we can factor out the postprocessing shaders, it's not just that bad.

It's still pretty bad though. Let's say that post-processing would be about 20% of the workload. Then the cost of 4x SSAA is 0.8 * 4 + 0.2 = 3.4. Better, but still significantly more heavy than MSAA.
 
What about Rampage's "M-buffer", a superset of T-buffer bringing MSAA support.
Would it have been able to do some T-buffer like effects with multisampling like cost, or is that nonsense?
Are there interesting thing to do witth such a Multisample Buffer?
 
Last edited by a moderator:
images

FWIW, Pixar's "1984" image (thumbnail above) with its spatial and temporal antialiasing only used 8 samples per pixel. Of course, there was no correlation between sub-pixel sample locations and sample time and there was no repeating pattern from pixel to pixel (and which makes it expensive to achieve in hardware).

Yeah, but that image is a best case. :) And that image seems like a scan of a print/slide.

Try it with spinning triangles moving across the table instead of balls. The edges of the moving triangles won't get AAed at all. I wrote a demo of this years ago back when 3dfx was talking about it (a few other people did too on B3D), and I remembered the results are pretty bad on low-poly scenes with 4 samples. I remember an OGL demo written by someone on B3D that did up to 16 samples that still seemed worse than 4xRGSS.
 
Yeah, but that image is a best case. :) And that image seems like a scan of a print/slide.
Well, I assume those who know about it have a copy of 1984's Siggraph :p
Try it with spinning triangles moving across the table instead of balls. The edges of the moving triangles won't get AAed at all. .
Errr...yes they will.
I wrote a demo of this years ago back when 3dfx was talking about it (a few other people did too on B3D), and I remembered the results are pretty bad on low-poly scenes with 4 samples. I remember an OGL demo written by someone on B3D that did up to 16 samples that still seemed worse than 4xRGSS.
Ahh but it is difficult to implement Pixar's technique with existing hardware simply because, to get the equivalent of a few, uncorrelated samples per pixel, you'd need a lot of frames because each frame's samples are highly correlated.
 
Simon, are you talking about Distributed Ray Tracing, or combining 8-jittered frames ala T-Buffer, because I fail to see how the latter can achieve the neccessary spatial antialiasing of moving primitives, as the example I gave, of simply rendering a moving line (say, white line on black background) will not result in any spatial antialiasing on T-Buffer, how could it? The reductio ad absurdum -- the situation of rendering 1 line in 8 different frames, and rendering 8 lines in 1 frame (no temporal changes), as long as the lines do not intersect, is isomorphic. Thus, if drawing 8 lines in 1 frame at different positions and center offets gives no spatial AA, it won't do so in the former. There is nothing special about the time axis here, time is space as far as non-changing functions/primitives are concerned. It is only moving objects (let's leave lighting and shadows out of the equation) where temporal sampling needs to be scrutinized.

Perhaps you miss understood the "double duty" part of my post. If you are taking say, 4xRGSS spatial samples, and you then claim that by simply distributing those samples over time, you can achieve the same *spatial IQ* as well as gaining temporal AA (that is, you love nothing from the prior case, and only gain), then I disagree. There's no free lunch, and you must give up quality on some pixels somewhere.

For stationary objects, one could easily prove that a per scene temporal jitter could be isomorphic to a pure spatial supersample (with identical sample distribution). However, moving objects would have less spatial AA. If the moving objects are intersecting with their positions in previous frames, the motion blur effect may serve to obscure this, but the leading edges of the object would still be aliased.

So let's say you are doing 8-sample jittered T-buffer, implemented 2 ways. In the first, you use traditional spatial AA: you oversample the same scene 8 times at different offsets. In the latter, you update the simulation of the scene for 8 frames, and render each one with a jitter For stationary objects, spatial IQ of the former and latter can be identical. For moving objects, the latter's spatial AA IMHO will be inferior to the spatial AA of the former.
 
So let's say you are doing 8-sample jittered T-buffer, implemented 2 ways. In the first, you use traditional spatial AA: you oversample the same scene 8 times at different offsets. In the latter, you update the simulation of the scene for 8 frames, and render each one with a jitter For stationary objects, spatial IQ of the former and latter can be identical. For moving objects, the latter's spatial AA IMHO will be inferior to the spatial AA of the former.
You are comparing AA in 2D with AA in 3D, with the same sample budget. Let's instead try something more fair. When we are doing spatial and temporal AA at the same time, our sample grid from the 2D case effectively becomes now 'sample cube'. How do you propose to distribute samples in that cube, for the best effect?

Fast moving object won't benefit visually from spatial AA. What it needs most, is motion blur.

Slow or stationary object won't benefit visually from motion blur. Most of all, it needs spatial AA.

By making the samples do the 'double duty' (as you called it yourself), you are making universally the best use of given number of samples.
 
It doesn't matter how you want to visualize it, the fact of the matter is the number of samples is too low. Now all you've done is shift the argument to whether or not moving objects need spatial-AA and what the definition of "fast moving" is. Since people currently complain about edge-aliasing on moving objects in games, one could conclude that edge aliasing effects are still annoying, even in motion. Any object moving more than 1 pixel per frame is going to have edge aliasing.

I'm of the opinion that 8 samples is way too low for T-buffer spatial+temporal AA, and that the resulting IQ's motion blur sucks as well. You end up with crappy looking motion blur (motion trails instead of blur) and inferior spatial AA.

I'm pretty sure it would not be hard to imagine a scene with moving objects in the background on the horizon, for with pixel-popping effects look worse than traditional spatial AA applied.
 
What about Rampage's "M-buffer", a superset of T-buffer bringing MSAA support.
Would it have been able to do some T-buffer like effects with multisampling like cost, or is that nonsense?
Are there interesting thing to do witth such a Multisample Buffer?
T-Buffer and Multisample-Buffer are not much different, with the addition that you can do alpha to coverage (if the API allows it) and fill multiple samples at once with the same color (which naturally leads to framebuffer compression).
That might help you somewhat with "T-buffer like effects" (DoF, soft reflections, motion blur). But only for those parts of the scene that are not affected, though, because these effects rely on each sample being rendered separately.

The T-buffer could be used on only sections of the whole screen rather than the entire screen. So, if you identified one region of the screen as containing a fast-moving object, you could render 4 different versions of that one part of the screen while rendering the rest of the scene only once, and thus save on fill-rate.
I know what the T-buffer is, but you couldn't simply save fillrate with it. It's supersampling, remember? So even if you render the rest of the scene only once, you render it with 4 samples per pixel, meaning 4x the fillrate requirements of non-AA, non-T-buffer rendering. If you rendered only one sample, it would be mixed with the clear color.

Instead, what you describe is actually possible with multisampling. Render the static parts of the scene once, render the moving parts multiple times, each time with a different bit of the sample mask set.

Unfortunately the worst case for motion blur is very common: camera movement. So you'd have to render the whole scene multiple times exactly when you need a high framerate the most. But if you have enough performance to do that, then why not use supersampling all the time?
 
I'm of the opinion that 8 samples is way too low for T-buffer spatial+temporal AA, and that the resulting IQ's motion blur sucks as well. You end up with crappy looking motion blur (motion trails instead of blur) and inferior spatial AA.

I'm pretty sure it would not be hard to imagine a scene with moving objects in the background on the horizon, for with pixel-popping effects look worse than traditional spatial AA applied.
Well, I can only invite you to see it in work. It doesn't look as bad as you're describing it. I can hardly see any degradation in spatial AA.

As I wrote on previous page, I took a liberty to modify Humus' supersampling demo, so that its samples perform the 'double duty'. I also added 8-sample mode. For easier comparison, you can suppress spatial and/or temporal jitter by holding left or right mouse button respectively. Download Humus' AntiAlias demo and replace its exe with the modified version (it's free-hosted, so the link will expire eventually). If you have reservations about running exe from untrusted source (yours truly ;)), the program can be rebuilt by anyone from the provided sources.
 
I know what the T-buffer is, but you couldn't simply save fillrate with it.
It requires less fill-rate to render part of the screen four times and the rest of the screen once than it does to render the entire screen four times. Or are you trying to disagree with that?
 
Simon, are you talking about Distributed Ray Tracing, or combining 8-jittered frames ala T-Buffer, because I fail to see how the latter can achieve the neccessary spatial antialiasing of moving primitives,
Oh, I was definitely talking about the distributed ray tracing (not that ray tracing per se is actually required).
 
Back
Top