will SLI return?

Xmas · Jun 14, 2005

I know this question sounds a little confusing now, but I mean SLI in its original meaning - scanline interleave.

Well, SLI already has returned - or rather, is announced to return. Today it just carries the name SuperAA. Strange, eh?

Generally, you can use multiple GPUs to either increase the framerate at a given resolution, or to increase the resolution while keeping the framerate the same.

When AFR or SFR (be it a single split or tiles) work, all is well and you either get a massive increase in fps or you can more than double the resolution (efficiency increases with resolution) while getting the same framerate.

But AFR breaks with framebuffer locks, and textures rendered to in the previous frame need to be transferred or re-rendered. With SFR, RTT textures generally need to be transferred or rendered twice.

Rendering textures twice does of course impact the fps you can reach, but even if each card rendered every texture, you still would be able to exactly double the final output resolution at the same framerate, ignoring the overhead. Well, but only if the resolution of those textures does not scale with output resolution.

Post processing effects, however, as well as a few other things, do scale with output resolution. So if you rendered the whole scene into a framebuffer-sized texture, then perform a simple bloom filter on it, your SFR system, set up to re-render textures on each card, would see close to no speed-up at all. And if that texture is used in the next frame as well, e.g. for light trails, AFR will significantly slow down, too.

Now this is where SuperAA kicks in. A simple mode that pretty much guarantees higher AA levels at the same framerate. Because the output resolution stays the same. But does it really? Isn't supersampling just a higher resolution downfiltered?

With SuperAA, each card renders as if it were the only card in the system, bar a possible LOD bias. Each card renders its own RTT textures. No communication until scanout.
This is exactly how scanline interleave works, too, except that at scanout the lines are interleaved instead of blended. And that the application is aware of the increased output resolution.
What if we took this awareness away?

Say, you have a game that delivers playable framerates at 1024x768 and whatever settings you like with a single card setup, but not above. Unfortunately, AFR and SFR both do not work as expected, but you have the option to enable SuperAA on a dual-card machine. So you get nice smooth edges at the same playable framerate. Unfortunately, 1024x768 is a bit low for a high quality monitor. Wouldn't it be nice, then, to be able to just double the resolution to 1440x1080 instead of having 2xSSAA?

KimB · Jun 14, 2005

I claim that it would still be better to do AFR or SFR in combination with a similar degree of multisampling instead of using the multiple cards for supersampling. I just don't buy that going for scanline interleave would allow one to dodge the same hurdles that using multiple cards for supersampling can get you.

In other words, what we really need are cards that support higher levels of multisampling. IHV's have been asking developers for a long time to not do many of the things that cause inefficiencies with any sort of SLI-type technology, so the only real remaining hurdle is render to texture, which will increase in usage. All we really need, I think, is some optimization in dealing with render to texture scenarios in multi-GPU setups.

One possible option would be for the application to be able to let the hardware know when the texture currently being rendered to will be output later using a screenspace quad (i.e. for tone mapping). This would allow the drivers to pass much less information between the cards. A slightly more generalized idea would possibly give the driver limits on what areas of the screen the rendered texture could possibly effect (imagine rear view mirrors, for example).

ondaedg · Jun 14, 2005

Aren't there inherent advantages to SuperSampling (outside of AA quality) that makes SS desirable such as sharper textures and reduced shimmer?

KimB · Jun 14, 2005

ondaedg said:
Aren't there inherent advantages to SuperSampling (outside of AA quality) that makes SS desirable such as sharper textures and reduced shimmer?

Anisotropic filtering is more efficient at anti-aliasing of textures (by far) than supersampling.

RejZoR · Jun 14, 2005

Problem today is that graphic cards no longer render just basic textures and few triangles. There is lots of framebuffer/Z tricks,pixel shading,multiple texture stages,several filtering methods,heavy stencil buffer usage etc...
I belive all this can be problematic in certain games affecting framerate and SLi mode method used to link both graphic cards into one image output "engine". Motion Blur effect is most common killer of AFR mode for example...

no-X · Jun 14, 2005

Chalnoth said:
Anisotropic filtering is more efficient at anti-aliasing of textures (by far) than supersampling.

More efficient? Depends on implementation. e.g. enabling AA on Kyro will cause the same performace drop as AF, but quality of textures won't be worse (with AA enabled)

AF on

AA on (4x) - same quality and performace as AF on (with many games), so edge removal is "free"

AA on (2x) - faster than AF, bit blurrier too

KimB · Jun 14, 2005

Except that 4x supersampling AA is no better than 2-degree anisotropy as far as dealing with anisotropy is concerned. It may remove a bit of aliasing in the near-field, but that's typically a waste of performance.

And that doesn't even consider that 4x supersampling will pretty much always drop performance by a factor of four, whereas even up to 16-degree anisotropic filtering performance rarely drops by more than half.

Xmas · Jun 14, 2005

Chalnoth said:
I claim that it would still be better to do AFR or SFR in combination with a similar degree of multisampling instead of using the multiple cards for supersampling. I just don't buy that going for scanline interleave would allow one to dodge the same hurdles that using multiple cards for supersampling can get you.

Why not? Look at how SuperAA works. It works like scanline interleave, except for the half-pixel horizontal shift and the blending of pixel data.

Of course, using scanline interleave to increase the resolution to, say, 1440x1080 would not be identical to have a single card render 1440x1080.

One possible option would be for the application to be able to let the hardware know when the texture currently being rendered to will be output later using a screenspace quad (i.e. for tone mapping). This would allow the drivers to pass much less information between the cards. A slightly more generalized idea would possibly give the driver limits on what areas of the screen the rendered texture could possibly effect (imagine rear view mirrors, for example).

Giving such hints might be desirable sometimes, but I don't think it will happen. Rather, I think virtual memory and a faster multi-GPU interconnect will take care of this in the future.

Blazkowicz · Jun 14, 2005

That's interesting. Why don't we see "real SLI" again?

As for supersampling, 4x RGSS allowed me to set LOD -1.5, MUCH better than 2x AF. (crisp textures overall, and I wasn't even much noticing the bilinear filtering)
2x AF is worthless as far as I'm concerned.

I agree it's wasteful, but if even a voodoo5 could afford to be so wasteful (for quake 1/2 based games and such), why wouldn't our ridiculously powerful monsters of today. There's no waste if you're playing CS and still get 100fps anyway

stevem · Jun 14, 2005

Blazkowicz_ said:
That's interesting. Why don't we see "real SLI" again?

Cache coherency.

Xmas · Jun 14, 2005

A LOD bias of -1.5 with 4xRGSS? Uh, that's like adding a LOD bias of about -0.6 on top of 2xAF...

Xmas · Jun 14, 2005

stevem said:
Cache coherency.

It's obviously not enough of an issue to make SuperAA worthless, so why would it be with scanline interleave?

stevem · Jun 14, 2005

Xmas said:
It's obviously not enough of an issue to make SuperAA worthless, so why would it be with scanline interleave?

SuperAA is claimed as an IQ, not performance enhancement. It's a catch-all. I'm not entirely sure the scene is rendered in alternating lines as per 3dfx SLI mkI, then reconstituted on scan out. What effects do higher poly counts have?

Humus · Jun 15, 2005

Maybe I'm missing something here, but if you just want higher resolution, then what's the problem with using SuperTiling or split screen rendering?

Xmas · Jun 15, 2005

Post processing.

Xmas said:
Post processing effects, however, as well as a few other things, do scale with output resolution. So if you rendered the whole scene into a framebuffer-sized texture, then perform a simple bloom filter on it, your SFR system, set up to re-render textures on each card, would see close to no speed-up at all. And if that texture is used in the next frame as well, e.g. for light trails, AFR will significantly slow down, too.

KimB · Jun 15, 2005

Sure, but tone mapping passes typically have outputs which depend upon neighboring pixels, so you can't keep the cards completely independent.

CMAN · Jun 15, 2005

Humus said:
Maybe I'm missing something here, but if you just want higher resolution, then what's the problem with using SuperTiling or split screen rendering?

Effeciencies? The Supertiling and or split screen rendering would need to be dynamic to theoretically as "effecient".

Humus · Jun 15, 2005

Xmas said:
Post processing.

Xmas said:

Post processing effects, however, as well as a few other things, do scale with output resolution. So if you rendered the whole scene into a framebuffer-sized texture, then perform a simple bloom filter on it, your SFR system, set up to re-render textures on each card, would see close to no speed-up at all. And if that texture is used in the next frame as well, e.g. for light trails, AFR will significantly slow down, too.

Click to expand...

Ok, so you mean that one card should render the even lines and do the post-processing on that, and the other card on the odd lines and do its post-processing on that, then interleave it for the final pic? Well, theoretically that would be possible, but the output would not match that of a single-gpu solution (though I guess for a bloom filter you may not see any difference in practice), so there would have to be a way for the application to communicate to the driver that it desires this kind of setup.

KimB · Jun 15, 2005

Humus said:
(though I guess for a bloom filter you may not see any difference in practice)

Oh, I bet there would be distinct feathering in the output with a bloom filter.

KimB · Jun 15, 2005

CMAN said:
Humus said:

Maybe I'm missing something here, but if you just want higher resolution, then what's the problem with using SuperTiling or split screen rendering?

Click to expand...

Effeciencies? The Supertiling and or split screen rendering would need to be dynamic to theoretically as "effecient".

Dynamic? No, that has nothing to do with it. But split screen rendering is dynamic. This is required to keep load balancing in line.

will SLI return?

Xmas

Porous

KimB

ondaedg

KimB

RejZoR

no-X

KimB

Xmas

Porous

Blazkowicz

stevem

Xmas

Porous

Xmas

Porous

stevem

Humus

Crazy coder

Xmas

Porous

KimB

CMAN

Humus

Crazy coder

KimB

KimB

Similar threads