Programming for SLI in OpenGL

Geo · Mar 16, 2005

http://download.nvidia.com/developer/presentations/2005/GDC/OpenGL_Day/OpenGL_SLI.pdf

Hmm! Does this tell us anything useful about the capabilities and limitations of NV SLI?

"Programming for. . ." is certainly an evocative title in itself, don't you think? By that I mean we (well, at least I) have been mostly thinking of SLI as a transparent technology to the app, and this could lead in all sorts of directions for the industry. . .

Thanks to the NV GDC presentations thread . . .http://www.beyond3d.com/forum/viewtopic.php?t=21264

Pete · Mar 17, 2005

Haven't read it yet, but I'm guessing "Programing Around ..." would be more apt.

Still, if SLI is here to stay, this stuff is probably important.

Edit: PCIe traffic is routed by the CPU, not the Northbridge (such as it is in A64 MBs)?

pakotlar · Mar 17, 2005

I wonder how ATI will fare without the SLI bridge. In particular, I wonder what kind of CPU hit we are talking about with AMR.

Remi · Mar 17, 2005

That presentation is mainly an explanation of the two modes, AFR and SFR (resp. Alternate and Split Frame Rendering), in order to make sure developers are conscious of their limits and don't paint themselves in a SLI corner. The 7th Chapter of the gpu programing guide adds a few more details, and as they say (quote): "The efficiency of a multi-GPU system is inversely proportional to how much data the GPUs share." That's what this presentation details a bit.

In short, with AFR frames are rendered alternatively by the processor, so if you're using some form of a recursive post-processing filter which uses elements computed for the previous frame, that's not going to work optimally with AFR (the two GPUs will have to use some time to communicate these elements). With SFR the screen is split, so similarily if the processor doing one part of the screen needs elements from the other part of the splitted screen, communication will have to occur and you can't expect this to work optimally with SFR.

That should apply to all AFR/SFR systems I think, regardless of which IHV proposes it.

Dave Baumann · Mar 17, 2005

* cough *

Geo · Mar 17, 2005

DaveBaumann said:
* cough *

Yes, and did you mention that an app limiting the number of frames buffered will also break AFR? Hmm?

trinibwoy · Mar 17, 2005

DaveBaumann said:
* cough *

Pimpage?

Hellbinder · Mar 17, 2005

3Dfx's Tech still seem the best solution to me. Scan Line interleave has got to be the most direct way to doubble performance in a pure way.

This should never have died as a technology.

AFR from Ati was not nearly as good. I suppose their Tile System is pretty good. But come on, Some one Licence or Bring back SLI in its pure unadulterated form.

Can you imagine the performance of 2 6800's or 520's in a real SLI mode?

Ostsol · Mar 17, 2005

I'm not sure that 3dfx's SLI implementation would have fared any better. Some of the issues raised by NVidia seem pretty fundamental, to me.

wireframe · Mar 17, 2005

Hellbinder said:
3Dfx's Tech still seem the best solution to me. Scan Line interleave has got to be the most direct way to doubble performance in a pure way.

This thould never have died as a technology.

AFR from Ati was not nearly as good. I suppose their Tile System is pretty good. But come on, Some one Licence or Bring back SLI in its pure unadulterated form.

Can you imagine the performance of 2 6800's or 520's in a real SLI mode?

Scan Line Interleave is only good when you are pushing polys and texturing them. Primitive stuff. Modern GPUs are way beyond this and pixels take multiple hops/loops to get to the screen and then there are inter-pixel dependencies. Scan Line Interleave would absoutely crush performance on a 6800. Everything would have to be recalculated twice or even more. It would suck (to use a technical term).

Xmas · Mar 17, 2005

Hellbinder said:
3Dfx's Tech still seem the best solution to me. Scan Line interleave has got to be the most direct way to doubble performance in a pure way.

This thould never have died as a technology.

Well, even 3dfx went from scan line interleave (V2) to stripe interleave (V5). And there's a reason for that: much better utilization of spatial coherence.

Scan line interleave does not double performance. At least not if you do it correctly and adjust the LOD calculation accordingly (you have to account for the fact that ddy doubles), because that leads to texture cache misses.

And whether you use scan line interleave, stripe interleave, tiling or split frame rendering, you get the same problems regarding render-to-texture, post-processing, etc.
The only thing split frame rendering doesn't have is inherent load balancing, but that's the trade-off for saving the transistors necessary for interleaving/tiling.

Remi · Mar 18, 2005

DaveBaumann said:
* cough *

I was just summarizing what I found in the online slides... All corrections and precisions are of course welcome!

Remi · Mar 21, 2005

geo said:
Yes, and did you mention that an app limiting the number of frames buffered will also break AFR? Hmm?

No, not really. Was that required to pass the exam?

I dunno, it just seems obvious. Moreover, as composers like to say, once a song is out, it's no longer yours, it belongs to the public. Frames are like songs, how people will consume them is their choice, not a developer's one - which is why I have trouble imagining someone (a developper) limiting things such as the number of frames rendered ahead.

Therefore I tend to believe that if it's an issue, it's only marginal. But as I have no real field knowledge of that (if someone does, please post), I know it is just what seems natural to me, not necessarily corresponding to the real world.

By the way, while we're speaking of missing things, nobody have noticed we might well lack one more mode, let's say "PFR" for Parallel Frame Rendering. You know, when you're using both your eyes in stereo 3D... Wouldn't it makes sense to have one GPU rendering the left image while the other renders the right one, thus minimizing lag and benefiting from natural synchronization (the GPUs taking about the same time to render about the same images)?

Edit: Of course unlike AFR/SFR, it makes senses that it would be automatically selected by the driver when stereo is enabled, so while a valid rendering mode, it would get less public exposure too...

Programming for SLI in OpenGL

Geo

Mostly Harmless

Pete

Moderate Nuisance

pakotlar

Remi

Dave Baumann

Gamerscore Wh...

Geo

Mostly Harmless

trinibwoy

Meh

Hellbinder

Ostsol

wireframe

Xmas

Porous

Remi

Remi

Similar threads