Multi-GPUs AFR vs frame lag & smooth animation

Cubitus

Newcomer
Both ATI & Nvidia offer multiple GPUs solutions for some years now (Crossfire & SLI). There are three common modes of operation available:

- Super AA or SLI AA mode (for improved image quality with increased AA);
- Scissor or split frame rendering (SFR) mode;
- Alternate frame rendering (AFR) mode.

As outlined in some talks from both IHVs, for a modern game engine that requires some sort of post-processing effect (all of them these days) the only usable mode is AFR . Note that even using AFR, care must be taken on when render-targets (DirectX) or render buffers (OpenGL) are updated and cleared.

Here are a few tips from vendors:

- To avoid GPU starvation make sure frames are buffered at least 2 frames. I think default is 3 for Nvidia and 2 for ATI. This way, at the end of a frame (SwapBuffers with OpenGL), the CPU does not wait for the end of one GPU processing before starting over new commands for the other GPU.
- Disable VSync, swap on V sync or wait for vertical refresh to maximize FPS if tearing is acceptable.
- Never call glFinish (OpenGL) since this would kills the asynchronous work between the CPU and GPUs. The CPU would have to wait for the end of one GPU processing before submitting any new commands to the other GPU.

Obviously those tips will maximize parallelism between the CPU issuing commands and GPUs. But how is it possible to render a smooth animation this way? If the CPU is totally asynchronous with the GPU, then the CPU has no idea at the time to compute positions when the actual frame will be displayed on screen.

For my application tearing is not an option, so at least this way I know that frames are displayed at a constant rate (the display refresh rate). But since frames are still buffered, CPU still doesn’t know when the computed frames will be displayed!? The lag may also become a problem… with 3 frames buffered and 2 frames being the maximum time taken by a GPU to render a scene, the latency could max out to 5 frames! @ 60 Hz this would mean 83 ms…

I found that on Vista it is possible using DirectX 9 or 10 to call a function named WaitForVBlank that would solve my problem.

Does anybody found a way to address this in Windows XP with DirectX or using OpenGL?

/Cubitus
 
Back
Top