Hyp-X said:
Well I can think of 3 possible explanations:
1.) Designers of PCIe lied about it's bandwidth (fairly unlikely).
2.) Crossfire cannot perform a DMA transfer between the two cards.
That's probably a limitation of either the MB chipset or the graphics chip.
This might been solved by either moving the data to the main memory and back to the other card, or by a CPU assisted memory transfer. Either way it's slower.
3.) Crossfire cannot hide the latency of the frame transfer.
Normally it should start the rendering of the next frame while the previous one is being composited.
If I'd have to guess it's a combination of 2 and 3.
Can you maybe find out if these are the real issues holding back the implementation?
I've done many tests around SLI AA and SuperAA. Unfortunately not enough time to write a lot about it.
CrossFire and SLI can't hide the latency of the PCIE transfer. Effective rendering time = rendering time + transfer time.
CrossFire does a DMA transfer, SLI doesn't so it uses HT and main memory bandwidth and also increases the latency.
Single cycle rendering :
Nf4 SLI, 6800 Ultra, AA4x : 3260 MPix/s
Nf4 SLI, 7800 GTX, AA4x : 3252 MPix/s
Xpress200 CE, X850 XT, AA4x : 2022 Mpix/s
Nf4 SLI, 6800 Ultra SLI AA 8x : 33 Mpix/s (-> 126 Mb/s)
Nf4 SLI, 7800 GTX SLI AA 8x : 234 Mpix/s (-> 893 Mb/s)
Nf4 SLI 32, 6800 Ultra SLI AA 8x : 34 Mpix/s (-> 130 Mb/s)
Nf4 SLI 32, 7800 GTX SLI AA 8x : 286 Mpix/s (-> 1091 Mb/s)
Xpress2200 CE, X850 XT, CF Super AA 10x (= SLI AA 8x) : 128 Mpix/s (-> 488 Mb/s)
Bandwidth when getting data back from a graphic board :
Nf4 SLI, 6800 Ultra : 862 Mb/s (that’s strange because very high for a bridged solution)
Nf4 SLI, 7800 Ultra : 929 Mb/s
Xpress200 CE, X850 XT : 505 Mb/s