How on earth does any of that actually work? My only guess is that everything is done on the CPU and the hardware is just a fancy PCIe bridge chip? I can't believe it actually touches the command stream from the GPU driver to the GPU, since that would mean reverse engineering ATI's and NV's drivers and re-implementing them. However the article claims the chip does some of the splitting and load-balancing, meaning data would go their driver -> their chip -> their driver -> gpu driver -> their chip as bridge -> gpu.
...
Oh, and how about transparency? Or off-screen buffers used for shadow maps and reflections? Or post-processing?
Perhaps their driver and chip keep track of resource allocation in and the shader command stream and cooperate to construct a dependency graph in a fashion similar to what the Larrabee paper said was done to bin the work.
Chunks of rendering that don't share resources within the same frame can be routed to separate cards.
Buffers built on one card that persist between frames could be handled by the driver inserting an export from one card and and a load to the next.
It seems that such a system could hit a snag if the dependency graph doesn't split all that well, though.