When ATI initially introduced the X1000 series the Crossfire support for X1300 was always due to be based on two standard cards with the data transfer occurring over the PCI Express bus, however X1600 was intended to be based on the same master/slave configuration present at the high end. Sometime after the release ATI decided on a switch such that X1600 now can operate in Crossfire mode with two standard boards and using the same data transference method as X1300 - this is likely because of the increased costs of the compositing solution wouldn't be beneficial given the performance X1600 operate at.
Although we haven't given it a full test here we can see that the Crossfire gains for X1600 in with this solution can yield fairly tangible results, with performance gains up to nearly 70% in Splinter Cell, at least. However in other games its clear that it would benefit from increased bandwidth between the two boards, which is widely expected to come with ATI's next chipset. Of course, that raises other issues with Crossfire such as it being platform specific and to take benefits of it most users will likely need a new mainboard, pushing the costs up for these entry and mainstream level solutions.
That's from the rv5xx review:
http://www.beyond3d.com/reviews/ati/rv5xx/index.php?p=23
The "beneficial performance" Dave talked about is obviously found in RD580.
The Dongle does NOT communicate.
The composition engine merely "sticks" the rendered parts of the frame from the Master, together with the slave card.
All other communication (i.e. vertex data, instructions etc.) is passed over the PCIe bus. the frames (or part thereof) are sent through dongle on the 18 and 19 because of the limited bandwidth left available after inter-card communication.
The 13 and 16 simply did not require it to function properly but obviously benefit from more available bandwidth, i.e. they're reaching they're required bandwidth and probably encounter some kind of xfire QOS .
I've allready hypotisized that, should a full 16x16 product become available, it might just be marginally "fat" enough to secure proper transmission of inter-card communication on high end cards..
However, due to compatibility, even X1950's will need to work in older Xpress200 systems and thus, I think we'll see the "internal" dongle available for those as well.. maybe someone will put it on existing models as well, but it's more ATI's problem, than that of the manufacturers.
Off course, all this is sucked right out of my big thumb... but it is what I am thinking
So, nVidia used the SLI internal bridge type connector, while Ati stuck to a more VooDoo (Monster3D) esque approach