The 256-bit bus on most GPUs is actually a collection of smaller buses, e.g. 4x64-bit or 8x32-bit.
What I'm expecting to see is that two GPUs on one package have a fat bus joining them (much like Xenos and its daughter die have a 32GB/s bus). The two separate memory systems (256/512MB per GPU) are then aggregated into a single memory address space.
I knew that (smaller buses), and is what I figured. Could 4x64 become 8x64 using two of the same GPUs on the same package with cross-usage of memory and current load balancing ala driver for the GPUs? I believe you may have answered it below in the positive (and through many subsequent answers to others.)
In effect GPU A can request a texture from GPU B's memory, so the fetch request travels over to the relevant memory controller on GPU B which then obtains the data GPU A requires. All clients in both GPUs (TUs, RBEs, vertex fetch, etc.) see this single memory space and are able to use it freely.
Then you just need a decent driver that understands how best to assign memory to the clients on both GPUs, so that the chosen multi-GPU mode produces the most efficient usage of memory as well as the required performance.
This is what I wonder is currently feasible, or if we'll have to wait a generation or two.
AFR seems like the prime candidate. Within AFR, though, it's possible to optimise the way textures are organised - e.g. classically textures are copied to both GPUs' memory. In theory the newer ATI GPUs don't need to be that wasteful.
I've got my fingers crossed that we'll see these kinds of efficiency gains, but the driver gods have been scowling upon these D3D10 GPUs and I see no sign of a let up.
Jawed
That, I find very interesting, and it would seem to make quite a bit of sense.
EDIT: I also did not know about MC addressing and the number of GPUs being irrelevant. Very food investigative reporting (and deductive reasoning) amigo.
Thanks a lot Jawed, and very interesting discussion about the possibilities as well as pros about such a solution (yields, cost saved on packaging, similar layout/heat to R600 could be used etc), many of which I figured would be the reason why such a product could or rather should exist. :smile:
BTW: the adjustable power to the pci-e slots on 790 looks freakin' rad., and agree the 512MB/1GB seems to imply R680 is indeed 2x670 Gladiator (in one form or another)
VR-ZONE gives me hope.
vr-zone said:
We even heard faintly that R680 could be AMD's ambitious plan to integrate two RV670 into a single die, if not on the same package.
Care to comment CJ? You seemed to give them the rest of the scoop. I'm just curious how a R680 (2xRv670) could score 20k in 3dmark06, while a single gladiator will reportadley do about 10.4
(edit: 11.4). That's almost perfect 2x scaling, and while granted the CPU has to taken into the mix, iirc we don't see anything close to that in Crossfire, although granted 3DM may be the exception.