For starters you'll be lucky to get a bit more than half that 32GB/s on 128bit DDR3. Second that has to be shared with CPU. I'm quite sure it'll be cheaper and with far smaller overhead to just go with x MB RAM + 2-4x MB VRAM or just one huge shared memory pool.
If that devkit specs are even remotely true then I'm almost certain that the final box will only have an APU without a discrete GPU and it'll likely be using GDDR5 instead of DDR3 in a single unified memory pool.
Well that just doesn't answer my question no disrespect or anything but look at the 360, the device got away with 22GB/s of bandwidth to the main ram were texture were stored. That bandwidth was share by the GPU and CPU and for moving data from edram to main RAM.
If I look at the HD6570, it uses DDR3 @ 900MHz which must be called DDR3 1866 (naming are so confusing...), that's 28.8 GB/s of bandwidth according to
AMD own data.
Using hardware.fr data the CPU in peak situation will reach ~20 MB/s in while reading and ~16 MB/s while writing. That's peak figures while reading and writing I would assume that 20 MB/s is the absolute limit for the CPU bandwidth usage.
That's ~9MB/s left for textures in the worst case scenario. I still have no idea if that would be bottleneck.
Anyway in llano the same bandwidth feeds the CPU , the textures read and frame buffer operations. I'm close to assert that texture reads from another GPU would not be a problem.
I don't expect to this to work in a dual graphic fashion ie alternate rendering. I expect both GPUs to work on different tiles. Ultimately the "definitive" frame buffer would be in VRAM or RAM. you may not want the link to be a bottleneck went you move part of your frame buffer but having more bandwidth than to main ram may be over killed.
If the CPUs sends data to the GPU is can be bursty, I don't know. PCi express x16 may be enough but more... is always tempting
-------------------------------
How would you do to come with unified memory pool?
A while ago and speaking the next xbox not the 360 I though that the second GPU would be a "shader core only" with all the ROPs on the APUs, I don't know it it's doable the xenos ROPS were simplistic, HD6000 ROps are tied to the L2 which are tied to the mem controller... that means the L2 would be off chip, etc.
I don't either if it's doable for the APU GPU to render straight into VRAM.
It sounds like complicated proposal to me, each GPUs are likely to render straight into the memory pool they are connected. So having a unified memory space sound really complicated.
Another thing is that both GPUs will want to read textures, would want waste memory and have duplicate of the same texture in ram and vram?
For the kind of resolutions that kind of card can push I feel like 512MB is really enough, more would be over kill. Let's consider this a tiny pool of fast memory intended only at frame buffer operation