Infinity Fabric itself doesn't mean everything, when the problem at hand is that GPUs have a far higher scale of interconnect bandwidth needs. "Modularity" doesn't help either if you consider how GPU caches are served to amplify bandwidth and to exploit spatial locality first and foremost with its predominant data-streaming access nature, and the increasing reliance on device scope atomics and coherency needs.AMD's upcoming Vega GPU's are built using their 'Infinity Fabric' technology which gives it modularity like the CCX's in Ryzen. So it's possible in the future they come out with a multichip software transparent product. The only question would be whether they use classic MCM assembly or an bigger interposer.
Infinity Fabric itself doesn't mean everything, when the problem at hand is that GPUs have a far higher scale of interconnect bandwidth needs. "Modularity" doesn't help either if you consider how GPU caches are served to amplify bandwidth and to exploit spatial locality first and foremost with its predominant data-streaming access nature, and the increasing reliance on device scope atomics and coherency needs.
Imagine to have 2x Vega 10 as four smaller chips. For each chip, in addition to the x1024 HBM2 interface, you would also need a triple of such (in SerDes or whatever) to at least maintain full channel-interleaving bandwidth that matches what is being expected for a monolithic GPU. Now let's take account of also the need of L2 caches and ROP access needs. Let's not forget also the GPU control flow (wavefront dispatchers, CPs, and GPU fixed functions esp).
As always I am not saying it is impossible. But apparently the only question is not the only one.
This makes little difference from explicit multi GPU though. Shared memory address space is already a thing IIRC (via the host).I think it's not realistic to expect that the hardware looks to the _driver_ like one GPU, but you can certainly just publish two graphics-queues to the run-times (DX12 + Vulkan runtimes, not games) and require no other adjustments at all (shared memory address space and so on, maybe e.g. shared MMU instead of multi-MMU coherency or such).
Multi-chip is the way, but the tone was set for heterogeneous SoC integration in the first place. Their excascale proposal uses multiple GPUs per package, but that's because of the model they pursue for that particular project (in-memory computing).I agree with you, but I remember someone from AMD saying something along the lines of multi-chip was the way to go in the future. Without much thought I assumed with Vega having infinity fabric this was at least the first step towards such a future.
Cache coherency between multiple GPUs, Zen hosts and perhaps OpenCAPI appears to be an incentive though.Why else would AMD choose to build Vega with infinity fabric...? But there's no guarantee it would be software transparent.
This makes little difference from explicit multi GPU though. Shared memory address space is already a thing IIRC (via the host).