The slide in question was found by Fudzilla, I believe and it was more recently released than the HPC APU slide posted previously. At least the code names and general feature names have been corroborated so far.
http://www.fudzilla.com/news/processors/38402-amd-s-coherent-data-fabric-enables-100-gb-s
The slide had a Greenland GPU and 2 HBM2 stacks on an interposer. As drawn, it seemed to indicate that this interposer is put onto an MCM that mounts the 16-core Zeppelin CPU separately.
Multiple GMI links connect the CPU to the GPU over the MCM substrate.
Zeppelin may not be designed to use an interposer, and the GMI bandwidth is relatively modest compared to what can be achieved with an actual interposer. I'm not sure the cheaper variants have better odds than the one destined for HPC.
Thanks.
So it looks like Greenland will be self contained on an interposer. My (optimistic outlook) would be that if Greenland is available for a HPC part, it could be made available for a consumer part without any re-engineering needed.
"4+" TF isn't a whole lot, but I suppose that this is going to be clocked for going on a package that already has 16 Zen cores on it, and that 100 GM/s should be enough to for both CPU <-> GPU communication and for the GPU to plunder main memory bandwidth.
It seems plausible that the CPU's GMI and PCIe connectivity would use the same physical interface, although prior to the OpenCL driver strings people were using Vega interchangeably with Greenland.
A likely implementation for this would require that Greenland have a measurably wider PCIe/GMI bus than would be necessary for a consumer GPU, and with Vega is listed separately it might not have that kind of capability.
I'm open to be pleasantly surprised, but it's possible that the engineering problems with the different memory types and CPU/GPU communication may not be addressed with Vega and consumer-space Zen. The HPC version may not be superior to the likely competition, but details are light.
So I suppose for a "consumer super APU" the likely best hope would be consumer Zen on a MCM with Greenland. Power limits allowing, that would should give you a fast 8 core CPU with something around 470 8GB performance, but with a lot of power saved over the 256-bit GDDR5 setup.
I'll keep dreaming for now, but I think getting something clearly PS4 Neo beating in an ultra small form factor PC would be pretty cool.
The limitation would be the speed of the GMI link, so roughly 100GB/s based on some of the leaded docs a while back. That is possibly in addition to system memory bandwidth. Hard to tell what the maximum rate is as the configurations were all CPU to GPU and the links only large enough to accommodate maximum system memory bandwidth. That's still a ton of bandwidth for a CPU to play with if it works. Problem being it's not the CPU that likely needs the bandwidth. Most bandwidth intensive tasks for a CPU using HSA would be better accelerated by the GPU portion.
I suppose the key would be using the 8/16 GB of HBM2 wisely. Perhaps transferring in data head of time, or using it as an enormous cache?
Only one controller using GMI links for different chips based on some leaded documents.
Excellent! *rubs hands together*
[/QUOTE]Was there an agreement they had to use GF exclusively? If they can't produce Polaris fast enough to meet demand, meeting wafer requirements shouldn't be a huge concern. Article doesn't mention chip pricing either, so it's possible some of the costs are offset by cheaper production if there were yield/performance issues.[/QUOTE]
Hopefully AMD wouldn't have agreed to a contract whereby they had to use GF even if GF couldn't supply them with enough working dies....
AMDs biggest problem right at the moment seem to be that they can't make enough of Polaris 10. While performance is behind nVidia they could still be selling well if they had the chips to sell, and 470 4 GB wasn't going for more than a 480 8GB was supposed to be.