Okay. I guess an APU+GPU combo would see the APU as a heavy-lifting processor via GPGPU. If the APU is running graphics in parallel with the GPU, I don't see the advantage versus a monolithic GPU.
I've been thinking about that lately. What if the AMD CPU is paired with some combination of GCN CU's, but without the ROPs, Texture Units, and anything else a full GPU needs. They're calling it an APU, but it's really just a stripped down CPU with no FP unit, but using GCN SIMDs as a Vector Unit for all the FP processing.
Perhaps taking that approach will simplify the programming model. In theory, all your GPU and vectorizable CPU code, could look the same and if there's a large shared and coherent L3 cache between the CPU and GPU, you could move your code between the two easily.
A GCN compute unit consists of 4 16-wide vector SIMDs. If you paired a 500 MHz CU unit with a core (assuming 8 cores), you could get a theoretical performance of 512 gigaflops from the CPU.