CD: Does CPU backed by a GPU, passing stuff over a bus, will that satisfy the flexibility needs plus fixed function? And do you think moving that onto a single die will help, or not be good enough to be left behind by more flexibility?
*
TS: Well you've touched on a key thing that's broken with CUDA right? So you have a lot of high level code that makes control flow decisions and high level decisions that's running on the CPU, and then when it wants to instigate a vector computation it hands that work off to the GPU, the GPU can then be use to run it and it hands the results back. So this workload is continually ping-ponging back and forth between the CPU and the GPU, the problem is there is a million clock cycles of latency between the two, and that's what's broken. What you need is at least the computing and the graphics side done on the same chip, so that the communications latency is minimal. Even preferable to that is a single unified core architecture which supports both scalar and vector computing like Larrabee, so you can run all of this computation together without having to switch cores or switch caches or anything.
*
AR: I'd just like to say, i agree with Tim totally about this issue of latency between the CPU and GPU, i think that's a really really really serious issue. So i do think it makes sense to have the CPU and the GPU on the same chip, for exactly the same reasons that a million cycles of latency really messes things up.
*
CD: Does that fix the problem, or does that minimize the problem? Or does it just help a little?
*
AR: I think it makes a lot of difference if you can just say 'right this bit of code here, you know it's making lots of decisions, so it should be on this core' right, 'this bit of code here is like really wide vectors so it should be on this core'. But, as Tim says you need to be able to switch from one to the other very quickly and that requires you to be on the same chip. So the interesting thing is that at the moment we have the CPU on one chip and we have the GPU on another chip, i think you know that division makes no sense, we should have CPU and GPU on one chip, if you're going to have 2 chips you should have another CPU+GPU, because actually the CPU+CPU com... (Charlie cuts in)