OpenGL guy said:You want full programmability? Use a CPU then. There's a reason why CPU don't compete with GPUs (or whatever your favorite term is). If you go to full programmability, how does a GPU keep it's advantage?
Because a GPU is a stream based SIMD vector processor and a CPU isn't, that's how. Full programmability (e.g. turing completeness) can be achieved without losing the GPU's advantage of stream processing and highly parallel vectorized code.
CPU's are optimized mainly for scalar computations and the vast majority of code (OS, applications, databases, etc) that executes on them is scalar in nature. This code is also "non-pure" in that it mutates the environment/manipulates external state. This yields all kinds of roadblocks to optimization. In addition, GPU's have a well defined memory access pattern due to their streaming nature, making life much easier for the memory controller. CPU's have to deal with heavy indirection and random access.
For the most part, GPU's execute "pure functional" code. (see Functional Programming languages like Haskell). They cannot store state inbetween executions (except in multipass) and such programs take well defined input and return well defined output. There is also no aliasing. This allows a compiler to do aggressive liveness analysis, redundant code elimination, and equational reasoning. It also allows the hardware to highly parallelize execution since interdependency can be exactly determined.
It is certainly possible to come up with a model of computation that is based on streams and which is nearly universal. Once you factor in multipass, universality is assured.