We might disagree on what is straightforward or not but GPU cores complexity is going up, not down.
Considering that GPU "cores" are crap cores that (IMHO) don't fully count as cores in the first place, that's not saying much.
If we're going to rate things in the general-purpose core continuum, the shader clusters have to be compared against real general-purpose cores like Penryn.
Do you really expect the next major architectural shift from NVIDIA and AMD to not have more complex cores? It will eventually happen that more and more fixed function units will be absorbed by programmable cores, and complexity will go up just to accomodate extra features.
If the fixed function is absorbed into the cores directly, it's just another unit.
I can put a swiss army knife in my pocket, but I'm not any more complex than I was prior.
If the designers don't want to constrain the generality of the design, more work will be done using simple operations to synthesize complex operations. That lends itself to simpler hardware.
The stated desire to eliminate a lot of the peculiarities of data structure, fixed pipeline, data formats, and specialized storage in many ways makes the hardware's job simpler.
If it's all "fetch from generic memory, execute generic operation, store to generic memory", then we're going back what Von Neumann did.
From what we know LRB won't be out before other 12-18 months and in that time frame I expect NVIDIA to offer sometihng more similar to LRB than to G8x (x86 aside). Just because NVIDIA is not talking about G300 it doesn't mean they are not exactly on the same route. Do you remember what they were saying about unified shading? I do
It would be quite a gamble to copy Intel when not even Intel knows how well its gamble will pay off once present in final silicon.
While you be right this is an aspect where I expect Intel to shine, especially with regards to NVidia. Since no one expects Intel to be initially as good as Intel at graphics I don't see why we should expect NVidia to be as good as Intel at designing such general purpose cores.
In my opinion, general purpose cores are generally being overrated, as far as graphics is concerned. A shader array isn't a massive failure if it can't run string comparison instructions or transition between software priviledge levels.
The P54 is decades-old and nowhere near cutting-edge as a CPU.
I don't see anyone having a problem matching it within the confines of consumer graphics.
The cores themselves are not as important as what connects and coordinates them.
I'm reserving judgement until I see the physical implementation of Larrabee from a manufacturer with a history of cheaper but in many ways uncompetitive and overweight x86 cores (of which Larrabee is a direct descendant), and so far uninspired many-core integration and nascent massively parallel software.
Those parts are untried, fetch/decode/execute is not.