I think they went wrong when they decided to make it x86. They are starting a brand new design that will have a lot of baggage on day one...
How much baggage are we talking about
really? The first x86 CPU with most of this baggage is the 386, which has a measly 275,000 transistors. Sure, a modern 64-bit SMT architecture needs more, but if Intel was able to keep x86 afloat in the 386 days, when there was still lots of competition from other so-called superior architectures, how bad can it really be?
x86 isn't particularly elegant, but I really wonder where the idea comes from that it has too much baggage or is even crippled. Every ISA has its curiosities, but unless you are a compiler back-end writer that shouldn't affect you much if at all.
If you were to design a massively parallel multi-purpose CPU, what ISA do you believe would do significantly better than x86? Give me numbers...
...and by exposing x86 directly they are forced the maintain binary compatibility for all future generations making the problem worse over time as new features are added, whereas ATI and Nvidia can redesign their stuff from ground up with essentially no backward compatibility on the hardware level.
As cores become more complex and ever more programmable, I doubt that the choice of x86 really makes things worse. People will always want more features, and unless there's some revolutionary hardware or design breakthrough I doubt you'll be able to do more with less. Heck, NVIDIA and ATI still have to invest a lot more transistors to be able to do everything Larrabee will be capable of on day one.
At the same time I understand why they made that decision. Larrabee is mostly motivated by business reasons rather than technical ones. GPGPU is eating into their most profitable business, and they need to fight back. The quickest solution they can do is to reuse existing technology. If they were to design a better architecture from scratch they would lose another year or two, which they can't afford.
And you think that for ATI and NVIDIA it's really a good idea to ditch the ISA every generation and start over from scratch? I can already hear a cheer of joy from the code generation teams and others who have to inspect assembly or binary from time to time... I don't know that the future will bring, but it does look to me like with G80 and CUDA NVIDIA opted for high compatibility between generations. It will be interesting to see ATI's approach.
Anyway, note that Intel will be able to present new Larrabee products at the same pace as CPUs. Having existing x86 tools and knowing that software written today will also run unmodified on future generations also helps keeping a high pace. Not to mention compatibility with systems not equipped with a Larrabee card but still having a powerful multi-core x86 CPU...
The software side is clearly becoming ever more important, so it doesn't necessarily hurt to settle with an extendable ISA even if it has its flaws. Intel has already proven that poor choices made in the past can be corrected to a large degree.