Screw all those deep pipelines ... just give me a dual issue in order x86 core. One issue slot for x86 instructions, or rather macro-ops, and non arithmetic SIMD instructions and one for SIMD arithmetic (MMX/3DNow! only, forget about SSE). Give me as many of those as you can fit on a chip with say 16/64 KB L1/L2 cache per core and a shared L3 cache of say 512-1024 KB. Oh and support 3+ thread contexts in hardware (wont take a lot of space, this type of processors doesnt need huge register files ... less space needed for renaming, and for multiple accesses per cycle).
Of course I have said that before, but now people on comp.arch are saying it too ... so maybe AMD/VIA/Transmeta will stop being such pussies and actually do this, instead of waiting for Intel to set the trend. It is going to happen sooner or later. There is a niche for these kinds of processors, and if the Cell patents are anything to go by Sony/IBM's design will leave plenty of oppurtunity for x86 to live in that niche too.
Of course I have said that before, but now people on comp.arch are saying it too ... so maybe AMD/VIA/Transmeta will stop being such pussies and actually do this, instead of waiting for Intel to set the trend. It is going to happen sooner or later. There is a niche for these kinds of processors, and if the Cell patents are anything to go by Sony/IBM's design will leave plenty of oppurtunity for x86 to live in that niche too.