How much is sequential performance demand going to increase versus parallelised demand? From the design criteria for Cell, the idea was to create a processor from scratch to deal with data processing that current 'sequential' processors aren't ideally suited for, and that me makes sense. I think a lot of computer theory has evolved around the evolving CPU architectures, and it's good to step back and think 'if we consider the problems we'll have to tackle, what sort of a processor would be best suited for that?' without concerning yourself about programming theory too much and running current algorithms.I just think it's a bad idea to give up sequential performance. Amdahl's law will f*ck over those that do.
At the moment there's a lot of devs I'm sure looking at the code they've been writing to date and thinking 'there's no way this algorithm is going to be able to be spread cross multiple cores' and wanting one core that can really eek out the performance of it's silicon, but parallelism across the board is in it's infancy. We're already hearing about problems regarded as a bad fit for parallelism being reworked to fit very well, both on Cell and GPGPU. Personally, using my Tea Leaves of Infallibility, I think the way problems are tackled is going to shift towards parallelism and the need for the serialised execution thread is going to remain limited. At the end of the day, perhaps the serial core in a multicore architecture won't need to progress far beyond something like an A64, and the focus for processor advancement will be on both the supplimentary cores, and the memory interfacing, both intracore and to try and get past that damned slow main memory!
I'll also note that there'll likely remain different designs for server processors and home/workstation/console processors, due to the fundamentally different workloads.