Bring back high performance single core CPUs already!

Probably Xenon at 22nm would only need a ~30mm2 die because of external I/O.
Very small and with 22nm, which is 4 times linear (16 times square) smaller than the original 90nm Xenon, a ~6.4 GHz could be possible and maybe cold.

With 6 processors and L3 cache (4 MBytes) maybe around 80 mm2.
And with 430 GFlops and good multitasking.

For entertainment I mean High Definition HTPC with more flexibility.
 
For gaming and entertainment it would be fast and cheap.
Probably something around 230 GFlops peak :)

It would actually be more like ~153.6 GFLOPs. MS inflated the paper specs of Xenon by adding the VMX units throughput to the floating point units in the core itself which if I'm not mistaken cannot run in parallel. Xenons real max theoretical FP throughput is 76.8 GFLOPs.

Also bare in mind they would likley have to re-design the core to reach such high speeds as well so it might not end up being as small or as cool as you'd expect.

And as I say, it would still be slower than a modest quad Sandy or Ivy Bridge for both games and media.

For example a quad Sandybridge at 2.6 Ghz has a peak theoretical FP throughput of 166.4 GFLOPs. And for none vector code I'd expect the SB to be at around twice as fast in heavily threaded workloads. Much faster in lightely or non threaded workloads.
 
Thanks for the info pjbliverpool
But the SandBridge die size is around 294mm2 at 32nm
The Ivy bridge has 1.4 T Transistors with 160mm2

Based on the data above the Xenon would have a die size of (guesstimaton):

20 mm2 = (160 mm2 x 165 M Transistor)/ (1.4 T Transistors)

Probably a re-design would be needed not only for the core but the architecture itself.

I am thinking about something small like Xenon with ~200 M Transistors ( ~30 mm2 at 20nm), cheap (less than $100) and fast with FP :smile:

Endless possibilities like HTPC, gaming, netbooks at low cost.
 
Endless possibilities like HTPC, gaming, netbooks at low cost.
Not many recent consumer operating systems or games run on PPC instruction set. Even Apple is using x86 now. So you would need to install some ancient Mac OS for it or Linux/UNIX of course.

Additionally you would need separate binaries to get good performance out of the in-order architecture. Basically you need to unroll loops and inline functions very agressively, and code around various stalls. In-order CPUs are not that great for general purpose code. You need to specially optimize for them, or you easily lose half of your performance (or more).

If x86 support is not critical for you, and you want low power consumption, why not use ARM-based architecture instead?
 
AFAIK the current ARM processors will not go to the FP performance in the range of ~50 GFlops per processor.

And maybe a simple OOE version of Xenon could be developed.
Just dont try the optimize everything.

Also add some power saving techniques.

Again a full redesign could be needed but in the end the result could be very interresting.
 
Last edited by a moderator:
I wonder what the next consoles will do for CPU choice. How much pain will they cause game programmers this round? ;)
 
I wonder what the next consoles will do for CPU choice. How much pain will they cause game programmers this round? ;)

Yeah... I was kinda hoping they would go OoOE this round with a few powerful cores, so that less time is wasted trying to figure design flaws out and more time is spent on ideas and gameplay. A dual core Ivy Bridge (the CPU side)or some customized version of it clocked at 4+ Ghz would be pretty nice to have IMO. Haswell would be ideal, but its a little late for that I think.
 
And maybe a simple OOE version of Xenon could be developed.
Just dont try the optimize everything.

Also add some power saving techniques.

Again a full redesign could be needed but in the end the result could be very interresting.
I am not sure what you are aiming to achieve, best possible single core performance or best possible throughput?

If you are looking for a new version of Xenon, with simple in-order cores, and a huge focus to having as high thread count as possible (6 threads back then was 3x higher than any x86 CPU could achieve), IBM already has done that. That chip is called PowerPC A2. It has 16 cores and each of the cores can execute 4 threads simultaneously. In total that's 64 threads. Getting best performance out of it would likely require similar optimization techniques than for Xenon (however this should be slightly better as there are 4 threads per core now instead of 2 threads per core, so more TLP can be automatically exploited). A throughput monster can be surely created that way (but it requires software to be designed to run with 64 threads to get good performance out of it).

Since this thread title is "Bring back high performance single core CPUs already!", the highly threaded PowerPC A2 route seems to be the complete opposite. If you want good single threaded performance, you have to exploit ILP (instruction level parallelism) from a single thread as well as possible. Simple in-order cores are not good at that. You basically have to try to handle the ILP extraction statically at compile time (Intel tried to go that route with Itanium, and we all know how that ended up). So basically if you want to have the best possible single thread performance, you have to go the way Intel did with its newest designs (Core/Sandy/Ivy). Spend a huge amount of transistors on other parts than pure execution units. Have deep OOO execution, exploit ILP as much as possible, fight against all stall cases to keep the pipelines occupied, focus on buffering/caches and memory latencies, etc. This is basically the opposite of simple in-order execution. Focus on having a clever core that is utilizing all its execution resources as much as possible (compared to a brute force core that stalls often and does nothing).

Intel's recent CPUs seem to be constrained by heat, not by clock rate. Turbo clocks are near 4 GHz, and the pipeline stages are still working just fine. A single core version could likely run at 5 GHz all the time. It wouldn't break any power efficiency records (as power usage grows very quickly as you increase clocks), but it would certainly beat everything in single threaded performance. If they wanted to go that road, they could widen the core a bit (more execution units to handle ILP peaks) and just scale things up, as transistor budget wouldn't be anywhere a limiting factor anymore. But again all this would decrease power efficiency, as reaching beyond the sweet spot (in many areas) would only increase the performance slightly, but increase the transistor count (and heat production) dramatically. The question really becomes, is there a large demand for a single core that has 2x performance (or likely less) compared to the current high end cores, but eats as much power as four current high end cores (halving the throughput at equal TDP)?
 
A dual core Ivy Bridge (the CPU side)or some customized version of it clocked at 4+ Ghz would be pretty nice to have IMO. Haswell would be ideal, but its a little late for that I think.
Way too expensive too. An intel chip would be half the price of the console hardware cost, if not more, simply because intel wouldn't want to sell their CPUs to a console manufacturer cheaper than what they could get for it through regular channels.
 
Way too expensive too. An intel chip would be half the price of the console hardware cost, if not more, simply because intel wouldn't want to sell their CPUs to a console manufacturer cheaper than what they could get for it through regular channels.

Well, there is opportunity cost. Having "Intel Inside" stamped on 100 million consoles playing games that port directly to the PC platform could be swanky for someone wanting to sell to both camps...
 
I am not sure what you are aiming to achieve, best possible single core performance or best possible throughput?
A core in the middle term, SMP style programming with 30 watts TDP.

Sorry to hijack this thread. I will open a new one in the future. I will elaborate better the arguments.
 
Way too expensive too. An intel chip would be half the price of the console hardware cost, if not more, simply because intel wouldn't want to sell their CPUs to a console manufacturer cheaper than what they could get for it through regular channels.

Yeah, I know it seems pretty unlikely, would be nice though. Microsoft did say they were considering both x86 and ARM though for Xbox720. I suspect they would go with AMD cores though, even if they did go x86.
 
A dual core Ivy Bridge (the CPU side)or some customized version of it clocked at 4+ Ghz would be pretty nice to have IMO. Haswell would be ideal, but its a little late for that I think.

Yep even a dual core Haswell would be fairly decent for a next gen console. At say 3.2 Ghz it would pack as much floating point throughput as Cell and at least twice the general processing performance of Xenon.

Plus without an IGP it would be absolutely tiny! And operate at very low power.

For real grunt though the consoles would want a quad core version. It would still be very small and pretty low power if the IGP was stripped out. Or possibly even better to leave the IGP in and use it as a dedicated physics processor...:?:

Half a TFLOP dedicated to physics sounds pretty good to me!
 
Back
Top