So what is the IPC for a single Jaguar core ? I read it's less than 1 !!
Also , for comparison purpose , I would like to know the IPC for a single SandyBridge Core i7 and and Bulldozer core .
I remember reading some IPC comparison article/benchmark years ago, but I can't find it anymore (it might have been from Realworldtech or Anandtech). If I remember correctly, in the general purpose integer test Bulldozer average IPC was 1.1, Sandy was 1.7, and Bobcat was 0.8. According to the Jaguar (Kabini A4-5000) benchmarks, 1.5 GHz Jaguar beats 1.6 GHz Bobcat by 22% (23.46% IPC increase). Extrapolating from the Bobcat IPC score, Jaguar average IPC should be very close to 1.0 in that general purpose integer test. However average IPC of each architecture might be completely different in another test case. If I remember correctly, Sandy's IPC in a mixed SIMD+integer code test was 2.9 (in the same benchmark article). But unfortunately I can't remember how the other chips performed in that test (and if I remember correctly they also used hyperthreading in that test to fill the Sandy Bridge core better = gain better IPC).
1.0 IPC is actually quite good if you compare Jaguar to the chips it's going to replace and compete against. In-order PPC CPUs (current gen consoles) have average IPC of around 0.2, ATOM has average IPC of around 0.5 and Bobcat around 0.8. In recent benchmarks 1.5 GHz Jaguar beat 1.7 GHz Cortex A15, indicating it has higher IPC than the top of the line ARM CPU. I don't think the IPC is a problem for Jaguar.
Jaguar would badly need dynamic CPU clocking (turbo). Intel radically improved their dynamic clocking for Ivy Bridge, and thus the 17W parts can clock up to 3.0 GHz (single threaded tasks / boost burst performance). Haswell improved this further by shortening the idle<->alive transition time to around 1 ms. ARM of course has focused on dynamic clocking / turning off chip parts since day one (as mobile/integrated chips are their main business area).
One thing I do not understand in AMDs Kabini/Temash SOC configurations is the GPU. It has only 2 CUs. They could have included 4 CUs instead and clocked them to half, and had exactly the same performance, but at a lower TDP. Or even better... they could have created a dynamic GPU clocking system similar to Intel, and had both lower power consumption in normal usage and much higher performance at demand. 17W Ivy Bridge parts have 350 MHz nominal GPU frequency, and turbo up to 1200 MHz (4.8x boost). This is something AMD needs badly, if they want to conquer the tablet/ultraportable market.
The perf/Watt improvement over Brazos is astounding.
Yeah, it even surpasses even P4->Core2