Opinion: Silverthorne fails but PowerVR impresses (+Montalvo trouble)

This is where the Cortex presentation pulled the Dhrystone 2.1 figures from:
http://homepage.virgin.net/roy.longbottom/dhrystone results.htm
Cheers, that's interesting! :)

I would take the benchmarks with grain of salt, unless some are willing to believe the 4 A9 cores will be faster than the Core 2 Duo.
And why not? Remember those scores are for a SINGLE core in the Core 2 Duo case and FOUR cores in the Cortex-A9 case. Also remembe that the Cortex would benefit from having what's basically an Integrated Memory Controller and much lower relative clock speeds, so memory latency is less of a problem. This also makes more advanced forms of OoOE less necessary IMO...

Now, I do think that Dhrystone likely isn't incredibly representative of real-world performance, but if the question is whether a 4x1GHz Cortex-A9 should beat a 1x2.4GHz Conroe, my answer would be that it should. Remember that ILP extraction is a game of diminishing returns, especially in terms of integers, and that the x86 ISA certainly doesn't make things any easier for Intel...
Here's Intel's numbers for Silverthorne against A8:
http://pc.watch.impress.co.jp/docs/2008/0402/kaigai432.htm
EEMBC Suite v1.1(compared to ARM 11 400MHz)
Cortex A8 600MHz: 3.3x
Cortex A8 1GHz: 5.4x
Intel Atom Z510 1.1GHz: 6.8x
Intel Atom Z530 1.6GHz/w HT: 13x
Once again that's very interesting, thanks. Hmm, let's scale the Z510 result to 1GHz, so it becomes 6.2 - now, if the A8 has a Dhrystone score of 2/MHz and the A9 has a score of 2.3/MHz, that would give us a performance result of... 6.2 again! So the ILP between Silverthorne (without IMC) and the Cortex A9 (with IMC) seems comparable.

I would actually expect the A9 to have a 10-20% ILP advantage, so that's slightly more positive for Intel than I thought. Of course I wouldn't be surprised if, in real-world scenarios, the A9 was more than 15% faster than the A8... Once again though, perf/watt for Intel's core will be massively lower than that of A9 cores on 40nm SoCs.

You know how IGP in Poulsbo performs?? It also says there: "Intel told us to expect a 3DMark '05 score around the 150 point mark."
I will refrain myself from judging the SGX IP based on that number since it's on 130nm and yet still quite small, so it wouldn't exactly be fair... :)
EDIT: According to my calculations based on public pictures, Poulsbo is 146mm² and the '3D' part of it is 42.5mm² (once again, on 130nm).
EDIT2: Oh and this also isn't very impressive: http://pc.watch.impress.co.jp/docs/2008/0402/kaigai01_10.gif
 
Yes, that was also my conclusion after studying the documentation a bit... :) This presentation is interesting, especially Page 22: http://www.jp.arm.com/event/pdf/forum2007/t1-2.pdf

Based on that graph, I estimated the Cortex-A8 to have 2.0 Dhrystone/MHz while the Cortex-A9 has 2.3 Dhrystone/MHz. ARM11 MPCore delivers 'only' 1.2 Dhrystone/MHz. Heck, Page 23 is also really impressive assuming those numbers are real and not too creatively selected. I'd be very very interested in the same programs being run on, say, Yorkfield or Barcelona... (btw, I'm not saying Dhrystone is the most representative benchmark around, it likely isn't, but it's the only real datapoint we have sadly!)
Dhrystone is really not representative at all of real world performance, it's basically unaffected by branches and memory latencies. The difference in IPC between the A9 and the A8 in non well-behaved codes will be much larger than what you see there.

EDIT: Oh, and I'm also not convinced Cortex-A9 would be clocked substantially lower than the A8 for a given process; maybe 10% or so though, I wouldn't exclude that. Not that we'd know anyway since I don't think anyone will synthetize it for 65nm, TBH...
Well, let's put it this way, the A9 is likely to clock significantly lower than what the A8 was supposed to. But only slightly lower (or not at all) compared to real A8 silicon ;)
 
Dhrystone is really not representative at all of real world performance, it's basically unaffected by branches and memory latencies. The difference in IPC between the A9 and the A8 in non well-behaved codes will be much larger than what you see there.
Thanks for the tip - I didn't know it didn't even have any branching, ouch.
INKster: Hmmm, that *is* quite small, nice. Although once again footprint isn't what I'm worried about, what I am worried about is power.
 
INKster: Hmmm, that *is* quite small, nice. Although once again footprint isn't what I'm worried about, what I am worried about is power.


Seeing what Intel can do with mature 65nm process Core 2's (Macbook Air, and Core 2 ULV's come to mind), i wouldn't dismiss a mature 45nm "Moorestown" just yet.
More competition can only be good for consumers, even if handheld x86 it's still just an "interesting" proposition for now.

Of course, i also agree with you that ARM-based products "are" very deeply entrenched in the handheld market right now, so i assume that this battle will be similar to the one Intel is facing with Itanium in the big-iron server market.
Being able to run Linux is just not a huge differentiator anymore, and Windows is practically out of the question due to interface and power/performance issues.

Speaking of ARM, because Nvidia stated at the APX 2500 launch that they intend to be very aggressive with product launches, how would they handle the Cortex A9 architecture and integrate it (if at all) with their in-house designs based on ARM11 ?
This would put them yet again head-to-head against several "Intel's" all at once. ;)
 
Speaking of ARM, because Nvidia stated at the APX 2500 launch that they intend to be very aggressive with product launches, how would they handle the Cortex A9 architecture and integrate it (if at all) with their in-house designs based on ARM11 ?
Hmm? My understanding is that this is NVIDIA's handheld roadmap. Please note that this is NOT based on ANY information from NVIDIA except old presentations and thus should NOT be considered of ANY competitive value or of sufficient reliability to be taken as anything else than my own speculation:
- 3GSM 2008: 65nm ARM11 High-End
- 2H08: 65nm ARM11 Derivatives
- 3GSM 2009: 40nm Cortex A9 High-End
- 2H09: 40nm Cortex A9 Derivatives
- 3GSM 2010: 32nm High-End
- 2H10: 32nm Derivatives
- 3GSM 2011: 32nm Half-Node High-End
- 2H11: 32nm Half-Node Derivatives
This would put them yet again head-to-head against several "Intel's" all at once. ;)
Indeed, they are part of the couple of companies competing directly against Moorestown and its future derivatives.
 
Thanks for the tip - I didn't know it didn't even have any branching, ouch.
Actually that's not true, I should have put it another way: it has branches and other things which were supposed to be representative at the time (late '80s) but a modern compiler makes minced meat of it. Besides it's got a minuscule footprint (thus no memory effects), it has lots of compile time constants (which lead the compiler to turn many parts into specialized code) and finally ARM supports conditional execution which - coupled with the fact that the benchmark is also very small - can still remove quite a few branches which survived after the optimizations.
 
Web page "benchmark" details?

Well if Intel wants it to be a benchmark, then treat it like a benchmark it should.
Does anybody know the browsers involved in the webpage render benchmarks?
And the amount of RAM on the devices used for the comparision?
Does anybody have a link to more info on it?
 
OT:

One day, one day, I'll figure out why so much of the best semiconductor technology is designed in the UK (CSR, PowerVR, Icera, PicoChip...)

Probably because they drink a lot of tea :D
 
Back
Top