Overclocked OMAP3

Lazy8s

Veteran
The enthusiast communities for the Motorola Droid/Milestone and the OpenPandora routinely clock the 65nm Cortex-A8s in their respective OMAP3s to a stable 800 - 900 MHz, with some Droid users even pushing upwards of 1.3 Ghz. Texas Instruments themselves offer high clocked variants, of course.

TI supposedly allotted about forty-five engineers for over two years to hand-tune critical parts of their A8 layout. Although Qualcomm went completely custom for the ARMv7 Scorpion and Intrinsity supposedly has made the most extensively optimized A8 for Hummingbird, I wonder how much higher and more efficiently they can be-clocked/perform relative to the surprising potential of TI's A8.
 
Wasn't apple known to underclock the SoCs in their ipods/iphones? I seem to remember something along those lines re early ipod touch/iphones.

But generally, I believe the chip clock in those devices may have more to do with battery draw and guaranteeing a reasonable lifespan for the device, than practical 'stable oveclock' limits of the silicon. What good is a (hypothetically) '2x stable overclock' if the device lasts, say, a year, and takes twice the amount of recharging?
 
Texas Instruments rates only a 550 MHz clock for their A8 in their mobile-targeted OMAP3430 and CE-targeted OMAP3530 yet Qualcomm rates their 65 nm Scorpion at 1 GHz, so the question becomes whether that difference is more attributable to Qualcomm potentially pushing closer to the limit of their core or more attributable to the architectural customizations Scorpion features over TI's A8 implementation.

If the former, Qualcomm might not have much of a competitive advantage nor a return on investment from their ARMv7 effort if TI decides to be similarly aggressive in clocking. A potentially interesting observation is that overclocking has been fairly unsuccessful on Snapdragon phones like the Nexus One.
 
Texas Instruments rates only a 550 MHz clock for their A8 in their mobile-targeted OMAP3430 and CE-targeted OMAP3530 yet Qualcomm rates their 65 nm Scorpion at 1 GHz, so the question becomes whether that difference is more attributable to Qualcomm potentially pushing closer to the limit of their core or more attributable to the architectural customizations Scorpion features over TI's A8 implementation.

If the former, Qualcomm might not have much of a competitive advantage nor a return on investment from their ARMv7 effort if TI decides to be similarly aggressive in clocking. A potentially interesting observation is that overclocking has been fairly unsuccessful on Snapdragon phones like the Nexus One.

Scorpion is not a customized Cortex-A8; ARM does not offer any avenue for customizing A8 beyond the configuration options detailed in the TRM. The only means to producing an ARMv7 CPU outside of the configurable parameters of Cortex-A8 is to design a new processor.

We don't know very many of the finer details of Scorpion's implementation. From a high-level standpoint I'm sure it's very similar to Cortex-A8 (in-order dual-issue) but it's also reasonably known to have a longer pipeline which alone makes it ineligible for clock headroom comparison.

Similarly, not knowing these details means that it's not well enough to compare the two CPUs on clock speed alone. For instance, we do know that Scorpion's advanced SIMD unit can perform 4 FP operations per cycle instead of 2. We also know that it has a pipelined VFPU.

It's possible that not all of the integer pipe stalls apply (although with a longer pipeline you'd expect the stalls to overall be worse). It could also lack limitations such as Cortex-A8's inability to perform initial 64-bit loads/stores in one cycle due to static scheduling and potential misalignment. For all we know it could have two load/store units, or at least allow simultaneous load/store, or the ability to prefetch into L1 cache.. etc, etc. It could also have any number of disadvantages. We just don't know (but if someone has more insight please tell me because I'd love to know more about the platform)
 
TI have just released data on their 1Ghz 45nm AM3715 "Sitara" Cortex based Soc, Quoting up to double performance for the SGX @ 20M polys/sec.

The VERY detailed datasheet:-
http://focus.ti.com/lit/ug/sprugn4/sprugn4.pdf

-ultimately reveals that the SGX530 core can be switched by the Soc to run at either 96Mhz or 192Mhz. Look at page 337 under the power,reset and clock managment chapter.
 
As a follow up to my previous posting, some here might find the following TI power usage figures for the 1Ghz capable AM3715 interesting.
http://processors.wiki.ti.com/images/b/b9/AM37x_DM37x_PowerEstimationSpreadsheet_v1_00.zip

A discussion article for the above spreadsheet can be seen here:-
http://processors.wiki.ti.com/index.php/AM/DM37x_Power_Estimation_Spreadsheet

The spreadsheet allows core cpu speed & temperature to be changed to see the effect on power performance.

Of particular interest are:-

SGX530 (@192Mhz & 70% utilisation) is quoted at 75mW.

Arm A8 core+ neon (@600Mhz & 70% utilisation) is quoted @ 254mW.
Arm A8 core+neon (@1Ghz & 70% utilisation) =585mW

Entire chip is quoted @ 597mW (for 600Mhz part), or 937mW for 1Ghz part.
 
Last edited by a moderator:
3630 also has the 192MHz SGX option.

It's in Droid X, for what it's worth.

The AM3715 datasheet reveals that the L1 cache of the A8 has been doubled over OMAP3, nice.

To me another impressive point of the spreadsheet is how little the DSP uses, especially if you don't use any of the video encode/decode blocks. It's quite a bit more power efficient per MHz than the Cortex-A8. I would be curious to know exactly what utilization means; the spreadsheet says 90-100% is "unrealistic", so I think it must mean more than just percentage of the time the CPU is on (maybe reflects how full the pipes are?)
 
Last edited by a moderator:
Another interesting stat from that TI spreadsheet is that the leakage current for the ARM sub-system is between 34mW and 50mW (depending on whether you tell the spreadsheet you have a 300MHZ chip or a 1Ghz chip). and with everything set to off-mode except the ARM subsystem, the power taken by the enitre chip is 90mW.

Intel has repeatedly stated that the Moorestown platform (and they insist its the entire platform) draws 25mW in standby.

Unless I'm comparing apples with oranges, it looks like Intel has done extremely well with the standby power.
 
Intel's CMOS tech is the best there is, heads and shoulders above the common garbage companies like TI have to work with. If their leakage is only a fraction of their competitors (despite spreading out across several chips), that shouldn't be too surprising really... :)
 
Another interesting stat from that TI spreadsheet is that the leakage current for the ARM sub-system is between 34mW and 50mW (depending on whether you tell the spreadsheet you have a 300MHZ chip or a 1Ghz chip). and with everything set to off-mode except the ARM subsystem, the power taken by the enitre chip is 90mW.

Intel has repeatedly stated that the Moorestown platform (and they insist its the entire platform) draws 25mW in standby.

Unless I'm comparing apples with oranges, it looks like Intel has done extremely well with the standby power.

I think you are misinterpreting the chart. The "Leakage" on Chart B is when the device is active. The reason I believe this is because when you take the Full chip power chart and put most to STBY4, the power usage is radically different. Leakage shoudn't change because of frequency anyway.

EDIT: The STBY modes on "Full chip" power chart should be compared with the 100uW on the Lincroft SoC. Yes, its true that Intel's numbers still look really good.
 
Last edited by a moderator:
Back
Top