Qualcomm SoC & ARMv8 custom core discussions

Do you know what the emergency temperature limits tend to be set to for devices like this? Where it'll just shut down? It could be getting pretty close...
I've heard of devices/chipsets that actually let it go quite a bit higher. Seems dangerous with the battery and all.

SoCs usually shutdown around 105°C on the silicon. I'll ask Josh to see if he can re-check the CPU sensors when running the test. Usually 55°C skin temp means the insides are above 60 or even in the 70's and the silicon must be in the 90's. This varies from vendor to vendor on which sensor they base their "internal" temp on and what kind of policy they use to shut things down. To be honest the only device we have seen actually shut down due to heat was the Mate 7. Any normal and reasonable thermal driver will prevent that case from ever happening unless you leave your phone bake in the sun.
 
SoCs usually shutdown around 105°C on the silicon. I'll ask Josh to see if he can re-check the CPU sensors when running the test. Usually 55°C skin temp means the insides are above 60 or even in the 70's and the silicon must be in the 90's. This varies from vendor to vendor on which sensor they base their "internal" temp on and what kind of policy they use to shut things down. To be honest the only device we have seen actually shut down due to heat was the Mate 7. Any normal and reasonable thermal driver will prevent that case from ever happening unless you leave your phone bake in the sun.

Seems that HTC fixed the high temps for the M9 via a firmware update. The GPU clocks were not doing any throttling:

http://tweakers.net/nieuws/102040/htc-verhelpt-hitteprobleem-one-m9-met.html

Temps down to ~40 celsius, but GFXBench suffers....

PS: a CMOS process usually is qualified up to 125 celsius, but IMHO you really don't want a mobile SoC to run at high temps. We are not talking about an actively cooled (= fan) processor in a huge enclosure here. From earlier jobs I remember that a certain finnish cell phone maker usually demanded a maximum Tamb of 40 celsius, with a poor Rthermal of a cellphone housing (plastic ? thin metal casing) that doesn't give you much to play with.
 
Last edited:
Anandtech's review of the M9 is a lot harsher than anything else I've seen so far, but the M9 really feels like a sidegrade to the M8 instead of a successor.
I guess the effects of all those executives bailing out of HTC back in 2013 and constant criticism of Peter Chou's (lack of) vision are starting to show.
 
Not many other sites tried to investigate the issues. Notice the PCMark scores versus the 5433 Note 4, that is something which is just unacceptably bad.
No, they didn't.
But in the end, for a smartphone, what does 25% more PCMarks really mean, and how much should that weigh in the general impression of a smartphone?
We know PCMark is a whole lot more important that e.g. AnTuTu (which all the non-tech-savvy websites seem to love), but in the end, how much do they really differ in how they represent end-user experience, when dealing with >1.8GHz CPUs on smartphones?


I'm a very big critic of the M9 myself. I think the device will be HTC's doom, but not because the S810 is a disappointment or the screen lacks in color accuracy.
I think the device is a terrible flagship because HTC decided to keep almost the exact same design for 3 consecutive years. They're too slow at adopting changes from customer feedback.
The black bar saying "HTC" should have been gone a year ago in the M8, or if it had to stay, they should've kept the capacitive buttons and save screen estate. Speaker design should've been adopted from HTC's own contemporary designs like the HTC Eye or the Nexus 9. They can make good sounding speakers look a lot better, yet they chose to go with the same design as the 2-year-old M7. They claim there's no space to include OIS in the camera, but they have the thickest flagship around. Screen-to-body ratio is the worst amongst all the flagships and they can't even use IP67 certification as an excuse.

While the general industrial design doesn't look bad, their ability to cramp things together keeps being at the same level as the typical 2nd/3rd rate chinese manufacturers like ThL, Jiayu, etc.


BTW, if anything, the S810 being built on 20nm may just have a manufacturing price equivalent to the S801 while offering the performance+features of the S805 + modem. That alone could be a valid reason for the upgrade from the smartphone manufacturers' point of view..
My idea is that Qualcomm isn't really strong on making/marketing tablet chips. I think they originally thought of the S805 as the tablet contemporary to S801, but it got delayed (or they just didn't get much interest from their clients).
 
Last edited by a moderator:
BTW, if anything, the S810 being built on 20nm may just have a manufacturing price equivalent to the S801 while offering the performance+features of the S805 + modem. That alone could be a valid reason for the upgrade from the smartphone manufacturers' point of view..
My idea is that Qualcomm isn't really strong on making/marketing tablet chips. I think they originally thought of the S805 as the tablet contemporary to S801, but it got delayed (or they just didn't get much interest from their clients).

Did anybody try to come up with a sensible explanation why the S810 is such a "disaster" for Qualcomm in terms of execution ? Apple's A8 and the S810 both are using TSMC 20 nm if I'm not mistaken. So it can't be process or yields. What went wrong here ? Did Qualcomm use ARM A57 ready-to-use macro cells for time to market ? Apple's custom layout for the ARMv8 cores paid off ? Is it Apple's wider datapath in the CPUs, so less dependency on clock speeds ... so ARM/Qualcomm has to simply hit higher clock speeds to reach decent performance ?
 
Why is the S810 a disaster compared to Exynos 5433?
While they both use essentially the same CPU clusters and same process, the S810 has an integrated baseband processor which contributes to its thermal limitations, therefore its cores need more aggressive throttling.
The Exynos 5433 needs a separate chip for baseband, which consumes more power and should be more expensive overall.
 
Why is the S810 a disaster compared to Exynos 5433?
While they both use essentially the same CPU clusters and same process, the S810 has an integrated baseband processor which contributes to its thermal limitations, therefore its cores need more aggressive throttling.
The Exynos 5433 needs a separate chip for baseband, which consumes more power and should be more expensive overall.
The S810 throttles very badly compared to the 5433. The 810 can't sustain single thread performance. The PCMark tests are very representative of real use cases, the CPU throttling in the writing test is extremely alarming for both performance and power. This whole story is reminiscent of the Tegra 3 in the One X.

The modem has no noticeable effect in scenarios where you don't use it. This was never a problem for the 800 so why should it for the 810.
 
Relying on higher clock speeds and unit counts for performance instead of staying in their sweet spots and pushing efficiencies in the underlying architectures and implementations instead is what separates the S810 from some of the other SoCs. Those few extra tenths of a gigahertz can be killer.

Qualcomm has been going down this road for a while in some ways. I think the Adreno 430 also consumes more than some assume and was pushed beyond its architectural limitations to hit a competitive benchmark target.
 
Relying on higher clock speeds and unit counts for performance instead of staying in their sweet spots and pushing efficiencies in the underlying architectures and implementations instead is what separates the S810 from some of the other SoCs. Those few extra tenths of a gigahertz can be killer.

Qualcomm has been going down this road for a while in some ways. I think the Adreno 430 also consumes more than some assume and was pushed beyond its architectural limitations to hit a competitive benchmark target.

This is something that is all too often overlooked. Like when Cortex-A15 SoCs were first out, they were getting slammed for performing 20% faster but using 100% more power vs the competition (Saltwell in this case, if I remember the numbers right). But there was no investigation as to what the overall perf/W curve looks like. It doesn't matter how much power something uses at peak the competitor can't match that performance, although the user may not be given sufficient choices to throttle down to normalize performance. We've seen cases where a big supply voltage jump was applied just to get that last 100 or 200MHz. Looking at the M9 review the GPU appears to be best in class in most cases, so the question is, what's the power consumption like when it's brought down to its competitors level? Does the patch do more or less that, or does it have to take it further?

On the other hand, I can say with confidence that the CPU power management is very poor, or something else is broken. There's no justification for only being able to clock at 1.6GHz when a single core is active, or gradually throttling by similar amounts when in singlethreaded or multithreaded usage patterns. This was observed on LG's S810 phone so it's not just an M9 issue. Maybe they're throttling the entire chip if any single CPU core temperature exceeds a certain value.

I'm hoping AT does a deep dive on the power consumption curve for A57 and A53 cores in S810, like they did with Exynos 5433.
 
Relying on higher clock speeds and unit counts for performance instead of staying in their sweet spots and pushing efficiencies in the underlying architectures and implementations instead is what separates the S810 from some of the other SoCs. Those few extra tenths of a gigahertz can be killer.

Qualcomm has been going down this road for a while in some ways. I think the Adreno 430 also consumes more than some assume and was pushed beyond its architectural limitations to hit a competitive benchmark target.
*
I don't even know how many SPs the 430 has, but I wouldn't imagine more than 8*SIMD32@600MHz. That doesn't sound like any sort of crazy cluster count or exagerrated frequencies especially for 20SOC TSMC, if true.

What I would like to know is how the transistor count for the Adreno330 GPU was and how it looks it now with the Adreno420 or 430 GPUs now that those are DX11.x. If the difference there is too high then one of the negative factors could lie in the featureset that lies flat right now for those GPUs and less in the unit count and their respective frequency. Rumors and indications for the S810 point at the memory controller being problematic; again if true then the GPU would be amongst the first candidates to suffer from its problems. Did the S805 have similar problems? No idea.

A die area hint under 20SOC could possibly also help, but it's not likely that we'll get any of those details anytime soon.

[strike]On a sidenote yes I realize that it's still way too early for Gfxbench3.1 related conclusions, but it's my impression that it's the first time I notice Adrenos to NOT benefit from a heavier workload.[/strike] ....scratch that, it's a lot closer to GK20A in 3.1 than in 3.0.
 
Last edited:
MOAR PHYSICAL DESIGN DETAILS FFS
It's the biggest SoC in use in smartphones in this generation, by a long way (> 130mm2). A330 to A420 in the same technology (both 4 cluster designs with roughly the same on-paper perf), was > 50% more GPU area.
 
It's the biggest SoC in use in smartphones in this generation, by a long way (> 130mm2). A330 to A420 in the same technology (both 4 cluster designs with roughly the same on-paper perf), was > 50% more GPU area.

Thanks for that; at least now I feel less of a local village idiot when I claimed that DX11 stands for a sizeable area overhead for GPUs.
 
Thanks for that; at least now I feel less of a local village idiot when I claimed that DX11 stands for a sizeable area overhead for GPUs.

Are you assuming the Adreno 4xx have the same ALU, TMU and ROP count per cluster as the Adreno 3xx?
 
Are you assuming the Adreno 4xx have the same ALU, TMU and ROP count per cluster as the Adreno 3xx?

Adreno330 already has SIMD32 ALUs; are you imagining that 420 could have something like SIMD48,64 or even 128? I don't. Yes the 420 could have theoretically twice the MADDs per lane compared to 330, but then again it's clocked a tad higher and delivers according to QCOM itself 40% higher performance, which doesn't suggest any radical increases anywhere. More like "we've detected a couple of bottlenecks in spots A.B.C and fixed it" like every other IHV out there does.

That aside with almost 50% more die area on SoC level the S810 cannot even sustain the G6450@A8 performance or frequency for a reasonable amount of time. The HTC9 after the latest update throttles the GPU quickly down from 600 to 390MHz from tests I've seen.

***edit: it goes actually down to 305MHz apparently: http://www.computerbase.de/2015-03/htc-one-m9-test/2/

HTC One M9 vs. HTC One M8:

http://gfxbench.com/compare.jsp?benchmark=gfx31&D1=HTC+One+M9+(0PJA10,+6535LVW)&os1=Android&api1=gl&D2=HTC+One+M8+(2.2+GHz,+6525LVW,+831C)&cols=2

No idea why for S810 SoCs the long term performance score are in single digit fps, however judging from the frames listed the first is barely by 15% faster than the latter in an onscreen test with the same resolution.
 
Last edited:
It is clear that it is the Cortex A57 which heats and consumes more. But it is not the problem of the SD810. The Exynos 5433 doesn't have this problems and it is built in an earlier 20nm node.

Also, we have to mention that Huawei looks for the lowest price of the little core.
 
I found some discussion on the Wikipedia Snapdragon talk page about this kernel source file that describes msm8996, a rumored Kryo SoC:

https://www.codeaurora.org/cgit/qui...ot/dts/qcom/msm8996.dtsi?h=LA.HB.1.1.1_rb1.10

It appears that there are two clusters of two CPUs (four CPUs total), and each cluster has independent L2 clock/power domains.
You mean like a silvermont atom :).
CPU frequencies go up to 1.6GHz, but this could be limited due to being an ES.
That would be one explanation, the gpu clock also seems to be lower than older designs.
It may also be possible this is really the top clock, which wouldn't necessarily mean it's slow - if apple can do such a design others presumably could as well... Needless to say though that would need to be a massively different cpu compared to Krait.
 
Back
Top