Samsung Exynos 9810

Samsung has announced the Exynos 9810: "Samsung Optimizes Premium Exynos 9 Series 9810 for AI Applications and Richer Multimedia Content." The 9810 uses the M3 CPU and is on 10 nm LPP. (AnandTech)
Samsung said:
The Exynos 9810, built on Samsung’s second-generation 10-nanometer (nm) FinFET process, […]

The processor has a brand new eight-core CPU under its hood, four of which are powerful third-generation custom cores that can reach 2.9 gigahertz (GHz), with the other four optimized for efficiency. With an architecture that widens the pipeline and improves cache memory, single-core performance is enhanced two-fold and multi-core performance is increased by around 40 percent compared to its predecessor.
(emphasis mine)
That is a very impressive single-core performance increase, and if the increase holds for Geekbench, then the 9810 would be around 3900 in the single-core benchmark.
  • Apple A11: 4215 SC / 10170 MC [iPhone 8 Plus]
  • Exynos 9810: ~3900 SC / ~9100 MC [estimates]
  • Apple A10: 3438 SC / 5766 MC [iPhone 7 Plus]
  • Exynos 8895: 1959 SC / 6472 MC [Galaxy Note 8]
AnandTech also has a comparison of the Exynos 8895 and 9810. There was a rumor that the M3 is 6 issue, which explains the large performance increase.
 
From Andrei Frumusanu at AnandTech forums:
Andrei. said:
8vGjvW9.png


bblltQn.png
The 9810 is not too far behind the A11.
 
Actual early results are lackluster and the SoC seems lopsided despite the impressive absolute performance of the big CPU core:

https://www.anandtech.com/show/12478/exynos-9810-handson-awkward-first-results

You get great Geekbench performance but on pretty much every other benchmark or any discipline using actual Android APIs, the 9810 falls behind the 845.

Their multicore CPU score seems modest and their GPU has fewer cores compared to the previous iteration, so this probably means the M3 core consumes scads of die space and runs fairly hot to boot. The memory hierarchy looks hobbled on the small cluster with no L2 cache and a pooled L3, and it is also seems less integrated to their big cluster bus wise than on the 845 which uses ARM's DynamIQ. They probably should have gone with 2x M3 (this maybe hard due to a rigid design target for their big cluster) and 4x A55s for more thermal and die space breathing room which could be spent on more multicore and GPU performance without hurting single threaded performance.
 
Last edited:
I would assume the really terrible performance in everything but geekbench could be fixed (I would assume it's related to power management, what tasks to run on which cores, and when to power down big cores etc.).
That said, I think you are quite right that 4 of these cores is pointless (it has to be said the 4 A57 on the SD810 were entirely pointless too...). There's simply no way to run all 4 big cores at really useful frequencies simultaneously for even short periods of time (well, not in a smartphone), you'd be looking at 10W+ for the cpu cores alone. I'm not sure it really directly hurts neither, but it's definitely wasted die space. There is of course a good reason the apple SoCs only have 2 fast cores...
But I assume 2+4 cores was out of the question entirely simply by marketing reasons (that's just about the only number you ever see for a smartphone regarding the cpu, "8 cores" - of course I'd take a SoC with 2 fast and 2 slow cores over one with 8 slow cores (which is unfortunately the norm for most non-highend chips) any day of the week, but I guess I'm a minority...).
 
4 + 4 with aggressive frequency management on the big cores is better than 4 + 2, you still get vastly better performance and efficiency at lower frequencies. They'll likely go 4 + 2 + 2 next year.
 
4 + 4 with aggressive frequency management on the big cores is better than 4 + 2, you still get vastly better performance and efficiency at lower frequencies. They'll likely go 4 + 2 + 2 next year.
Not necessarily, that would depend on how well the big cores actually scale down. Powering down the big cores (or just not having them in the first place...) will allow you to increase frequency on the little cores, which may be more efficient (as the little cores in general are more power efficient, albeit if you clock them too high that will no longer hold true).
Some smartphones with SD810 actually never would run all 4 big cores simultaneously.
apple's A11 ST->MT scaling in geekbench seems to be slightly better as the Exynos 9810 ST->MT scaling, despite having only 2 big cores (of course, that scaling is notably worse than for those chips with slower big cores, such as SD835, SD845). (Though would need to compare power draw too of course.)
This is all theory of course, maybe there is indeed an advantage of having 4 M3 cores (within the smartphone power limit) after all, it's really impossible to tell without extensive measurements. But even if so, I would expect it to be very small.
 
Not necessarily, that would depend on how well the big cores actually scale down. Powering down the big cores (or just not having them in the first place...) will allow you to increase frequency on the little cores
No it wouldn't. The little cores are implemented for power and simply can't reach higher frequencies.
Some smartphones with SD810 actually never would run all 4 big cores simultaneously.
Because that would mean a 14W TDP. The 9810 will have the same power issue but there's actually performance behind that power so efficiency won't be as of a big issue.
apple's A11 ST->MT scaling in geekbench seems to be slightly better as the Exynos 9810 ST->MT scaling, despite having only 2 big cores
That's irrelevant, Apple has likely much more performant little cores, they're twice the size of an A53 and I somehow doubt they're in-order like an A53.
 
No it wouldn't. The little cores are implemented for power and simply can't reach higher frequencies
I meant if they weren't already running at their highest clock (because there's essentially not enough power left). But maybe that doesn't actually ever happen (if the power management always reduces clock of big cores first (or even powering them down...) so it can maintain max clock of little cores).

That's irrelevant, Apple has likely much more performant little cores, they're twice the size of an A53 and I somehow doubt they're in-order like an A53.
They are A55 (but ok probably not much of a difference). Indeed I think this might be a bit problematic - the A55 at max clock might not really reach the performance of the M3 at the latter "lowest power-efficient frequency" (that is, the frequency where lowering the frequency further would be detrimental to power efficiency, that is it would be better to run at higher frequency and put to sleep for some time instead).

That said, I suppose I got a bit confused by the low geekbench multi-core score. That looked to me like the big cores were running at a very low frequency (considering the small cores should contribute ~2000 points or so, that would mean the 4 big cores were only running at around ~1.2Ghz or so). Now if these big cores actually really maintain ~1.8Ghz as the anandtech article states then yes that looks like 4 of them would be very useful. However, in this case geekbench MT scaling would simply be awful (the shared L3 could be a bottleneck?).
 
That said, I suppose I got a bit confused by the low geekbench multi-core score. That looked to me like the big cores were running at a very low frequency (considering the small cores should contribute ~2000 points or so, that would mean the 4 big cores were only running at around ~1.2Ghz or so). Now if these big cores actually really maintain ~1.8Ghz as the anandtech article states then yes that looks like 4 of them would be very useful. However, in this case geekbench MT scaling would simply be awful (the shared L3 could be a bottleneck?).

Yes its one thing i have never seen talked about by ARM or have a seen really tested. if i was to guess the L3 is single ported, even Zen CCX L3 is single ported but they specifically designed buffering within the CCX to handle that and that still results is a 1/2 of bandwidth comparing L2 to L3.

So for General Phone operations thats not going to be a big deal but on the CPU benchmarks that hit memory sub system a lot it could play out as an issue. The other thing is i think anandtech said the two core types are in two separate clusters so there could also be bottlenecks between clusters that again in general phone operations wouldn't be a big deal.
 
Here's the first direct comparison I've seen between the S845 and the E9810 on the S9+:

https://www.androidcentral.com/samsung-galaxy-s9-tested-exynos-9810-vs-snapdragon-845

Looks like a blowout in favor of the S845 with the sole exception of Geekbench. Looking forward to Neb's detailed review.
Yes, indeed waiting for a more in-depth comparison.
Something just doesn't seem right with the 9810. I'd think in particular with these cpu cores it should dominate the browser benchmarks - but it hardly beats the old Exynos 8895 and gets blown away by the SD845 there.
Its FPU should also be way faster (and geekbench tells as much, albeit it's not surprising the MT scores are closer), yet again in 3dmark physics it's nowhere close to SD 845.
The GPU being slower was of course expected, but that it essentially loses in everything cpu related too except geekbench against the SD 845 is definitely unexpected. I wasn't that excited as some others when the early geekbench results surfaced, mostly due to potential energy efficiency concerns of the new cores (in any kind of sustained MT load it doesn't matter what the peak performance of the cores is, if they have worse perf/power than the not-really-custom A75 then things will be slower), but I'd never have expected things to be THAT bad - after all not everything is a sustained MT load...
 
I imagine the better memory subsystem and interconnects are the the S845's secret sauce. It's not clear that the Exynos has an L4 cache and the smaller cluster cores each lack L2 entirely. They have a very nice core in the M3, but that's about it; the rest of the design seems withered next to it, making it a bit like the Matthias Schlitte of SoCs.
 
Anand said:
I also have to remind reader that the devices were actively cooled in a reduced temperature environment, this is because the whole benchmark run takes 2-3 hours and we’re trying to look at peak performance. Transactional workloads are nowhere near this long-running and thus active cooling is warranted.

Whyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy. The focus on peak performance is very disappointing. And worse this will keep forcing SoC makers in making poor design decisions (based on marketing, not real world benefits to the user) on future products.
 
Whyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy. The focus on peak performance is very disappointing. And worse this will keep forcing SoC makers in making poor design decisions (based on marketing, not real world benefits to the user) on future products.
Because it's an analysis of the CPU micro-architecture and its characteristics that demands controlled environments. If you actually read the following page with system performance and the rest of the review I think I'm plenty harsh enough on the lacking real world benefits. I'm more disappointed from such knee-jerk reactions from readers.
 
We'll ignore that it was the "Galaxy S9 review" and not a SoC architectural analysis piece (I did read the whole article), but even if it was an architectural review who cares about hypothetical environments that these architectures will never be used in? Should we start benchmarking gpus using dry ice? Come on Neb you and I both know in android land IHVs focus on peak performance and chasing higher bars in (meaningless) benchmark graphs has ultimately impacted the end user's experience for the worse. Sustained performance is far more important and yet has always taken a backseat to peak performance. I think it's fair to say that's largely not due to architectural design decisions but marketing ones.
 
We'll ignore that it was the "Galaxy S9 review" and not a SoC architectural analysis piece (I did read the whole article), but even if it was an architectural review who cares about hypothetical environments that these architectures will never be used in? Should we start benchmarking gpus using dry ice? Come on Neb you and I both know in android land IHVs focus on peak performance and chasing higher bars in (meaningless) benchmark graphs has ultimately impacted the end user's experience for the worse. Sustained performance is far more important and yet has always taken a backseat to peak performance. I think it's fair to say that's largely not due to architectural design decisions but marketing ones.
Do you load webpages in endless loops with your device or do software video encoding? Peak CPU performance is important because CPU workloads are transactional and bursty. I mean I'm not even going into this argument as I specifically changed the way the GPU performance is represented precisely because of these concerns. You're utterly overreacting here and using strawman arguments.
IHVs focus on peak performance and chasing higher bars in (meaningless) benchmark graphs has ultimately impacted the end user's experience for the worse
The rest of the review has the 9810 with the lowest bars all over the place. You're telling me because I dared to test SPEC at peak that this invalidates the whole rest of the article. Again there's reasonable arguments and there's unreasonable ones.
Should we start benchmarking gpus using dry ice?
And why the hell shouldn't we? Getting both the peak and sustained numbers tells us more about the GPU and the way it's run.
 
Back
Top