Samsung Exynos 9810

Discussion in 'Mobile Devices and SoCs' started by iMacmatician, Jan 4, 2018.

  1. iMacmatician

    Regular

    Joined:
    Jul 24, 2010
    Messages:
    737
    Likes Received:
    182
    Samsung has announced the Exynos 9810: "Samsung Optimizes Premium Exynos 9 Series 9810 for AI Applications and Richer Multimedia Content." The 9810 uses the M3 CPU and is on 10 nm LPP. (AnandTech)
    (emphasis mine)
    That is a very impressive single-core performance increase, and if the increase holds for Geekbench, then the 9810 would be around 3900 in the single-core benchmark.
    • Apple A11: 4215 SC / 10170 MC [iPhone 8 Plus]
    • Exynos 9810: ~3900 SC / ~9100 MC [estimates]
    • Apple A10: 3438 SC / 5766 MC [iPhone 7 Plus]
    • Exynos 8895: 1959 SC / 6472 MC [Galaxy Note 8]
    AnandTech also has a comparison of the Exynos 8895 and 9810. There was a rumor that the M3 is 6 issue, which explains the large performance increase.
     
    Grall likes this.
  2. juicytuna

    Newcomer

    Joined:
    Jul 27, 2005
    Messages:
    71
    Likes Received:
    0
  3. iMacmatician

    Regular

    Joined:
    Jul 24, 2010
    Messages:
    737
    Likes Received:
    182
    From Andrei Frumusanu at AnandTech forums:
    The 9810 is not too far behind the A11.
     
    el etro likes this.
  4. Raqia

    Regular

    Joined:
    Oct 31, 2003
    Messages:
    507
    Likes Received:
    18
    Actual early results are lackluster and the SoC seems lopsided despite the impressive absolute performance of the big CPU core:

    https://www.anandtech.com/show/12478/exynos-9810-handson-awkward-first-results

    You get great Geekbench performance but on pretty much every other benchmark or any discipline using actual Android APIs, the 9810 falls behind the 845.

    Their multicore CPU score seems modest and their GPU has fewer cores compared to the previous iteration, so this probably means the M3 core consumes scads of die space and runs fairly hot to boot. The memory hierarchy looks hobbled on the small cluster with no L2 cache and a pooled L3, and it is also seems less integrated to their big cluster bus wise than on the 845 which uses ARM's DynamIQ. They probably should have gone with 2x M3 (this maybe hard due to a rigid design target for their big cluster) and 4x A55s for more thermal and die space breathing room which could be spent on more multicore and GPU performance without hurting single threaded performance.
     
    #4 Raqia, Feb 28, 2018
    Last edited: Feb 28, 2018
  5. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    2,971
    Likes Received:
    90
    I would assume the really terrible performance in everything but geekbench could be fixed (I would assume it's related to power management, what tasks to run on which cores, and when to power down big cores etc.).
    That said, I think you are quite right that 4 of these cores is pointless (it has to be said the 4 A57 on the SD810 were entirely pointless too...). There's simply no way to run all 4 big cores at really useful frequencies simultaneously for even short periods of time (well, not in a smartphone), you'd be looking at 10W+ for the cpu cores alone. I'm not sure it really directly hurts neither, but it's definitely wasted die space. There is of course a good reason the apple SoCs only have 2 fast cores...
    But I assume 2+4 cores was out of the question entirely simply by marketing reasons (that's just about the only number you ever see for a smartphone regarding the cpu, "8 cores" - of course I'd take a SoC with 2 fast and 2 slow cores over one with 8 slow cores (which is unfortunately the norm for most non-highend chips) any day of the week, but I guess I'm a minority...).
     
  6. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    934
    Likes Received:
    81
    Location:
    Luxembourg
    4 + 4 with aggressive frequency management on the big cores is better than 4 + 2, you still get vastly better performance and efficiency at lower frequencies. They'll likely go 4 + 2 + 2 next year.
     
  7. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    2,971
    Likes Received:
    90
    Not necessarily, that would depend on how well the big cores actually scale down. Powering down the big cores (or just not having them in the first place...) will allow you to increase frequency on the little cores, which may be more efficient (as the little cores in general are more power efficient, albeit if you clock them too high that will no longer hold true).
    Some smartphones with SD810 actually never would run all 4 big cores simultaneously.
    apple's A11 ST->MT scaling in geekbench seems to be slightly better as the Exynos 9810 ST->MT scaling, despite having only 2 big cores (of course, that scaling is notably worse than for those chips with slower big cores, such as SD835, SD845). (Though would need to compare power draw too of course.)
    This is all theory of course, maybe there is indeed an advantage of having 4 M3 cores (within the smartphone power limit) after all, it's really impossible to tell without extensive measurements. But even if so, I would expect it to be very small.
     
  8. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    934
    Likes Received:
    81
    Location:
    Luxembourg
    No it wouldn't. The little cores are implemented for power and simply can't reach higher frequencies.
    Because that would mean a 14W TDP. The 9810 will have the same power issue but there's actually performance behind that power so efficiency won't be as of a big issue.
    That's irrelevant, Apple has likely much more performant little cores, they're twice the size of an A53 and I somehow doubt they're in-order like an A53.
     
    BRiT likes this.
  9. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    2,971
    Likes Received:
    90
    I meant if they weren't already running at their highest clock (because there's essentially not enough power left). But maybe that doesn't actually ever happen (if the power management always reduces clock of big cores first (or even powering them down...) so it can maintain max clock of little cores).

    They are A55 (but ok probably not much of a difference). Indeed I think this might be a bit problematic - the A55 at max clock might not really reach the performance of the M3 at the latter "lowest power-efficient frequency" (that is, the frequency where lowering the frequency further would be detrimental to power efficiency, that is it would be better to run at higher frequency and put to sleep for some time instead).

    That said, I suppose I got a bit confused by the low geekbench multi-core score. That looked to me like the big cores were running at a very low frequency (considering the small cores should contribute ~2000 points or so, that would mean the 4 big cores were only running at around ~1.2Ghz or so). Now if these big cores actually really maintain ~1.8Ghz as the anandtech article states then yes that looks like 4 of them would be very useful. However, in this case geekbench MT scaling would simply be awful (the shared L3 could be a bottleneck?).
     
  10. itsmydamnation

    Veteran Regular

    Joined:
    Apr 29, 2007
    Messages:
    1,196
    Likes Received:
    298
    Location:
    Australia
    Yes its one thing i have never seen talked about by ARM or have a seen really tested. if i was to guess the L3 is single ported, even Zen CCX L3 is single ported but they specifically designed buffering within the CCX to handle that and that still results is a 1/2 of bandwidth comparing L2 to L3.

    So for General Phone operations thats not going to be a big deal but on the CPU benchmarks that hit memory sub system a lot it could play out as an issue. The other thing is i think anandtech said the two core types are in two separate clusters so there could also be bottlenecks between clusters that again in general phone operations wouldn't be a big deal.
     
  11. kalelovil

    Regular

    Joined:
    Sep 8, 2011
    Messages:
    553
    Likes Received:
    93
    It may not entirely be the SoC to blame.
     
  12. Raqia

    Regular

    Joined:
    Oct 31, 2003
    Messages:
    507
    Likes Received:
    18
    Later tweets say it's fixable by software and that "it's not cheating" so it may not be performance related per se. If it is, it could be they're throttling the S845 to match the Exynos... I still have a jaundiced view of the SoC level competency of Samsung's design.
     
  13. Raqia

    Regular

    Joined:
    Oct 31, 2003
    Messages:
    507
    Likes Received:
    18
  14. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    2,971
    Likes Received:
    90
    Yes, indeed waiting for a more in-depth comparison.
    Something just doesn't seem right with the 9810. I'd think in particular with these cpu cores it should dominate the browser benchmarks - but it hardly beats the old Exynos 8895 and gets blown away by the SD845 there.
    Its FPU should also be way faster (and geekbench tells as much, albeit it's not surprising the MT scores are closer), yet again in 3dmark physics it's nowhere close to SD 845.
    The GPU being slower was of course expected, but that it essentially loses in everything cpu related too except geekbench against the SD 845 is definitely unexpected. I wasn't that excited as some others when the early geekbench results surfaced, mostly due to potential energy efficiency concerns of the new cores (in any kind of sustained MT load it doesn't matter what the peak performance of the cores is, if they have worse perf/power than the not-really-custom A75 then things will be slower), but I'd never have expected things to be THAT bad - after all not everything is a sustained MT load...
     
  15. Raqia

    Regular

    Joined:
    Oct 31, 2003
    Messages:
    507
    Likes Received:
    18
    I imagine the better memory subsystem and interconnects are the the S845's secret sauce. It's not clear that the Exynos has an L4 cache and the smaller cluster cores each lack L2 entirely. They have a very nice core in the M3, but that's about it; the rest of the design seems withered next to it, making it a bit like the Matthias Schlitte of SoCs.
     
  16. mboeller

    Regular

    Joined:
    Feb 7, 2002
    Messages:
    922
    Likes Received:
    1
    Location:
    Germany
  17. willardjuice

    willardjuice super willyjuice
    Moderator Veteran Alpha Subscriber

    Joined:
    May 14, 2005
    Messages:
    1,351
    Likes Received:
    179
    Location:
    NY
    Whyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy. The focus on peak performance is very disappointing. And worse this will keep forcing SoC makers in making poor design decisions (based on marketing, not real world benefits to the user) on future products.
     
    Grall and BRiT like this.
  18. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    934
    Likes Received:
    81
    Location:
    Luxembourg
    Because it's an analysis of the CPU micro-architecture and its characteristics that demands controlled environments. If you actually read the following page with system performance and the rest of the review I think I'm plenty harsh enough on the lacking real world benefits. I'm more disappointed from such knee-jerk reactions from readers.
     
    Lodix and BRiT like this.
  19. willardjuice

    willardjuice super willyjuice
    Moderator Veteran Alpha Subscriber

    Joined:
    May 14, 2005
    Messages:
    1,351
    Likes Received:
    179
    Location:
    NY
    We'll ignore that it was the "Galaxy S9 review" and not a SoC architectural analysis piece (I did read the whole article), but even if it was an architectural review who cares about hypothetical environments that these architectures will never be used in? Should we start benchmarking gpus using dry ice? Come on Neb you and I both know in android land IHVs focus on peak performance and chasing higher bars in (meaningless) benchmark graphs has ultimately impacted the end user's experience for the worse. Sustained performance is far more important and yet has always taken a backseat to peak performance. I think it's fair to say that's largely not due to architectural design decisions but marketing ones.
     
  20. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    934
    Likes Received:
    81
    Location:
    Luxembourg
    Do you load webpages in endless loops with your device or do software video encoding? Peak CPU performance is important because CPU workloads are transactional and bursty. I mean I'm not even going into this argument as I specifically changed the way the GPU performance is represented precisely because of these concerns. You're utterly overreacting here and using strawman arguments.
    The rest of the review has the 9810 with the lowest bars all over the place. You're telling me because I dared to test SPEC at peak that this invalidates the whole rest of the article. Again there's reasonable arguments and there's unreasonable ones.
    And why the hell shouldn't we? Getting both the peak and sustained numbers tells us more about the GPU and the way it's run.
     
    Laurent06 and Vitaly Vidmirov like this.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...