Samsung Exynos 8890

Discussion in 'Mobile Devices and SoCs' started by Rys, Nov 12, 2015.

  1. Rys

    Rys PowerVR
    Moderator Veteran Alpha

    Joined:
    Oct 9, 2003
    Messages:
    4,156
    Likes Received:
    1,433
    Location:
    Beyond3D HQ
    http://www.anandtech.com/show/9781/samsung-announces-exynos-8890-with-cat1213-modem-and-custom-cpu

    Samsung's own Mongoose CPU microarchitecture for the big complex (now called M1)
    Cortex-A53 for the little complex
    Mali-T880MP12 GPU
    Custom fabric/interconnect called SCI (Samsung Coherent Interconnect), so no ARM CCI
    Powerful LTE modem (@Nebuchadnezzar speculates it's off-chip and on-package, which given the size of the rest of the big blocks, especially GPU, I'd say is probably likely)

    So that makes three of the big four very high performance ARM-based SoC vendors not using Cortex for their fastest CPUs (although Samsung still use it for the little complex here).
     
  2. juicytuna

    Newcomer

    Joined:
    Jul 27, 2005
    Messages:
    71
    Likes Received:
    0
    Still 3-wide apparently. Apple really caught the rest of the industry with their pants down it seems. When are we going to see someone other than Apple attempt a really wide core?
     
  3. Turbotab

    Newcomer

    Joined:
    Feb 19, 2013
    Messages:
    214
    Likes Received:
    3
    Mali-T880MP12, seems that at least one of the Galaxy devices, probably the Note will have a 4K display.
     
  4. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,015
    Likes Received:
    112
    T880MP12 sounds huge indeed. Considering the T760-MP8 in the Exynos 7420 already had to downclock to roughly half the peak frequency for sustained use, this thing has 50% more clusters, _and_ 50% more alus per cluster, so I wonder what the magic was to make this work reasonably in a low TDP environment... Is T8xx really that more power efficient?
     
  5. Turbotab

    Newcomer

    Joined:
    Feb 19, 2013
    Messages:
    214
    Likes Received:
    3
    No doubt Samsung have improved their 14nm process, during the past year, so that should account for some of improvement. I bet that Exynos 8890 will be fabbed on their 14LPP, rather than the older LPE of the 7420.
    From the TSMC VS Samsung's A9 die area comparison, it seems that Sam have an advantage over TSMC in die area, so going for a wide and slow (MHz) GPU, is less risky than for TSMC users.
     
  6. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    It's also Samsung's first custom core.
    Apple starting putting the groundwork together at least as far back as 2008 with the PA Semi acquisition, and Swift was the more modest first deployment. 6-wide came about in 2014.

    Samsung seems to have started a bit later, so a riff on an existing template seems like a more conservative choice that has less risk in terms of time to market, and real-world feedback for a more advanced core. This could be Samsung's version of Swift.

    The PR is thick and details are light, so I do not know how to rank this versus an A72 implementation. The benefits of a custom core might be muted by Samsung's custom power management and process switch with the 7420, and thanks to ARM going back and performing more optimization and physical IP design, compared to what the prior cores left on the table.
     
    dogen likes this.
  7. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    974
    Likes Received:
    141
    Location:
    Luxembourg
    Agree. LPP should be 14% faster than LPE, somewhat closing the gap to 16FF+.

    The supposed scores reported a few months ago were 59.4 fps MH and 108.9 fps Trex that would point out that they kept frequencies stable at 700-772MHz.
     
  8. Vitaly Vidmirov

    Newcomer

    Joined:
    Jul 9, 2007
    Messages:
    108
    Likes Received:
    10
    Location:
    Russia
  9. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,420
    Likes Received:
    179
    Location:
    Chania
    I think it sounds huge because "12" sounds like a very high number. If you consider though that it actually has "just" 12 TMUs, then it's not "huge" at least compared to the GT7600 in the Apple A9 for instance. Frequency then is another chapter; the 760MP8 in the 7420 clocks at 700MHz with a burst frequency of 772MHz. Nebu or anyone else might correct me but in T-Rex if memory serves well it goes only up to 700MHz but it throttles down to 400MHz over N period of time.

    The 7600 in A9 should be clocked at 533MHz or somewhere around that frequency either way. Now assume it'll throttle in a worst case down by ~20% it drops to 425MHz. Now ask yourself why the frequencies ULP mobile GPUs usually throttle at today are not so far apart.

    For the record's sake the Kirin 950 is stuck with "just" 4 clusters but clocks its Mali to 900MHz *cough*
     
  10. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    711
    Likes Received:
    282
  11. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,015
    Likes Received:
    112
    The "huge" was really in relation to the 760MP8 used in the Exynos 7420 as this one already has to throttle down to ~400Mhz. Now it's quite possible this is a reasonable frequency for power efficiency reasons (as a side note, even Carrizo GPU throttles to roughly this level in its 15W form), but I have some doubts going even lower would be helpful. So with 50% more clusters (which themselves have 50% more alu capacity) you're still looking at quite a big efficiency improvement needed (somewhere - either architecture or process or more likely both) to make this really useful.
     
  12. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    429
    Location:
    Cleveland, OH
    A4 and A5 weren't CPU names, they were SoC names.. it's not like Exynos 3xxx was a rebranding for Cortex-A8, 4xxx for A9, etc.

    The details posted for Mongoose look very similar to both A72 and A57, which both look even more similar to each other based on this sort of information. There are some small differences, for example most integer SIMD operations are lower latency. But really the devil is in the details, which is how A72 has measurably better IPC than A57 despite having the same basic execution resources with the same latencies. This doesn't account for the impact of things like instruction window/scheduler size, cache sizes and latencies, prefetch performance, branch prediction, TLB sizes, reordering capability (particularly, memory disambiguation and alias prediction) and so on.
     
    Nebuchadnezzar likes this.
  13. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    711
    Likes Received:
    282
    The mobile SoC trend for CPUs seems to go into 2 directions:
    - 2 cores but wide decode like 6 instructions (Apple, Qualcomn 820 (+2 low power cores))
    - 4 cores but only 3 wide decode (A57, A72, M1 (+4 low power cores))
    All with high clock > 2Ghz
     
  14. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    974
    Likes Received:
    141
    Location:
    Luxembourg
    We don't know how wide Kryo is, integer workloads for example don't seem any faster than ARM cores.
     
  15. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,420
    Likes Received:
    179
    Location:
    Chania
    Look at it that way: IF it should throttle to the same degree/persentage as the 7420 GPU, you still have N% more usable performance compared to the former.
     
  16. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    974
    Likes Received:
    141
    Location:
    Luxembourg
    Keep in mind that the actual GPU power in the 7420 was only a portion of the total SoC power. Given Samsung did stuff in terms of their interconnect and memory controllers, things can end up either way.
     
  17. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    974
    Likes Received:
    141
    Location:
    Luxembourg
    So as I suspected, something got lost in translation. The 30% perf 10% efficiency figure actually was 30% perf 10% lower power. That's about an 44% increase in efficiency over the 7420's A57.

    Seems they were pretty clear that this is also an on-die modem.
     
    #17 Nebuchadnezzar, Nov 18, 2015
    Last edited: Nov 18, 2015
  18. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    429
    Location:
    Cleveland, OH
    That sounds pretty close to Cortex-A72's improvements, depending on how generous you are with measuring the performance increase (they say 10-50% at same clock with lower power consumption, but the more robust benchmarks are probably closer to the 10% mark)
     
  19. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    974
    Likes Received:
    141
    Location:
    Luxembourg
    http://images.anandtech.com/doci/9762/P1030611.jpg

    Less power improvements but larger performance improvements making for overall larger efficiency improvements.

    Anyway I don't really trust Samsung to give representative numbers on their SoCs (for better or worse) - the 7420 marketing numbers were for example focused on the process gains but the real gains were far greater than that.
     
  20. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,496
    Likes Received:
    910
    Are those numbers for the same process? Because it's surprising to see that A72 is advertised as smaller than A57.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...