Samsung SoC & ARMv8 discussions

Discussion in 'Mobile Devices and SoCs' started by Nebuchadnezzar, Apr 10, 2014.

  1. Rys

    Rys Graphics @ AMD Moderator Veteran Alpha

    For ARM-based designs based on CCI-400, it's not just bandwidth you need to focus on. That interconnect is "deep", so especially with GPUs since they tend to have reasonably hard requirements on latency, making sure those requirements are met in a complex SoC can be difficult and GPU performance can be affected. Measured bandwidth only tells you part of the story.
     
  2. Erinyes

    Erinyes Regular

    Samsung confirmed what we pretty much already knew..14nm for the Exynos 7420, 2.1 Ghz clock for the A57's and LPDDR4. They also claim 20 percent faster speed, 35 percent less power consumption and 30 percent productivity gain for 14nm compared to 20nm. Not sure what they mean by productivity but Joshua at AT seems to think it is performance per watt. Unless my reasoning is completely wrong, wouldn't a 35% reduction in power consumption mean ~50% higher performance per watt (i.e. 1/0.65)?

    Source - Anandtech
    2. Yes..agreed that those dont scale well but surely they wouldn't grow in size compared to 28nm? And cache should scale well so something doesn't quite add up. The overall size of the clusters certainly should not have gone up significantly.
    3. Yes..I understand the results are for the full SoC. But my point was that the additional power consumption for a 4 core load on A53 seems to be much higher than the A7, compared to a single core. On a one core load..SoC power is 0.271W v/s 0.213W for A53 v/s A7, i.e. a difference of 58 mw. But for 4 cores..SoC power is 0.847W v/s 0.453W, a difference of 394 mw or about 100 mw per core. This is almost double the difference of 58 mw for one core alone.

    P.S. I have another unrelated query if you could indulge me and have the time to test. You guys did a test on encryption and storage performance on Lollipop on the Nexus 6 and we saw that performance dropped drastically if encryption was enabled. With the A57 this should be mitigated due to the encryption units. Could you possibly test this on the Note 4?
    Ahh yes..thanks I forgot about that. Seems like higher bandwidth does not seem to be helping performance all that much even in benchmarks. The Geekbench scores for 7420 vs 5433 are ~15% and ~10% higher for single and multicore respectively. If you normalize for clocks (2.1 v/s 1.9 ghz) this reduces to ~5% and 0%. The other slight surprise is that multi-core advantage is lower than single core. I would have thought it would be the opposite due to the process advantage and presumably less throttling.
    Thanks Rys..always appreciate your valuable inputs :) Do you see this changing with CCI-500?
     
    Last edited: Feb 17, 2015
  3. That was our best guess by what that "productivity" gain meant. The announcement was weird so we'll know more by the end of this week.

    As stated before, I was told by ARM that this may have been caused by decreasing work per thread with increasing threads/cores due to resource constraints on the clusters of the A7/A15 which has been "resolved" on the newer architectures.
    I didn't update to Lolipop yet on the Note 4 and AFAIK the Samsung ROM doesn't have the encryption option.
     
    Last edited: Feb 17, 2015
  4. Lodix

    Lodix Newcomer

    Maybe productivity gain = density ?
     
  5. Entropy

    Entropy Veteran

    "even in benchmarks" is an odd turn of words.
    Geekbench is very much an example of a benchmark suite that is designed on purpose to separate main memory performance from the rest of the benchmark suite. This allows assessment of the per core low level performance for a number of code examples, and how this scales with number of cores. The memory performance IS tested in a separate part of the overall benchmark. (It would have been nice if they added some kind of latency test as well.) It is not strange that the main memory bandwidth doesn't affect the other scores. It is not really meant to.
     
  6. Rys

    Rys Graphics @ AMD Moderator Veteran Alpha

    Yep. I'd be very surprised if the interconnect's real-world performance and influence on full-SoC performance didn't get better versus the last gen.
     
  7. juicytuna

    juicytuna Newcomer

    Samsung are giving a talk tomorrow at ISSCC titled:
    This could be where we hear first details of their custom GPU and/or CPU cores.
     
  8. Rys

    Rys Graphics @ AMD Moderator Veteran Alpha

    That's highly likely to just be 5433.
     
  9. juicytuna

    juicytuna Newcomer

    You're right, I completely missed the heterogeneous bit. The 5433 does fit that description perfectly.
     
  10. It's their custom architecture. The 5433 is not a quad core.
     
  11. Rys

    Rys Graphics @ AMD Moderator Veteran Alpha

    I take "Heterogeneous 64b Quad-Core CPUs" to mean two quad-core clusters. If it's not 5433 I'll eat a hat.
     
  12. Start seasoning your hat just in case.

    From what I understood of their architecture the heterogeneous is supposed to refer to the HSA-ity between the CPU and GPU.

    Edit*

    This report from back in December matches what I hear:
    http://www.zdnet.co.kr/news/news_view.asp?artice_id=20141202145608

    I know at least what their gpu is supposed to look like based on research papers but they have several designs including one ray-tracer that looked promising. Not sure what the presented one will be.
     
    Last edited: Feb 24, 2015
  13. Rys

    Rys Graphics @ AMD Moderator Veteran Alpha

    Why do a 20nm tapeout of something that works well enough to present at ISSCC, a full application processor no less with all the baggage that means (including a Cat6 LTE modem according to that ZDNet article!), and never have sold it or even announced it properly.

    I'll season the hat but I'll be very surprised if I have to eat it.
     
  14. Last edited: Feb 25, 2015
  15. juicytuna

    juicytuna Newcomer

    I take it from the lack of any reports that this was indeed just the Exynos 5433 then?
     
  16. Probably NDA for the audience. It was already weird only a single outlet world-wide reported on the 2013 piece. The MediaTek presentation on a "2.5GHz Octa-core" after is also nowhere to be seen
     
    Last edited: Feb 26, 2015
  17. Alexko

    Alexko Veteran Subscriber

    Do you think that's just the effect of the 14nm process, or is there something else?
     
  18. Correct me if I'm wrong.
    With Exynos 7420's GPU, we're looking at:
    ~130 to 220 GFLOPs
    8 TMUs, 8 ROPs @ 700-770MHz
    25GB/s memory bandwidth + 1MB L2 cache
     
  19. juicytuna

    juicytuna Newcomer

Loading...

Share This Page

Loading...