Samsung Exynos 5250 - production starting in Q2 2012

Discussion in 'Mobile Devices and SoCs' started by Deleted member 13524, Nov 30, 2011.

  1. balagamer

    Newcomer

    Joined:
    Jul 5, 2013
    Messages:
    3
    Likes Received:
    0
    with gfx benchmark the fill rate scores are drastically different when compared to iphone5 despite having similar gpu as s4, is that limitation due to platform or due to some other factors?
     
  2. Clock speeds. The GPU in the iphone 5 is lower clocked.
     
  3. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
  4. balagamer

    Newcomer

    Joined:
    Jul 5, 2013
    Messages:
    3
    Likes Received:
    0
  5. tangey

    Veteran

    Joined:
    Jul 28, 2006
    Messages:
    1,537
    Likes Received:
    282
    Location:
    0x5FF6BC
    In glbenchmark, Samsung 5410 has 13% better off-screen fill rate than iphone5, but has 60%+ higher clock (assuming the 5410 is clocking @ 533Mhz for the test).

    So either Samsung graphics datapath is inferior to the one in the A6, or its a driver issue.
     
  6. tangey

    Veteran

    Joined:
    Jul 28, 2006
    Messages:
    1,537
    Likes Received:
    282
    Location:
    0x5FF6BC
    Yes and even that is probably being generous, as the graph might assume 480Mhz clock to work out the theoretical fill rate.
     
  7. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    It's so far my understanding that it actually clocks in the widest majority of cases at 480MHz and only in a couple of benchmarks at 532MHz. I might understood Nebu wrong but I think it clocks at 480MHz in GLB.

    Besides that I'd still love to know which the nearly 100% efficiency variant is.
     
  8. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    1,061
    Likes Received:
    329
    Location:
    Luxembourg
    480 in GFXBench and 532 in the old GLB apps.

    I'm curious about the wording in that IMG blog, as if it wants to say that the inefficiency is because of the high clocks. I'm pulling straws here.

    I did a quick bench of 350 vs 480MHz, both those clocks on the GPU force a memory lock to 800MHz so bandwidth shouldn't be an issue:

    350:
    2.5 Egypt offscreen: 3534 frames
    [strike]Fill-rate offscreen: 987526ktex/s[/strike]

    480:
    2.5 Egypt offscreen: 4517 frames
    [strike]Fill-rate offscreen: 1323662ktex/s[/strike]

    37.14% superior frequency for 27.81% improvement in Egypt [strike]and 34.03% improvement in fill-rate.[/strike]

    I can do some more synthetic benches while locking all of the phone's frequencies and several runs if somebody would like to see that.

    PS: Does that ImgTech blog even take into account Exynos's cheating?
    Would be funny if the efficiency is calculated based on a 532MHz clock but 480MHz results :D


    PS2: I found the fill-rate to be very bogus, reran it:

    480:
    Run1 1902600 ktex/s
    Run2: 1415437 ktex/s
    Run3: 1911574 ktex/s
    Run4: 1936990 ktex/s

    350:
    Run1 1630870 ktex/s
    Run2: 350: 1644457 ktex/s
    Run3: 350: 1674204 ktex/s

    Given the above reruns, it's even worse: only 15.69% improvement on the best scores between 350 and 480, that's bandwidth limitation, right?

    I'll have to investigate GPU thermal throttling...
     
    #628 Nebuchadnezzar, Jul 5, 2013
    Last edited by a moderator: Jul 5, 2013
  9. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    6 TMUs * 480MHz = 2.88 GTexels/s
    Kishonti results onscreen = 1.97 GTexels/s = 68%


    Alas if its already throttling in a simple fillrate test. Either the driver needs some serious work, or there's something else wrong with bandwidth being one of probably many candidates.
     
  10. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    1,061
    Likes Received:
    329
    Location:
    Luxembourg
    And if you lower the frequency, the efficiency goes up.

    Made a across-the-table sweep on some possible scenarios:

    [​IMG]

    [​IMG]

    I also tested lowering the internal bus but that didn't have any effect at all on the scores.

    What are the actual bandwidth requirements per TMU per cycle?

    I'm still not aware of any GPU throttling mechanism, but memory has throttling in place.
     
    #630 Nebuchadnezzar, Jul 5, 2013
    Last edited by a moderator: Jul 5, 2013
  11. tangey

    Veteran

    Joined:
    Jul 28, 2006
    Messages:
    1,537
    Likes Received:
    282
    Location:
    0x5FF6BC
    Iphone5 isn't far away.

    Assuming its 325 MHz. Then 650 per core. X3=1950 M t/s
    Have to allow a small reduction as IMG have said multi core performance scales about 95% linear. Would work out about 1850.

    Offscreen fillrate in gfxbench is 1835
     
    #631 tangey, Jul 6, 2013
    Last edited by a moderator: Jul 6, 2013
  12. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    Interesting.


    No idea to be honest.

    Or Series5XT cores simply aren't meant for very high frequencies, unlike of course Series6 according to my so far understanding.
     
  13. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    1,061
    Likes Received:
    329
    Location:
    Luxembourg
    http://browser.primatelabs.com/geekbench2/compare/2136607/2136542

    Can anybody theorise the difference in stream scores in the above results? The higher one is from A15 at 800MHz and the other one is A7's at 1500MHz.

    Which also by the way answers my question from the MediaTek thread about how high the 5410's A7's can go. The cores are very frugal even at a higher voltage.
     
  14. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    This is by no means a solid analysis, but maybe the A7 is latency bound by the FPU operations while the A15 isn't (technically even the copy operation should be going through the FPU). Stream consists of very tight loops; if the compiler isn't unrolling it then you could end up with loading and storing to the same registers causing stalls due to WAW hazards (and hitting RAW with load-use latency, to some degree). The A15 would hide this due to its register renaming.

    But that doesn't explain why Cortex-A9s get much better scores in Geekbench, since they should be subject to the same problem.
     
  15. Arun

    Arun Unknown.
    Legend

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    302
    Location:
    UK
    Very interesting data, thanks! :)
    Is there any way to change memory CAS like on the desktop? That could be very interesting to test the impact of latency vs bandwidth (although I don't know if the memory controller itself is clocked based on memory frequency and whether it plays a noticeable part in total latency or not).

    It's worth pointing out that LPDDR3-1600 has worse latency than LPDDR2-1066 (not sure exactly how it compares to LPDDR2-800) so you might have a double whammy of higher latency than some competing systems with higher GPU frequency as well. So I suspect the memory latency in cycles might be higher than on any other SGX device (except for very badly designed ones perhaps, I don't really know).
     
  16. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    1,061
    Likes Received:
    329
    Location:
    Luxembourg
    Yes, but the value fields are undocumented so i would be changing them blindly. Line 192; https://github.com/AndreiLux/Perseus-UNIVERSAL5410/blob/perseus/drivers/devfreq/exynos5410_bus_mif.c
     
  17. french toast

    Veteran

    Joined:
    Jan 5, 2012
    Messages:
    1,667
    Likes Received:
    9
    Location:
    Leicestershire - England
  18. wishiknew

    Regular

    Joined:
    May 19, 2004
    Messages:
    341
    Likes Received:
    9
  19. Nebuchadnezzar

    Legend

    Joined:
    Feb 10, 2002
    Messages:
    1,061
    Likes Received:
    329
    Location:
    Luxembourg
  20. french toast

    Veteran

    Joined:
    Jan 5, 2012
    Messages:
    1,667
    Likes Received:
    9
    Location:
    Leicestershire - England
    Surprising, I thought the t624 would be used...and mp6? Didnt see that one.

    This has settled it then, a galaxy note 3 containing this beasty will be my next phone.
    What about power consumption, hope they hav implemented big little better on this.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...