Samsung Orion SoC - dual-core A9 + "5 times the 3D graphics performance"

Discussion in 'Mobile Devices and SoCs' started by Mike11, Sep 7, 2010.

  1. Arun

    Arun Unknown. Legend

    Very good catch! :)

    I'm pretty sure they must have improved their ALU precision to FP24 and their depth buffer support from 16-bit to 24-bit. Although they don't expose 24-bit depth in OpenGL ES which is probably because it will have a noticeable performance hit (full +50% depth bandwidth since they don't support framebuffer compression AFAICT). Every time I do performance analysis on Tegra at work my eyes bleed at all the depth fighting artifacts... (although it's not as bad in games/benchmarks that set their zmin/zmax intelligently it's still fairly bad).

    Obviously these are all trade-offs and I understand some of the reasons why they made them, but I think at a basic level NVIDIA designed the original Tegra GPU in an era where they thought handheld GPU performance wouldn't increase anywhere nearly as fast as it has, and more importantly they thought they'd be more limited by area than they actually could be at this point (leading to things like no framebuffer compression). It will be interesting to see how aggressive they are with handheld Kepler (and how similar it is to PC Kepler) once that comes to market although it remains to be seen when that actually is and what the competition will be at that point... :twisted:
     
  2. french toast

    french toast Veteran

    Ha, nice ;)
     
  3. Ailuros

    Ailuros Epsilon plus three Legend Subscriber

    That's what I thought too; but it should be there in T3 at least (FP24 PS and 24bit Z) for whatever windows it may run on.
     
  4. Exophase

    Exophase Veteran

    2004 standards are ancient history, even in the mobile world, but I think the TMU to ROP ratio was non-one long before that. Look at the first dual-texturing GPUs, they could handle two texels for every pixel output. That right there indicates a 2:1 ratio. At this point fragment shading wasn't programmable (and I wouldn't consider it fully programmable until DX9 level) so it's difficult to identify how many "ALUs" these units had. But the combiners often had several stages for each pixel output too, so if you count those as ALUs that number is higher. You need at least one combiner stage for every texel, but you could have more for other inputs.

    He's saying that all of the SIMDs are capable of FP16 or better per lane, but that doesn't mean that they're limited to that.

    I bet it's more that the shaders do half the blending and the ROP does the other half, something like this:

    Code:
    shader: color.rgb *= color.a
    shader: color.a = 1 - color.a
    ROP: render_target.rgb = (render_target.rgb * color.a) + color.rgb
    
    Because normally the shader can't read the render target directly, you really want to keep that decoupled.
     
  5. Arun

    Arun Unknown. Legend

    Both SGX and Tegra have full access to the previously rendered color in the pixel shader :)
    So with the right extensions/low-level access you can do HDR in non-RGB colorspaces very efficiently for example...
     
  6. Ailuros

    Ailuros Epsilon plus three Legend Subscriber

    Not the right topic anyway, but I'm really wondering for quite some time now where the 9th FLOP per ALU comes from in Series5XT GPU IP.
     
  7. wco81

    wco81 Legend

    Well Samsung announced Galaxy Tab 2 at 7 and 10.1-inch screen sizes.

    They have ICS and are described as 1Ghz dual-core. They don't try to compete with high-resolution screen on the iPad but they are lower in prices, especially the 7-inch model at $250.
     
  8. Ailuros

    Ailuros Epsilon plus three Legend Subscriber

  9. So the kernel sources for the S3 were released and the final clock on the Mali is the same as I posted several months ago, 440MHz.

    While I'm pretty sure there's no driver magic going on here, the only reasonable explanation would be that the memory bandwidth is vastly improved. Bandwidth tests put the S3 at roughly 30% higher speeds in real-world metrics over the S2. I don't see any other explanation for a 95% performance increase for only 65% clock increase on the GPU. Reports have been posted that it has an "internal 128bit bus" over 64bit in the 4210, but I do not understand how exactly does this help memory bandwidth, as the memory itself remained (apparently) unchanged.
     
  10. french toast

    french toast Veteran

    I get 797mb/s on that test using a 70mb test size-45% i would have thought that they would have stuck in some lpddr2 1066?? anyway im not running out of bandwidth anytime soon i know that much, this thing chews through anything, the only gripe is that Samsung didn't stick in 2gb ram, ive only got 780mb to play with for some reason, i expect thats ICS+touchwhizz taking that, then i reguarly find around 400mb used constantly doing nothing on the home screen, even when i close every app from task manager and clear ram i barley get it under 400mb at best, ive installed a ram booster app to clear ram when i get to only 300 free.

    Ive noticed i some times get kicked out of apps like Opera when im heavy loading, i wonder if that ICS/Touchwhizz cutting apps back? you would have though they would have started with some annoying background processess first??

    Anyway once ive put that auto ram booster in i don't run out of ram for my main apps, certainly even with everything running some 25 apps ive experienced only a slight stutter when ram get filled and ICS starts closing or tombstoning, Bandwidth it seems is more than fine.:smile:

    Let me know if you want me to run any benchmarks, i have quite a few installed already. cheers.
     
  11. Ailuros

    Ailuros Epsilon plus three Legend Subscriber

    What am I missing, where's the 95% performance difference? http://www.glbenchmark.com/compare....00 Galaxy S III&D2=Samsung GT-i9100 Galaxy S2 ....or are you saying that the S2 results are at 440MHz in the latter?

    IF above Egypt offscreen results should be at default GPU frequencies (266 and 440MHz respectively) the results look quite reasonable.
     
  12. 65fps?! Wait a moment. ......... Okey I just ran it again in 2.1.4 and now it gives me 61fps. A few weeks ago I was getting constantly 53fps. This is silly. Either they changed something between 2.1.3 > 2.1.4 or drivers did indeed improve performance and I didn't notice it in the meantime. Bollocks. French toast can you run an Egypt test on the latest GLBenchmark? I'm still waiting on my blue S3.
     
  13. french toast

    french toast Veteran

    Egypt offscreen 720p- 11064 frames 98fps 98/65*100= 51%

    Pro offscreen 720p -6167 frames 123fps

    Note latest run, ive edged abit higher at 99 + 125 however ive only conducted 3 tests

    EDIT; As im quite new to this app it seems ive made a mistake, or at least im confused as the app says its running in 1080p offscreen, version 2.1.4?? i thought 1080p was version 2.5?
     
    Last edited by a moderator: Jun 1, 2012
  14. Well well. This pretty much leaves out only viable explanation indeed driver improvements, if Kishonti didn't change much in the benchmark.

    So basically scores on the S2 improved 30-40% over the last year or so..
     
  15. Ailuros

    Ailuros Epsilon plus three Legend Subscriber

    Doesn't surprise me one bit; ARM isn't the only IHV with GPU IP where some driver and/or compiler tweaking brought significant performance increases over time.
     
  16. almighty

    almighty Banned

    Adreno released a new driver for the Adreno 220 GPU in December and I gained 2-3x the performance jump in loads of apps.

    I can get 53fps in Nenamark 2 which is much higher then even the Galaxy S2.
     
  17. french toast

    french toast Veteran

    Right quick update about Exynos 4412 and quad cores in general on Android.

    Loaded system tuner pro (amazing peice of software) to track all four threads to see whether they acually are being used, and if so how much.
    It is quite clear that that they do get used very frequently, sometimes clocking all 4 at 800mhz, some times thrashing all four at 1.4ghz (more often than you would think) with those optimisations Samsung said being evident as 3 cores shut off, or 2 or even 1, with differnent cores being able to clock at different frequencies for extra efficiency.

    The minimum speed is 200mhz, and you adjust it to only go to 1ghz using power saving mode, which does very slightly impact performance even with all 4 cores available, that along with the speed you run out of ram tells me Android could easilly use another gb ram and some higher single thread performance, amazing considering my phone is now considerably faster than my netbook.

    So all those nay sayers saying that 4 cores were a waste of battery (batterylife is very good) and would be a waste of resources as they would be redundant can pipe down, i have have seen my self that all 4 threads are indeed used at differing frequencies with maxx frequency used more often than you would think, Android is silky smooth as a result.

    Cheers.
     
  18. Ailuros

    Ailuros Epsilon plus three Legend Subscriber

    While there will always be naysayers for pretty much everything, it remains a fact that efficiency per core (or per thread) is way more important than a sterile amount of cores. Besides personally I wouldn't care how often N hw is really needed, but when needed how badly exactly.
     
  19. french toast

    french toast Veteran

    Yes i get your point, but most of what you have said is in software, Exynos 4412 is nearly at the apex of what can be done on a mobile device, i suspect Snapdragon S4 PRO, built on 28nm HKMG would be the ultimate sumit of both performance and batterylife, but for the next 2 years apart from ram, i can't say im going to want or need any extra power in my pocket, we have got to the stage of a pc in your pocket, ridiculous.

    Can i ask you, how does the Adreno 320 compare to the S3's Mali 400 mp4?

    EDIT; One more thing, in head to head gaming, ie Nova 3, Tegra 3 gets trounced by Exynos 4412.
     
    Last edited by a moderator: Jun 7, 2012
Loading...

Share This Page

Loading...