Samsung Orion SoC - dual-core A9 + "5 times the 3D graphics performance"

http://www.nocutnews.co.kr/show.asp?idx=1688473
http://www.careace.net/2011/01/19/galaxy-s2-device-codenamed-seine-teased-mwc-announcement/


The Galaxy S2 will be officially announced in the 13th February and will sport an Orion SoC.

- Orion (Dual A9 1GHz + Mali 400)
- 1GB RAM
- 4.3" SAMOLED+ screen
- 1080p video recording
- 8MP camera
- Samsung Cloud Services (whatever that may be)
- Proposed goal of selling 10 million units before the end of 2011.


With a formal annoncement in mid-February and that sales goal, the device will probably be available for sale at most in the beginning of H2 2011.
This pretty much denies some assumptions I've seen around here that Orion would only be found available in 2012.


With A5 using the SGX543MP2, Orion using Mali 400 and 3rd gen Snapdragons using Adreno 220, I guess the TI OMAP4 will be the graphically-weakest "high-end oriented" SoC available during 2011 (trading blows with Tegra2, maybe), using last year's SGX540 (unless they clock the GPU to astonishingly high levels).

That's a serious perf push. I for one, welcome this take-no-prisoners war for perf.
 
Ok, didn't know USSE2 doubled to vec4.

http://www.imgtec.com/News/Release/index.asp?NewsID=428

  • USSE2 - extended USSE instruction set with comprehensive vector operations and co-issue capability
  • upgraded tile handling to further reduce memory bandwidth and improve performance for setup-bound applications
  • typically 40% faster performance for 'shader-heavy' applications
  • 2x floating point and 2x hidden surface removal performance
  • enhanced triangle setup delivering up to 50% higher throughput
  • improved multi-sampling anti-aliasing performance
  • features for optimised performance when used with POWERVR VXD and VXE video cores
  • advanced colour space handling and gamma correction
  • further optimised OpenVG 1.x support
  • cache and MMU improvements
Not only as I said; triangle setup improvements are in the triangle rates I posted, MSAA & HSR performance improvements are in the z/stencil unit amounts apart from all the others.
 
Oh come on, how hard can it be to guess that. There's a 2 in there. ;)

¬¬




So how much faster should a single core SGX543 be, compared to a SGX540, clock for clock?
Up to 30-40%, if it's not bottlenecked by the TMUs?
 
Orion is somewhat further along than I thought, apparently. Samsung is allocating the resources necessary to be competitive for design wins throughout a lot of the year. Still, a real-world performance comparison to the A5 which will still precede it by quite a while will be interesting, both in video and in graphics.

With Samsung's aggressive push in the mobile SoC market, perhaps they'll even take an ISA license to ARM in the near future as some of their competitors have. Apple will eventually show the result of their own.
 
You'll probably know more than me, but I've never heard of any GPU clock numbers for OMAP4. OMAP34xx is 110MHz and OMAP36xx is 200MHz. Since SGX540 seems like pretty much a SGX530*2, it didn't cross my mind that TI would double the GPU clocks again, achieving 4x the previous generation's performance.

TI have recently made the 4430 datasheet available on their website:-
http://focus.ti.com/pdfs/wtbu/OMAP4430_ES2.x_PUBLIC_TRM_vO.zip

In section 11.3.2 you'll see the max clock for SGX540 is stated as 307MHz. In their recent PR for the 4440, TI indicated the graphics would go 25% higher, which gets you in the region of 380Mhz.
 
So how much faster should a single core SGX543 be, compared to a SGX540, clock for clock?
Up to 30-40%, if it's not bottlenecked by the TMUs?

50% at least in non fill-rate bound scenarios. It's all in theory anyway since I don't think there are going to be single chip SGX543's in the end.

If indeed the SGX540 is clocked to 300-400MHz, then Tegra2 should become the slowest "top performer" of the bunch.
I'd say you're lucky if you surpass the 10M Tris/s mark on the latter.

Since you're wondering about frequencies compared to former generation SoCs, one place to look for answers is what kind of manufacturing process has been used on each SoC as a starter.
 
I just noticed there are also dual Cortex M3 processors in OMAP4430. I understood they're supposed to be imaging and video co-processors.
That seems nice.
 
http://www.nocutnews.co.kr/show.asp?idx=1688473
http://www.careace.net/2011/01/19/galaxy-s2-device-codenamed-seine-teased-mwc-announcement/


The Galaxy S2 will be officially announced in the 13th February and will sport an Orion SoC.

- Orion (Dual A9 1GHz + Mali 400)
- 1GB RAM
- 4.3" SAMOLED+ screen
- 1080p video recording
- 8MP camera
- Samsung Cloud Services (whatever that may be)
- Proposed goal of selling 10 million units before the end of 2011.
- thinner than 9mm (looks like a dick-size competition with Apple)

With a formal annoncement in mid-February and that sales goal, the device will probably be available for sale at most in the beginning of H2 2011.
This pretty much denies some assumptions I've seen around here that Orion would only be found available in 2012.


With A5 using the SGX543MP2, Orion using Mali 400 and 3rd gen Snapdragons using Adreno 220, I guess the TI OMAP4 will be the graphically-weakest "high-end oriented" SoC available during 2011 (trading blows with Tegra2, maybe), using last year's SGX540 (unless they clock the GPU to astonishingly high levels).
isn't the Vivante GC860 faster in a lot of respects compared to the Tegra 2?

http://www.glbenchmark.com/compare....ocity A7&D4=Marvell Armada SmartPhone 800x480

eLocity a7 = Tegra 2, and the Marvell Armada = GC860
 
Taking those results at face value suggests that Vivante might have some faster computational units and finer grained or lower latency branching but Tegra 2 has better loops (maybe a matter of having explicit looping support vs not having it) and more/faster TMUs.. and the huge difference in texturing performance strikes me as the most significant win, at least for anything texturing limited.
 
I never really understood what it was about GPU hardware that had to be specialized for 3D displays. Isn't it enough that it's capable of rendering to two different framebuffers? Wouldn't be surprised at all if we see a bunch of other mobiles doing 3D without a GPU that's promoting support for it. Come to think of it, I don't recall Pica200 ever promoting anything about 3D, yet 3DS seems to support it okay.

The 200MHz boost is nice but not really earth shattering. Will make it a little more competitive against OMAP4440, for tablets.
 
For 3D you mainly want the display pipeline to be able to interleave two buffers.
 
I figured this is something the LCD itself could do... would be a little more logic to handle but worth it for the ease of integration.
 
Vivante's stuff is still a ways off from release in end product. What was interesting to me about the numbers, and what someone helped put into perspective for me, is the impact AA and scaling resolution has on its performance.
 
Vivante's stuff is still a ways off from release in end product. What was interesting to me about the numbers, and what someone helped put into perspective for me, is the impact AA and scaling resolution has on its performance.

I doubt it's lack of raw bandwidth; I'd put my money on inefficient bandwidth saving capabilities. Multi-sampling has typically only a very small impact on fill-rate in comparison to bandwidth.

Could be also some driver quirk though. We'll probably find out after products will ship and drivers are more mature. In any case apart from MSAA and resolution scaling the Vivante based platforms seem to score exceptionally well. Not a contender I'd neglect to overlook. The ARM Mali200 cores I see left and right in the GL Benchmark database seem to need a whole damn lot more work than the Vivante ones.
 
resolution scaling impact on performance?

I doubt it's lack of raw bandwidth; I'd put my money on inefficient bandwidth saving capabilities. Multi-sampling has typically only a very small impact on fill-rate in comparison to bandwidth.

Could be also some driver quirk though. We'll probably find out after products will ship and drivers are more mature. In any case apart from MSAA and resolution scaling the Vivante based platforms seem to score exceptionally well. Not a contender I'd neglect to overlook. The ARM Mali200 cores I see left and right in the GL Benchmark database seem to need a whole damn lot more work than the Vivante ones.

Do you have an idea how performance should scale with resolution on these tests ideally?

If I had Windows executables for these tests I'd give them a try on a desktop ATI or NVDIA card at different resolutions ala the FutureMark benchmarks.

I only find a couple of examples with the same chip at different resolutions on the GLbenchmark site:

Example 1: Apple A4 chip (Imagination SGX535)

iPhone4 640x960 = 614,400 pixels
iPad 768x1024 = 768,432 pixels

iPad has 1.28 times the pixels of iPhone4

GLbenchmark 2.0 Egypt test:
iPhone 723 frames
iPad 571 frames

=> performance drop fro SGX535 is 21% for 1.28x more pixels (assuming both A4 chips are the same)

Example 2: Marvell Armada (Vivante GC860)

Armada 600x1024 = 614,400 pixels
Armada 720x1280 = 921,600 pixels

720x1280 has 1.5 times the pixes of 600x1024

GLbenchmark 2.0 Egypt test:
600x1024 2478 frames
720x128 1753 frames

=> performance drop for Vivante GC860 is 29% for 1.5x more pixels (assuming both Armada chips are the same)

I'm not sure what conclusions to draw from the limited data in these two examples. It will be good once a higher resolution Tegra-2 platform is tested to make the same comparison for NVIDIA on the same tests.
 
I had another look both on Armada's 1.1 and 2.0 results in comparison to other solutions and there doesn't seem to be anything wrong with resolution scaling after all.

I assume that Armada tablets have higher and Armada smart-phones have a lower clocked SoC.

However if you take the 1024*600 resolution (which is far more common for this year's tablets), with 4xMSAA it drops to 1646 from 2478, about 34%. While other solutions are rather in the 17-20% ballpark.

Those are obviously all just reference devices, that's why I also thought of a possible driver quirk.

A Rockwell RK29 (tablet?) appeared in the database at 1024*768 based on a Vivante GC800 (and not 860). 1754 overall score and with MSAA 1391 (a 21% drop). Now go figure heh....
 
I note that Samsung have put some brouchures on its site:-

First S5PC110/S4PV210 (hummingbird):-
http://www.samsung.com/global/busin...ort/brochures/downloads/systemlsi/S5PV210.pdf

Now Orion:-
http://www.samsung.com/global/business/semiconductor/support/brochures/downloads/systemlsi/Orion.pdf

Two interesting things.

1) The orion functional block diagram shows the same 3D/2D block as the S5PC110, i.e. 20M polys/s. I assume this is a lazy mistake by someone.

2) The orion narrative does not mention a poly rate, however they do mention a fill rate of 3200M pix/s. My understanding is that Mail-400MP4 does 1 pix per clock per core, giving 1600M pix/s @ 400Mhz. So it looks like samsung are using an equivalent x2 to allow for their culling of otherwise drawn pixels. If true, this is ironic, as ARM posted an article before Christmas lambasting the notion of "virtual" pixels, which I took to be a very thinly veiled critisim of IMG's x2-x2.5 overdraw allowance.

http://blogs.arm.com/multimedia/353-of-philosophy-and-when-is-a-pixel-not-a-pixel/
 
Back
Top