Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 29-Jul-2012, 01:42   #1
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 7,818
Default GLBenchmark 2.5 Discussion

Quote:
Originally Posted by french toast View Post
Edit: version 2.5 is already used in the gl benchmark pro I think?
No. 2.5 is the newest benchmark set and far more demanding than 2.1 which contains Egypt and PRO. Have a look at the new results as they float in: http://www.glbenchmark.com/result.jsp so far mostly Exynos 32nm and Tegra3 yet nothing all "that" fast as up to now.
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs.
Ailuros is offline   Reply With Quote
Old 29-Jul-2012, 04:09   #2
Lazy8s
Senior Member
 
Join Date: Oct 2002
Posts: 2,833
Default

Taking into consideration the GLBenchmark 2.5 results that were revealed for the new iPad in TI's OMAP5 promo video (which might've been using a slightly earlier, beta version of GLBench 2.5 compared to the version just released for Android) and some of my own testing with a Galaxy Nexus, PowerVR's SGX performs comparatively better, as predicted, than other architectures under the added graphics detail of GLBench 2.5 over 2.1.

I'd guess the higher effective fill rate of PowerVR is playing a significant part considering the bump to 1080p, yet GLBench 2.5 also enhanced several other aspects of the graphics over 2.1 which should theoretically play to PowerVR's strengths.
Lazy8s is offline   Reply With Quote
Old 29-Jul-2012, 10:31   #3
french toast
Senior Member
 
Join Date: Jan 2012
Location: Leicestershire - England
Posts: 1,488
Default

I'm shocked to see tegra 3 thrashing exynos..
french toast is offline   Reply With Quote
Old 29-Jul-2012, 10:35   #4
french toast
Senior Member
 
Join Date: Jan 2012
Location: Leicestershire - England
Posts: 1,488
Default

Yea but changing uarch is a time consuming costly business is it not?...by the sounds of it it's only bringing 10-15% better performance.
french toast is offline   Reply With Quote
Old 29-Jul-2012, 12:01   #5
Jubei
Member
 
Join Date: Dec 2011
Posts: 282
Default

Quote:
Originally Posted by french toast View Post
Yea but changing uarch is a time consuming costly business is it not?...by the sounds of it it's only bringing 10-15% better performance.
I think its a cheaper and more logical solution than being beaten in performance by Cortex A15 and not being able to compete until the next generation is ready.

10-15% better performance should allow them to stay competitive while they work on a new architecture.
Jubei is offline   Reply With Quote
Old 29-Jul-2012, 14:09   #6
Lazy8s
Senior Member
 
Join Date: Oct 2002
Posts: 2,833
Default

The biggest advantage Qualcomm has gotten out of their ISA license is getting to market earlier with a next generation part, helping them to secure important design win momentum.

That One X listing which ranked at the top of the 2.5 results is far from stock... lots of different software modifications to the environment, at least... EDIT: Scratch that. The environment listings just compile all the data from all of the different runs, and the One X has been released in different flavors of processors.

Last edited by Lazy8s; 29-Jul-2012 at 14:42.
Lazy8s is offline   Reply With Quote
Old 29-Jul-2012, 14:53   #7
Arun
Unknown.
 
Join Date: Aug 2002
Location: UK
Posts: 4,882
Default

Quote:
Originally Posted by french toast View Post
I'm shocked to see tegra 3 thrashing exynos..
Those results are for the Adreno 225-based One X, I tested the HTC One S and got a score very close to that. The reason why Adreno is doing so well in GLB2.5 is that it's a much more ALU-heavy benchmark than GLB2.1 and that has always been Adreno's strength. For Tegra you should be looking at the Transformer scores which are accurate.

As for Exynos, it's easy to figure out that it's heavily vertex shader limited, and that's despite Kishonti implementing lots of things to remove unnecessary geometry (portal culling, reflection LODs, etc.) - it was much worse in some earlier benchmark versions.
Quote:
Originally Posted by Lazy8s
Taking into consideration the GLBenchmark 2.5 results that were revealed for the new iPad in TI's OMAP5 promo video (which might've been using a slightly earlier, beta version of GLBench 2.5 compared to the version just released for Android)
That demo was a version for Mobile World Congress, you should ignore the results completely - there have been about a gazillion new builds since then.
Quote:
and some of my own testing with a Galaxy Nexus
I'm curious, which ones are these? The result that seems correct for the Nexus is 573 for 1080p Offscreen, which is pretty good for a GPU with less than 5GFlops of ALU performance, but if you're impressed by that then SGX-XT will blow your mind I only tested on an ICS Nexus on Friday, but JB shouldn't make a huge difference.

Quote:
PowerVR's SGX performs comparatively better, as predicted, than other architectures under the added graphics detail of GLBench 2.5 over 2.1.
It certainly does, although TBDR only plays a small part in it as Kishonti are doing fairly aggressive front-to-back sorting combined with portal culling (it's clearly still an advantage as there's still a bit of overdraw plus we have faster depth testing than anybody else, but it's not the main factor). SGX is perfectly capable of beating everyone else even in that kind of scenario

The main reasons are that we're very fast in high geometry workloads (how ironic given some of the marketing fud against us) and that we're very fast in all of the vertex and pixel shaders in terms of cycle count and real-world performance.
__________________
Focusing on non-graphics projects in 2013 (but I still love triangles)
"[...]; the kind of variation which ensues depending in most cases in a far higher degree on the nature or constitution of the being, than on the nature of the changed conditions."
Arun is offline   Reply With Quote
Old 29-Jul-2012, 14:59   #8
french toast
Senior Member
 
Join Date: Jan 2012
Location: Leicestershire - England
Posts: 1,488
Default

Thanks arun.

So maybe the adreno 225 is a better allround gpu than tegra ulv and mali 400 mp4?...

Just early benchmarks favoured certain scenarios that fixed function shaders could excel at.
french toast is offline   Reply With Quote
Old 29-Jul-2012, 15:10   #9
Lazy8s
Senior Member
 
Join Date: Oct 2002
Posts: 2,833
Default

Some of the Galaxy Nexus scores on the site are from my uploads. And I was impressed in a relative context (how it was at least somewhat more competitive in 2.5 vs 2.1), not an absolute context where the larger GPUs obviously overpower it. I've got an iPad 2, so I know Series 5XT MP can step quite a bit higher in performance!

Yeah, Jelly Bean doesn't change much when it comes to benchmarks. It has done wonders for the smoothness of the UI, as intended, though.
Lazy8s is offline   Reply With Quote
Old 29-Jul-2012, 15:28   #10
french toast
Senior Member
 
Join Date: Jan 2012
Location: Leicestershire - England
Posts: 1,488
Default

I was wondering...what does jelly bean do to battery life? As I read some of the optimizations up clock frequency when touching the screen...
french toast is offline   Reply With Quote
Old 29-Jul-2012, 15:33   #11
Arun
Unknown.
 
Join Date: Aug 2002
Location: UK
Posts: 4,882
Default

Quote:
Originally Posted by french toast View Post
So maybe the adreno 225 is a better allround gpu than tegra ulv and mali 400 mp4?...

Just early benchmarks favoured certain scenarios that fixed function shaders could excel at.
I'm not sure I'd say that. It's not fixed-function vs ALUs, it's texture-heavy vs arithmetic-heavy. Most real games on the Android market are more texture-heavy than GLB2.5 so the Adreno 225 wouldn't have as much of an advantage.

But more importantly, the biggest problem with Adreno in my mind is that their performance drops off a cliff much more easily when the developer does something wrong. The early versions of GLBenchmark 2.5 did a few things wrong with their vertex buffers and framebufer objects and their relative performance was the worst of anyone before Kishonti fixed it. This is partially driver issues probably, but also their entire tiling architecture which is very fragile - I can imagine that's one of the reasons they added an IMR mode to Adreno 320. Even if it's faster for them, that doesn't necessarily mean IMR would be faster for anyone else.
__________________
Focusing on non-graphics projects in 2013 (but I still love triangles)
"[...]; the kind of variation which ensues depending in most cases in a far higher degree on the nature or constitution of the being, than on the nature of the changed conditions."
Arun is offline   Reply With Quote
Old 29-Jul-2012, 16:32   #12
french toast
Senior Member
 
Join Date: Jan 2012
Location: Leicestershire - England
Posts: 1,488
Default

Yea still don't know what to make of the 320...it looks like it should have come sooner than now.

It gets trashed by a iPad 3 in gl 2.1 so I would like to see some 2.5 results before passing judgement....also iPad 3 carries some funky quad channel memory bandwidth to play with vs the standard dual channel set up of the mdp platform.

That platform supports lpddr3 when it becomes available? Which would equal the bandwidth and maybe lower the latency as well?.

Certainly apple a6 is going to rip it a new one.
french toast is offline   Reply With Quote
Old 30-Jul-2012, 08:29   #13
mboeller
Member
 
Join Date: Feb 2002
Location: Germany
Posts: 849
Default

Quote:
Originally Posted by french toast View Post
It gets trashed by a iPad 3 in gl 2.1
but isn't that due to the comparable low fillrate? 732 MTexels (according to anandtech) seems to point to 366MHz x 2 TMU's or 183 MHz x 4 TMU's.

Therefore I suspect that the Adreno 320 in the dev-tablet does not run at full speed yet.
mboeller is offline   Reply With Quote
Old 30-Jul-2012, 12:23   #14
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 7,818
Default

Quote:
Originally Posted by Arun View Post
That demo was a version for Mobile World Congress, you should ignore the results completely - there have been about a gazillion new builds since then.
That's all I need to read to make my day even further

***edit: http://www.glbenchmark.com/phonedeta...chmark=glpro25

49.0 fps

Isn't the iPad2 result low?

Quote:
It certainly does, although TBDR only plays a small part in it as Kishonti are doing fairly aggressive front-to-back sorting combined with portal culling (it's clearly still an advantage as there's still a bit of overdraw plus we have faster depth testing than anybody else, but it's not the main factor). SGX is perfectly capable of beating everyone else even in that kind of scenario

The main reasons are that we're very fast in high geometry workloads (how ironic given some of the marketing fud against us) and that we're very fast in all of the vertex and pixel shaders in terms of cycle count and real-world performance.
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs.
Ailuros is offline   Reply With Quote
Old 30-Jul-2012, 13:09   #15
french toast
Senior Member
 
Join Date: Jan 2012
Location: Leicestershire - England
Posts: 1,488
Default

Quote:
Originally Posted by mboeller View Post
but isn't that due to the comparable low fillrate? 732 MTexels (according to anandtech) seems to point to 366MHz x 2 TMU's or 183 MHz x 4 TMU's.

Therefore I suspect that the Adreno 320 in the dev-tablet does not run at full speed yet.
I hope so...well the last mdp was actually realistic performance for a smartphone....so maybe as Qualcomm knows this is going into a smartphone and not just a tablet that they have clocked it for a realistic smartphone performance?

Certainly when lpddr3 gets loaded into it that would help the 1080p results on 2.5.
french toast is offline   Reply With Quote
Old 30-Jul-2012, 13:15   #16
french toast
Senior Member
 
Join Date: Jan 2012
Location: Leicestershire - England
Posts: 1,488
Default

Also take into consideration the crap drivers Qualcomm has previously shipped and the likely sensitivity of adreno uarch to such things...
As arun suggested.
french toast is offline   Reply With Quote
Old 30-Jul-2012, 13:19   #17
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 7,818
Default

Quote:
Originally Posted by mboeller View Post
but isn't that due to the comparable low fillrate? 732 MTexels (according to anandtech) seems to point to 366MHz x 2 TMU's or 183 MHz x 4 TMU's.

Therefore I suspect that the Adreno 320 in the dev-tablet does not run at full speed yet.
Adreno320 should have 4 TMUs or better probably 1 TMU/cluster. Since when do GLBenchmark2.1 fillrate results reflect the actual theoretical peak fillrate on IMRs anyway? Yes the 543MP4@250MHz gives almost 2.0 GTexels/s for 8 TMUs but that's a TBDR and I've seen that phenomenon in 2.1 fillrate results only on IMG's cores.

http://www.glbenchmark.com/phonedeta...group=lowlevel

The Mali400MP4 in the S3 should be clocked at 440MHz, for 4 TMUs it gives you a theoretical peak of 1760 MTexels, despite it giving almost 790 MTexels in the warm-up fill test.

It's likelier that the Adreno320 is clocked at around 400MHz than anything else and yes that with 4 TMUs.
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs.
Ailuros is offline   Reply With Quote
Old 31-Jul-2012, 06:21   #18
mboeller
Member
 
Join Date: Feb 2002
Location: Germany
Posts: 849
Default

Quote:
Originally Posted by Ailuros View Post
Adreno320 should have 4 TMUs or better probably 1 TMU/cluster. Since when do GLBenchmark2.1 fillrate results reflect the actual theoretical peak fillrate on IMRs anyway?
Ah..Ok. I had only compared the Adreno 320 score with the SGX scores.
mboeller is offline   Reply With Quote
Old 31-Jul-2012, 16:12   #19
ltcommander.data
Member
 
Join Date: Apr 2010
Posts: 433
Default

http://www.anandtech.com/show/6121/g...25-performance

Anandtech has results up for a variety of Android SoC.

EDIT:
http://www.anandtech.com/show/6126/g...ndroid-devices

iOS results now up too. Looks like the SGX543MP2 struggles in the triangle tests against the Tegra 3 and Adreno 225.

Last edited by ltcommander.data; 01-Aug-2012 at 02:40.
ltcommander.data is offline   Reply With Quote
Old 01-Aug-2012, 04:11   #20
ams
Member
 
Join Date: Jul 2012
Posts: 419
Default

This is all well and good, but IMO the benchmarks used for comparing different mobile GPU's/CPU's are truly lacking compared to benchmarks used for comparing different desktop/notebook GPU's/CPU's. Imagine if month after month, the only benchmarks used for comparing desktop systems was 3dmark. It would be nice to see a wider variety of real world benchmarks to compare and contrast mobile GPU/CPU performance.
ams is offline   Reply With Quote
Old 01-Aug-2012, 06:40   #21
Lazy8s
Senior Member
 
Join Date: Oct 2002
Posts: 2,833
Default

The marginal struggle in triangle performance of the A5 versus the newer competition appears to be a condition of the higher image resolutions; I'm guessing it's a consequence of USC balancing.
Lazy8s is offline   Reply With Quote
Old 01-Aug-2012, 09:02   #22
mboeller
Member
 
Join Date: Feb 2002
Location: Germany
Posts: 849
Default

Quote:
Originally Posted by ltcommander.data View Post
http://www.anandtech.com/show/6121/g...25-performance

Anandtech has results up for a variety of Android SoC.

EDIT:
http://www.anandtech.com/show/6126/g...ndroid-devices

iOS results now up too. Looks like the SGX543MP2 struggles in the triangle tests against the Tegra 3 and Adreno 225.
So they compared Tegra3 to Tegra3....wow
I didn't even bother to look at the second page.

The comparison of the A5 with the A5X (second link) was more interesting. The A5X is nearly always roughly twice as fast as the A5. Therefore PowerVR seems to have a really scaleable Multi-GPU implementation. Nice!
mboeller is offline   Reply With Quote
Old 01-Aug-2012, 09:37   #23
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 7,818
Default

Quote:
Originally Posted by Lazy8s View Post
The marginal struggle in triangle performance of the A5 versus the newer competition appears to be a condition of the higher image resolutions; I'm guessing it's a consequence of USC balancing.
Could be one of the factors; keep in mind that the ULP GF in T30 clocks at 520MHz, which also goes for its Vec4 VS unit in a specific singled out synthetic case scenario.

The whole 2.5 affair (and also encounting Arun's comments above) smells like it's quite an ALU intensive benchmark. And that's probably the reason why the 543MP2 banks so close in its ranking with the Adreno225. Essentially both have oversimplified 8 Vec4 ALUs; the iPad2 MP2 comes out at a good advantage considering that its clocked at 250MHz which is significantly lower what the 225 is clocked at and also of course the Mali400MP4 in the 32nm Exynos.

Execution still being a question mark, but Intel's and TI's 544MP2s clocked at ~532MHz will fair a tad better than the MP4 in iPad3. Even worse I don't want to imagine what kind of performance a "simple" Rogue GC6200 could deliver in that one.

I now am very curious about 2 solutions that haven't appeared in the results for 2.5 yet: Adreno320 and SGX544. My gut feeling estimates the first to give or take break even with iPad3 and the latter to probably break even with T30.

Quote:
Originally Posted by mboeller View Post
The comparison of the A5 with the A5X (second link) was more interesting. The A5X is nearly always roughly twice as fast as the A5. Therefore PowerVR seems to have a really scaleable Multi-GPU implementation. Nice!
Performance also scales as expected according to clockspeed between iPad2 (GPU@250MHz) and iPhone4S (GPU@200MHz).
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs.
Ailuros is offline   Reply With Quote
Old 05-Aug-2012, 12:16   #24
Ryan Smith
Junior Member
 
Join Date: Mar 2010
Posts: 73
Default

Quote:
Originally Posted by mboeller View Post
Therefore PowerVR seems to have a really scaleable Multi-GPU implementation. Nice!
PowerVR's scalable core count is as much multi-GPU as a Radeon HD 7970 is a 7770 in tri-CF. Which is to say it's 1 GPU with multiple functional units, a far different beast than discrete GPUs.
Ryan Smith is offline   Reply With Quote
Old 05-Aug-2012, 13:49   #25
Arun
Unknown.
 
Join Date: Aug 2002
Location: UK
Posts: 4,882
Default

Quote:
Originally Posted by Ryan Smith View Post
PowerVR's scalable core count is as much multi-GPU as a Radeon HD 7970 is a 7770 in tri-CF. Which is to say it's 1 GPU with multiple functional units, a far different beast than discrete GPUs.
That's not completely true - there are several fundamental differences, including separate triangle setup/rasterisation units (only introduced in Fermi - you could make an argument that the GTX480 had 4 cores) but also all the complexity inherent in multiple cores having their own binning unit which can write triangles to memory despite that these triangles must ultimately still be rasterised/rendered in order.

If you wanted to make a multi-GPU comparison, it's probably closest to SFR but without having to process the geometry multiple times and each GPU being able to do as much geometry processing as it wants. This is obviously reliant on sharing the same memory so it's impossible with multiple chips today, but in the future who knows with technologies like TSV etc...
__________________
Focusing on non-graphics projects in 2013 (but I still love triangles)
"[...]; the kind of variation which ensues depending in most cases in a far higher degree on the nature or constitution of the being, than on the nature of the changed conditions."
Arun is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 05:50.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.