OMAP4 & SGX540

Either way you turn it since it's a marketing stunt it doesn't have to reflect average realistic performance. I'm not going to argue about the sillyness of any sort of those type of marketing claims, but I recall specifically NV claiming that the ULP GF in T3 is 3x times faster than the ULP GF in T2. Let me pull that equally nast marketing trick here and see how it can backfire if you claim bullshit:

ULP GF T2@333MHz=
1Vec4 PS = 8 * 0.333GHz = 2.664 GFLOPs
ULP GF T3@520MHz=
2Vec4 PS = 16 * 0.52GHz = 8.32 GFLOPs
------------------------------------------------------
8.32 / 2.664 = 3.12x difference what a coincidence :rolleyes::LOL:



Again if you'll poke the Amazon CEO he'll take another 180 degree turn and tell you sorry I meant the highest end variant only. I thing we should be familiar with marketing crap these days.

-------------------------------------------------------------------------------------------------------------------
Arun,

Help me out here: isn't in the ULP GF PS ALUs a programmable blending unit that when no blending is used they could use another theoretical FLOP?

In other words if the story would go about 4+1 ALUs you'd have the probably count the +1 for theoretical peak arithmetic throughput as well on Tegras, Adrenos and possibly others too.
Programmable blending != more flops.
 
Programmable blending != more flops.

Then I'm misinterpreting the following:

http://www.nvidia.com/content/PDF/t...ing_High-End_Graphics_to_Handheld_Devices.pdf

Integrated Pixel Shader and Programmable Blend Unit
The OpenGL ES 2.0 logical GPU pipeline defines a separate stage for pixel blending that is performed after pixel shading. The fixed function blend unit defined by the logical pipeline supports only a limited set of blend operations. The GeForce pipeline integrates the blend unit with the pixel shader, making it a fully programmable blender. Due to the integrated design, the pixel shader can harness the processing power of the blender when there are no blend operations in progress. In addition, the programmable blender allows GeForce GPU to implement blend modes that are not defined by the OpenGL spec. For example, Adobe’s Flash Player uses several blend modes that are not supported by OpenGL, but the GeForce GPU is able handle these blend modes due to its programmable blender.
 
NVIDIA does rate 3D performance of Tegra 3 relative to Tegra 2 as "Up to 3x". And while 3x performance improvement vs. Tegra 2 is not typical, there are some examples to back that up. GLBenchmark 2.5 has some tests that show ~ 2.6-3.2x performance improvement in fps (http://images.anandtech.com/graphs/graph6121/48839.png). Lost Planet 2, Da Vinci, and Glowball show ~ 2.1-2.7x performance improvement in fps (shown in one of the Tegra whitepapers).

No idea about Lost Planet 2 and Da Vinci, but Glowball is too CPU bound for my taste to stand as worthwhile case example, especially since performance drops radically if you switch of two of four A9 CPU cores.

As for Egypt classic offscreen in GLBenchmark2.5 if I take the fastest available device with a Tegra3 SoC vs. the fastest available Tegra2 Soc it looks more like this:

http://www.glbenchmark.com/compare.jsp?benchmark=glpro25&showhide=true&certified_only=1&D1=LG%20LU6500&D2=Asus%20Transformer%20Pad%20TF700T%20Infinity

4312 frames / 1779 frames = 2.4x times difference.

That's for Egypt Classic 2.1 offscreen by the way; Egypt 2.5 offscreen is a 1.85x times difference according to the above due to 2.5 taxing quite a bit more geometry than 2.1.

That would be the correct 2.5 Egypt offscreen slide, with the Galaxy Tab 10.1 not being the fastest T2 device by far:

http://images.anandtech.com/graphs/graph6126/48888.png

Either way you twist it since neither vertex processing nor texel fillrate or z fillrates are remotely close to being at a factor 3 between T3 and T3 and only pixel shading is, you need to find corner cases where the application is mostly pixel shading bound in order to find a factor 3x GPU performance inrease.

NV wasn't lying when they claimed up to 3x times GPU performance, but it's not true and convenient only in selected cases.
 
NV wasn't lying when they claimed up to 3x times GPU performance, but it's not true and convenient only in selected cases.

Their claim of up to 3x performance improvement vs. Tegra 2 is based on GLBenchmark 2.0 (as noted on their website). Of course, as I mentioned earlier, this is not typical, but not totally out of the question either in a few instances (as Glowball sees 2.7x performance improvement, Da Vinci sees 2.3x improvement, and GLBenchmark 2.5 sees 2.6x improvement). Regardless, this is still different than grossly misrepresenting a competitor's product and architectural design, which is exactly what Amazon did.
 
Again the improvement in the GLBenchmark2.5 score between T3 and T2 is at 1.85x times at the moment and not 2.5 or 2.6x (see above).

It still doesn't change the fact that they only theoretical increase that reaches/exceeds a factor 3.0x between ULP GF in T3 and T2 are the PS ALUs.

At 520MHz vs. 333MHz respectively the increase in vertex throughput between the two is at 1.56x times (which is actually by quite a bit closer to the increase in GLBenchmark2.5 for good reason) and the texel fillrate increase is at the very same 1.56x times increase too amongst others.

So while we're arguing marketing semantics here, I'm pretty sure that "up to 3x times GPU performance increase" for T3 comes from hidden borgo sphinxter units and definitely not PS ALUs.
 
You are looking at the wrong data set. I specifically mentioned GLBenchmark 2.5 - Egypt Classic (Offscreen 1080p), not GLBenchmark 2.5 - Egypt HD (Offscreen 1080p). Looking at some reasonably well regarded commercial tablets, the difference in performance between Tegra 2 and Tegra 3 is indeed 2.6-3.2x with GLBenchmark 2.5 - Egypt Classic (Offscreen 1080p): http://images.anandtech.com/graphs/graph6121/48839.png . Looking at these same reasonably well regarded commercial tablets, the difference in performance between Tegra 2 and Tegra 3 with GLBenchmark 2.5 - Egypt HD (Offscreen 1080p) is 2.0-2.6x: http://images.anandtech.com/graphs/graph6121/48837.png . The Egypt HD results are actually less useful because the framerates are so hopelessly low on this hardware (between 4-11 fps). Of course, the maximum performance of both Tegra 2 and Tegra 3 is subject to change at any time with SoC hardware and/or software improvements, but that doesn't simply invalidate all other previous comparisons.

Anyway, Tegra 2 and Tegra 3 are two products from the same company based on similar non-unified architectures, and their performance delta was listed as "Up to 3x" based specifically on GLBenchmark 2.0 as noted by the company. To attempt to equate that with a completely different scenario involving one company grossly misrepresenting a competitor's architecture is crazy.
 
Last edited by a moderator:
You are looking at the wrong data set.

Am I? Let's see.

I specifically mentioned GLBenchmark 2.5 - Egypt Classic (Offscreen 1080p), not GLBenchmark 2.5 - Egypt HD (Offscreen 1080p). Looking at some reasonably well regarded commercial tablets, the difference in performance between Tegra 2 and Tegra 3 is indeed 2.6-3.2x with GLBenchmark 2.5 - Egypt Classic (Offscreen 1080p): http://images.anandtech.com/graphs/graph6121/48839.png . Looking at these same reasonably well regarded commercial tablets, the difference in performance between Tegra 2 and Tegra 3 with GLBenchmark 2.5 - Egypt HD (Offscreen 1080p) is 2.0-2.6x: http://images.anandtech.com/graphs/graph6121/48837.png .

Have a 2nd more careful look at the results in this link again:

http://www.glbenchmark.com/compare....U6500&D2=Asus Transformer Pad TF700T Infinity

....and let me know what the Kishonti database states exactly next to "Egypt Classic". It might be part of the GLBenchmark2.5 set, yet Egypt classic is named by Kishonti itself GLBenchmark 2.1 Egypt Classic - Offscreen (1080p).

And no I refuse to take the fastest implementation of one SoC and compare it with another one that doesn't represent its peak performance. There are piles of differences in OEM/vendor specific implementations, SoC bandwidth, underlying sw and what not which give the differences between one SoCs implementation vs another. The Infinity TF700T has additional bandwidth against other T3 solutions and that's why it pulls somewhat ahead.

The Egypt HD results are actually less useful because the framerates are so hopelessly low on this hardware (between 4-11 fps). Of course, the maximum performance of both Tegra 2 and Tegra 3 is subject to change at any time with SoC hardware and/or software improvements, but that doesn't simply invalidate all other previous comparisons.

Synthetic benchmarks like GLBenchmark aim IMHO to somewhat "predict" or estimate if you prefer GPU performance in future 3D applications. Sometimes synthetics like that manage to paint an as accurate picture as possible and sometimes they fail. It remains open if and by how much Kishonti has managed to make an accurate prediction with it. Frames per second aren't relevant in such a case since you don't play a synthetic benchmark; it's an attempt to artificially read out each piece of hardware's capabilities and might expose weaknesses.

2.5 must have compared to 2.1 a much higher geometry load otherwise the results are hard to explain. I don't expect geometry to either remain idle in future mobile games nor do I expect it to shrink and I don't expect either that any mobile ISV would create a game today that would run on any Tegra3 or equivalent SoC at single digit framerates average.

Anyway, Tegra 2 and Tegra 3 are two products from the same company based on similar non-unified architectures, and their performance delta was listed as "Up to 3x" based specifically on GLBenchmark 2.0 as noted by the company. To attempt to equate that with a completely different scenario involving one company grossly misrepresenting a competitor's architecture is crazy.

It still doesn't change the fact that any of the up to 3x comes from the added 2nd PS Vec ALU in T3 and the higher GPU frequencies which also get reflected up to a point in a synthetic benchmark which in it's iteration A is useful and iteration B questionable whenever it's convenient.

I never said or implied that Amazon's stuff wasn't complete bullshit. But that's marketing for you and that's one department NVIDIA is the last company that doesn't have any equal or worse marketing stunts in its track record. And no I can't think of any other IHV or company that's innocent or immun to that, since it's the actual task of marketing is to exaggerate or paint the most optimistic picture of whatever is being marketed. Some even go too far and call it a lie.
 
As I understood it, they added a few specialized instructions for blending that can be issued by the shader if it's not already doing blending.

Hence my original question; it's not uncommon in a lot of mobile GPU architectures that SFUs in ALUs can eventually under conditionals get used for a single FLOP. It's obviously just a corner case which for the majority of cases isn't usable.

But since Arun mentioned the 9th FLOP for Series5XT cores, I thought I ask if the ULP GFs in Tegras can under conditionals add another FLOP too at least in theory.
 
Hence my original question; it's not uncommon in a lot of mobile GPU architectures that SFUs in ALUs can eventually under conditionals get used for a single FLOP. It's obviously just a corner case which for the majority of cases isn't usable.

But since Arun mentioned the 9th FLOP for Series5XT cores, I thought I ask if the ULP GFs in Tegras can under conditionals add another FLOP too at least in theory.

I don't know about Tegra's internals but if I had to guess, I'd say no.
 
By the way, on the Kindle Fire HD 8.9" product page, Amazon claims that Omap 4470 has 40% more memory bandwidth than Tegra 3 (on the basis that the former has 7.5 GB/sec memory bandwidth, and the latter has 5.3 GB/sec memory bandwidth). This is yet another false and deceptive statement from Amazon. The Google Nexus 7 has 5.3 GB/sec memory bandwidth, but Tegra 3 as an SoC is not limited to that. In fact, the Asus Transformer Pad Infinity has 6.4 GB/sec memory bandwidth. So in that scenario, the difference in memory bandwidth would be 17% and not 40%! Amazing how so-called facts can be twisted so easily.
 
By the way, on the Kindle Fire HD 8.9" product page, Amazon claims that Omap 4470 has 40% more memory bandwidth than Tegra 3 (on the basis that the former has 7.5 GB/sec memory bandwidth, and the latter has 5.3 GB/sec memory bandwidth). This is yet another false and deceptive statement from Amazon. The Google Nexus 7 has 5.3 GB/sec memory bandwidth, but Tegra 3 as an SoC is not limited to that. In fact, the Asus Transformer Pad Infinity has 6.4 GB/sec memory bandwidth. So in that scenario, the difference in memory bandwidth would be 17% and not 40%! Amazing how so-called facts can be twisted so easily.
So they're not lying. They're just selectively telling the truth. Realistically as marketing can go, that's pretty good restraint. Now if there was a high-end Tegra 3 configuration that could have more bandwidth than the OMAP 4470 I could see more customers being offended that they aren't being told. But since the difference is between Tegra 3 losing and Tegra 3 losing less in terms of memory bandwidth, I doubt even nVidia will try to point out the complete facts to customers.
 
By the way, on the Kindle Fire HD 8.9" product page, Amazon claims that Omap 4470 has 40% more memory bandwidth than Tegra 3 (on the basis that the former has 7.5 GB/sec memory bandwidth, and the latter has 5.3 GB/sec memory bandwidth). This is yet another false and deceptive statement from Amazon. The Google Nexus 7 has 5.3 GB/sec memory bandwidth, but Tegra 3 as an SoC is not limited to that. In fact, the Asus Transformer Pad Infinity has 6.4 GB/sec memory bandwidth. So in that scenario, the difference in memory bandwidth would be 17% and not 40%! Amazing how so-called facts can be twisted so easily.
I wonder why Bezos even bothers. The Kindle buying audience is even less technically oriented than that of Apple and Android. And almost nobody follows the keynote.

The only thing that matters is if the UI is smooth. The few preview that I've seen were not too positive on that, which is not surprising given its ancestry.
 
The only thing that matters is if the UI is smooth. The few preview that I've seen were not too positive on that, which is not surprising given its ancestry.
Does OMAP have a reputation for being responsible for jittery UI performance on Android, or do you mean Android itself?
 
Does OMAP have a reputation for being responsible for jittery UI performance on Android, or do you mean Android itself?


I think he meant the video previews made with the new kindle fires.
 
I wonder why Bezos even bothers. The Kindle buying audience is even less technically oriented than that of Apple and Android. And almost nobody follows the keynote.

I think making comments about the technical orientation of buyers is pretty ignorant in all honesty. I know many people who have bought iphones and ipads that are certainly more technically knowledgeable than 99.9999% of the android buyers. Given the overall market demographics and usage patterns, it is probably fair to say that the average iOS buyer is more technically oriented than the average android buyer.

And it is also probably fair to say that the average kindle fire buyer is more likely to already be a heavy amazon buyer and already have amazon prime. Beyond that the technical knowledge is probably fairly unknown. And the Kindle Fire HD is very likely to be the best selling "android-based" tablet just like the Kindle Fire before it was.

As far as UI snapiness, no android device is great in that department with even 4.1 well behind iOS in that regard.
 
Yes.
Maybe Amazon has some time to copy over some JB features before launch, but I'm not holding my breath.

Having one sitting in front of me, it doesn't appear that they have incorporated the 4.1 increases to android's horrible UI smoothness in. Even 4.1 needs a lot more work on the smoothness front being still behind the ui responsiveness of the original iphone.
 
Back
Top