Very good catch!From \kernel\drivers\media\video\samsung\mali\platform\ pegasus-m400\mali_platform_dvfs.c
So the 4412 is running at at least 440MHz, if they haven't upped it even more since the source drop, and certainly would explain the benchmarks.
I'm pretty sure they must have improved their ALU precision to FP24 and their depth buffer support from 16-bit to 24-bit. Although they don't expose 24-bit depth in OpenGL ES which is probably because it will have a noticeable performance hit (full +50% depth bandwidth since they don't support framebuffer compression AFAICT). Every time I do performance analysis on Tegra at work my eyes bleed at all the depth fighting artifacts... (although it's not as bad in games/benchmarks that set their zmin/zmax intelligently it's still fairly bad).Tegra 2 PS are actually FP20 (bottom of page 7 in the above white paper). No idea about Tegra 3, but seeing it is mainly an expansion of Tegra 2 rather than a redesign, it's probably still at FP20. Which is why I've been curious how they meet the DX9 compliance necessary for the Windows 8 support they've been demoing.
Obviously these are all trade-offs and I understand some of the reasons why they made them, but I think at a basic level NVIDIA designed the original Tegra GPU in an era where they thought handheld GPU performance wouldn't increase anywhere nearly as fast as it has, and more importantly they thought they'd be more limited by area than they actually could be at this point (leading to things like no framebuffer compression). It will be interesting to see how aggressive they are with handheld Kepler (and how similar it is to PC Kepler) once that comes to market although it remains to be seen when that actually is and what the competition will be at that point...