Last OT post wrt to Mali and FLOPS the best response comes from ARm itself, real world evidences seems to corroborate theirs claims:
http://community.arm.com/groups/arm...ops--how-arm-measures-gpu-compute-performance
As useful as any marketing blog out there. The T760MP8 at a turbo frequency of 772MHz in the Exynos7420 (being the biggest Midgaard integrated to date I'm aware of) comes to 198 GFLOPs FP32 as a peak theoretical and 6.18 GTexels/s fillrate (1TMU/cluster); what is left then in real time is another chapter, but it doesn't change one bit my quote above from several months ago.
Up to T7xx Midgaard cores are capable of 32 FLOPs/clock FP32 and if you'd want to also count SFU ops then 34/clock. The part in the above blog writeup where they mention that they don't "count" SFU FLOPs is probably the joke of the day. I recall them counting them first into the peak FLOP count with their T604.