Games designed for modern day consoles and PC are not designed with lower precision pixel rendering in mind. So console and PC game developers that are looking to bring their games to the ultra mobile space with high visual fidelity and console quality will surely be making use of the FP32 ALU's for pixel rendering.
OK, first I'll quote Sebbbi from the new Rogue architectural thread:
sebbbi said:
I personally think Rogue is a step in the right direction. With many other mobile architectures you still have to use the painful ~FP10 formats to extract the best performance. I sincerely hope that lowp vanishes soon, and the ES standard would dictate FP16 minimum for mediump.
FP16 is great for pixel shaders. You can avoid vast majority of precision issues by thinking about your numeric ranges and transforms. Doing your pixel shader math in view space or in camera centered world space goes long way to the right direction.
Obviously you need FP24/32 for vertex shaders. However most of the GPU math is done in pixel shaders as high resolutions such as 2048x1536 and 2560x1600 are common in mobile devices. The decision to increase the FP16 ALUs was a correct one.
Increased FP16 ALUs also means that the performance trade off from dropping the lowp ALUs is decreased. This is certanly a good thing. I hope that other mobile GPU developers also think alike and lowp is soon gone
So there you have it from someone who is actually intimate with this problem area.
I also somehow feel that you don't quite grasp the hows and whys of numerical precision in general. That's OK by the way, even among people who are professionals in fields that are affected, it is a topic that is often considered tangential (=sweep under the rug). It's typically not treated at all below university level, and there the initial courses typically look at worst case error propagation, truncation vs. rounding, classical problem areas such as derivatives and matrix inversions - useful, but a bit removed from practical problems which typically involve other sources of error in underlying models or whatever, removed from the absolute numerical measurability of the mathematical discipline of the course, and which determine the
significance of numerical precision.
Basics - when we are talking pixels on cell phone screens these days, those are 8-bit integer values for Red, Green and Blue respectively. In days gone by you would do your graphics calculations in 8-bit integer/pixel color, progressing to 10-bit for better precision, to fp10 for less precision but better handling of dynamic range and fixed point formats (which I'll ignore for now for simplicity) to fp16 and fp32.
FP16, since it is the focus of the discussion has one sign bit, a 5-bit mantissa, and 11 bits worth of significand (through a trick). For most calculations you have precision to spare. Errors accumulate slowly enough that they don't reach significant levels. However, there are problematic operations, and of course algorithms that are good in some ways, but are dubious in terms of numerical precision. And that is where higher precision formats may be of aid. OR you could work around those specific problems in other ways, depending on your priorities and the hardware capabilities at hand.
At the end of the day, we are talking about gaming graphics on mobile devices with very high pixel densities. We are not talking about life support systems or satellite control after all. So if you lack previous experience, it would be a viable approach to simply default to FP16, run your code, and see if everything looks OK. If it does, you're good. If it doesn't, look at what causes the problem, and see if you can work around it, or if it is small enough and contained enough to be fixed by increasing the precision. (If you're using a strongly divergent algorithm, you're asking for problems and you might want to change approach.) Given the nature of gaming, and the devices the games run on, some errors are perfectly OK, after all it's not as if there are no compromises going on in other areas than numerical precision that have large visual consequences. With accumulated experience in the field, you'll know where to expect issues (as in sebbbis example above), and you'll save yourself a bit of work.
Generally, if you have limited resources, and in these devices you always do, you avoid wasting resources unnecessarily. Saving bandwidth, power, computational resources and to some extent memory is
always a good idea. So using FP16 (or even lower) as default makes sense. The interesting part is actually in those areas where precision starts being an issue to be reckoned with - can you find alternative ways to express your algorithm that is less numerically demanding? Alternatively, are there visually similar algorithms that are less demanding? If the problem arises from a particular operation in the algorithm, can you do type-casting tricks to ensure that you have enough precision right there, and then convert back? Is that profitable in terms of of computational intensity, or would it actually be more efficient to brute force it? And so on - my experience in these matters is from a different field, where brute force typically wins out because of very lax financial accountability
, but even there making poor algorithmic choices will bite you. Throwing significand bits at a numerically poorly expressed algorithm is a band-aid at best.
Ouch, too wordy, and probably too vague. I'd better quit. Apologies.
Bottom line is you use the precision you need for the problem at hand. In the face of limited resources, waste does not make sense, and weakens your competitiveness in the marketplace.