Tegra X1 can use half precision but as far as we know Maxwell can't use async compute in any sort of productive means. sebbbi said around 70% of the pixel shaders in his games could be done in FP16. How much real-world performance does this translate into? How does it compare to Mark Cerny's claim of 50% GPU performance boost by using async in GCN GPUs?
Besides, there are limits to using FP16 in Maxwell 2.5, you can't assume it'll simply perform 2x faster in each FP16 operation. It probably won't, because e.g. both FP16 calculations in each ALU need to be doing the exact same operation and you don't know how effective the scheduler will be at distributing this work.
Do you know how one feature compares to another regarding practical performance? I don't. Probably, few people do.
And async has been in use for over 3 years by all major developers for the PS4Bone. Who's been using FP16?
As great as half precision is, you can't just pretend there's a boost to 400GFLOPs in handheld mode that will make them equivalent to PS4Bone GFLOPs.