Performance of a particular shader is often limited by a single bottleneck (or combination of two). Most common bottlenecks are ALU, texture filtering, memory latency, memory bandwidth, fillrate and geometry front end. Double rate FP16 only helps if the shader main bottleneck is ALU. FP16 registers also helps a bit with memory latency, since 16 bit registers use 50% less register file storage than 32 bit registers -> GPU has better occupancy -> more threads can be kept ready to run -> better latency hiding capability.
People look too much to GPU peak FLOP rate number. FP16 doubles this theoretical number, but it's important to realize that FP16 doesn't double the count of TMUs, ROPs or memory bandwidth. When GPU manufacturers scale up the GPU, they scale all of these up together. GPUs with more FLOPs also have more TMUs, more ROPs, more bandwidth and fatter geometry front ends. Marketing departments like to use FLOP count as simple number to describe the GPU performance level, but this creates the illusion that FLOP count is the only thing that matters. If the other parts didn't scale up equally, the performance advantage would be very limited.
Thus doubling the peak FLOP rate by FP16 doesn't suddenly make a GPU equivalent to another GPU with double FLOP rate, unless all other parts of the GPU are also scaled up.
FP16 is a very useful feature for the developers, but mixing it up with FLOP based marketing is simply confusing the consumers.
People look too much to GPU peak FLOP rate number. FP16 doubles this theoretical number, but it's important to realize that FP16 doesn't double the count of TMUs, ROPs or memory bandwidth. When GPU manufacturers scale up the GPU, they scale all of these up together. GPUs with more FLOPs also have more TMUs, more ROPs, more bandwidth and fatter geometry front ends. Marketing departments like to use FLOP count as simple number to describe the GPU performance level, but this creates the illusion that FLOP count is the only thing that matters. If the other parts didn't scale up equally, the performance advantage would be very limited.
Thus doubling the peak FLOP rate by FP16 doesn't suddenly make a GPU equivalent to another GPU with double FLOP rate, unless all other parts of the GPU are also scaled up.
FP16 is a very useful feature for the developers, but mixing it up with FLOP based marketing is simply confusing the consumers.