From a hardware perspective I never thought it made much sense that UAV read/writes should only be possible in the pixel shader. It all executes on the same compute units.
But obviously there must have been a reason from at least one IHV to not allow it. My guess would be Intel.
I can assure you that there are lots of computations in real-time 3D graphics that would be just fine in FP16 without any noticeable quality loss. And this is unlikely to ever change.
This is probably wrong. Even if the ALU throughput is the same, you still have less register pressure and therefore a higher compute unit occupancy.
But one could also imagine a higher throughput with FP16 in some GPUs in the future even on the desktop. Some circuitry scales non-linear with the...
Crysis 3 uses a two channel YCbCr framebuffer http://www.slideshare.net/TiagoAlexSousa/rendering-technologies-from-crysis-3-gdc-2013
Luminance is written every pixel, chrominance is interleaved.
I meant modern GPUs obviously :)
Interesting. I definitely would like to see that. We had that problem in our tile based lighting compute shader and FP16 would have likely be more than enough precision for a lot of the calculations there.
Well for integer it definitely could work (int24 is faster than int32), but I don't think that any desktop GPU at the moment has hardware support for <FP32. Correct me if I'm wrong.
If you don't update them every frame, you should potentially not use D3D11_USAGE_DYNAMIC.
This is a good read on buffer management: https://developer.nvidia.com/sites/default/files/akamai/gamedev/files/gdc12/Efficient_Buffer_Management_McDonald.pdf
I sometimes wish modern GPUs still had FP16 support. Not even because of throughput, but because of register file pressure. We have compute shaders were occupancy was a real problem.