The issue with dynamic range demo is probably the DX driver. As discussed before, the FX can only use 8 bit per component textures storage (apparently still true in the recent drivers?). If conducting multiple passes using 8 bit integer textures, it would save a bit on bandwidth, but if the bottleneck is computational, it would still be slow even though the image quality suffers.