GPU Texture Compression demo

I was spending one evening thinking about how to output all the bits correctly to a floating point render target. I had especially hard time with automatic floating point normalization, and special bit formations (NaN, INF+, INF-)... until I realized that I don't have to create the render target in floating point format :). I am now using 16-16-16-16 integer render target. The 23 bit mantissa of the 32 bit floating point pixel shader output is enough to write to each channel in full precision, and I now get all the bits set properly. 32-32 integer would have been much harder to output to (with only 32 bit floats in pixel shader), but the 16 bit integers are a piece of cake.

I'll inform how well my code performs when I get it up and running properly... It's basically a port of the Humus algorithm that makes the palette by choosing the highest and lowest color.
 
I'll inform how well my code performs when I get it up and running properly...

My floating point port of the DXT1 GPU compressor compresses 8888 1024x1024 tiled textures in around 0.6ms (on target platform to tiled DXT). I had to do the alpha block swapping quite differently (no binary integer operations), and the packing of bits is done with floors and floating point multiplies instead of binary shifts. The bit perfect float output to 4 x 16 bit signed integer rendertarget took me a bit of experimenting to work properly. The code accumulates the values in [0, 65535] range first and then converts to signed format [-32768, 32767] to match the render target. I am pretty happy about the results. The GPU compressor is much faster than our CPU compressor.

Thanks for the idea and the DX10 source to experiment! That speeded up my process nicely :)
 
Back
Top