GPU Texture Compression demo

Humus

Crazy coder
Veteran
I have a new demo showing how to do texture compression entirely on the GPU in Direct3D 10.1. I'm compressing a luminance texture to the BC4 format. The shader is merely 49 hardware ALU instructions on an HD 3870, which for 16 pixels in a block is ~3 instructions per pixel. Since it's just a single channel texture I've also optimized the compression by sampling with Gather (AKA Fetch4) to reduce the texture fetches from 16 down to 4. :)

Not so much eye-candy in this demo as it just shows you a static texture (which obviously is just a photo), but here's the usual screenshot:
GPUTextureCompression.jpg


You can toggle between the compressed and uncompressed with F5.

Download here
 
Last edited by a moderator:
Well, it's not really comparable. Those are offline compressors, even if the latest NV one can use the GPU to speed up the process. Clearly those are aimed at quality, whereas in my implementation I went with a quick and reasonably good implementation, which is what you'd likely want in a real-time application. I don't know what format they were compressing to in the chart in your link (probably DXT1), but at 2.29 textures / second that's not real-time. This demo could compress thousands hi-res textures in a second. The bottleneck would be reading them in from disk. Although I should say that BC4 clearly is the fastest format to compress to since it's only a single dimension to deal with, so selecting reasonable control points is trivial, whereas it's a much harder problem for DXT1. DXT1 should be possible in real-time too though, although a fair amount slower than BC4.
 
Interesting!

If you have a game with render-to-texture, would it be beneficial to texture compress it by the GPU first before using the texture later on? I guess the answer will be 'it depends on how much you're using that compressed texture and how much BW you're otherwise consuming.' ;)
So let's rephrase: is real time texture compression something that's already being done in current games? Does it result in significant speedups in practice?
 
So let's rephrase: is real time texture compression something that's already being done in current games?

No. It's only in DX10.1 it's a reasonable thing to do.

Does it result in significant speedups in practice?

It could potentially. Depends on how many reads there will be for each time you compress. A guesstimate is that you'd need to read maybe 5 times or more before it pays off. One place where it almost certainly would have been beneficial is if I integrated it into my DynamicLightmapping demo. There the lightmap could be reused for thousands of frames before it's regenerated.

Very nice but its not a bloody fireworks screensaver is it :D

:p
Yeah, this is mostly going to be interesting to programmers.
 
No. It's only in DX10.1 it's a reasonable thing to do.
Is this because of the Fetch4?

One place where it almost certainly would have been beneficial is if I integrated it into my DynamicLightmapping demo. There the lightmap could be reused for thousands of frames before it's regenerated.
I can't wait to see the results. ;)

Thanks!
 
Is this because of the Fetch4?

No, it's because in earlier APIs there's no way to copy data on the GPU from a resource of another type into a compressed texture. So to make this work you would have to transfer the data back to the system memory across the PCIe/AGP bus, and then copy it back to a compressed texture in video memory.
 
Long overdue, but I've uploaded another demo on this theme, except it's now compressing to DXT1 format, which is probably more useful although a bit more complex to compress to, but it's still certainly real-time. :)

GPUTextureCompression2.jpg


Download here
 
Last edited by a moderator:
Framerate is a bit sporadic, but it hovers between 1500 and 2100 on my 3.6Ghz Q9450 and a pair of 3870's in crossfire on Vista64 Ultimate. Cool :)
 
Resurrecting and old tread...

I just wanted to ask if you (Humus) or anyone else has done this kind of GPU DTX compression with only using floating point instructions (instead of all the DX10 integer ops in the code)?

I am programming a virtual texture system (to current consoles) and I'd like to compress textures to DXT on fly to optimize the system memory and BW usage. Doing the compression on GPU sounds like the best choice for my system.
 
I haven't attempted that. Most of it is done with floating point math though, but there's some bit crunching in the end, but it should be possible to accomplish that with floating point math. On xbox360 I suppose a memexport shader is the way to go, but on PS3 I would guess using the SPUs might be faster than using the GPU.
 
It's available, but my site has moved from .ca to .name since the thread was originally posted. I've updated the links.
 
Back
Top