ASTC compression algorithm

msxyz

Newcomer
I'm looking for information about the quite recent ASTC texture compression algorithm (both compression and decompression); I'm particular interested in the 4x4 and 8x8 modes as they look like a step up in quality from DXT1/3/5 and PVRTC 2bpp and 4bpp modes.

There is not a lot of literature around that explains the math behind it, despite being quite widespread and supported among recent hardware. The most useful page on the subject I have found is this:
https://community.arm.com/graphics/b/blog/posts/astc-does-it

Also, does anybody know what is the cost in terms of complexity and computation of the decompression hardware compared to other widespread formats?
 
Thanks Simon. So gathering all the bit of info together it seems that ASTC does things a bit different from other texture compression algorithms. Every block has four 'endpoints' (versus 2 of S3TC) and, instead of interpolation coefficients, it contains a sort of pointer to a pre-set (hardcoded?) lookup table containing a large number of patterns. During the compression phase, the algorithm has to pick the pattern that maximizes the likelihood between the reconstructed color values and the original rectangle (or square) of pixels. Am I right?
 
IIRC it's vastly more complex than that.

I'll probably have many of the details wrong, but IIRC each block may be (independently?)
  • Low precision (8bit/channel) OR high precision (HDR)**
  • Have a single map (eg R,RG, RGB) or two maps (e.g RGB + A)
  • Have 1 through 4 different colour partitions. If > 1 partition, then the pixel to partition map is generated with pseudo-random functions from a 10bit seed.
  • There are about 4 different ways of encoding the end point colours for the partition.
  • The index/blending weights can be specified at a resolution different to that of the block (and are thus interpolated to the block resolution)
  • The precision of the specified weights and colour end points can vary.
  • Can, IIRC, encode that other blocks in the neighbourhood are similar.
It's, err, quite involved.


**Though I think it's been said the BTC6 (or is it BTC7 ?) format is a better choice for HDR content. (Actually we did an HDR encoding via PVRTC at 6bpp (basically 1x4bpp and 1x2bpp) which I felt did a better job of HDR)
 
Last edited:
ASTC key differences compared to BC6/7 formats:
- ASTC calculates the color partitioning table by a random generator. Around half of the patterns are repeats or otherwise bad. BC6/7 uses LUT for partitioning patterns. BC thus saves one bit of storage and makes decoding/encoding easier than ASTC, but LUT takes space on sampler hardware.
- ASTC bit packs values to fractional bits (Bounded Integer Sequence Encoding). This encoding is more complex than BC6/7 (which use whole bits). This gives ASTC slight advantage, but a lot more complexity.
- ASTC fractional bit encoding allows flexible non-pow block sizes from 4x4 to 12x12. Block size in bytes stays the same (thus compression ratio increases). BC6/7 use fixed 4x4 block size (similar to old BC/DXT/S3TC formats).
- ASTC has modes for filtered upscaling of decompressed data (block not pixel precise). Partition mask is still pixel precise. This is how you can reach less than 1 bit per pixel compression ratios (with largest block sizes).

Otherwise ASTC and BC6/7 are highly similar. Both have lots of different compression modes and partitioning modes. Each tile has a separate header that describes the mode. Tiles are fully independent. No dependencies to neighbors (like PVRTC).

ASTC and BC6/7 produce similar maximum quality (4x4 blocks) at same memory footprint. ASTC compression ratio however scales much higher if you don't need absolutely best quality. BC6/7 offers only single bitrate. ASTC also supports 3d tiles (3x3x3 to 6x6x6) for improved volume texture compression.
 
Last edited:
- ASTC calculates the color partitioning table by a random generator. Around half of the patterns are repeats or otherwise bad.
I think in our experiments, only about 8 partitions actually got used with any reasonable frequency
 
I think in our experiments, only about 8 partitions actually got used with any reasonable frequency
Their papers claim that half of the ASTC patterns are usable. They most likely used a test set of millions of images and encountered some patterns only once (= usable pattern). A LUT based approach is certainly better. In comparison BC7 mode 0 has 4 partition bits (16 partitions). Every one is useful. Modes 1,2,3,7 have 6 partition bits (64 partitions). I would assume that more than 16 partitions are used in common BC7 images, as 4x more partitions need 4x larger LUT in hardware (but 128 byte LUT shouldn't be that bad).

Official BC7 mode docs (not very good): https://msdn.microsoft.com/en-us/library/windows/desktop/hh308954(v=vs.85).aspx

All 64 BC7 partitions visualized. It seems that all are useful. No noisy bad ones:
https://rockets2000.wordpress.com/
 
Last edited:
A couple of days ago, I found a whitepaper detailing BC6. They use only 32 patterns which are a subset of those used in BC7 (or possibly BC7 expanded the number of patterns). A shame I didn't save it. If I stumble on that page again, I'll post the link here.

Actually I'm more interested (though my interest is merely an intellectual exercise) in the highest saving that can be achieved with a reasonably fast and straightforward algorithm. How many texture compression schemes can claim to achieve 2pp rates out of a 15-24bit RGB texture? I count three: vector quantisation with 2x2 blocks and a 256 entries LUT, PVRTC 2pp and ASTC in 8x8 mode. VQ seems to have gone a bit out of fashion nowadays but it's hard to imagine a simplest algorithm. One thing I remember about VQ is that the final quality is highly dependant on the compressor and that the slowest ones could take minutes to compress a single texture (I'm talking about early 2000s hardware).
 
One thing I remember about VQ is that the final quality is highly dependant on the compressor and that the slowest ones could take minutes to compress a single texture (I'm talking about early 2000s hardware).
ASTCEnc takes minutes to compress a single 4096*4096 texture on a high end i7 (2016 hardware). Intel has their own ASTC encoder: https://software.intel.com/en-us/articles/fast-ispc-texture-compressor-update. The quality is slightly worse than ASTCEnc, but it is up to 44x faster at same quality. It is programmed with ISPC (a GPU-style SPMD programming language that generates SSE/AVX/AVX2 code). Intel also has a very good ISPC based BC7 compressor.
 
Speaking of texture compression, is there a tool/utility or a nifty little program that does vector quantization of images using an approach similar to that employed with PowerVR chip inside the Dreamcast?
 
Other than a purely academical interest? None.

But VQ techniques are a very interesting subject. It's also interesting to note that many of these texture compressing techniques describe a way to decompress the images but not the algorithm used to generate them. This is an interesting challenge because you know your destination but you want to look for the most efficient road (i.e PSNR or coding speed or a mix of both) to get there.
 
Speaking of texture compression, is there a tool/utility or a nifty little program that does vector quantization of images using an approach similar to that employed with PowerVR chip inside the Dreamcast?
Apart from, say, IMG's VQ compressor tool itself? It's quite old but I wonder if it could be released?


It's also interesting to note that many of these texture compressing techniques describe a way to decompress the images but not the algorithm used to generate them. This is an interesting challenge because you know your destination but you want to look for the most efficient road (i.e PSNR or coding speed or a mix of both) to get there.
Agreed.
 
It's also interesting to note that many of these texture compressing techniques describe a way to decompress the images but not the algorithm used to generate them.
Format spec only needs to specify the meaning of the bits and how the bits are decoded. Compression standard doesn't need to force any specifics for the encoder, as long as it produces legal bit patterns that match the spec.

Encoders will usually evolve in time. Slowest reference encoders tend to use brute force exhaustive search. This results in absolutely best quality, but takes hours to encode large images.

There exists various trade offs with quality and performance. For example Intel's ASTC encoder is super fast but not as high quality as the (44x) slower alternatives: https://software.intel.com/en-us/articles/fast-ispc-texture-compressor-update
 
Apart from, say, IMG's VQ compressor tool itself? It's quite old but I wonder if it could be released?
.
I've seen an old SEGA photoshop plugin that was part of the DC development kit. Is this what you're referring? It's has been out in the wild for quite some time, at least as long as people began to be intested in DC emulation :)

Also, is that the same tool you used when you wrote this comparison? http://web.onetel.net.uk/~simonnihal/texcom/texcompcomp.html
 
what does PVRTexTool do?
I have some games that come supplied with it, why I dont know
Do you mean how does the compression algorithm work? If so, do you mean for the PVRTC cases?

I've seen an old SEGA photoshop plugin that was part of the DC development kit. Is this what you're referring? It's has been out in the wild for quite some time, at least as long as people began to be intested in DC emulation :)
I don't remember a photoshop plugin, but if it has a vqdll.dll file associated with it, then it may be derived from the IMG VQ compressor.

Also, is that the same tool you used when you wrote this comparison? http://web.onetel.net.uk/~simonnihal/texcom/texcompcomp.html
Yes, I certainly used the IMG compressor (VQGen.exe + vqdll.dll) to create those.
 
Do you mean how does the compression algorithm work? If so, do you mean for the PVRTC cases?
No I mean why does the game give me a copy of pvrtextool.exe (arkham origins, borderlands the pre sequel, colonial marines) what am i supposed to do with it
or does the game execute it (if i delete it the game still works)
 
Back
Top