Simon F said:
If you mean "find a way to parallelise the decode" then "good luck". Someone on this board recently gave a pointer to a parallel huffman decoding technique and it didn't look pretty.
No, what I meant was, to invent a code wherein the Nth symbol doesn't require decoding the previous N-1 symbols. I don't think all variable length codes impose this restriction. For example, I don't see why a code can't exist such that if you need to decompress the Nth symbol, a function f(N) will exist will return a location M near the start of the symbol, and guarantee at most 1 retry to "align" to the proper offset.
Just as an off-the-top-of-my-head analogy, imagine a code where symbol lengths are quantized to be either 2-bits, 4-bits, or 8-bits long. If you read 8-bits at a given location f(N) you have either read 1 compressed symbol, 2 symbols, or 4 symbols or you have a misalighment. The encoding would contain a sync or check scheme, so that if in fact, you needed to read an 8-bit symbol, but you were misaligned (by say 4-bits), you could detect this and "backup" or "advance" to the next 4-bit boundary, likewise for 2-bit misalignments.
In a given 4x4 or 8x8 block, if one saved Y bits from entropy encoding, the extra space could be used to for other purposes, like increased fidelity perhaps by encoding as many correction factors as you could in the remaining space.
Anyway, I think entropy encoding is not neccessary the place to start. The place to start is at the input textures, and what non-perceptual information can be thrown away, as well as what low-entropy information can be interpolated or looked up, such as gradient or frequency distributions.
I also don't think implementing something like JPEG-2000 is "out of the question". I think you could use DXTC/VQ/PACKMAN/et al at the level of the L1 texture cache, and use JPEG-2000 to get data from system memory or the L2-cache. There's no shame in a tiered-solution. The ultimately goal is to make better use of memory space as well as trade off computation for bandwidth.
One of the things that gives me hope that a conceptual breakthrough may still lurk out there is the awesome efficiency of EdgeBreaker as well as the simplicity of the decode.