PDA

View Full Version : Will Microsoft ever adopt a new compression method?


Brimstone
21-Dec-2002, 16:08
From what I understood FXT1 by 3dfx is better than S3TC. Microsoft stuck with S3's technology because they wanted one standard. It seems algorithms are getting improved upon all the time for techniques like AA. Why not texture compression? At some point will Microsoft have a compelling reason to look for a better technology than S3's?

Colourless
21-Dec-2002, 16:24
FXT1 is better in some ways and worse in others.

andypski
21-Dec-2002, 16:30
From what I understood FXT1 by 3dfx is better than S3TC. Microsoft stuck with S3's technology because they wanted one standard. It seems algorithms are getting improved upon all the time for techniques like AA. Why not texture compression? At some point will Microsoft have a compelling reason to look for a better technology than S3's?

The advantages of FXT1 over S3TC are marginal - certainly not worth Microsoft dividing what is otherwise a simple standard and making things harder to get right. After all, it's apparently difficult for some vendors to even implement the current DXTC standard correctly (or, at least, in what could be regarded as an intelligent manner...) :wink:

More seriously, new compression formats will certainly have to have tangible and demonstrable advantages over DXTC in either compression ratio or quality to have a chance of being included.

Hyp-X
21-Dec-2002, 17:27
Well, a normal map friendly compressed format would be nice.

High dynamic range compressed format would also be a good thing.

mboeller
21-Dec-2002, 17:43
SimonF mentioned that MS will not introduce an new compression in the known future. IMHO IMG have tried to "sell" their 2bit/texel PVRTC to MS.

ActionNews
21-Dec-2002, 17:49
What's about this:
PVR-TC is not the same compression scheme as implemented in KYRO (DXTC), nor is it the same as was used in Dreamcast (VQ) - it is instead one which achieves up to twice the compression rates of DXTC - a very important factor for embedded systems where bandwidth is so limited - while maintaining high image quality. It is a compression scheme that we will be incorporating in future products.( Kristof Beets (PowerVR) in an PowerVR Generations Interview (http://www.pvrgenerations.co.uk/cgi-bin/viewarticle.cgi?page=/articles/2002/kristof1102&printer=0&pagenum=1) )

This is an interessting new texture compression :)!

CU ActionNews

ActionNews
21-Dec-2002, 17:57
SimonF mentioned that MS will not introduce an new compression in the known future. IMHO IMG have tried to "sell" their 2bit/texel PVRTC to MS.

Sorry i wasn't fast enough :)! I had to search the Interview :)! I didn't know that you have posted about PVRTC meanwhile.

CU ActionNews

Humus
21-Dec-2002, 18:52
S3TC has served us well for a while now, but I feel the time for retirement of texture compression is closing in. In the future with long shaders etc. I think texture compression will be more and less of a forgotten feature as shader execution will be the performance determining factor and not the memory bandwidth.

790
21-Dec-2002, 19:38
And what about memory space? I don't see compression going anywhere. I can eat 128Mb on current cards even with DXT1 compression, it'll be many many years before space no longer becomes an issue. Surely you aren't proposing we stream all our textures over the AGP bus every frame?

MDolenc
21-Dec-2002, 21:21
Once shaders are fast enough, why not do texture (de)compression with them?

Hyp-X
21-Dec-2002, 21:25
I agree, that space is the most important part (while being faster IS a nice thing as well).

And you can no longer rely on AGP memory either, as you'll soon run out of base memory as well.

Especially with texture management - where you have a copy of everything in the system memory.

And while we are at it, vertex compression is also important for the same reasons. I think it is very important that R300 supports many packed vertex formats. It is the way to go.

Brimstone
22-Dec-2002, 00:07
Nvidia and ATI both want to be everywhere pixels are utilized. The Power VR interview brings up a good point about the value of PVR-TC. As you move away from desktops into other markets like embedded systems, bandwidth is going to come at a premium cost along with ample amounts of system memory. Cell phones are starting to get some good displays and when they get around to doing 3d graphics, I would assume texture compression will be of benifit.

MfA
22-Dec-2002, 02:39
3D textures are screaming for a storage scheme tailored to them ... not even so much pure compression since that would only be of marginal help, adaptive resolution storage of the textures would make 3D textures vastly more usefull.

Hell, Id love to see adaptive resolution 2D textures and render targets too ... given the choice I would take it over better compression any day. Because you have to be able to efficiently decode the textures in real-time pure compression wont give too much gain over DXTC, gains from adaptive resolution textures would be in a different order of magnitude for things such as lightmaps (orders in the case of 3D).

Simon F
23-Dec-2002, 10:05
From what I understood FXT1 by 3dfx is better than S3TC.
As others have said, there are probably a few times when S3 is better, however I think the biggest problem is that at least one of the FXT1 modes probably infringes S3's patent and so adoption could be risky.

SimonF mentioned that MS will not introduce an new compression in the known future. IMHO IMG have tried to "sell" their 2bit/texel PVRTC to MS.
I don't think I said exactly that. :o
What I believe I did say was that the last time I demonstrated PVR-TC to a leading member of the DX research team, he said something equivalent to them being reluctant to introduce a new compression scheme unless it had a significant increase in compression rate.

Basic
23-Dec-2002, 15:18
Ít's possible to tweak FXT1 so that the only drawback compared to S3TC is that it selects compression mode per 8x4 pixel block instead of per 4x4 pixel block. Or in other words, the 15 bit vs 16 bit issue can be removed. However, that would bring it even closer to S3's patent.

Still, it would only be a marginal gain.

andypski
23-Dec-2002, 16:07
Ít's possible to tweak FXT1 so that the only drawback compared to S3TC is that it selects compression mode per 8x4 pixel block instead of per 4x4 pixel block. Or in other words, the 15 bit vs 16 bit issue can be removed. However, that would bring it even closer to S3's patent.

Still, it would only be a marginal gain.

Overall it might not be a gain at all - block->block noise (low frequency noise created by colour choice mismatches at block intersections) is one of the key problems that a high quality DXTC compressor needs to deal with, and is very difficult to solve optimally. In addition to this the low-frequency and structured nature of this noise makes it one of the most noticeable artifacts caused by the compression. With a larger block size the errors from block->block are likely to get larger.

Dio
23-Dec-2002, 18:41
As a general rule of thumb, the maximum compression you can achieve is dependent largely on the block size, larger blocks giving better compression.

However, blocking artifacts then generally dominate at high compression ratios (this is clearly shown by JPEG, MPEG etc.).

DXTC is a very finely chosen tradeoff...

Simon F
24-Dec-2002, 09:43
As a general rule of thumb, the maximum compression you can achieve is dependent largely on the block size, larger blocks giving better compression.
Of course the downside (at least with a scheme such as S3TC) is that the quality goes down as it gets very much harder to represent the increased number of pixels in the block with a limited number of 'base' colours.

The big advantage of compression methods such as JPEG is the fact that the data per block is variable which, in turn, makes it rather unsuitable for texturing. I suppose in a way, VQ was a 'variable rate' compression method in the sense that areas of low detail would be assigned, on average, a lower number of bits. The only problem is that the HW guys didn't like implementing it because it needed a 2nd cache to hide the indirection.
However, blocking artifacts then generally dominate at high compression ratios (this is clearly shown by JPEG, MPEG etc.).
Have you seen the MPEG 4 spec? It has some horrendous post-processing to remove the block artefacts. Effective, I suppose, but not exactly elegant. In many ways it'd be nicer to avoid blocks entirely, but I guess backward compatibility is essential.

DXTC is a very finely chosen tradeoff...
But I wonder what really drove the decisions. I have a feeling it may have been influenced by the number of bits need to store each block. 64 bits is a nice granularity (which would also fit the typical bus widths of the time) which then equates to the 2x16bit colours plus the 16x2bits of indexing. The next logical size is 128 bits which would have required two transactions from the external bus and a cache that was twice as wide.

Dio
24-Dec-2002, 12:28
I do think variable-rate compression is interesting and possibly has some potential, but if it starts needing index blocks and the like it's messy - and the surface simplicity of DXTC was a big help in its adoption, I think.

I will correct my statement to 'horrendous blocking artifacts unless you put a horrendous amount of effort into trying to get rid of them' :). I did work with the MPEG2 and preliminary MPEG4 specs a couple of years back but I'd forgotten about the deblocking.

As to what drove the decision - well, only the mathemagician knows that :). I think DXTC is a great format myself...

Dave B(TotalVR)
24-Dec-2002, 12:40
Personally I think VQ compression is great, sure comrpessing the textures in an overnight job which is an ass but you get so much better compression ratio's than with S3TC, especially as the texture gets larger. You can also read and decompress a VQ compressed texture quicker than you can read an uncompressed texture which is stunning if u ask me.

IMO the 4:1 compression ratio of alpha textures with S3TC is pitiful, I remember VQ getting 8:1 compression ratios, there was some comparison article written ages ago. I'll see if I can find it.


Simon, are you allowed to hint as to exactly how PVR-TC works? is it an extension of VQ or a whole new thing or what?

Dave B(TotalVR)
24-Dec-2002, 12:47
Umm, you know im sure what I have found is actually a plagiarism of what I was looking for.....

http://gpp.netfirms.com/cgi-bin/resourceCanada.cgi?ImageCompressionVectorQuantizat ion

Simon F
24-Dec-2002, 13:02
I do think variable-rate compression is interesting and possibly has some potential, but if it starts needing index blocks and the like it's messy - and the surface simplicity of DXTC was a big help in its adoption, I think.
AFAICS, the only other option for variable rate would be to have "large" data blocks (eg 256 bit) that decode to large NxN pixel blocks with variable compressio rates internally. I tried something a little less ambitious when I was researching texture compression methods, and decoding huffman-like data in one or two cycles is deeply unpleasant!

As to what drove the decision - well, only the mathemagician knows that :). I think DXTC is a great format myself...

Well there are a few other hints: S3TC clearly grew from "CCC" (described in one of the siggraphs) which, IIRC, used 4x4 blocks of pixels. Each block stored two 8-bit palette indices (to select two base colours) and then each of the 16 pixels had a one bit index to choose which base colour. S3TC (sort of) doubles the storage cost, adds the implied colours, while eliminating the palette indirection.

What did disappointment me WRT DXTC is that it didn't have a 4bpp variable alpha variant. I did quite a few experiments when S3TC/DXTC first came out using a variant that had N-levels of alpha, and in most cases the quality was fine. The 8bpp for the DXT2+ modes just seemed like an overkill. <shrug>

andypski
24-Dec-2002, 13:48
Personally I think VQ compression is great, sure comrpessing the textures in an overnight job which is an ass but you get so much better compression ratio's than with S3TC, especially as the texture gets larger. You can also read and decompress a VQ compressed texture quicker than you can read an uncompressed texture which is stunning if u ask me.

You can get about twice the compression ratio of DXTC (for colour-only images), but the difference between 2bpp and 4bpp is not really of much interest in the video card market. Smaller than 4bpp is of interest mainly in areas where memory is at a huge premium (handheld devices etc). As far as video cards go if VQ's higher compression ratio couldn't win the day back when it was first introduced (when devices might typically have only about 8-16 MB of onboard RAM) then it is hardly likely to be a convincing argument now.

In the consumer 3D space the most interesting aspect of compression is increasing the efficiency of texturing, and DXTC solves that problem just fine - the added benefits of dropping to 2bpp vs. 4bpp in overall texturing efficiency are generally pretty marginal (considering you've already dropped from 24bpp->4bpp, and effectively from 32bpp->4 bpp, since most 3D hardware does not use packed texel formats).

Whether the image quality of VQ at 2bpp is equivalent to DXTC at 4bpp is a long and involved discussion in and of itself, but in most typical cases I believe it to be somewhat lower quality overall (although in the same ballpark). Of course each compression method has different strong and weak points in terms of IQ, and therefore the exact situation varies from image to image. I know that Simon had a comparison of some aspects of this on his homepage where he made some interesting observations on quality/bit.

VQ compression is also not great for hardware, as Simon has touched upon, since you need to hide an additional indirection. Also, for properly orthogonal support you have to be able to use N different sets of VQ palettes, where N is the number of simultaneous textures you support.

IMO the 4:1 compression ratio of alpha textures with S3TC is pitiful, I remember VQ getting 8:1 compression ratios, there was some comparison article written ages ago. I'll see if I can find it.

'Pitiful' is an interesting choice of words, and is probably taking things too far. It's certainly a low compression rate, and assigning as many bits of storage to one 8-bit component as to the other 3 is obviously not optimal. On the other hand it does its job well, and you get an alpha channel that is compressed practically without any fidelity loss. It could certainly be better, but if it was 'pitiful' it would not be very useful, and that is certainly not the case - going from 32bpp to 8 bpp is very useful.

Simon F
27-Dec-2002, 11:33
Once shaders are fast enough, why not do texture (de)compression with them?
As shaders get faster, developers are just as likely to want to use all those extra cycles for something else. Besides, doing random access of texture data and performing bit unpacking is not going to be fast with the current instruction set and so, if you're going to add specialised HW to make the system faster, you might as well make it automatically decompress textures (all IMHO).

I agree, that space is the most important part (while being faster IS a nice thing as well). And you can no longer rely on AGP memory either, as you'll soon run out of base memory as well.
If you have lots of textures being used in the same image then it's not just space: bandwidth is still going to be a problem.

(To Humus:) I don't see bandwidth issues magically going away simply because there's now this great opportunity to write really slow shader code :-)
Personally I think VQ compression is great, sure comrpessing the textures in an overnight job...
Hardly. It took about ~10-20% longer than the S3 compression tool on my old PC.
...but you get so much better compression ratio's than with S3TC,especially as the texture gets larger.
I would also say that, on average, the quality was slightly lower with
DC's VQTC than S3TC but, given the ~2-fold decrease in storage costs, this was completely acceptable.
You can also read and decompress a VQ compressed texture quicker than you can read an uncompressed
texture which is stunning if u ask me.
You'll get that with S3TC as well. Because the HW to do fast decompression has to be included in the system, you get savings because compressed textures use the external
memory bus far less, freeing it up for other tasks.

IMO the 4:1 compression ratio of alpha textures with S3TC
I personally dislike quoting a 'ratio' for texture compression since the schemes are nearly always lossy. FWIW, DXTC storage costs are 4bpp for opaque (and punch-through) and 8bpp for translucent textures, while the DC's VQ was always ~2bpp.
(Note that putting alpha in the VQ sometimes meant more degradation because it was trying to represent more data with the same number of bits.)

Simon, are you allowed to hint as to exactly how PVR-TC works? is it an extension of VQ or a whole new thing or what?
It'll become public eventually, but for now I'll not say anything on it other than it's nothing like the VQ (i.e. it has no indirection) and is reasonably cheap to implement in HW (e.g. it's used in MBX which has a very tight gate budget).

VQ compression is also not great for hardware, as Simon has touched upon, since you need to hide an additional indirection. Also, for properly orthogonal support you have to be able to use N different sets of VQ palettes, where N is the number of simultaneous textures you support.
IIRC DC used a second cache stage and each texture had its own palette/codebook (although, having said this, for small textures where the compression ratio would effectively decrease, you could actually pack textures together so they borrowed codes from neighbouring textures).

Dio
30-Dec-2002, 11:35
AFAICS, the only other option for variable rate would be to have "large" data blocks (eg 256 bit) that decode to large NxN pixel blocks with variable compressio rates internally. I tried something a little less ambitious when I was researching texture compression methods, and decoding huffman-like data in one or two cycles is deeply unpleasant!
Oh yes. Been there, when I was writing a JPEG decoder on a 56001 DSP. It's very clumsy to try to do in HW.

I think the problem with a 'fixed/variable' JPEG-based system is that the (compressed) size ratio between 'large' and 'small' blocks in a JPEG image is large, so I envisage that in this kind of system the 'easy' bits of the image get a larger percentage of the bandwidth, while the 'hard' bits get less....


What did disappointment me WRT DXTC is that it didn't have a 4bpp variable alpha variant. I did quite a few experiments when S3TC/DXTC first came out using a variant that had N-levels of alpha, and in most cases the quality was fine. The 8bpp for the DXT2+ modes just seemed like an overkill. <shrug>
I think the assumption was that the alpha channel would need higher precision. Certainly this is what I tend to see nowadays... of course there are trivial modifications that would give some alpha support for lower bit accuracy in RGB.

Basic
30-Dec-2002, 12:45
OK, I'm a bit late here, but I've been away.

Overall it might not be a gain at all - block->block noise (low frequency noise created by colour choice mismatches at block intersections) is one of the key problems that a high quality DXTC compressor needs to deal with, and is very difficult to solve optimally. In addition to this the low-frequency and structured nature of this noise makes it one of the most noticeable artifacts caused by the compression. With a larger block size the errors from block->block are likely to get larger.

If I got it right, DXT1 (I used that name to explicitly say "no alpha on the side") have two different compression blocks; 3-color-block, and 2-color-1-transparent-block. Each block has it's own mini-palette. The type of the block can be chosen indpendently per block.

So palette is chosen per 4x4 texel block, compression type per 4x4 texel block.

In the S3TC-similar compression blocks, FXT1 selects the palette per 4x4 texel block. But the compression type is as always in FXT1 selected per 8x4 block.

Are you saying that:
1) The slightly higher granularity in compression type selection will reduce block>block noice.
And/or
2) None of the extra modes in FXT1 would ever be used. Not even for, say, slow gradients, or multi-bit alpha modes still at 4bpp.

Simon F
30-Dec-2002, 13:23
If I got it right, DXT1 (I used that name to explicitly say "no alpha on the side") have two different compression blocks; 3-color-block, and 2-color-1-transparent-block.
A minor correction: The two modes are "4-colours" (2 implied from the 2 stored base colours) and "3-colours + transparent black".

andypski
30-Dec-2002, 15:32
Are you saying that:
1) The slightly higher granularity in compression type selection will reduce block>block noice.
And/or
2) None of the extra modes in FXT1 would ever be used. Not even for, say, slow gradients, or multi-bit alpha modes still at 4bpp.

I had a long reply to this typed in but lost it in a crash, so this version is going to be fast and dirty, but I hope clear enough.

My recollection of FXT1 is that it had 4 compression modes -

CC_MIXED. 4x4 block, 2 565 endpoints and 2 interpolant colours. Basically identical to S3TC.

CC_HI. 8x4 block, 2 555 endpoints and 5 interpolants with 1 explicit transparent encoding.

CC_CHROMA. 8x4 block, 4 555 explicit colours

CC_ALPHA. 8x4 block. 3 5555 endpoints, 2 interpolants between endpoint 0 and 1 for the left 4x4 area and 2 between endpoints 1 and 2 for the right 4x4 area.

The CC_MIXED mode could not coexist in an image with the other formats because this created a dependency between the format chosen per block and the texel addressing (how do you find a specific texel on a line that is composed of different block sizes?) You would need to add an index of some kind, which would get messy.

So in mixed-mode images only the 8x4 formats are available. The compression of both S3TC and FXT1 on colour images is 4bpp, so for each 8x4 FXT1 block you have 2 S3TC blocks in a direct apples->apples comparison. Looking at the descriptions above it can be seen that for colour-only data S3TC should always be a superior representation to either the CC_CHROMA and CC_ALPHA formats for any data, and it is questionable whether CC_HI is better for gradients as well (lower endpoint precision with 1 extra interpolant vs. 2 extra explicit colours at higher precision.)

So for colour images FXT1 is pretty much a bust - I would expect that on almost all images the best S3TC compressor would beat the best FXT1 compressor for quality.

For images with alpha the presence of the CC_ALPHA format makes things interesting because it allows 4bpp compression of complex alpha, but the image quality will be lower than the 8bpp S3TC equivalent, so it is questionable if this is a big advantage.

Overall FXT1 seemed like a bit of a 'me too' exercise on the part of 3dfx and didn't really offer anything compelling above what S3TC provided, so it's not surprising that it didn't generate much interest.

- Andy.

Basic
30-Dec-2002, 21:05
Simon:
DOH :oops: Are you sure that two bits can represent 4 values? :)
I'll blame it on my always occuring post-Xmas cold.

andypski:

CC_MIXED has two sub modes (just as S3TC). Both are 8x4 blocks, but they split this 8x4 block into two 4x4 sub blocks.
Sub-modes:
CC_MIXED non-transparent: One 555 endpoint and one 565 endpoint, 2 interpolant colors.
CC_MIXED transparent: One 555 endpoint and one 565 endpoint, 1 interpolant color, 1 transparent "color".

And it certainly is possible to mix CC_MIXED freely with the other modes in the same texture. The only limitation is that both S3TC-like blocks in a CC_MIXED block must use the same mode (transparent/non-transparent).

Enhancements possible by small changes in the standard (not using any more memory):
CC_MIXED: All endpoints 565
CC_HI: One endpoint 565
CC_CHROMA: All endpoints 565
CC_ALPHA: One extra bit in one of the endpoints, not realy worth it.
There is coding space left for other compression modes, if someone see a new usefull mode.

andypski
31-Dec-2002, 12:15
CC_MIXED has two sub modes (just as S3TC). Both are 8x4 blocks, but they split this 8x4 block into two 4x4 sub blocks.
Sub-modes:
CC_MIXED non-transparent: One 555 endpoint and one 565 endpoint, 2 interpolant colors.
CC_MIXED transparent: One 555 endpoint and one 565 endpoint, 1 interpolant color, 1 transparent "color".

And it certainly is possible to mix CC_MIXED freely with the other modes in the same texture. The only limitation is that both S3TC-like blocks in a CC_MIXED block must use the same mode (transparent/non-transparent).

Aha! My (somewhat old) memory of FXT1 must be playing tricks on me.

Being able to include CC_MIXED blocks does make it a bit more interesting, but the other block modes still seem to be fairly uninteresting in general. The CC_ALPHA mode is still the only really interesting extension as explained above. In addition, reducing the precision of one of the endpoints can cause some problems due to increased quantisation noise, and the ability to freely mix 3 and 4 colour blocks in DXTC (without restrictions) can also marginally improve compression quality in some cases with smart compressors.

Changing the encoding to keep higher resolution endpoints at all times would make the spec much more interesting as it should then beat S3TC in all cases (although perhaps only marginally) - can you outline this suggestion and the new encoding?

- Andy.

Basic
31-Dec-2002, 19:09
The redundancy in FXT1 is in the placement of base colors. It's possible to switch places on the base colors, and then do the corresponding changes in the index field.

This can be used in different ways:
You could see the colors as 15 bit integers (removing the last green bit), and do compares between them.
So ie for (one half of) CC_MIXED:

if(color0<color1) {
color0.greenlsb=0;
color1.greenlsb=the_other_bit_stored_explicitly;
}
else {
color0.greenlsb=the_other_bit_stored_explicitly;
color1.greenlsb=1;
}
This scheme would btw be "compatible" with FXT1 in the sense that if you compress with FXT1 and decompress with this "FXT2" (or the other way around), the output will still be correct except for the green lsb.

Another way is to lock certain texels to certain (groups of) colors:
For CC_ CHROMA:
Texel0 always use color0 => 2 bit freed in the index array
Texel1 always use color0/1 => 1 bit freed in the index array
And then do a comparison as above between color2/3 for the fourth bit.
There's the four needed green lsb bits.
There is actually one bit that isn't used at all in this mode, so that one could be used as the fourth bit. But then you'd waste easy coding space for future enhancements with new compression modes.

andypski
01-Jan-2003, 13:17
So in your encoding I have 1 explicit bit that I specify per (4x4) CC_MIXED block as 0 or 1, and an implicit encoding from the ordering of the endpoint values that manipulates the effective LSB that is then substituted back into the green channel of each endpoint? I worked your example through, but couldn't generate the case where I could have colour 0's LSB as 1 and colour 1's LSB as 0 - have I misunderstood the encoding?

Basic
01-Jan-2003, 15:47
You're right that color0.glsb=0 and color1.glsb=1 together isn't possible. But you don't need that!
Think of it like this; the color order determines the green lsb of the "smallest" color, while the explicit bit is the green lsb of the "largest" color. With that in mind, you can easily see that you can set the green lsb of both colors to anything you want.
You just never need to set color0.glsb=0 and color1.glsb=1, the colors would be swaped instead.

There is just one "problem", if the colors are equal when the green lsb is stripped off. But otoh, the solution is quite simple. If the colors are equal, then color1.greenlsb=1 according to the rules above. So set the explicit bit to 0, and you've got all cases covered.

andypski
01-Jan-2003, 16:46
Gotcha.

Basic
01-Jan-2003, 18:37
I'll read that as "I see what you're saying". (Normally I would interpret "Gotcha" as "I nailed you down there".)

Btw, you didn't seem too impressed about the CC_CHROMA mode at all. I know that it doesn't have the fine gradients in it, so it's just 4 colors total over the whole 8x4 block. But remember that it's the only mode that breaks the "one-dimentional" color limit. A block where three different colors meet will get big errors in all other modes. Try to get red, green and blue into the other modes.


Now back to a different place where compression might be interesting.
What if a GPU has a virtualized memory (like P10). Textures are split into blocks. (I don't remember the exact size for P10, was it 4KB? => 32x32texel@32bit.) If each of those 32x32 blocks were ~jpg compressed, and decompressed by the GPU as they were loaded into gfx card mem, then AGP memory could suddenly become a lot more useful.

AGP bandwith would be virtually multiplied by the jpg compression ratio. The jpg decompression wouldn't need to work in random order (as normal texture decompression schemes need). The possibly variable block size wouldn't be such a big problem here, since we could use a table to see where the blocks are stored. The table would be just 1/1024 of the (uncompressed) texture size, and the cost of the indirection isn't that bad since it's only going to be used when loading new textures over AGP, and it will be used in pair with a transfer of a rather large block of data.

It should be noted though, that the working set of textures still should fit into gfx card mem. The "working set" of textures being those needed to render one entire frame, and that probably will be there next frame too. Or in other words, textures should still only be loaded over AGP when getting into new areas/revealing new textures.

andypski
01-Jan-2003, 19:09
I'll read that as "I see what you're saying". (Normally I would interpret "Gotcha" as "I nailed you down there".)


I see what you're saying :)

Btw, you didn't seem too impressed about the CC_CHROMA mode at all. I know that it doesn't have the fine gradients in it, so it's just 4 colors total over the whole 8x4 block. But remember that it's the only mode that breaks the "one-dimentional" color limit. A block where three different colors meet will get big errors in all other modes. Try to get red, green and blue into the other modes.

It may have some uses, but I suspect the number of instances where it would be selected over standard S3TC blocks is fairly rare interms of overall error - it might reduce blocking slightly in some circumstances, but in real-world images/textures you can usually trade off some chroma accuracy in small areas without much visual blocking, particularly if there is any high frequency noise structure in the area - this would tend to mask the chroma inaccuracy. I suspect CC_CHROMA would come into play if you were trying to compress images such as HUD displays (which might have widely different primary colours in a single block)

With alterations in encoding FXT(2!) becomes interesting, but still not much of an advance over S3TC on the kinds of images it's designed to represent. It would be interesting to write a good encoder for this new format to see how it performs. 3DFX's original FXT1 encoder was really not very good at all (and it was so slow...)

Basic
01-Jan-2003, 20:11
You're probably right that the places were CC_CHROMA would be at most use would be at HUDs, or displays/signs/maps/diagrams, which often can be colorful. But I guess I can't be sure unless I have some example pictures. Maybe it's OK with rather large chroma errors, if it's on just a small area. (But since it's for textures, you'd never know how large area they will be expanded to.)

3DFX's original FXT1 encoder was really not very good at all (and it was so slow...)

Well, that's one thing that we agreed on all the time. If you have the encoder, then try anything with small black and white details. (So that there's black and white in every block.) Then try to compress it over a weekend or so... In some modes they actualy made exhaustive searches, but used an incorrect measure for what was optimal. :shock: I think I have the FXT1 source laying around somewhere.

I actually started to write a FXT1/2 en-/de-coder, but kinda' lost the interest when no one seemed to be interested in the format, and then 3dfx disapeared.

Btw, I know that 3dfxs' FXT1 encoder didn't attempt to do any dithering, did S3s' compressor do that?

andypski
01-Jan-2003, 22:58
Btw, I know that 3dfxs' FXT1 encoder didn't attempt to do any dithering, did S3s' compressor do that?

Of course the source code for S3's encoder was never made public.

(But no, it didn't... :wink:)

You can check that out on example images easily enough.

Basic
02-Jan-2003, 00:22
But you know because you were involved writing it? :)

I just have a feeling you're already closer to a FXT* compressor than I ever was, because you already have some nifty optimization source laying around.

andypski
02-Jan-2003, 10:42
But you know because you were involved writing it? :)

I just have a feeling you're already closer to a FXT* compressor than I ever was, because you already have some nifty optimization source laying around.

I can't lay claim to writing the original S3 compressor, although I do work extensively with the guy who did write it. :) (On compression, amongst other things...)

Since I no longer work for S3 I'm afraid I don't have access to the original compressor any more (it was a very clever one in many ways)

It should be possible to write an even better version than the original S3 version, but I think it would take significant time and research to beat it by any significant margin. I know that Simon F. thought that his own S3TC compressor gave better results on some images at least - he commented on this on his home page, although we would respectfully disagree (on the limited basis we have for comparison). :wink:

Dio
02-Jan-2003, 11:15
Btw, I know that 3dfxs' FXT1 encoder didn't attempt to do any dithering, did S3s' compressor do that?
I would say that targetting things like dithering and error diffusion is barking up the wrong tree a bit. The errors introduced by the compression process don't really lend themselves to an error-diffusion type model as far as I can see.

Concentrate on a really good block compressor and you'll see much better results. (Any kind of extensive search, by the way, will never be fast). After that look at advanced stuff!

Of course, that's not easy :)

Simon F
02-Jan-2003, 12:13
It should be possible to write an even better version than the original S3 version, but I think it would take significant time and research to beat it by any significant margin. I know that Simon F. thought that his own S3TC compressor gave better results on some images at least - he commented on this on his home page, although we would respectfully disagree (on the limited basis we have for comparison). :wink:
In my defense, it was only on "some images" and it was a long time ago :D.

I think the problem I found with the S3 compressor was that it had some default weights that emphasised the green channel over the red and blue. (IIRC, according to Poynton's colour FAQ, although the eye is less spatially sensitive to blue than the other primaries, it is still quite sensitive to overall blue levels.) On one or of the two test images I tried, the S3 compressor produced an almost monochromatic green result, although I suspect that in general the S3 compressor would be superior.

FWIW, having looked at the description of the compression process in the S3 patent, I noticed some of the techniques described (e.g. principal vector analysis) are probably similar to those I used in the VQ tool (which I'd based on the work by Wu).

I suppose I could update my compression comparison page to use the S3 compression tool but I'm a bit busy (or lazy). Besides, the old version I have has the feature that makes it crash immediately after you output a decompressed .bmp file from a .s3tc input file! :shock: It's a bit tedious.

andypski
02-Jan-2003, 13:00
I think the problem I found with the S3 compressor was that it had some default weights that emphasised the green channel over the red and blue. (IIRC, according to Poynton's colour FAQ, although the eye is less spatially sensitive to blue than the other primaries, it is still quite sensitive to overall blue levels.) On one or of the two test images I tried, the S3 compressor produced an almost monochromatic green result, although I suspect that in general the S3 compressor would be superior.

As I recall I think your analysis is pretty correct here - there can be some rare instances where it doesn't do so well (although we had a large test suite for the compressor to try to avoid this). Certainly I believe that on regions of completely random colour noise (essentially uncompressible) it would tend to bias heavily towards green (so blocks that had an overall 'neutral' character in the original would end up greenish), but on real world images this effect didn't show up.

Actually there was a bug/feature that could cause a very slight (1 lsb) green shift in some circumstances, but I don't think this was ever tracked down and fixed (it can sometimes be seen when compressing greyscales)

Where the S3 compressor was pretty good was manipulating the endpoints and clustering to squeeze out more signal->noise.

Simon F
03-Jan-2003, 10:16
Actually there was a bug/feature that could cause a very slight (1 lsb) green shift in some circumstances, but I don't think this was ever tracked down and fixed (it can sometimes be seen when compressing greyscales)
Isn't that almost an inevitable side-effect of using 565 base colours? (Though, frankly, I don't think it's at all important)
Where the S3 compressor was pretty good was manipulating the endpoints and clustering to squeeze out more signal->noise.
I tried something along those lines as well. It's annoying how you can't analytically compute the optimum end points. :-(

andypski
03-Jan-2003, 10:50
Actually there was a bug/feature that could cause a very slight (1 lsb) green shift in some circumstances, but I don't think this was ever tracked down and fixed (it can sometimes be seen when compressing greyscales)
Isn't that almost an inevitable side-effect of using 565 base colours? (Though, frankly, I don't think it's at all important)

I think it may be pretty much inevitable in a compressor that is trying to make the best use of the available colour precision. I have seen compressors that didn't exhibit this overall shift, but they weren't as good at representing the shallow gradients.

Where the S3 compressor was pretty good was manipulating the endpoints and clustering to squeeze out more signal->noise.
I tried something along those lines as well. It's annoying how you can't analytically compute the optimum end points. :-(

Yes - this is an irritating problem with getting an optimal compression solution - balancing error by manipulation of the endpoints, and finding the global minima rather than a local one.

Dio
03-Jan-2003, 11:08
It's annoying how you can't analytically compute the optimum end points. :-(
Actually, I did find an analytical solution to most of the problem (everything except the rounding to 565). But I never managed to make it work - which means it is possible (likely?) it wasn't actually a solution :)

Simon F
03-Jan-2003, 14:36
It's annoying how you can't analytically compute the optimum end points. :-(
Actually, I did find an analytical solution to most of the problem (everything except the rounding to 565). But I never managed to make it work - which means it is possible (likely?) it wasn't actually a solution :)

You've got me very intrigued :?. I don't understand at all how you can do it analytically with all the discontinuities.

For example, with the DC VQ compressor, given a set of image vectors and a chosen partition of that set into two subsets (on either side of a plane perpendicular to the principal axis), you can easily compute two representative vectors for those subsets that gives a local minimum in the error. The problem is that you might be able to move the plane (and suddenly the sets change (i.e. the error function is discontinous)) and you might get a lower error.

I just chose to sweep the partition plane through the entire set to find all possible minima (which, luckily, is linear operation in the number of vectors).
Although it'd be more complicated, I suppose you could do the same with the S3TC encoding whereby you have "4" subsets.

Dio
03-Jan-2003, 15:20
I suppose I still had to check a bunch of cases and pick the best, but it was reduced to a maximum of 16, and it was guaranteed to give me the minimum error, so I call it analytical :)

I also had to make a pretty big bunch of assumptions :D but most of them I was making anyway in the progressive refinement version I was trying to replace - reduction to a 1D problem, midpoint is centre of extremities, etc.

Anyway, it never worked, and the progressive refinement was both faster (it rarely executed more than a couple of iterations) and great quality, so I dumped it.

Brimstone
07-Sep-2003, 17:36
Personally I think VQ compression is great, sure comrpessing the textures in an overnight job which is an ass but you get so much better compression ratio's than with S3TC, especially as the texture gets larger. You can also read and decompress a VQ compressed texture quicker than you can read an uncompressed texture which is stunning if u ask me.

You can get about twice the compression ratio of DXTC (for colour-only images), but the difference between 2bpp and 4bpp is not really of much interest in the video card market. Smaller than 4bpp is of interest mainly in areas where memory is at a huge premium (handheld devices etc). As far as video cards go if VQ's higher compression ratio couldn't win the day back when it was first introduced (when devices might typically have only about 8-16 MB of onboard RAM) then it is hardly likely to be a convincing argument now.

In the consumer 3D space the most interesting aspect of compression is increasing the efficiency of texturing, and DXTC solves that problem just fine - the added benefits of dropping to 2bpp vs. 4bpp in overall texturing efficiency are generally pretty marginal (considering you've already dropped from 24bpp->4bpp, and effectively from 32bpp->4 bpp, since most 3D hardware does not use packed texel formats).

Whether the image quality of VQ at 2bpp is equivalent to DXTC at 4bpp is a long and involved discussion in and of itself, but in most typical cases I believe it to be somewhat lower quality overall (although in the same ballpark). Of course each compression method has different strong and weak points in terms of IQ, and therefore the exact situation varies from image to image. I know that Simon had a comparison of some aspects of this on his homepage where he made some interesting observations on quality/bit.

VQ compression is also not great for hardware, as Simon has touched upon, since you need to hide an additional indirection. Also, for properly orthogonal support you have to be able to use N different sets of VQ palettes, where N is the number of simultaneous textures you support.



What about Light Field mapping which gains from both VQ and Texture Compression? Any merit in going this direction?

http://www.romulus2.com/articles/guides/lfm/math31.jpg


http://www.romulus2.com/articles/guides/lfm/math32.jpg




Light Field Mapping analysis (http://www.romulus2.com/articles/guides/lfm/lfm1.htm)


Intel Research on Light Field Mapping (http://www.intel.com/research/mrl/research/lfm/index.htm)

gkar1
07-Sep-2003, 23:40
S3TC has served us well for a while now, but I feel the time for retirement of texture compression is closing in. In the future with long shaders etc. I think texture compression will be more and less of a forgotten feature as shader execution will be the performance determining factor and not the memory bandwidth.

I agree, with the increasing flexibility and performance of pixel shaders, procedural textures are the way to go IMO.

Dio
08-Sep-2003, 11:55
Bandwidth isn't the only thing. Textures are getting bigger too - I know of applications that consider 2048x2048 restrictive.

For 80% of textures, even DXTC compression makes no visible difference to image quality. Why use 32-bit for those?

Texture compression will become more, not less, important for the future - particularly on highly restricted memory platforms such as budget cards, laptop, PDA, phone, console... but even when more memory is available, a game developer who chooses not to use compression will make their game look worse because they will not be able to use textures of the same resolution.

Simon F
08-Sep-2003, 12:59
I agree, with the increasing flexibility and performance of pixel shaders, procedural textures are the way to go IMO.
But how would you code a procedural texture for, say, the earth's surface viewed from space? (and I don't mean something that looks like a random blue-green planet). Sometimes a stored texture is an easier approach.

Having said that, the "Wang Tiles for Image and Texture Generation" presented at Siggraph this year would potentially offer a good compromise.

Joe DeFuria
08-Sep-2003, 13:03
But how would you code a procedural texture for, say, the earth's surface viewed from space?

Pre or post Borg?

Simon F
08-Sep-2003, 13:16
Pre or post Borg?Post? (http://slashdot.org) http://images.slashdot.org/topics/topicms.gif

Humus
08-Sep-2003, 13:17
Just a note for those who didn't spot it: This is an old resurrected thread.

WaltC
08-Sep-2003, 14:50
As others have said, there are probably a few times when S3 is better, however I think the biggest problem is that at least one of the FXT1 modes probably infringes S3's patent and so adoption could be risky.

I agree the differences aren't worth adoption of FXT1 at this stage in the API. But I would think that once M$ included the S3 technique in the API that anybody could use it on M$'s license. Who bought S3--VIA? Can't recall at the moment.

Joe DeFuria
08-Sep-2003, 15:48
I agree the differences aren't worth adoption of FXT1 at this stage in the API. But I would think that once M$ included the S3 technique in the API that anybody could use it on M$'s license.

Which is fine for D3D, but could be an issue with OpenGL.

Simon F
08-Sep-2003, 16:51
Which is fine for D3D, but could be an issue with OpenGL.
I believe you do have to license it for any use outside of D3D.

Joe DeFuria
08-Sep-2003, 16:51
I believe you do have to license it for any use outside of D3D.

My belief as well.

Humus
08-Sep-2003, 21:54
IP Status

Contact S3 Incorporated (http://www.s3.com) regarding any intellectual
property issues associated with implementing this extension.

WARNING: Vendors able to support S3TC texture compression in Direct3D
drivers do not necessarily have the right to use the same functionality in
OpenGL.


http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_compression_s3tc.txt

arjan de lumens
08-Sep-2003, 22:41
Hmph. http://www.s3.com (or for that matter http://www.sonicblue.com) redirects me to http://www.digitalnetworksna.com/default2.asp, which in turn presents with me 3 links, none of which seem to be going to an organization with S3 in its name or have anything to do with 3d graphics ... so who would one contact?

OpenGL guy
08-Sep-2003, 23:40
Hmph. http://www.s3.com (or for that matter http://www.sonicblue.com) redirects me to http://www.digitalnetworksna.com/default2.asp, which in turn presents with me 3 links, none of which seem to be going to an organization with S3 in its name or have anything to do with 3d graphics ... so who would one contact?
Maybe www.s3graphics.com? I'm not sure if that IP went with S3 Graphics or not...

Simon F
09-Sep-2003, 08:13
http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_compression_s3tc.txt
Ahh cheers. This is one of the reasons MBX has its own texture compression scheme ;-)

I also noted teh following from that spec:
(5) Is the encoding of the RGB components for DXT1 formats correct in this spec? MSDN documentation does not specify an RGB color for the "transparent" encoding. Is it really black?

RESOLVED: Yes. The specification for the DXT1 format initially required black, but later changed that requirement to a recommendation. All vendors involved in the definition of this specification support black. In addition, specifying black has a useful behavior.


Useful? Surely not!!.

IM(NS)HO, premultiplied alpha (which is effectively what we have here) is a mistake once you have any form of texture filtering! When you get a fully transparent texel, the information content drops from 4 dimensions (ARGB) down to 1 (just alpha) so this means you can't filter the colour channels sensibly around the fully transparent pixels. The result: ugly grey/black halos.

JohnH
09-Sep-2003, 09:55
Probably useful where you're only going to be applying additive blending to the texture, questionable for anything else...

John.

Simon F
09-Sep-2003, 11:38
Probably useful where you're only going to be applying additive blending to the texture, questionable for anything else...

John.
Even then I can't see much point. You can just as easily set the dest blend to 1 and src to alpha and, hey presto, "pre-multiplied alpha". <shrug>

.... unless, of course, the developer wanted to square the alpha I suppose.... :?

xGL
09-Sep-2003, 11:45
Anyone knows why S3 is no more part of the Futuremark BETA program?

Maybe it's a repeat of the NV30 scenario : they noticed their Deltachrome performed very badly in 3D Mark and therefore decided to boycott it... :roll:

tEd
09-Sep-2003, 18:11
Anyone knows why S3 is no more part of the Futuremark BETA program?

Maybe it's a repeat of the NV30 scenario : they noticed their Deltachrome performed very badly in 3D Mark and therefore decided to boycott it... :roll:

too expensive :wink:

OpenGL guy
09-Sep-2003, 18:40
http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_compression_s3tc.txt
Ahh cheers. This is one of the reasons MBX has its own texture compression scheme ;-)

I also noted teh following from that spec:
(5) Is the encoding of the RGB components for DXT1 formats correct in this spec? MSDN documentation does not specify an RGB color for the "transparent" encoding. Is it really black?

RESOLVED: Yes. The specification for the DXT1 format initially required black, but later changed that requirement to a recommendation. All vendors involved in the definition of this specification support black. In addition, specifying black has a useful behavior.


Useful? Surely not!!.
Sure it is. If you don't care about transparency, you can use this encoding to get black without sacrificing one of your two palette colors for that block.

Simon F
10-Sep-2003, 08:34
Useful? Surely not!!.
Sure it is. If you don't care about transparency, you can use this encoding to get black without sacrificing one of your two palette colors for that block.
Well I was meaning WRT to translucency but, yes, I can see that it does give you black for free. Now where's that image of the model T Ford I need to compress....? :)

OpenGL guy
10-Sep-2003, 17:14
Useful? Surely not!!.
Sure it is. If you don't care about transparency, you can use this encoding to get black without sacrificing one of your two palette colors for that block.
Well I was meaning WRT to translucency but, yes, I can see that it does give you black for free. Now where's that image of the model T Ford I need to compress....? :)
Just think how great a black hole would look! :D