Isn't it high time we abstracted compression methods?

Guden Oden

Senior Member
Legend
We've had the although simple and straight-forward yet flawed and limited S3TC for a long good while now. While it fulfils its purpose, it isn't stellar either from an image quality standpoint, nor compression ratio either.

Replacing it however isn't easily accomplished with the way APIs are designed. Rather than doing what Microsoft did and decide upon one compression standard, why didn't they introduce function calls to compress textures either at runtime or when installing a game using algorithms the graphics driver supplies?

As we move along, more and more texture formats appear, formats that the S3TC algorithm is poorly equipped to handle; 3D textures, normal maps etc. ATi now introduces their own "standard" just like 3dfx did before they went bust, and even though they probably made it open it won't help much because there is no cap in DX for either 3dfx's or ATi's compression format.

So why not decouple the algorithm completely from the API? The application could indicate what type of texture it is when it calls the compress function or at least hint wether lossy compression is permitted or not if it is some type of texture there is no ready-made definition for it and the driver could do some autodetection to find out which algorithm works best for that particular texture. It would take more time, but today's CPUs are fast and if compression was done at the time of installation it would only have to be done once, or at least infrequently in case the user switches graphics cards and the game needs to recompress the textures due to unsupported formats etc.

Surely this would be the smartest approach, rather than setting one or a few standards in stone and then be locked into an aging format that will gain support only slowly and then has to be supported for all time to come to ensure legacy apps to continue to function?
 
Guden Oden said:
there is no cap in DX for either 3dfx's or ATi's compression format.

Incorrect. All the Direct3D calls that take a D3DFORMAT parameter (except index related calls probably) can take a FourCC value. That means you can query support for the desired IHV texture/surface format. 3dfx Direct3D drivers DO expose FXT1 btw.
 
But the app would have to know specifically about these formats, and the hardware would have to support them. What chip in use today supports 3dfx's format? I know of none at all.

What you speak of is not what I speak of. :)
 
Yes, the 3D API should work more like Image compression APIs on other operating systems.

I should be able to say

Compressor* c = CodecFactory::createCompressor(NORMALMAP_HINT);

c->compress(normalMapBuffer);

or

Compressor* c = CodecFactory::createCompressor(GEOMETRY_MESH_HINT);

or

c = CodecFactory::createCompressor(VOLUME_TEXTURE_HINT);

etc

I hate the DX method of asking for Caps bits for everything, and then having the application code do its own polymorphic dispatch based on caps.
 
And OpenGL is any better? Where instead of caps you need to query for support of all those lovely extensions.

[ADDED]

OpenGL actually already supports 'something' along these lines. GL_ARB_texture_extension (i think i've got that right) allows you to ask the driver to compress a texture in whatever format it likes.
 
But does that opengl function have any way of defining the use of that texture (as you wouldn't want your bumpmaps compressed in a lossy format for example), or does it just pick an arbitrary compression method and run with it, or does it have some fixed rules based on...what?
 
OpenGL has less granularity. Yes, it doesn't have zero queryable capabilities, but the number of queryable bits you have to validate in your app is much much lower. Moreover, over time, many of the most useful extensions are moved into the core, instead of being a queryable extension.

Microsoft's APIs violate many good practices of programming design IMHO. Ironically, Managed DirectX is much better.
 
I think it's a little much to say that DXTC isn't stellar in compression ratio or image quality. Used for typical base texture maps with trilinear on it is pretty rare to find something that isn't acceptably compressed.

As for the compression ratio, DXT1 is (I think) 'small enough' - it is questionable as to whether increasing the compression ratio is worth sacrificing any further image quality for (I tend to think it unlikely that a realtime method can achieve significantly better compression on such a wide range of images without a cost in image quality - although I always hope to be proved wrong).
 
No arguments there. Hence 3Dc - pretty much comparable to the 16:16 textures that our demo-porting group used to insist on at 1/4 the size.
 
So it is 16:16 -> 3Dc that gives the 4:1 compression ratio. From most places I read it seemed like XYZ* 8:8:8:8 -> 3Dc was the reason.

Does that mean that the interpolation in R420 is done at high enough precision to get the 10-11 bits per component that is theoretically possible to get from 3Dc on a favourable texture?
 
Remember that we have filtering for 16:16 and 10:10:10:2 textures ;)

3Dc will be significantly higher quality than an 8:8:8 bump map in most cases.
 
If you start out with FP16 normal maps, you can even claim higher ratios. You go from 1024-bit to 128-bit = 8:1 compression.

I assume that 3Dc is two DXT5 alpha blocks?

For each of of the two components you store:
2 8-bit values (min/max extremes)
and 4x4 3-bit index to represent either 6 or 8 interpolated values
(if you designate special values and indicate via the order of the two interpolation parameters)

This yields (8+8)base * 4*4*3-bit(index) = 64-bits * 2 (both components) = 128-bit. Each texel can store 8 different component possibilities, yielding 8*8 = 64 unique possible normal values per 4x4 block. (more than needed BTW, since you only have 16 texels :))


The straight DXT5 method described by ATI gives you one channel at equal precision to 3Dc, and another with 4 possible component values per texel yielding 32 different possible normals per 4x4 block, except that one of the components is somewhat more limited in its variance, meaning it's not as isotropic. The question is, is it visually distinguishable on most normal maps?


Seems to me that the ideal encoding for a 4x4 block would yield 16 possible values for one component, and 16 possible values for the other component, which can be combined in whatever way needed. Of course, perfection would 16 possible values for the entire 4x4 block, but that's hoping too much.
 
i haven't read the whole thread. but one reason it is taken out is to let the developers explicitely make the choise of compression, and the choise of TIME OF COMPRESSION.

compression takes time. and quality mathers, too. if you have arbitary amount of time (a.k.a. while saving the image in photoshop), it can analize the image and compress the way it takes the least quality hit (and if this means try out all combinations, be it like that).

if it is at runtime, it normally means it should upload the data quite fast (a millisecond? more? .. depends), as the api doesn't have a "loading state", but only rendering (and some games don't even have a loading state).

it's like "why can't i have runtime generated divx at highest quality?" because it simply has to process the whole movie at least twice, and do a lot of stuff.

and, depending on situation, you don't WANT that compression changes. you simply don't.

most games even store the textures precompressed (the .dds format), so changing to a "bether" algo at runtime would lead to decompression, and recompression, taking first much time, and second another quality hit.


oh, and (at least in opengl), there IS support to simply "compress the image" at runtime. but the quality will not be best, due to above restrictions.
 
I've been thinking about 3Dc again, and how to slightly improve on it.

You can view the problem is given N points, and M interpolated points packed inside 128-bits, which curves have optimal fitting properties for 16 data points.

For example, with N=2, we have M = 64, so our challenge is to find the best end points for lines that have the least error to fit all 16 data points. This sweeps out a bunch of lines in parameter space. We can choose least squares or the Hough Transform to find this.

But what about N=3? What if we try to fit quadratic curves? That is, we store 3 base points (24:24), and use 5-bit indices are lookups. This yields 32 interpolated points, but potentially a closer fit. Hmm. The previous method generates more potential interpolants (64 vs 32), but the average distance might now be as close as the 32. Need some experiments.

Of course, we can go one further and store 4 base points (32:32) leaving 4-bit indices and exactly 16 interpolated points for 16-texels. Only problem is it eats up more transistors to do higher degrees of interpolation. I guess this is why we don't have >linear interpolation on texture units.
 
DemoCoder said:
Of course, we can go one further and store 4 base points (32:32) leaving 4-bit indices and exactly 16 interpolated points for 16-texels. Only problem is it eats up more transistors to do higher degrees of interpolation. I guess this is why we don't have >linear interpolation on texture units.


Other problem is of course more texture look ups too so their is a memory component though its not so much of a problem with a magnification filter but once you start getting into the mip-maps I believe it would start to get a big preformance hit.

Dio said:
Remember that we have filtering for 16:16 and 10:10:10:2 textures ;)

3Dc will be significantly higher quality than an 8:8:8 bump map in most cases.

If you really need better then 8bit int accuracy ( if your talking about float well 3dc can't handle floats in the general case ) then you have made a very very big mistake. Hey it might all work fine on ATI hardware BUT I don't believe the 3dc specs state what the precision for the calculations are and seeing as this was aimed at 8R8G8B8A textures most implementations would end up using 8 bit accuracy and you would have major problems.
 
Dio said:
3Dc will be significantly higher quality than an 8:8:8 bump map in most cases.
3DC appears to be covered by the original S3TC patent. I was wondering how licensing of this scheme will work.
 
Simon F said:
Dio said:
3Dc will be significantly higher quality than an 8:8:8 bump map in most cases.
3DC appears to be covered by the original S3TC patent. I was wondering how licensing of this scheme will work.

Well it probably shouldn't be a problem for DX then but I don't quite have a copy of the lisencing agreement between M$ and s3.
 
Dio said:
Remember that we have filtering for 16:16 and 10:10:10:2 textures ;)

3Dc will be significantly higher quality than an 8:8:8 bump map in most cases.
I just wanted to make sure, since the interpolation in DXTC and 3Dc isn't nessesarily done with the same hardware as the bilinear filtering. And there were some rumours that the compressor only used 8 bit per component as input.

[Edit]
OK, looking some more at it, it seems that the most efficient way is to use the same hardware anyway.
 
Back
Top