Colour Subsampling

aths

Newcomer
While I still don't have the approval of my professor, I plan to talk in my diploma also about efficient texture storing, meaning texture compression.

Traditionally, one can lower either the spatial resolution, or the colour resolution. Now we want to have both high detail textures and no colour banding.

So the problem is, how to keep the spatial details high without sacrificing too much colour resolution. Lets have a look at that picture:

Lagos03w.jpg


We could may be lower the resolution of some less necessary colour cannels, especially R and B. To show the effect, now any tile of 8x8 texels get only one RB-value, but G is still stored per texel.

Bild1.jpg


Now we have of course a visible tile structure. Lets try to use a bilinear interpolation of the RB-channels:

Bild2.jpg


The entire pic is now very blurry. I tested in earlier times that one can lower the R-channel by factor 2/3 (so it is upscaled with 1.5) and B-channel even by factor 0.5 (upscale with 2.0 then) without noticable blurryness. May be we should try another colour model. I took YCbCr. Y (Brightness) will be stored full-res, but CbCr (Colour) only once per 8x8 tile:

Bild3.jpg


That looks much better. Lets go to even larger tiles of 16x16:

Bild4.jpg


Now finally we can easily see the tile structure. Lets – even though one expects blurryness – try it again, to use a bilinear filter for the CbCr-Channels:

Bild5.jpg


This is still acceptable, if you know that only one color is stored only for 256 texels.


To avoid visible tiling artifacts, Simon Fenney suggests to use a bilinear filtering of the reference colours. (Have a look at http://www.powervr.com/Downloads/Factsheets/.) His compression method is not simple RB-downsampling, but more sophisticated.

Nevertheless, I think, to deliver best sharpness its best to use the YCbCr colour system and sacrifice CbCr information only. Traditional texture compression techniques are still working in the RGB colour space. But even though the reference colours are stored 5:6:5, so the resolution of G ist the highest, YCbCr offers more flexibility and allows larger tiles without too much visible losses.

On the other hand, DXT1 is often good enough, and uses fully independent tiles, which are also nicely small.
 
Isn't this effectively YUV compression ? Y is luminance (brightness) and UV are color. MPEG2 and other compression formats use this technique to compress the data. I am sure the same technique has also been used for still image compression but its best know for video codecs.

I think some hardware already supports YUV textures so hence this compression is more or less available in some form depending on the HW:

http://msdn.microsoft.com/library/d...tx/graphics/reference/d3d/enums/d3dformat.asp

I am sure Simon is the better person to comment on this though.

K-

P.S. You might want to use a wider variety of test images though since I suspect an image with more color variety will show more artefacts with your proposed technique.
 
Kristof said:
Isn't this effectively YUV compression ? Y is luminance (brightness) and UV are color. MPEG2 and other compression formats use this technique to compress the data. I am sure the same technique has also been used for still image compression but its best know for video codecs.
Yes, video uses YUV, not YCbCr (which is similar, though.)

Even high quality video uses only one UV value for 2 pixels. JPEG stores one CbCr for any block of 2x2 pixels.

Kristof said:
I think some hardware already supports YUV textures so hence this compression is more or less available in some form depending on the HW:

http://msdn.microsoft.com/library/d...tx/graphics/reference/d3d/enums/d3dformat.asp
Yes.

My experiments at this time are purely "academic" to get a feeling of what could may be done. Now, colour subsampling with RGB looks to be a bad idea, while YCbCr looks like a usuable approach to subsample the colour information. At the same subsample ratio the picture is sharper, and shows less coloured borders at the high contrast egdes.

Kristof said:
P.S. You might want to use a wider variety of test images though since I suspect an image with more color variety will show more artefacts with your proposed technique.
I tested more images (photographs, computer game screenshots, also images taken from the mandelbrot set) but decided to show that particular pic.

The advantage of YCbCr colour subsampling is, that even at insane big tiles (of 32x32) the picture does not look "wrong" on first sight. Of course, the entire picture is less coloured, and fine colour structures are now filled with the same avarage surrounding colour, but if you consider to store the color value only once for 1000 pixel (1024 to be exact) the result is wondrous – at least for me :)

I don't know if JFIF JPEG allows this today, but I think it should offer to store colour (CbCr) only once per 4x4 tile – and use a bilinear filter for decompression, of course, to avoid blocky artifacts. The RMS error (I chose RGB differences squared, ad weightet by influence of the brightness) is lower than without bilinear magnification.
 
aths said:
Nevertheless, I think, to deliver best sharpness its best to use the YCbCr colour system and sacrifice CbCr information only. Traditional texture compression techniques are still working in the RGB colour space. But even though the reference colours are stored 5:6:5, so the resolution of G ist the highest, YCbCr offers more flexibility and allows larger tiles without too much visible losses.
It's certainly good for "natural" images, but you run into problems with "graphics".
 
Simon F said:
aths said:
Nevertheless, I think, to deliver best sharpness its best to use the YCbCr colour system and sacrifice CbCr information only. Traditional texture compression techniques are still working in the RGB colour space. But even though the reference colours are stored 5:6:5, so the resolution of G ist the highest, YCbCr offers more flexibility and allows larger tiles without too much visible losses.
It's certainly good for "natural" images, but you run into problems with "graphics".
The results with photo-textures are excellent, in my opinion. YCbCr with CbCr-subsampling fails in some special cases, though.

I don't know how expensive is TC decompression today. To get better results, I suggest:

- Use larger tiles of 8x8.

- Use color subsampling heavily. Colour per 8x8 tile? – YCbCr allows that without too much artifacts. For most photo-textures, it is good enough.

- Store colour once per edge and use bilinear interpolation of the colour value. Totally one need more colour values than tiles, but not that much. My pictures are using the average colour value of the tile and an offset for determinating bilinear interpolation coordinates, so it seems to be possible to store the same munber of colour values as we have tiles.

- Also save brightness (Y) once per edge and interpolate it over the tile. May be use the average tile brightness to have only that much values as there are tiles. But may be its better to determinate the average brightness at the edges, even though the compression gets more complicated.

- Brightness interolation should allow a very low precision for the Y modification over the tile.

- One should investigate non linear quantisation of the modulation. I don't speak about your weights of 3/8 and 5/8, but to allow both slight and strong differences of the interpolated Y value.

But nevertheless, this approach will not lead to a data rate substantial below 4 bit per texel. In the ends its more a fixed data rate jpeg-like compression with lower bitrate and lower picture quality.

The next idea is, to store both for the colour value and for brightness some DCT values. This makes it possible store the most important (still low frequency) characteristics ...


I tested for S3TC the interpolation weights of 0/3, 1/3, 2/3 and 3/3 vs. 0/8, 3/8, 5/8 and 8/8. It was very hard to find an example where the last table deliver a better result. In almost any case, linear interpolation wins. I don't know why, intrinsically it should be better to use the more centric weights because the middle values should be more common than extreme dark and bright values.
 
Better results

There is another way to get better results. This is a very difficult image for CbCr-subsampling:

mandela.jpg


Now I chose heavy colour subsampling (16x16 tiles)

mandelb.jpg


Note the blurry colour seam at the fine structures. Now lets assume this image stores colour in the sRGB model. We linearize before the conversion to YCbCr and re-gamma the final RGB-values back.

mandelc.jpg


While the colour information is of course still very low, the fine structures are better visible.

Primaly Photo-textures should be stored in sRGB-space and treated as sRGB. Linear interpolation (for example in the texture filter) is simply wrong and leads to wrong results.

Here are the subsampled images without interpolation of CbCr:
(The right edge is calculated wrong due to a bug in my tool.)

mandel4.jpg


That was with linear RGB interpretation. And this now is with sRGB interpretation (approximated with gamma 2.2)

mandel5.jpg


If your monitor is calibrated correctly, the bottom image should be better. To see if your monitor shows sRGB correctly, compare inner and outer brightness of this pattern:

22gamma.png
 
Re: Better results

aths said:
If your monitor is calibrated correctly, the bottom image should be better. To see if your monitor shows sRGB correctly, compare inner and outer brightness of this pattern:
Something seems to be very wrong with those sRGB treated images. They're clearly less saturated and brighter than the original image, showing strong banding and line artifacts.
The compressed and then decompressed image should resemble the original. You don't need to calibrate your monitor to see that there's a big difference.
 
Re: Better results

Xmas said:
aths said:
If your monitor is calibrated correctly, the bottom image should be better. To see if your monitor shows sRGB correctly, compare inner and outer brightness of this pattern:
Something seems to be very wrong with those sRGB treated images. They're clearly less saturated and brighter than the original image, showing strong banding and line artifacts.
The compressed and then decompressed image should resemble the original. You don't need to calibrate your monitor to see that there's a big difference.
It is really funny - the sRGB interpreted images got the higher average error per pixel than the linear interpreted images – even if I calculate the sRGB-error.

Nevertheless, while transition of colours at with high contrast edged produces dark areas without handling the content as sRGB. With sRGB-handling, these transition are brighter and looking better.
 
S3TC is just too damn good. It's very hard to improve upon.

I think a luminance/saturation/hue split makes some sense. I've been thinking about this a bit today and the biggest problem I have with this is that I don't see any acceptable way to encode luminance with less than 3 bits per pixel in a fixed-ratio format that works on completely isolated blocks.

FWIW I suggest 4x4 blocks with two 8 bit endpoints, yielding a four-entry LUT via interpolation. Plus two bits per pixel that index the LUT. 2*8+16*2=48 bits per block. A linear gradient can be encoded very well, with just a little loss of precision if it's running diagonally through the block. High contrast blocks are much less precise, but it's harder to see these errors.
 
zeckensack said:
S3TC is just too damn good. It's very hard to improve upon.

I think a luminance/saturation/hue split makes some sense.
I doubt that. For my "Großer Beleg"-thesis I worked with HSL and HSV. S and L/V are pure mathematical expressions for sat and lum. The same difference value in hue, sat or lum can make a little or a big difference in the resulting colour – depending by the other two values. That is not suitable for linear inpolation. Also its much harder to convert RGB into HSx (or HSx into RGB.)

In my opinion, its most important to split Brightness and Colour. YCbCr provides that. I also tried lower colour precision but due to some strange behaviour the entire picture gets a strange "Farbstich" (damn, how to say that in english??)

zeckensack said:
I've been thinking about this a bit today and the biggest problem I have with this is that I don't see any acceptable way to encode luminance with less than 3 bits per pixel in a fixed-ratio format that works on completely isolated blocks.

FWIW I suggest 4x4 blocks with two 8 bit endpoints, yielding a four-entry LUT via interpolation. Plus two bits per pixel that index the LUT. 2*8+16*2=48 bits per block. A linear gradient can be encoded very well, with just a little loss of precision if it's running diagonally through the block. High contrast blocks are much less precise, but it's harder to see these errors.
Its extremely hard to get under 4 bits per texel. I thought about to compress CbCr as normal S3TC-Tiles, but use the information for 4x4 Y-Tiles, then. With Y, Cb and Cr stored with 8 bits, the Y tile is only 48 bits long. But the total data rate is still 3.25* bit per texel, then.

S3TC is, despite it simplicity, very good indeed.
 
aths said:
zeckensack said:
S3TC is just too damn good. It's very hard to improve upon.

I think a luminance/saturation/hue split makes some sense.
I doubt that. For my "Großer Beleg"-thesis I worked with HSL and HSV. S and L/V are pure mathematical expressions for sat and lum. The same difference value in hue, sat or lum can make a little or a big difference in the resulting colour – depending by the other two values. That is not suitable for linear inpolation. Also its much harder to convert RGB into HSx (or HSx into RGB.)
Yeah, maybe you're right.
I thought a hue representation can be beneficial because it's a very compact color representation. Six or seven bits should be acceptable. I don't know about L or V, I only know that you need high precision for shades of green and yellow, and little precision for shades of blue, with red somewhere in the middle. So the color "circle" should be very non-linear.

But if you've actually made the experiments, I'll trust your judgement.
aths said:
In my opinion, its most important to split Brightness and Colour. YCbCr provides that. I also tried lower colour precision but due to some strange behaviour the entire picture gets a strange "Farbstich" (damn, how to say that in english??)
"Tint".
Don't know. I liked the saturation idea because I would have thought that you can get away with "medium" precision for it (low precision for hue, high precision for luminance), whereas if you use YCbCr you'll encode something similar to combined saturation/hue with a uniform resolution. I think hue doesn't need as much resolution as saturation, and that's why I wanted to split it there.

edited: I mean spatial resolution here, aka block size, not bits.

aths said:
Its extremely hard to get under 4 bits per texel. I thought about to compress CbCr as normal S3TC-Tiles, but use the information for 4x4 Y-Tiles, then. With Y, Cb and Cr stored with 8 bits, the Y tile is only 48 bits long. But the total data rate is still 3.25 bit per texel, then.
3.25 bits is a nice data rate. I don't think you can do much better without some serious magic.
"My" approach doesn't do any better either, but has the additional downside of being completely unimplemented and untested ;)

But this is highly interesting stuff. Did I ever tell you that my first C/C++ "learning" project was an S3TC compressor? :D
 
zeckensack said:
aths said:
zeckensack said:
S3TC is just too damn good. It's very hard to improve upon.

I think a luminance/saturation/hue split makes some sense.
I doubt that. For my "Großer Beleg"-thesis I worked with HSL and HSV. S and L/V are pure mathematical expressions for sat and lum. The same difference value in hue, sat or lum can make a little or a big difference in the resulting colour – depending by the other two values. That is not suitable for linear inpolation. Also its much harder to convert RGB into HSx (or HSx into RGB.)
Yeah, maybe you're right.
I thought a hue representation can be beneficial because it's a very compact color representation. Six or seven bits should be acceptable. I don't know about L or V, I only know that you need high precision for shades of green and yellow, and little precision for shades of blue, with red somewhere in the middle. So the color "circle" should be very non-linear.
You also need nearly no hue precision on very low sat and/or lum. But I consider it difficult to store more or less bit per component, the decoder complexity for HSx-conversion is already high enough. While non-linear hue coding is a good Idea, you can may be save 1 or 2 bit. That is not good enough to get a significant decrease of the total data rate.

To keep the full resolution, one need at least 1 bit per texel. Of course, 2 bit is much better. And we have to store the reference colours – leading to at least 3 bit per texel, even with colour subsampling.

Simon Ferreys approach of a 2 bpp data rate compression uses either only half resolution or can be used for 2 colours only. But there is no way to get a resonable quality on common textures with only 2 bpp.

zeckensack said:
But if you've actually made the experiments, I'll trust your judgement.
aths said:
In my opinion, its most important to split Brightness and Colour. YCbCr provides that. I also tried lower colour precision but due to some strange behaviour the entire picture gets a strange "Farbstich" (damn, how to say that in english??)
"Tint".
Don't know. I liked the saturation idea because I would have thought that you can get away with "medium" precision for it (low precision for hue, high precision for luminance), whereas if you use YCbCr you'll encode something similar to combined saturation/hue with a uniform resolution. I think hue doesn't need as much resolution as saturation, and that's why I wanted to split it there.
Hue also influences the acutal brightness. Therefore one need still a quite high hue precision. Hue can be yellow (with 89% Y-intensity) or blue (with 11% Y-intensity.) So you need to have a high resolution around yellow, as you already wrote.

zeckensack said:
aths said:
Its extremely hard to get under 4 bits per texel. I thought about to compress CbCr as normal S3TC-Tiles, but use the information for 4x4 Y-Tiles, then. With Y, Cb and Cr stored with 8 bits, the Y tile is only 48 bits long. But the total data rate is still 3.25 bit per texel, then.
3.25 bits is a nice data rate. I don't think you can do much better without some serious magic.
"My" approach doesn't do any better either, but has the additional downside of being completely unimplemented and untested ;)

But this is highly interesting stuff. Did I ever tell you that my first C/C++ "learning" project was an S3TC compressor? :D
My approach is untested, too. I think I will try to implement it in the next days.

How do you calculated the two reference colours for S3TC? I seek every tile for min and max R, G and B value. Then I round them to 5, 6 and 5 Bits and expand them with placing the MSBs into the last bits. Then I calculate the two interpolated colours (in full precision) and compare any texel colour in the tile with every of the four colours and chose the one with the smallest error, weighted by the influence of Y (30% for R, 60% for G, 10% for B.)

This works, but you have often some tiling artifacts.

There is another problem. May be one texel has a very low blue component. It don't look wise to have that low blue value in the reference color, since one sacrifices precision for all the other texels. I think may be one should "jitter" the reference colous, meaning, try other values in the near and test, if the average RMS error can be reduced. You sharing dynamic vs. precision.

It looks also promising to consider the average colours around the tile to minimize tiling artifacts. But this makes the whole compression process even more complicated.
 
Re: Better results

aths said:
If your monitor is calibrated correctly, the bottom image should be better. To see if your monitor shows sRGB correctly, compare inner and outer brightness of this pattern:

22gamma.png

Well if I scroll the pattern to the top of the window the center is darker then the surrounding pattern.
If I scroll the pattern to the bottom of the window the center is lighter then the surrounding pattern.

LCD monitors are FUN! ;)
 
Its extremely hard to get under 4 bits per texel.
I suppose it all depends. If you want to delve into some areas of greater complexity, you can get pretty close to 4 bits per texel maintaining near lossless. Although, I can't say that the alpha channel would maintain high quality under such constraints. Alpha is another challenge that lies in the same order of importance as the Y channel as it can carry soft gradients just as easily as hard edges... and often times, both in the same image.

I have my own approach that, while extremely demanding in computation, it's mainly vector/matrix tasks -- requiring matrix inversion at 3 texels out of every 2x2 block. So it's well suited to GPUs anyway. But it's far from having all that great of a bitrate as of yet. A lot of the things I'd like to do are still on paper -- I need some free time someday this decade. About the only real advantage it carries over other texture compression schemes is that it can implicitly store all miplevels. I still need to implement some kind of progressive encoding scheme, so you can basically get any miplevel by just partially decoding... Now if only there was some suitable method that wasn't already patented (damn software patent system).


More than anything else, throwing a monkey wrench into the whole mess is HDR. A YCbCr HDR scheme would be rather interesting, since from a pure theoretical standpoint, only the Y channel really needs to be HDR. The chrominance channels are pure hue. And ultimately, I do think HDR will be important in the long run whereever textures become more than just color information, but also numerical data to feed computational tasks. That's going to get all the more interesting. While we can use things like logarithmic encodings to fit the same data within the 8-8-8-8 space, that doesn't mean the same coding schemes can work all the same in a new domain.
 
ShootMyMonkey said:
Its extremely hard to get under 4 bits per texel.
I suppose it all depends. If you want to delve into some areas of greater complexity, you can get pretty close to 4 bits per texel maintaining near lossless.
You can perform a compression with losses based on wavelets, though, but the decoding looks very expensive, then.

ShootMyMonkey said:
Although, I can't say that the alpha channel would maintain high quality under such constraints. Alpha is another challenge that lies in the same order of importance as the Y channel as it can carry soft gradients just as easily as hard edges... and often times, both in the same image.
I don't investigate Alpha compression now, because I think, its best to store compressed Alpha per tile with an extra Alpha tile. I doubt that DXT3 / DXT5 Alpha Channel compression can be beaten. Its a single channel compression where you don't have much options, in my opinion.

ShootMyMonkey said:
I have my own approach that, while extremely demanding in computation, it's mainly vector/matrix tasks -- requiring matrix inversion at 3 texels out of every 2x2 block. So it's well suited to GPUs anyway. But it's far from having all that great of a bitrate as of yet. A lot of the things I'd like to do are still on paper -- I need some free time someday this decade. About the only real advantage it carries over other texture compression schemes is that it can implicitly store all miplevels. I still need to implement some kind of progressive encoding scheme, so you can basically get any miplevel by just partially decoding... Now if only there was some suitable method that wasn't already patented (damn software patent system).
I don't know, if it is the best way to don't store MIPs separately. Because with some rendering mode, one need differently generated MIPs. Ok, all common rendering techniques are fine with block-generated MIPs. But I prefer fast-trilinear with an 16x16 filter kernel, one have to generate 16 texels from the base map then (per clock), instead of 2x 4 (in two clocks). To have MIP means, have up to 33% more textures. That is not that much in my opinion, and gives you additional freedom regarding the MIP content.

ShootMyMonkey said:
More than anything else, throwing a monkey wrench into the whole mess is HDR. A YCbCr HDR scheme would be rather interesting, since from a pure theoretical standpoint, only the Y channel really needs to be HDR. The chrominance channels are pure hue. And ultimately, I do think HDR will be important in the long run whereever textures become more than just color information, but also numerical data to feed computational tasks. That's going to get all the more interesting. While we can use things like logarithmic encodings to fit the same data within the 8-8-8-8 space, that doesn't mean the same coding schemes can work all the same in a new domain.
A already attempted to try to find a way to compress at least FP16 textures. But a reasonable compression is harder to find than I thought. FP16 is not only for the dynamic but also for precision – the compression must not be too lossy, though.

Your approach with storing only Y with HDR reminds me of the RGBE-format. It is by the way also 32 bit wide.
 
aths said:
You also need nearly no hue precision on very low sat and/or lum. But I consider it difficult to store more or less bit per component, the decoder complexity for HSx-conversion is already high enough. While non-linear hue coding is a good Idea, you can may be save 1 or 2 bit. That is not good enough to get a significant decrease of the total data rate.
Of course it's difficult :D
You'd probably want the full 8 bits for the endpoints for storage reasons, but the point is that hue doesn't need much numeric precision to be (visually) accurate. That's good IMO. Your idea includes relatively low spatial resolution for the actual color. And that works okay, as you illustrated above
E.g. you could encode hue just once per 4x4 block and do some interpolation (on the hue circle, not in RGB space). If you pack this up into an S3TC-like block format, a "hue block" spans 16x16 pixels and would take the same 48 bits as the 4x4 pixel "luminance block" I suggested.

Keeping sat at a higher spatial resolution than hue (2x2) would help with this kind of problem.

Yes, there's some interdependence between hue and sat. I'd rather test the staggered subsampling approach for viability first before thinking this through, but I think this dependence can be exploited with some encoding tricks, while keeping the staggered subsampling intact.
aths said:
Hue also influences the acutal brightness. Therefore one need still a quite high hue precision. Hue can be yellow (with 89% Y-intensity) or blue (with 11% Y-intensity.) So you need to have a high resolution around yellow, as you already wrote.
I don't think we're agreeing on the meaning of "luminance" here. I'd store NTSC-weighted intensity, so there would be no dependence on hue. That's necessary because intensity must have high spatial resolution. That's where humans are best at detecting errors.
aths said:
My approach is untested, too. I think I will try to implement it in the next days.

How do you calculated the two reference colours for S3TC? I seek every tile for min and max R, G and B value. Then I round them to 5, 6 and 5 Bits and expand them with placing the MSBs into the last bits. Then I calculate the two interpolated colours (in full precision) and compare any texel colour in the tile with every of the four colours and chose the one with the smallest error, weighted by the influence of Y (30% for R, 60% for G, 10% for B.)
The first step was to convert colors into an NTSC-weighted color space and tackle it geometrically. The next task is "find the line segment with the smallest average distance to these 16 points". As an intermediate step toward this goal, you can also first try to "find the plane with smallest average distance to these 16 points". Then, when you have the "best" plane, you find the "best" line. Then you find the "best" line segment.
From there it gets really complicated ... it's not always best to have min/max values as the endpoints. In many cases you should rather move the endpoints so that the two interpolated steps on the line segment come closer to some points near the middle.

My program never performed well ;)
aths said:
There is another problem. May be one texel has a very low blue component. It don't look wise to have that low blue value in the reference color, since one sacrifices precision for all the other texels. I think may be one should "jitter" the reference colous, meaning, try other values in the near and test, if the average RMS error can be reduced. You sharing dynamic vs. precision.
That's one of the few problems with S3TC. You can't really encode a block with green, red and blue pixels. All points must be close to a straight line segment, curves just aren't possible. With such difficult blocks, you'll always have to pick "unimportant" colors, where you know that you won't be able to encode them properly, while keeping visual errors low. Not simple at all.

It helps a lot if you split this 3D problem into three 1D problems (or one 2D problem and one 1D problem). That's one of the benefits of color subsampling, the split just comes naturally :)
 
I don't know, if it is the best way to don't store MIPs separately. Because with some rendering mode, one need differently generated MIPs.
Well, it's actually possible to have uniquely different MIPs at every level with my approach, but it guaranteeably comes at a bitrate cost. Especially since, if you want full control, you end up overwriting the anchor pixels in a particular miplevel which are used to generate the next higher res level. Ultimately, what I'm doing is not that radically different from wavelet techniques, except that it's effectively adaptive basis down to the pixel level (or at least 3 out of every 4 pixels) -- that and error is not stored in basis since there's no real basis function per se. In reality, my tests are showing that the best bitrates so far come by building the miplevels using nearest neighbour downsampling. But I can imagine that doesn't look quite so good when used in real hardware (I ought to actually try it one of these days). While you can theoretically edit the individual miplevels and such, it alters the error components. I think that's one of the main strengths of S3TC and DXT#... in that you have pretty predictable even losses at a constant compression rate. When you dive into DCT and DWT approaches, it's hard to do the same thing (i.e., you can do one or the other reasonably well, but both?... eh...).

Your approach with storing only Y with HDR reminds me of the RGBE-format. It is by the way also 32 bit wide.
RGBE is nice and all, but my biggest problem with it is that if only one or two components of the color are dominating, you end up running off a lot of your precision in the smaller component(s).
I don't particularly see Y-only-HDR YCbCr color as an extremely clean idea either mainly because the transformation can't be completely affine anymore. It's just that from a storage standpoint, there's obviously a lot that can be done to bring down the space requirements. I'm not so much concerned with storage in this respect as with bandwidth considering we are talking about textures here. I'm generally very cynical about the growth rate of bandwidth. :?
 
Hi ShootMyMonkey,

without more specific details I can't imagine how your approach should work. While it sounds interessting, my main goal is to think about how to store seperate MIPs in a seperate space.

RGBE vs YCbCr: I consider the main advantage of YCbCr its use of only 3 channels. Compared to the overall brightness, you have low colour precision anyway.
 
Hi zeckensack,

I still don't like the idea of storing HSL or HSV values. Let me give you a longer answer, why. If you got the feeling my protest is only because I don't understood you correctly, I beg your pardon in advance.

A colour is a spectral band of electromagnetical ray. To store the "real" colour, one have to store that spectral band with full resolution. Now our eyes have colour receptors to recognize red, green and blue (each with a certain curve of sensibility.) To store only red, green and blue parts of that band is already a lossy compression. But good enough to "deceive" us believable. Nevertheless, it is no longer possible to do anything with the light colour and intensity the "real" (physical) world can do.

To convert RGB into HSx and back is complicated. Fast hue calculations are also just an approximation. In any case, it is possible to store HSx value with no sense. Lets say, an hue value while the sat or lum value is zero. It is obvious, that "8 bit per channel RGB" can only be converted with losses to "8 bit per channel HSx". I don't think that there is a way go get a reasonable colour format with less than 16 bits (like RGB 5:6:5.) To lower the spatial resolution, it sounds too complicated for me, to store 3 channels in extra tiles either.

The lum value (Lightness in HSL or Value in HSV) is not a good representation of the actual brightness. We have to store at least sat with greater precision (greater than hue) also.

A non-linear hue representation (via lookup table?) makes the decoder even more complex. The conversion already includes if-cases, so there is no way to convert from or to HSx with some dp3-instructions.

HSL is a great way to generate the colour you have in mind. To do that with RGB is far more difficult and requires experience in the RGB colour space. Since RGB is so widely used, we accustomed to it. YCbCr consists of a number of Y colour planes with different shades in brightness (depending by Y.) In some ways it is a crossover of HSL and RGB. Anyway, we have brightness seperated from the colour. That is in my opinion the main advantage over RGB – and over HSx. Also, we can convert from and into YCbCr with only 3 dp3 instructions. I consider this an advantage, too.

It is may be an option to use an YbCr-Palette for spatial low-res colour tiles. Then we can reach a total data rate of 3 bit per texel. The lookup table have only be accessed once per colour tile. It is still 512 bytes per texture and a great additioal load in the (L2) texture cache.



Thanks to your posting, I now have a new idea of finding good S3TC reference colours: First, sort all 16 colour values by its brightness. Then use the "methode of the smalles squares" (aka linear regression) to find a good (still not optimal) linear interpolation. Now, it is may be good to overweight the start- and the end-values. Also, it is may be still and advantage to "jitter" the calculated values, to find other, better reference points via trial and error. One can use two reference colours each: One find with min-max, the other with linear regression. That we have an area worth looking for optimal values (where optimal means to have the lowest average Y-component weighted RMS error of the into the RGB-room converted sRGB components.)

How do you round the RGB-values? I check the following bit and if its 1, I round up (attention: clamp at 255, then!) It is better to round only up if we get an even number (like the standard IEEE round mode?)
 
Aths,
I'd like to make comments on your work but to do it properly requires quite a bit of time and I'm really a bit busy at the moment.

But just quickly...
aths said:
Thanks to your posting, I now have a new idea of finding good S3TC reference colours:
The way to do it is to compute the principal axis through the set of pixels in the block. You do that by computing the correlation matrix of the RGB values and then calculate the principal eigenvector of that matrix. That gives the direction of the axis which also passes through the average of the colours.

Once you have that, you can map all your pixels into a 1 dimensional system which makes it a lot simpler.
 
Back
Top