On textures and compression

Discussion in 'Console Technology' started by n00body, May 7, 2006.

  1. DudeMiester

    Regular

    Joined:
    Aug 10, 2004
    Messages:
    636
    Likes Received:
    10
    Location:
    San Francisco, CA
    Looking at the Xenos capabilities, it looks like you could setup a 128-bit palletted texture using only 4-bits per pixel in the index buffer. That would give you 32:1 compression, which I think is pretty darn good. Of course, if you factor in say a pallette of 2048 colours with a 1024x1024 texture, then you have compression that's only about 30:1, still good though. Just looking at some pictures in photoshop and even a 256 colour pallette looks ok for large full colour photos, so a 2048 colour pallette would actually be pretty ridiculous. As for smooth gradients, you could probably just do them as shader math if you really had to. Other then the lack of AF, I think it would be a fantastic compression system and you get HDR textures (which afaik don't have AF anyways).
     
  2. Simon F

    Simon F Tea maker
    Moderator Veteran

    Joined:
    Feb 8, 2002
    Messages:
    4,563
    Likes Received:
    171
    Location:
    In the Island of Sodor, where the steam trains lie
    The page that was linked earlier did mention tiles and regions but ignoring that, you could indeed use specific wavelet terms to construct a particular pixel except that those are surely entropy encoded making access to them just as difficult. :???:

    Actually, I have had some ideas in that respect but just need a little bit of time to try them out.

    But how do you pack the colours? A 2k palette => 11 bits per pixel and that's not going to fit very well into a binary machine. You'd need horrible addressing calculations!
     
    #22 Simon F, May 9, 2006
    Last edited by a moderator: May 9, 2006
  3. Richard

    Richard Mord's imaginary friend
    Veteran

    Joined:
    Jan 22, 2004
    Messages:
    3,508
    Likes Received:
    40
    Location:
    PT, EU
    I don't follow. It's not multiple files. Could you clarify please?
     
  4. andypski

    Regular

    Joined:
    May 20, 2002
    Messages:
    584
    Likes Received:
    28
    Location:
    Santa Clara
    In the (somewhat limited) testing I've done with PVRTC at 4bpp it's an interesting matchup when compared to S3TC - the blocking artifacts of S3TC are naturally not present with this method. On photographic-style data sets the most noticable artifacts are some modulation around colour boundaries (introducing some noise that can look a bit like ringing or dithering) and a tendency to blur fine details - not unexpected, and usually not particularly troubling. It's hard to see how it's doing against S3TC in terms of RMS error - I'm not sure that the compressors are actually trying to minimise the same metric - the S3TC compressor I'm using typically targets weighted error based on perceived luminance, and generally does better at this than the PVRTC compressor, but the opposite is true of unweighted error. If I alter the S3TC compressor to minimise unweighted error then it trades blows with PVRTC. Overall I'm not sure that there's a clear overall winner in RMS terms, with different images seeming to favour one or other technique.

    It does have trouble with some types of image data other than photographic (as does S3TC in different ways) - if you have an image with clear colour boundaries they tend to become smeared, which could be undesirable. It's certainly a very interesting alternative to S3TC style encoding, and I'm not sure how much headroom is left with future improvements to the compressor - maybe quite a lot...? Writing a high-quality compressor for this format certainly seems to be a very different class of problem than writing a good S3TC compressor, so I expect there's quite a learning curve to get good results.

    At 2bpp the smearing and modulation artifacts start to become pretty bad, but it is only 2bpp after all, and the quality on some images should be usable. In some environments you might not notice the low quality (small screens...). Then again, you could just take the 4bpp version and reduce the dimensions by 50% - it might not look much worse.
     
    Jawed likes this.
  5. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    Weird, in theory it can do hard (aliased) boundaries between two color planes almost perfectly (ignoring the color quantization). I have never been entirely convinced the initial estimate of A&B and the validity of the iterative procedure to improve them used in the original paper ... but I have no idea where their present compressor is at now.
     
  6. andypski

    Regular

    Joined:
    May 20, 2002
    Messages:
    584
    Likes Received:
    28
    Location:
    Santa Clara
    Yes - it is strange. I wasn't particularly expecting this sort of artifact - maybe a compressor issue.
     
  7. DudeMiester

    Regular

    Joined:
    Aug 10, 2004
    Messages:
    636
    Likes Received:
    10
    Location:
    San Francisco, CA
    You wouldn't do that. I propose storing the palette in shader constant arrays. The reason you would limit your palette is for performance, not memory packing. Afaik, the constant arrays should be mostly loaded into registers, so memory consuption is the concern here not bandwidth. Thus, you want to limit how many colours you need as much as possible. I guess you say that you would pack the palette in 128-bit FP 4 component vectors. However, this would be entirely seperate from the texture itself, which would just be a single channel index table. Therefore, lookup would be trivial: Read the texture, lookup the corresponding colour in the constant array and render.

    For more compression, you could arrange the colours in the palette by grouping similar colours. This would allow for a margin of error in the index value you read from the texture, since if you're off by +/- 3 or 4 values it shouldn't be too horrible. In turn, allowing you to use some basic single channel compression. Of course, this may not work on all textures, but even then without it you achieve some significant compression. After all, while 8 bit would give you only 256 colours, which is useful enough, 16 bit gives you 65,536 potential colours, which is far more then necessary or that the hardware can provide. Thus at minimum you get 8:1 compression (128-bit to 16-bit). Most textures vary little in colour, like brick or dirt, so you shouldn't need a huge number of colours, so 8-bit should work here. After all, people still use GIFs for a reason. You could also pack 2 or 3 texture palettes into one array, and share pallettes between textures as other optimisations.

    The way I see it, full 32-bit textures are only necessary for very colourful and moderately noisy textures, which are a limited number. With low noise, the error of compressing the index texture will usually result in picking like colours anyways, and with high noise you won't notice the error (or much of anything really). The worst problem would be potential banding on smooth textures, but even then you can use bilinear filtering as an input to blend between palette colours (use the fractional part of the sample as the blending factor). Come to think of it, if you have a well ordered and smooth palette you might be able to use AF too, but I'm not sure. Also, if you notice a texture has a very regular palette, it may be possible to replace it with a palette generated by a function (could be a complex waveform if you have the GPU power). A sort of pseudo-procedural texture, which would free up registers for other palettes or things of interest. There's a host of other possibilites I'm sure, and this is all rendered in 128-bit FP colour.

    EDIT: Is PS I took a diagonal rainbow gradient at 1024x1024, converted to 256 colour GIF with random selective dithering, then applied 1 px guassian blur to approximate bilinear filtering, and I'll be damned if it doesn't look smoother then the original somehow! Even a pure red/green gradient and some game screens I have are great.
     
    #27 DudeMiester, May 10, 2006
    Last edited by a moderator: May 10, 2006
  8. Simon F

    Simon F Tea maker
    Moderator Veteran

    Joined:
    Feb 8, 2002
    Messages:
    4,563
    Likes Received:
    171
    Location:
    In the Island of Sodor, where the steam trains lie
    I wasn't concerned about the storage of the palette - that's meant to be an insignificant part of the texture data and so it can padded. I was referring to the texture indices. 11bits per pixel does not pack very nicely into power of 2 words!:???:


    Yes - it needs so looking at.:???:
     
  9. DudeMiester

    Regular

    Joined:
    Aug 10, 2004
    Messages:
    636
    Likes Received:
    10
    Location:
    San Francisco, CA
    So? Then pad the indicies to 16-bits, distribute the values over the added range, throw on some compression (DXN or CTX1 from the Xenos docs), and in the shader rectify the errors induced by the compression. It should work, assuming the errors are not too tremendously large. You would have a margin of error of +/- 16, and by grouping colours as I suggest, you could double or triple the acceptable error.

    In the PS you would have something like this for a nicely filtered paletted texture:
    Code:
    //Get your index from the texture. Bilinear filtering should be ok to enable.
    float2 IndexFS=Tex2D (IndexTexture, coords);
    //Recombine the components, scaling to a max of 2048, which should mitigate compression/filtering error.
    float IndexF=(IndexFS.x+IndexFS.y)*1024;
    //Get the integer index
    int Index1=floor(IndexF);
    int Index2=ciel(IndexF);
    //Blend the palette colours together, where Palette is a constant array and frac(IndexF) is the blending factor.
    float4 Colour=Palette[Index1]+(Palette[Index2]-Palette[Index1])*frac(IndexF);
    Of course 2048 is a totally arbitrary number. You could just as well use 256, which fits nicely into a single 8 bit integer texture (DXT3A or 5A compressions possible). Although, I would imagine that compression here wouldn't be such a good idea, because there is much less tolerance. Still for textures with uniform colouring, it may be possible.

    I'm not saying that this will work all the time. Rather I'm saying each texture should be considered on it's own for optimum compression. Different textures will have different needs, so going for a "universal" compression scheme will limit your results. Of course, that doesn't mean you can't make an app that analysis a texture and determines the optimum compression, be it a palette or some other system.
     
    #29 DudeMiester, May 10, 2006
    Last edited by a moderator: May 10, 2006
  10. Simon F

    Simon F Tea maker
    Moderator Veteran

    Joined:
    Feb 8, 2002
    Messages:
    4,563
    Likes Received:
    171
    Location:
    In the Island of Sodor, where the steam trains lie
    Right.... so where is the (significant) compression? You've now got 16 bits/texel + 2k*32bits of palette. If all the texels of a, say, 1kx1k texture are used in a single shader application, then you're achieving close to 16 bits/texel (so a moderate 50% compression) but if in a render there are only, say, 2000 texels being accessed, then you're achieving ~50bits/texel, i.e 1.5x expansion :shock:
     
  11. DudeMiester

    Regular

    Joined:
    Aug 10, 2004
    Messages:
    636
    Likes Received:
    10
    Location:
    San Francisco, CA
    I don't think that would happen that often. If you shared palettes, you would minimise the added load. Also, the palettes would be 128-bit not 32-bit. Given that only 256 colours is ok for many images, then you could definitly share a 2048 colour palette between many textures. Finally, I'm talking in reference to what DX10 offers, where constant arrays are stored seperately and independantly from the shaders. This way if you have a series of shaders that use it, there is no additional BW cost.

    Of course, if you only have one texture that is displayed over a small area of the screen, then of course you won't use paletting. However, how often do you only have one small compressed texture in a scene? If that were the case, then I don't know why you bother to compress anything at all, lol. No, I imagine you would have many textures on the screen sharing a small set of palettes. If there was still a problem, you would use LOD to enable paletting only on foreground objects and their textures. Memory consumption would be a bit higher, but not too much since the distance textures are lower resolution. This way you maxmise the savings, giving you 8-bit per pixel BW/storage (if you use DXN compression on a 16-bit index) on a a 128-bit source texture. By my calculations, you could store about 200 such textures in 201MB of RAM (assuming 1 palette per 10 textures). If that's not good compression, then I don't know what is. (maybe I really don't though, heh)
     
  12. Xmas

    Xmas Porous
    Veteran Subscriber

    Joined:
    Feb 6, 2002
    Messages:
    3,344
    Likes Received:
    176
    Location:
    On the path to wisdom
    Filtering of palette indices will give you ugly artifacts and aliasing. Imagine having two neighboring texels with indices from each end of the palette. If you move a polygon with this texture in tiny sub-pixel-steps, you will cycle through the whole palette.
     
  13. DudeMiester

    Regular

    Joined:
    Aug 10, 2004
    Messages:
    636
    Likes Received:
    10
    Location:
    San Francisco, CA
    I see what you mean, heh. That seems obvious now. I was thinking about it, and the best balance I can come up with is something like this:

    You take a 1024x1024 8-bit index texture and reduce it to a 512x512 8:8:8:8 texture, where the components are indexs to the colours the reduced pixel replaced. Then in the shader, you calculate where the current pixel would be located relative to the 4 original texels represented in the reduced index texture, and blend appropriately. You would still have some aliasing between the 2x2 blocks on high contrast textures, but you the pseudo-bilinear filtering you do get only requires one texture lookup. For textures without much colour variation it should be fine (but so would direct interpolation of indicies), and for contrasting textures the aliasing would be somewhat limited. Kind of like how 4xAA looks on polygon edges, I would think. Of course, this wouldn't work if you needed 16-bit indicies, but that should be fairly rare. Still you could pair indicies and do two texture lookups for some improvement.

    Obviously, it's not usuable everywhere, but neither are other types of compression. I'm sure it has it's place somewhere. Certainly it's better then rendering 128-bit FP textures directly, so there's one definite avenue.
     
    #33 DudeMiester, May 10, 2006
    Last edited by a moderator: May 10, 2006
  14. Idiot_stupid_head

    Newcomer

    Joined:
    Dec 27, 2006
    Messages:
    12
    Likes Received:
    0
    Is DXT adaptative? By adaptative I mean applying different compression ratios for different blocks of the same texture, thus archiving maximum C : Q ratio.

    DXT / s3tc is kinda old, it compresses textures in a similar JPG fashion, is it OK for today standards or are there newer compression schemes being developed?
     
  15. mboeller

    Regular

    Joined:
    Feb 7, 2002
    Messages:
    923
    Likes Received:
    3
    Location:
    Germany
    TREC maybe ;)
     
  16. Simon F

    Simon F Tea maker
    Moderator Veteran

    Joined:
    Feb 8, 2002
    Messages:
    4,563
    Likes Received:
    171
    Location:
    In the Island of Sodor, where the steam trains lie
    No, and that's what makes it good as a texture compression method as opposed to a generic image compression method. Have a look at the first section of this paper on why this is a desireable feature.
    No, it doesn't really have that much in common with JPEG.
    Yes there are other compression methods "out there" and being developed.
     
  17. Dave

    Newcomer

    Joined:
    Jan 31, 2002
    Messages:
    167
    Likes Received:
    3
    What ever happened with the free TC that 3dfx introduced? FXT1 had some definite advantages back in the day over S3TC, plus it was free. I'm surprised nobody ever picked it up. Then again, maybe something with NVIDIA buying the IP complicated that.

    -Dave
    (showing signs that I've been out of the loop)
     
  18. Simon F

    Simon F Tea maker
    Moderator Veteran

    Joined:
    Feb 8, 2002
    Messages:
    4,563
    Likes Received:
    171
    Location:
    In the Island of Sodor, where the steam trains lie
    IMHO, it probably infringes on S3TC's patent so "free" is not exactly the word I would use.
     
  19. Dave

    Newcomer

    Joined:
    Jan 31, 2002
    Messages:
    167
    Likes Received:
    3
    If I recall correctly there was prior work in the area upon which both were built. But I suspect with 3dfx no longer around to defend that, nobody would want to risk it.

    -Dave
     
  20. Simon F

    Simon F Tea maker
    Moderator Veteran

    Joined:
    Feb 8, 2002
    Messages:
    4,563
    Likes Received:
    171
    Location:
    In the Island of Sodor, where the steam trains lie
    If you refer to CCC (don't have the ref to it off hand but it is in the paper I linked to below), then I would argue that S3TC is a valid improvement on it.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...