Have Palettes become obsolete?

Himself

Regular
There is a rarely used desktop mode that consists of a bitmap of indices ranging from 0-255 (one byte) into an array of 3/4 byte colour values. If you wish to change all the pixels on a display of one colour, you can using one write to the palette instead of a scan of all the pixels doing a comparison and a read and write for each. The main problem has been that 256 colours are not enough to go around when you have many display elements, Windows switches palettes to the palette of the active window and generally forces all windows to share a subset of colours.

What if each pixel were a two byte index into a palette 4 bytes wide? You could fetch twice as many pixels during a read, but you have to read the palette entries anyway, making the advantage moot, also you are accessing non sequential parts of ram, anothing inefficiency. The benefit would be on the side of changing large sections of the display using one write. Overall though, you can see the reason why such a mode isn't on the roster, it is just as much work and you have more control over individual pixels. Also, there is 16 bit mode, which many consider pretty good, but I find really has limited colour values which would be neatly solved if each pixel were an index into real colours.

Ok, the thought I had came to me when considering larger colour palettes of 64 bits+, wouldn't the savings and features balance shift once again to palettes? From what I understand textures are rarely larger than 256x256 in use in today's games, and future games are using finer details, if you wanted textures with greatter than 32 bit precision, or one with a large alpha channel seems to me a palette would be handy, for 256x256 you'd have a palette entry per pixel. There is probably something like that already in existance, but I wouldn't know. :) The problem has become one of filling larger and larger areas of the display, between higher desktop resolutions and multiple monitors, and Windows headed towards more and more eye candy, seems to me a reconsideration of palettes would solve some problems.

Of course, it would probably never be an option, any new idea has to be able to fit into the current legacy of the PC, but this seems to be a forum for discussing random ideas, so I thought I'd ramble on. :)
 
Palletes are typically handled in hardware, using a built in LUT (look up table).
Calls to set the pallete are translated by the driver into writes into the hardware LUT. Then when the indexed data is fetched from framebuffer the DAC depalletizes the data using the LUT. This has no impact on memory access as the LUT is on chip, so you really do read 1 byte per pixel.

Expanding the lut to be indexed using two bytes gives you 64K unique entries. So you could represent images with greater color accuracy, but given how cheap memory is at the moment this seems like too much work.

You also cannot do arithemetic operations on the data in the framebuffer if it is palettized, as the colors for a given entry are completly arbitrary. Standard 16 BPP can (and is) be used in blending and arithmetic operations.

Textures are (I think) getting bigger and bigger.

Windows is heading towards everything running through the 3d pipe, which requires arithmetic operations on the texture values. The only to handle this would be for the 3d hardware to depalletize the data do the operation, it would then have to either quantize the value to the existing palette, or modify the palette to allow for the new color value, as the result of the arithmetic operation is certainly not guaranteed to already exist in even your expanded pallette.


Interesting idea though. :)


CC.
 
Captain Chickenpants said:
Windows is heading towards everything running through the 3d pipe, which requires arithmetic operations on the texture values. The only to handle this would be for the 3d hardware to depalletize the data do the operation, it would then have to either quantize the value to the existing palette, or modify the palette to allow for the new color value, as the result of the arithmetic operation is certainly not guaranteed to already exist in even your expanded pallette.

This would not theoretically be too difficult to perform on shader hardware (with support for 8 bit render output) provided you know the palettes in advance, or better yet have a global palette - in this case you just need to construct (in the case of 8 bit palettes) a 256x256 texture to act as a shade table into which you can do a dependent lookup for the result of a given operation. The table automatically maps the result of the calculation back to the closest palette entry to the 'correct' one.

You then produce versions of this table for each possible operation - ADD, SUB, MUL etc.
 
This would not theoretically be too difficult to perform on shader hardware (with support for 8 bit render output) provided you know the palettes in advance, or better yet have a global palette - in this case you just need to construct (in the case of 8 bit palettes) a 256x256 texture to act as a shade table into which you can do a dependent lookup for the result of a given operation. The table automatically maps the result of the calculation back to the closest palette entry to the 'correct' one.

You then produce versions of this table for each possible operation - ADD, SUB, MUL etc.


This requires doing a lot of calculations anytime someone alters even one pallete entry, which kind of defeats the original suggestion of using palletes.

This aslo assumes that the data you are using is always palettized, if one of the textures is not palettized then the table approach does not work.

CC
 
Captain Chickenpants said:
This would not theoretically be too difficult to perform on shader hardware (with support for 8 bit render output) provided you know the palettes in advance, or better yet have a global palette - in this case you just need to construct (in the case of 8 bit palettes) a 256x256 texture to act as a shade table into which you can do a dependent lookup for the result of a given operation. The table automatically maps the result of the calculation back to the closest palette entry to the 'correct' one.

You then produce versions of this table for each possible operation - ADD, SUB, MUL etc.


This requires doing a lot of calculations anytime someone alters even one pallete entry, which kind of defeats the original suggestion of using palletes.

This aslo assumes that the data you are using is always palettized, if one of the textures is not palettized then the table approach does not work.

CC

I did point out in the post that you have to know the palettes in advance (this is inherently a static scheme). Dynamic manipulation of simulated palettes is not easy, and displaying the results of intermingling multiple different non-static palettes really requires promotion of the on-screen results to a non-indexed mode.

Attempting to display unpaletted data on a paletted screen would be somewhat strange behaviour, and very difficult to do quickly.

Fundamentally the time for palettes _is_ probably past, except for some rather esoteric uses like vector map compression, and even for these the quality tradeoff is not good.
 
fresh said:
That's all anybody uses on the PS2.

Not surprising - it doesn't have any better form of compression, but it doesn't have enough texture memory to use higher bit depth textures. This isn't a case of palettes being any good - just that they're the only thing available.

On the vast majority of typical texture images DXTC provides better results at 1/2 the size of 8 bit paletted textures, so palettes are largely obsolete for texture compression on all the other game consoles and PC hardware.
 
Apple, Sun, and SGI all did some cool tricks with 8-bit and 4-bit paletted textures, back in the day. If you build lookup tables, and dither, you can actually do a lot of color space conversions and 3D math with paletted textures.

But these days, with cheap RAM, it doesn't make much sense.

With next-generation hardware (9700, NV30) we're seeing pixels getting larger, not smaller. The future is going to be floating point pixels.
 
"Palette" is nothing more than a compression method.
With 1,9M pixels being on screen in 1600x1200 mode, and 32bit color enabling for 4,300M different color values, obviously palette-based compression scheme would make some sense.. Its just that with traditional palette implementations multipass rendering would become prohibitively expensive. Blending ops would perhaps be possible with dynamic palette size, i.e. if op result colour is not found in palette then new entry is created.
Compression works best if compression method matches the data patterns, i.e. some (lossless) compression algos work best with audio data, some with still image data etc. Obviously, compression schemes for Z-buffer, stencil buffer, RGB and alpha channels should not be same. Its just that selecting the best algo-s for every situation and implementing all coders/decoders in HW is pretty much impossible.
Also, to my understanding, "reverse" palette ops (i.e. finding palette entry from LUT. given RGB values ) are not well-suited for hardware at all. I cant imagine how quicksort and binary search would be ever implemented in HW :p
 
Perhaps what I was looking for is more of a bitmap format standard and possibly driver/hardware support for BitBlts to a higher level screen buffer. Thinking more of visual styles and colour preferences, changing a palette to make the entire gui brown or blue would be less expensive than sending messages to all the windows and controls for them to change pixel contents. :)
 
GL_EXT_palette_texture

Himself said:
Perhaps what I was looking for is more of a bitmap format standard and possibly driver/hardware support for BitBlts to a higher level screen buffer. Thinking more of visual styles and colour preferences, changing a palette to make the entire gui brown or blue would be less expensive than sending messages to all the windows and controls for them to change pixel contents. :)

I think that palettes will be used again a lot for normal map compression, where blocky compresion cannot be applied.
OpenGL supports GL_EXT_palette_texture to use paletised textures.
The biggest problem with palette textures is that if you use linear filtering, you multiply by two the number of memory transfers you need for a single texel fetch (32 fetches for quadlinear filtered 3D textures!) and you cannot initiate the LUT fetch until you have done the color index fetch, so the latency increases (unless you cache the palette onchip).

EDIT: Added bumpmapping url
 
A palette is only 1kb, it is likely cached away in some fast memory when texturing is performed (on the PS2 it is). Sure you gotta do a few more lookups, but you save on texture cache misses because the texture data is smaller.

For most games you can't even tell the difference, as long as the textures aren't too big and you got a unique palette per texture it looks just as good as 32 bit. We're not talking 8bit palettes like back in the software rendering days where all you had was ONE SINGLE palette. Now you can have one per texture, and it's a great compression scheme. You can also do neat stuff like clut animations very cheaply.
 
Fresh, can you recomend a good program to convert pictures to 8bit color? MsPaint did a horrible job on my test picture, it look like it used a default palette, instead of making one optimised for the picture. I also tried Gimp. Half the test picture was a block of solid color, but Gimp actually failed to put that color in the palette!! Instead it dithered with three other colors to generate the correct shade.
 
fresh said:
A palette is only 1kb, it is likely cached away in some fast memory when texturing is performed (on the PS2 it is). Sure you gotta do a few more lookups, but you save on texture cache misses because the texture data is smaller.

However if you are dealing with modern PC hardware (say DX9 type) that does multitexturing and you want to deal with all texture types fully orthogonally then you will either have to have static storage for 16 different 1kb palettes on chip or continually dereference the palettes to main memory, possibly through some palette cache. For a PS2 that can only make use of a single texture at once this problem doesn't exist.

Palettes are no longer a great compression format because for almost all types of images DXTC produces better results at half the storage size without additional dereferences. There are, of course, exceptions where palettes do produce better results (normal maps as mentioned above)

For most games you can't even tell the difference, as long as the textures aren't too big and you got a unique palette per texture it looks just as good as 32 bit. We're not talking 8bit palettes like back in the software rendering days where all you had was ONE SINGLE palette. Now you can have one per texture, and it's a great compression scheme. You can also do neat stuff like clut animations very cheaply.

I think a lot of people would say that you can tell the difference between palettised and 32 bit textures. Certainly there are many who can tell the difference between 16 and 32 bit textures, and this is generally a much smaller leap than 32-bit to palettised.

'As long as the textures aren't too big?'

I want bigger textures - that seems to be one of the major points of using compression in the first place (although perhaps not on architectures with really low memory limits...). With DXTC generally the bigger the textures get the better the visual quality of the compression becomes (when you think about how the compression is performed the locality of the colourspace is likely to improve when each block covers a smaller percentage area of the texture).

With palettes the larger the textures become the more compromise has to be made in mapping the limited number of available colours over the whole image. Palettes are very good at compressing small textures, but if they're already small then the advantages become more nebulous.

Palettes are not a 'great' compression scheme - there are manifestly better ones to be had for general texture compression in hardware. They do, however, have some nice secondary properties (such as the clut animation you touch on above)
 
Humus said:
Paint shop pro generally does a good job.
It's good, but it doesn't manage the solid color block quite right either. If I select ‘Optimized Octree’ as color reduction method, and turn dithering down to 99%, then the single colored area gets saved as a single color with no dithering, but it gets the RGB values 113, 118, 138 instead of the 113, 118, 139 of the original. Except for that, it did great, so thanks for the suggestion.

no_way said:
http://www.irfanview.com/
Yet another that is good, but still doesn't manage the single color block. It dithered it with two colors. It's a very cool program though, I love it :)

But I did find one program that finally managed to do it; the 'convert' command in Linux (I also know that Solaris has it, so I assume its standard with most UNIX like systems).

Oh, and sorry if this post is a little OT.
 
Since we are discussing the relative quality of S3TC/DXTC, the following comparison of texture compression methods pages I wrote a long while ago would be relevant. (One day I might add PVR-TC.)

The biggest problem with palettised (or more generally VQ) textures is the indirection needed to access the Palette/LUT (look up table). To get good quality you really want an individual LUT per texture and to overcome the latency it needs to be "cached". Furthermore it has to be multi-ported (i.e. expensive) to support texture filtering . The simplest thing to do is to load the entire LUT when you first encounter the texture, but that is very expensive if you only process a few texels out of that texture.

For these reasons there's been a shift toward more block oriented texture compression methods, even though they generally achieve less compression.
 
Thowllly said:
Humus said:
Paint shop pro generally does a good job.
It's good, but it doesn't manage the solid color block quite right either. If I select ?Optimized Octree? as color reduction method, and turn dithering down to 99%, then the single colored area gets saved as a single color with no dithering, but it gets the RGB values 113, 118, 138 instead of the 113, 118, 139 of the original. Except for that, it did great, so thanks for the suggestion.
The problem with an octree is that the "splitting planes" are always aligned with the R, G, and B axes. You really need to split down arbitrary planes determined by finding the greatest spread of colour. In the Dreamcast VQ compressor (which in theory can handle standard palettised images) I used a variant of Wu's colour quantiser (You can find the paper using google).

[Update]By the sound of your example, it seems that the quantizer might first be reducing the colour precision of the original image, possibly to make the searching faster[/Update]
 
Obviously palettes or VQ compression would have its uses in modern cards as well, just the implementation needs to be more flexible than old-skool 256-color palettes.
as a side, stencil buffers and framebuffer alpha channels would generally yield to pretty high compression rate, i wonder if anybody is doing something in that area ? Or does Z-buffer compression already take care of stencil as well ?
 
Back
Top