Speed of Decompression

Kelemit

Newcomer
I had a wild idea, and have no Idea if this is true or not.

Most programs (games especially) come off the disk compressed and decompressed into ram to be used as needed etc. But the decompression and loading occurs during load screens for the most part (or in the background if its streaming).

So, my question is this...

How fast can decompression algorithims work?

Why?

Because, is it possible to keep data in RAM compressed, effectively increasing your memory pool by requesting the information / data and decompressing it to use it then (if the data changed) recompressing it back into the memory pool?

Is there any decompression / compression technique fast enough and compresses enough to make such a thing worth it?

Might be a stupid idea. If it is... oh well.

But if it isn't.. hopefully some people can use it.

Kelemit
 
I thought that this was already happening, that GPUs could read compressed textures and did not need to have them uncorpressed before use but they could do it on the fly...
 
I thought that this was already happening, that GPUs could read compressed textures and did not need to have them uncorpressed before use but they could do it on the fly...
The compressed textures that the GPU reads are generally using a fixed-rate compression scheme. A higher level of compression can be achieved by (also) applying the techniques used in gzip etc.

I presume the OP was refering to the latter.
 
How fast can decompression algorithims work?

Why?

Because, is it possible to keep data in RAM compressed, effectively increasing your memory pool by requesting the information / data and decompressing it to use it then (if the data changed) recompressing it back into the memory pool?

Is there any decompression / compression technique fast enough and compresses enough to make such a thing worth it?

Might be a stupid idea. If it is... oh well.

But if it isn't.. hopefully some people can use it.

Kelemit

I don't have references on-hand, but I'll try to think through some reasons why this is hard to use in a general case.

First, it will have to be lossless compression, otherwise your machine will wreck everything in short order.

Let's say we use a simple dictionary-based compression scheme.
From a computational standpoint, it is about linear, I think.
See a symbol in the compressed data, look it up in the dictionary.
It's linearly scaling, but the scaling factor is at least 2, since you need to read two pieces of data to get at what you are reading.

Things get more complicated if the dictionary file is too big to cache well, and if the architecture of the machine is too narrow in the memory pipeline or lean on execution resources to handle what is at least a doubling of the computational load per data access.

It doesn't necessarily mean that everything will take twice as long, but unlike reading compressed files from a hard disk, there's a lot less idle time when reading from RAM.

For compression, a dictionary method will have to scan through the data set at least once.
This is a problem, just because a single write is almost always much smaller than the data file you are writing to. This means that frequently written data cannot be compressed. Only rarely used and read-only data files can be compressed.

In doing a scan, the compressor will accumulate a set of patterns it will make up the dictionary. This is happening in memory, and depending on the data being compressed, the number of candidates can be large. It will likely be more than just 1, so the amount of work per write is increased by at least the number of candidates, in addition to the scan that occurred before.

For compression to work, the candidates must be sorted by the number of times they are encountered in the data set. Sorting takes about NlogN steps, probably.
The best case is if there's only one candidate. A write then will only double the work involved, if taken in isolation to all the other work already done. Otherwise, things get worse and more than just a simple multiple of the size of the data.

After doing this sorting, the data file must be reread and then recompared to the final dictionary and then rewritten. This is based on the data set's size, but if this were the only work involved, it would still multiply the amount of work inolved by at least double.

There's tons of overhead involved in all of this, and there is no guarantee that the combined size of the dictionary and the compressed data will be less than the size of the original+the size of the compression program.

In addition, the scratch space needed for all the intermediate work will probably be too much to justify the space savings for anything but very large data sets.

In certain special cases where the data is highly compressible and the format and type is known beforehand, it is possible to do a few space-saving tricks or use specialized hardware. General on-demand compression is way out of bounds, though.
 
I don't have references on-hand, but I'll try to think through some reasons why this is hard to use in a general case.

First, it will have to be lossless compression, otherwise your machine will wreck everything in short order.

Let's say we use a simple dictionary-based compression scheme.
From a computational standpoint, it is about linear, I think.
See a symbol in the compressed data, look it up in the dictionary.
It's linearly scaling, but the scaling factor is at least 2, since you need to read two pieces of data to get at what you are reading.

Things get more complicated if the dictionary file is too big to cache well, and if the architecture of the machine is too narrow in the memory pipeline or lean on execution resources to handle what is at least a doubling of the computational load per data access.

It doesn't necessarily mean that everything will take twice as long, but unlike reading compressed files from a hard disk, there's a lot less idle time when reading from RAM.

For compression, a dictionary method will have to scan through the data set at least once.
This is a problem, just because a single write is almost always much smaller than the data file you are writing to. This means that frequently written data cannot be compressed. Only rarely used and read-only data files can be compressed.

In doing a scan, the compressor will accumulate a set of patterns it will make up the dictionary. This is happening in memory, and depending on the data being compressed, the number of candidates can be large. It will likely be more than just 1, so the amount of work per write is increased by at least the number of candidates, in addition to the scan that occurred before.

For compression to work, the candidates must be sorted by the number of times they are encountered in the data set. Sorting takes about NlogN steps, probably.
The best case is if there's only one candidate. A write then will only double the work involved, if taken in isolation to all the other work already done. Otherwise, things get worse and more than just a simple multiple of the size of the data.

After doing this sorting, the data file must be reread and then recompared to the final dictionary and then rewritten. This is based on the data set's size, but if this were the only work involved, it would still multiply the amount of work inolved by at least double.

There's tons of overhead involved in all of this, and there is no guarantee that the combined size of the dictionary and the compressed data will be less than the size of the original+the size of the compression program.

In addition, the scratch space needed for all the intermediate work will probably be too much to justify the space savings for anything but very large data sets.

In certain special cases where the data is highly compressible and the format and type is known beforehand, it is possible to do a few space-saving tricks or use specialized hardware. General on-demand compression is way out of bounds, though.

I have a compression scheme that uses large "combinations" byte dictionary. Coded in assembler it should be faster for look-up because of speed and looking up a segment address should be faster.Except the dictionary itself is compressed(very fast and simple) by another means other than conventional algorithms.The compression ratio is 1:210:(recompressed)210:1000(128MB dictionary 1TB dictionary increases ratio to 1:210:210:1,000,000).


Not only is it fast(coded in assembler + few conversion steps+ relatively simple decompression scheme) but its lossless.If I could pay for a attorney to defend a patent I would patent this and license it to nintendo (personal bias),DVD forum(5120p @60fps? You would be a movie theater projector to even see most of it),HD-DVD alliance,DirectTV(decreases cost of transmission),multiple mobile phone companies(decreases cost of transmission),and AT&T.It can also increase WAN bandwidth(decompression needed at ISP headend)which would drop the cost for transmission+increase profits at the same time.

Of course like the guy above stated. You need to streamline the compression sceme(hardware optimized for decompression ie a modified CELL processor or similar cheaper solution)to get the yields you are expecting.
 
There are some issues I think might apply to reading from a compressed structure. Depending on the implementation, it may be possible that a desired data element's position cannot be known without decompressing everything from the start of the file to the position of the element.

It depends on the organization of the data strucure and data type, but could happen.

This can be avoided with some kind of active/inactive compression scheme, where infrequently used files are compressed in memory, while files that have been touched remain uncompressed.

Another option is some kind of index file that maps addresses, but that is also dependent on the nature of the data and wouldn't work with data elements that are smaller than the size of their entries in the index.


None of this changes the issue where it is quite possible to have data that compresses to a size larger than the original, once you count the size of the index file or the compressor/decompressor code.
 
http://en.wikipedia.org/wiki/LZO (or other compression schemes made by this same person) would be ideal for this kind of thing, most likely. No reason to reinvent the wheel or proof you're a NIH fan! :) Especially so when it's unlikely any of us has a billionth of his expertise, at least when it comes to lossless compression schemes. Unless the goal is something else than a serial software implementation, of course. Or, if the GPL license isn't suitable for you...

Personally, I'm a fan of just doing things properly in the first place rather than having to resort to such schemes, when it comes to games. But maybe for some things, it would indeed make some sense...
 
The cost of most dictionary based decompressors is comparable to a raw memcopy. Since speed is gated by the memory read/writes rather than the "work" done between.

But it depends what sort of data yoour talking about, having compressed graphics resource cache is certainly doable and has been done, although how usefull is somewhat dependant on the game type.

If your talking in game data structures, most decompression algrythms do not lend them self to decompression of random data, and most accesses are relatively random.

Compression is also a significantly more expensive than decompression.
 
Would there be any particular advantage to using these types of compression schemes on the local store of an SPE?, as well as negating the latency involved in accessing main memory you could potentially simplify certain types of work by having an increased local address space.
 
Well, in the real world this is impractical due to memory allocation of the recompressed data, the new data compressed size can differ and then if very possible the need of a reallocation of the data plus in the worst case a reorganization of the memory. The only thing at least practical is to store the non alterable data in memory and then decompress it to cache in realtime wen if needed , but for this a modified Cache and MMU system if needed with a good quantity of cache, and only useful for textures, images, audio and objects; but not for typical data. For this a range encoding can be very convenient, the ratio can be low in some scenarios (around 1:0.8) for example in the Wii the memory gain can be around 12MB, this is low but truly useful.

Sorry for my bad English.
 
http://en.wikipedia.org/wiki/LZO (or other compression schemes made by this same person) would be ideal for this kind of thing, most likely. No reason to reinvent the wheel or proof you're a NIH fan! :) Especially so when it's unlikely any of us has a billionth of his expertise, at least when it comes to lossless compression schemes. Unless the goal is something else than a serial software implementation, of course. Or, if the GPL license isn't suitable for you...

I imagine most (all?) game companies prohibit the use of GPL and LGPL code in their games. Other open source licenses like BSD are more acceptable, however.
 
I imagine most (all?) game companies prohibit the use of GPL and LGPL code in their games. Other open source licenses like BSD are more acceptable, however.

LZO is available relatively cheaply for commercial use, there are big-name games out there using it.
 
I imagine most (all?) game companies prohibit the use of GPL and LGPL code in their games. Other open source licenses like BSD are more acceptable, however.
GPL, probably, but surely LGLP should be fine.
 
LGPL on a console game would be kind of almost impossible... as you need to include the LGPL license as well as giving access to the LGPL source code used. For PC software, it's fairy easy, but on a console, not so easy (ok you could have a splash screen full of lots of LGPL stuff that contains all the required details, but that would be annoying).
 
LGPL on a console game would be kind of almost impossible... as you need to include the LGPL license as well as giving access to the LGPL source code used. For PC software, it's fairy easy, but on a console, not so easy (ok you could have a splash screen full of lots of LGPL stuff that contains all the required details, but that would be annoying).
Surely one could just put this in a page of the documentation.
 
If your talking in game data structures, most decompression algrythms do not lend them self to decompression of random data, and most accesses are relatively random.
Indeed, one might be so bold as to say that no algorithm can consistently compress random data in a lossless mannner (see Shannon's work in Information Theory). All compression works on patterns in the data, be it textures, video, code, or text. This is why there are specialized compression algorithms for each. For all intents and purposes, you can assume that random data is uncompressable.
 
Back
Top