An idea to be considered indeed for a new compression app...

NeoCool

Newcomer
Hello all,

Perhaps you all already know the WinRAR and WinZip efficient methods to take a file and compress it for saving disk space? Well, let's say you want to compress 1 gigabyte sized file in WinZip or WinRar. The result would likely be around ~445mb. Wouldn't it be cool if you could take 2 gigs with some new written compression program and comperss that to 2 megabyts? Do you guys believe it's possible?(well of course it is, but ATM?) Now of course I'm not saying I'm going to do so but what I'm asking is: is there a program thats like that(i doubt so) and do you guys have any real ideas on how that can be achieved? You've all probably thought of this before. All feedback is greatly appreciated. Thanks. :)

--NeoCool
 
It's not possible. (though I'd LOVE to be proven wrong.. :D)

Compression works by getting rid of redundant data, and some clever substitution rules. (in its simplest form, at least)

Thats not to say i can't create a file full of 0's, 2 gigs big, and it be compressed to a few hundred k (not that I've tried)... but thats not very useful, is it?

Quantum computers could possibly change the way we think about compression....

EDIT: also, there are two types of compression: lossy and lossless. lossy compression can be applied to analog representations of data, e.g. a picture, or a sound, with acceptable results, however lossy on some other data (computer data) would be _bad_. :D (you realize your example was asking for a 1000x lossless compression rate, right?)
 
zsouthboy said:
It's not possible. (though I'd LOVE to be proven wrong.. :D)

Compression works by getting rid of redundant data, and some clever substitution rules. (in its simplest form, at least)

Thats not to say i can't create a file full of 0's, 2 gigs big, and it be compressed to a few hundred k (not that I've tried)... but thats not very useful, is it?

Quantum computers could possibly change the way we think about compression....

EDIT: also, there are two types of compression: lossy and lossless. lossy compression can be applied to analog representations of data, e.g. a picture, or a sound, with acceptable results, however lossy on some other data (computer data) would be _bad_. :D (you realize your example was asking for a 1000x lossless compression rate, right?)
the answer will be in some computational algorithms. If you can find a set of equations (even if ithey are 2mb in length) that 100% accurately gives the results to a gigabyte sized file, then yes, it is possible. My understanding is that currently, it is hard to create such an algorithm, and so it is not used.
 
In reply to Althornin and zsouthboy,

Yes, it is apparent with the current alogrithims and techniques to compress files that compressing a huge 1 gig file to a small 1 megabyte file is not quite possible. Perhaps with future techniques, it will be possible. Or at least, I bloody hope so. :)
 
I hate to think of the time it would take to do that on a current system.
Maybe in 5-10 years when processors are a hell of a lot faster!

Imagine trying to compress a wave file to mp3 on an old 486 or the like.
 
madmartyau said:
I hate to think of the time it would take to do that on a current system.
Maybe in 5-10 years when processors are a hell of a lot faster!

Imagine trying to compress a wave file to mp3 on an old 486 or the like.

Compress?!

Just playing an mp3 on a 486 is quite comical (I had to play it in mono, and downsample them to like 22khz... it did work, though :p)
 
Nope, not bloody likely. The problem is that all compession algorithms really do is store data using a form that, given the average characteristics of that data, will produce a smaller representation of the original information.

Think of it this way. Can you construct an bijection (google this if you don't know it) from the set of simgle digits to the set of letters? Of course not, there simply are not enough digits.
 
Neocool - no it cannot happen. Every file has a certain information content (termed entropy) and you can't possibly compress it smaller than that.

For example, there are endless posts by "trolls" on comp.compression (use Google Groups to read it) who claim to
  • be able to compress files with random data, or
  • take the output of a compressor and compress it further ad infinitum.

They then go on to solving the halting problem and building a perpetual motion machine.
 
Not to to mention a machine that reduces the amount of chaos in a closed system.

Same principle at the core, whether it's information theory or statistical physics, you can not reduce the level of entropy in a given system without some outside intervention.
 
I like the elegance of the pigeon hole school of thought. Simply put, no compression algorithm is capable of compressing every possible combination of bits, not even by just one bit. There simply must be some input that is not compressible.

For example, given a data record of two bits, it is simply not possible to create a compressed record that is one bit long, since there is a grand total of two possibilities for that single bit that must account for the four possible combinations of the original record. It can still be run through the compressor, but it will either not touch the input, or stupidly create a compression record that will have a single bit entry with a data tag of at least one bit added on, which defeats the purpose. edit: Well, it can compress the input, but at best it can only compress two of the four combinations successfully.

Increasing the number of bits will not make the task any easier, as reducing the input by just one bit will halve the number of combinations that can be present in the compressed record. This is why compression relies on regularity in a data record, or goes on certain specific assumptions on the data being used to reduce record sizes by ruling out unnecessary combinations, or building up a set composed of the fewest bits necessary to fit the circumstance.

edit: For an interesting take on super-compression, I suggest looking at this oldie-but-goodie: http://lzip.sourceforge.net/

Who needs all those extra 1's and 0's anyhow. ;)
 
Back
Top