Cryptology, and more specifically, the breaking thereof.

Grall

Invisible Member
Legend
I've been thinking about this off and on, and never really seen any answer to it on forums or tech websites and such (I don't read any more specialized publications on the matter, as the subject matter is generally far above my head), but when brute-force breaking a crypto, how do you actually know you've broken it...?

Presumably, you'd do it by recognizing the content, since I doubt cryptos come with Pachinko machine-style bells and flashing lights when you succeed... :p So, if you scramble your content according to some sufficiently complicated algorithm (like, Rot13, but better) before running it through encryption, how would an intruder know they've succeeded in breaking the crypto? I guess there isn't any sure way of knowing.

You'd have to run some kind of statistical analysis looking for repeating patterns that would correspond to say, known written languages or machine language opcodes that could help you unscramble the data for every single set of keys attempted, if you don't suddenly end up with plaintext data.

Seems very complicated, and of course time-consuming...
 
It sounds to me like the concept you are looking for is "unicity distance":

http://www.schneier.com/crypto-gram-9812.html#plaintext
http://en.wikipedia.org/wiki/Unicity_distance

If you want to try to protect data against brute-force attacks by scrambling them, then do not use block-based scrambling schemes (e.g. ROT13 is a "block-based scrambling scheme" with an 8-bit block), especially not with small block sizes. Instead, you should look into things like the All-Or-Nothing transform:

http://en.wikipedia.org/wiki/All-or-nothing_transform
 
There are many data files with specific patterns. For example, all JPEG files begin with something like 0xFF 0xD8. So you can eliminate all but one in 65,536 possible results already. Then you can check further to make sure it looks like a JPEG file.

Or, if you know the target file is a plain text file, all you need to do is to check whether the first block (for most cipher it's 8 bytes) are all ASCII displayable characters (i.e. from 0x20 ~ 0x7e, and common control characters like 0x0d and 0x0a). For a random data (if you decrypt with a "wrong" key it should look random) the chance is one in 2557. If it fits then you can check the second block, and so on. In general, you don't even need to do other heuristic checks, as the probability of a random data file somehow looks like a text file is extremely small (even compared to the key space, i.e. 2^128 for a 128 bits key). Unicode text files normally also have a "byte order mark" in front of the file, so it can also be used.
 
So you'd need to design your scrambling algorithm to get rid of these identifying marks, possibly by using unique scrambles per file type. If all jpegs for example start with a certain byte sequence, well, shuffle the file contents around then so that the start ends up somewhere in the body of the file... Something like that.

Or you could just use strong crypto with ludicrously large crypto keys I suppose... Say RSA, 8k bits? :p
 
So you'd need to design your scrambling algorithm to get rid of these identifying marks, possibly by using unique scrambles per file type. If all jpegs for example start with a certain byte sequence, well, shuffle the file contents around then so that the start ends up somewhere in the body of the file... Something like that.

Or you could just use strong crypto with ludicrously large crypto keys I suppose... Say RSA, 8k bits? :p

Large crypto keys are probably not that useful though. AES's 128 bits is already good enough if it's theoretically secure (but no one knows that of course), as 2^128 is a very large key space.

The real problem is that no matter how large your key space is, users are going to use bad passwords. So the key space is never actually 2^128. Even if you use really sensible passwords, like mixing capital letters, numerals, and punctuation, it still needs to be quite long (unpractical for normal person to remember) to be anywhere near 2^128 key space. Using pass phrases are better, but even that's not easy to achieve 2^128 key space. So increasing key space is not a solution here.

So another solution is to make decryption process longer. However, you can't just make your cipher slow, because that'd be inefficient. So there are some other ways, such as the "all or nothing transform" mentioned by arjan. In the transform, you need to decrypt all data to know whether the data is good or not. If you data is generally big enough, attacks will need much more time to brute force the key.

Sometimes you don't really need a real "all or nothing transform." For example, if you use CFB, then in theory you just need to protect your first block. So you can randomly generate a block sized number (using a real random number generator), XOR all your data with the number, then append the number to the end of your data file. Then you use CFB to encrypt the data file. An attack will have to decrypt the whole file to get the random number to XOR the first block to check whether the file is correct, and that makes brute force much more difficult if your file is large enough.
 
Couldn't you salt the user password to algorithmically increase keyspace resolution?
 
It doesn't do much, because no matter how you generate your encryption key, it needs to come from a source that's basically from the user password. So you can just iterate possible user passwords to generate the "hashed" password, salted or not. So the key space is still much smaller.

Supposed that, say, your system accepts pass phrase of unlimited length, and use MD-5 or SHA-1 to convert it into a 128 bits key. And you salt it with a random number (of course you'll need to store the number). If I want to crack the key, I don't just randomly generate 128 bits keys. I should generate possible pass phrases, add the random salt number (it's stored so I can get it), then use MD-5 or SHA-1 to hash it. So I only have to try possible pass phrases, so the key space is smaller.

Of course, pass phrases are still better than simple passwords. A simple four words pass phrase has much better entropy than a normal 8 character password. But it's still far from 2^128 in most case.
 
Deniable encryption is a different issue though, and it's not immune to brute force attack if the encryption key is not secure. If I can just guess your password, I don't really need to care whether it's deniable or not.
 
Back
Top