From that list, only AES and ChaCha20 are considered secure today, they others have been broken in interesting ways ... people figured out how to predict them. AES itself has a bit of patchy reputation, because ...
Now the magic and personality of the encryption algorithm really comes in. It's really all about how we generate one block after another, and each block is seemingly-random with predictable relationship to what came before or after, to someone who doesn't have the key.
-
-
I'm not going to go too deep into how these work, but if you want to get a sense yourself: start by looking up linear congruential generators, which is a really simple function that can make a block of data "cycle" in a random non-repeating way.
Show this thread -
Then look up Feistel networks, which are sort of the next level of that. Look up S-Boxes then if you're more curious, and finally, take a look at how Salsa20 does the rotations that ChaCha20 uses. It's all more approachable than you might think!
Show this thread -
O.k. so now we know how random streams of data can be combined with plaintext to do encryption and decryption, and we sort of know how random streams of data are produced. Isn't that it?
Show this thread -
For full disk encryption, it almost is really. We basically just encrypt every block or sector of storage, under the same key, and using an IV that's derived from the "position" on disk. So we can always just decrypt any block anywhere on the disk, as long as we have the key.
Show this thread -
But there's a problem with this ... someone can mess with the encrypted data. If I change the value of any byte, even if I don't have the key, it will mess with the decrypted output. There's no real security against that kind of meddling.
Show this thread -
For sending messages and data over the network, that's not going to got it. We don't want people tampering with our information. So we need to add an integrity check! There's a few schemes for this.
Show this thread -
HMAC, GCM and Poly1305 are the common ones in use right now. In each case, these algorithms basically take the data as input, along with another key (an integrity key) and produce a MAC or a tag, which is just another piece of data that acts as a signature.
Show this thread -
So to encrypt, and protect, our string, one scheme might be: AES(key, IV, "Ovaltine") -> encrypted_output HMAC(key, encrypted_output) -> MAC and then on the wire, we send: IV | encrypted_output | MAC
Show this thread -
To decrypt, we check the MAC first by generating it again and making sure they are the same, and then we decrypt the output. Internally there are differences between how HMAC, GCM and Poly1305 generate these signatures, but you don't need to worry about that.
Show this thread -
Today, this combination of operations is wrapped up in function we call "AEAD" which means Authenticated Encryption with Additional Data, and it does all of this is a mostly-foolproof way for you. Basically: AEAD(key, IV, plaintext, additional_data) -> IV_encrypted_data_MAC
Show this thread -
The "additional data" is just any other data you might want to "prove" the sender has, but not send; like say some meta-data that establishes a permission. It's often left empty.
Show this thread -
Now you can still screw up with AEAD. If you re-use the same IV, that's bad!! There are attempts to make this better, my colleague Shay has been working on a cool scheme called SIV, and it adds a measure of protection against that too.
Show this thread -
If you do use unique IVs, modern encryption is really robust. In general, you could publish some encrypted text in the New York Times, and no-one will be able to crack it. This Is true even if /some/ of the text is known. For example ...
Show this thread -
In internet protocols a lot of the text is known, a HTTP server always responds the same way and the first few bytes are known and totally guessable. This doesn't matter at all - doesn't help an attacker figure anything else out even one bit. We've come a long way from WWII.
Show this thread -
But there are attacks that do work! If you're sending this data over a network, and someone can see the timing and size of message. This opens us up to traffic analysis.pic.twitter.com/8qeI9A3Ozp
Show this thread -
Let's look at length first. O.k. so the length is obviously not hidden. That's fine if you're trying to protect your password or credit card number in the middle of a response. No big deal. But it does mean that someone might be able to fingerprint the content you're sending.
Show this thread -
Simple example: if you send a gif over a messaging app, if the size of that gif is unique, someone in the middle can probably guess what gif you just sent. There are more sophisticated versions of this for Google Maps, Netflix, WikiPedia, and so on.
Show this thread -
The way we protect against this is to "pad" messages, to make large numbers of messages appear to be the same size no matter what. Military grade network encryption actually pads all traffic all the time, so it's always the same!
Show this thread -
Another problem with length is that if you're using compression, and let attackers control any of the content on a page that a user sees, that can let the attackers figure out even small secrets. Look up the "CRIME" attacks. It's awesome, and scary.
Show this thread -
I said the other problem is timing. Obviously the timing of each message is public, but is that a big deal? It can be! For example, if you send a message for every use keystroke, it's trivial to figure out what they're typing through timing analysis. WOW.
Show this thread -
Another example is VOIP. If your call app only sends data when people are speaking, but not during the silences, that's still enough to guess about 70% of English-language speech. Just from the silences! Scary cool.
Show this thread -
These examples underling: even when you use encryption algorithms and schemes we've been perfecting for about 80 years, there's still some gaps you can walk into and break the security. Which is why this stuff is worth knowing!
Show this thread -
Anyway, that's the level I'm going to stick at for now, but we've covered a lot of ground. If you've finished this thread, thank you! But also you should now have some kind of better understanding of what's going on, and what to be wary of. Feel free to AMA.
Show this thread -
Oh the truth table for XOR is wrong. I guess it's more of a lies table. Should be: a | b | c 0 | 0 | 0 1 | 0 | 1 0 | 1 | 1 1 | 1 | 0
Show this thread
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.
Also Cryptographers: AES implementations in software (not hardware) are either insecure, or slow, or both. It wasn't designed with caching side-channels in mind. 