Hashing is not encryption

1
Hashing is not encryption

In a job interview years ago, the interviewer asked me to explain the difference between encryption, encoding, and hashing. At the time I was working for a company that specialized in encryption, so I took knowing the difference for granted.

It wasn’t until much later that I understood how easily most folks can confuse the three topics for one another. Let’s take a look at each in turn.

Encoding

Encoding is the practice of taking data in one format and converting it to another. There are no secrets involved – the specifications for each format are public, well-documented, and easily implemented everywhere.

The content of this article is rendered in your browser as either ASCII or UTF-8. What that means is the ones and zeros representing are interpreted in a specific way that converts them into English characters.

The text “Hi, folks!” is nothing more than ones and zeros, interpreted as ASCII and converted into readable text. You could just as easily encode this data as hex, which would instead be the string “48 69 2C 20 66 6F 6C 6B 73 21.”

These are completely interchangeable. There’s nothing special or magic about encoding, it’s merely a way to interpret and present the underlying, raw data.

Encryption

With encryption, everything changes. Encryption requires a secret (a key, password, or passphrase) that is used to convert usable data into something indistinguishable from random noise. Given an encrypted message, you can only decrypt it if you have the original secret.

Unlike with encoding, an encrypted message, like the base64-encoded text “lqNBja38qFHnITloKNzdVg==,” is entirely useless. There is no way you can get from an encrypted message back to the plain text original, even if you know the algorithm used, without the password.

If, however, you know this message used AES for encryption. And you know it used “thisisasecret” as the password. Then you can properly decrypt the message and read “Hi, folks!” yet again.

Hashing

Anyone who’s spent any period of time working with encryption has likely used a hashing algorithm as well. Hashes look somewhat like encrypted messages. The underlying algorithms take a piece of plain text and convert it (with or without a key) into something indistinguishable from random noise.

Unlike encryption, there is no way back from a hash.

These algorithms are one-way. Even if you know the algorithm and any secret keys involved, there is no way to un-hash a string. It’s an entirely destructive operation.

Remembering the difference

The easiest way to remember how these topics differ is with a simple mental model.

Encoding is a way of translating between different formats. Like converting a Spanish recipe for cake into English.

Encryption is a way of protecting data behind a secret. Like sealing a box of chocolate in a locked safe so your kids don’t find it.

Hashing is a way of permanently converting from one recognizable thing to something uniform and simple. Like grinding a cow into a hamburger – you can always make a burger, but you can never put the cow back together again.

These are rough analogies, but they should help the next time you’re faced with the same interview question.

Join the pack! Join 8000+ others registered users, and get chat, make groups, post updates and make friends around the world!
www.knowasiak.com/register/
Read More

Leave a Reply

2 thoughts on “Hashing is not encryption

  1. Aditya avatar

    I guess the other difference between encryption and hashing is that a hash function need not be one-to-one. Many inputs can result in the same hash, though hopefully it's hard to find collisions.

    So a hash function is allowed to destroy information, whereas it's pretty important that an encryption algorithm doesn't!

  2. Aditya avatar

    A cryptographic hash function H can be used for encryption though, by mimicking a one-time pad: Pick a random integer k, concatenate it with a secret s, hash the result obtaining H(s•k), XOR this with the plaintext yielding the ciphertext C and send the pair (k,C). Pick another integer if you need to encrypt more data.

    Decryption proceeds in the same way, mutatis mutandis. This is the basic idea behind one of the best currently used ciphers, ChaCha20: It builds on a hash function.

    From a regulatory and legal perspective, this means that if you want to ban strong encryption, you must ban cryptographic hash functions.