This will be a basic introduction to the Scrypt hash function, or more accurately, KDF function. I will assume most of my audience is here to gain an understanding of why Scrypt is used and the basics of how it works. My goal is to explain it in a general sense, I will be omitting proofs and implementation details and instead focusing on the high-level principles.
What is Scrypt?
Scrypt is a slow-by-design hash function. Its purpose is to take some input data, and create a fingerprint of that data, but to do it very slowly. This often means to take a password and create a 256-bit private key.
For example, let’s pretend your password is password1234. By using scrypt, we can extend that deterministically into a 256-bit key:
That long 256-bit key can now be used as the private key to encrypt and decrypt data using the AES-256 cipher.
Why not use the password to encrypt directly?
Most encryption algorithms, including AES-256, require that a key of sufficient length is used. By hashing the password, we get a longer and fixed-size key.
Furthermore, we chose to use the scrypt algorithm as opposed to a faster hash like SHA-256 for two reasons:
- It is slow
- It uses memory as well as CPU resources
The reason we want a slow hash is so that an attacker has a harder time guessing the user’s password. If an attacker is trying to brute-force their way into a vault, that means they are just guessing passwords over and over in order to break in. AES-256 is very fast, so this means the attacker would be able to try many passwords per second on a modern computer.
Like all hashing functions, scrypt has the following properties:
- Deterministic (Same input produces the same output every time)
- Fixed-size output
- Irreversible (By using the output an attacker can’t find the input)
Additionally, Scrypt has the following properties:
- Computationally expensive and slow (It takes a long time for a computer to run the hash)
- Memory intensive (Potentially several gigabytes of RAM is used to run the hash)