Hash Functions
A public function
-
is simple and fast to compute -
takes as input message of arbitrary length and outputs a message digest of fixed length
Security Properties
- Collision resistant: it should be infeasible to find any two values
and such that - Second-preimage resistant: given a value
, it should be infeasible to find a different value such that - Preimage resistant (one-way): given a value
, it should be infeasible to find any input such that
If an attacker can break second-preimage resistance, they can also break collision resistance.
Birthday Paradox
If there is a group of 23 people, there is over a 50% chance that at least two people have the same birthday.
If choosing
Hence, if
Today,
In comparison, to get a 50% change of guessing the key of a block cipher requires only
Iterated Hash Functions
Iterated hash functions splits the input into fixed-size blocks and operates on them sequentially.
Merkle-Damgård Construction
A compression function
The compression function takes two
m = m_1 || m_2 || m_3 || … || m_l
m_1 m_2 m_3 m_l pad+len
| | | | |
| | | | |
v v v v v
IV ---> h ---> h ---> h --...-> h---> h ---> H(m)
Security: if the compression function
Weaknesses:
- Length extension attacks: if padding appended but not message length, there is no difference in the output of the message and the message with the right padding added. Hence, if the message is
, they can extend this to , where is the additional contents added by the attacker. Since the hash is the full internal state of the hash function, the attacker can use the hash and continue calulating the rest of the hash for , allowing them to create valid hashes without knowledge of or the rest of the message. Adding message length stops this attack - Second-preimage attacks: not as hard as they should be
- Collisions for multiple messages: not that too much more difficult than finding collisions for two messages
The Merkle-Damgård construction is used in MD5, SHA-1 and SHA-2.
Standardized Hash Functions
MDx Family
Proposed by Ron Rivest, widely used in the 1990s.
128-bit output, all broken.
Secure Hash Algorithm (SHA)
Based on MDx family, more complex design with 160 bit output.
SHA-0 introduced 1993, SHA-1 in 1995 with minor changes. Both broken.
SHA-2 Family
Standardized 2015.
Hash sizes of 224, 256, 384 or 512 bits.
Minimum recommended is SHA-256 (256 bit hash, 512 bit blocks) - same security as AES-128.
Most secure is SHA-512: 512 bit hash, 1024 bit blocks.
Padding:
- Message length field: 64 bits for 512 bit blocks, 128 bits for 1024 bit blocks.
- Always at least one bit of padding
- There is one
1bit, some number of0bits (enough to make blocks full) and then the length field - Padding and length fields may add an extra block
SHA-3
MDx, SHA families based on the same basic design which were vulnerable to a few unexpected attacks.
Keccak function picked by NIST in 2015 as as SHA-3: uses sponge function instead of compression.
Using Hash Functions
Hash functions do not depend a key; anyone can calculate it so it is not encryption.
However, it does help to provide data authentication:
- Authenticating the hash of a message to authenticate the message
- Building blocks for MACs
- Building blocks for signatures
Password storage:
- Pick random salt: makes it resistant to rainbow table attacks as there needs to be a different dictionary for every salt
- Compute
- Store salt and hash value
Message Authentication Code
Ensures message integrity. takes in a message
The tag
Unforgeability: not feasible to produce a valid pair
Unforgeability under chosen message attack: even with access to an oracle that can calculate the MAC for an input message of the attacker’s choosing, they cannot create a valid tag themselves (i.e. guess the private key from tags they ask the oracle to generate).
MAC from Hash Function (HMAC)
Proposed by Bellare, Canetti and Krawczyk in 1996.
Can be built from any iterated hash function
With a key
opad = 0x5c5c...5cipad = 0x3636...36
HMAC is secure if
HMAC is often used as a pseudorandom function to derive subkeys.
Authenticated Encryption
A and B share a key
Two options:
- Split
into and , encrypting with and using with a MAC - Use an authenticated encryption algorithm that provides both
Combining Encryption and MAC
Three options:
- Encrypt-and-MAC: encrypt
, apply MAC to and send the ciphertext and tag - Encryption algorithms usually have an IV but MACs usually don’t; if the same message is sent multiple times the MAC will be the same
- MAC-then-encrypt: calculate the MAC on
, encrypt then send the ciphertext - Encrypt-then-MAC: encrypt
, calculate the MAC on the ciphertext and then send and tag
Encrypt-then-MAC is the safest option:
MAC-then-encrypt was used in older versions of TLS while newer versions use authentication encryption modes.
https://crypto.stackexchange.com/questions/202/should-we-mac-then-encrypt-or-encrypt-then-mac
CTR Mode for Block Ciphers
Synchronous stream cipher. Counter initialized with random nonce
where
Encryption and decryption is simply XORing the plain/ciphertext with
Galois Counter Mode (GCM)
CCM mode cannot be used for processing of streaming data: formatting function for
Combines CTR mode on the block cipher
- Input: plaintext
, authenticated data , nonce - Outputs: ciphertext
, tag - Lengths
and are 64 bit values -
and are the minimum numbers of zeros required to expand and to complete blocks
-
- Length
of is 128 bits, is 96 bits long - Initial block input:
- Function
increments 32 MSB of the input string by
GHASH:

Output is
Decryption:
- Receiver receives ciphertext
, nonce , tag , authenticated data - Receiver computes tag
using shared key , compares with - If the same,
can be computed by generated the same keystream from CTR mode