10. Public Key Cryptography

Public key cryptography (PKC) has some features that symmetric key cryptography (SKC) does not and is applied for key management in protocols such as TLS and IPsec.

Discrete log-based ciphers are alternatives to PKC.

One-Way Functions

A function is one-way if $f(x) = y$ is easily computed given $x$ , but $f^{-1}(y) = x$ is hard to computed given $y$ .

It is not known if one-way functions exist, but there are many functions that are believed to be one-way:

Multiplication of large primes: the inverse function is integer factorization
Exponentiation: the inverse function takes discrete logarithms

Trapdoor One-Way Functions

A function such that $f^{-1}(y)$ is easily computed given additional information, called a trapdoor.

Example: let $f(x) = x^e \bmod n$ for any $x$ co-prime to $n$ :

The inverse function will be to find $d$ such that $\left(x^e\right)^d \bmod n = x^{ed} \bmod n = x$
From Euler’s theorem, $a^{\phi(n)} \bmod n = a^{k\phi(n)} \bmod n = 1$ for any integer $k$ , assuming $a$ co-prime to $n$
Hence, $x \cdot x^{k\phi(n)} \bmod n = x^{k\phi(n) + 1} \bmod n = x$ ( $x$ and $n$ are co-prime to each other)
- $ed = k\phi(n) + 1$ for some integer $k$
- That is, $ed \equiv 1 \pmod{\phi(n)}$
- Hence, $d = e^{-1} \bmod \phi(n)$
Integer factorization is assumed to be a hard problem; hence $\phi(n)$ cannot be computed easily
Hence, only someone with knowledge of $n$ ’s factors, the trapdoor, can find the inverse function

Asymmetric Cryptography

Another word to describe public key cryptography.

Public key cryptosystems, such as the Diffie-Hellman key exchange, are designed by using a trapdoor one-way function: the trapdoor is the decryption key.

This allows a public key to be stored in a public directory: anyone can obtain the public key and use it to form an encrypted message that only a person with the private key can decrypt.

Asymmetry: encryption and decryption keys are different. Encryption key is public and known to anybody; the decryption key is private and known only to its owner.

Finding the private key from the public key must be a hard computational problem.

Advantages of shared key/symmetric cryptography:

Key management is simplified: they dot not need to be transported confidentially
Digital signatures can be obtained

RSA

Designed in 1977 by Rivest, Shamir, and Adleman from MIT (patent expired 2000).

It is based on integer factorization problem.

Algorithm

Key generation:

Randomly choose two distinct primes, $p$ and $q$ from the set of all primes of a certain size
Compute $n = pq$
Randomly choose $e$ such that $\mathrm{gcd}(e, \phi(n)) = \mathrm{gcd}(e, (p - 1)(q - 1)) = 1$
Compute $d = e^{-1} \bmod \phi(n)$
Set the public key $K_E = (n, e)$
Set the private key $K_D = (p, q, d)$

Encryption:

Input is value $M$ such that $0 < M < n$
Compute $C = \mathrm{Enc}(M, K_E) = M^e \bmod n$

Decryption:

Compute $M = \mathrm{Dec}(C, K_D) = C^d \bmod n$

Any message must be pre-processed:

Coding it as a number
Adding randomness (to avoid repeating ciphertext for the same plaintext)

Numerical Example

Key generation:

Let $p = 43$ , $q = 59$
$n = pq = 2537$
$\phi(n) = (p - 1)(q - 1) = 2436$
Let $e = 5$
- $d = e^{-1} \bmod \phi(n) = 5^{-1} \bmod{2436} = 1949$ (Solve $ed + k'\phi(n) = 1$ using the Euclidean algorithm)

Encryption:

Let $M = 50$
$C = M^e \bmod n = 50^5 \bmod{2537} = 2488$

Decryption:

$C^d \bmod n = 2488^{1949} \bmod{2537} = 50 = M$

Numerical Example 2

Key generation:

Let $p = 11$ , $q = 13$
$n = pq = 11 \cdot 13 = 143$
$\phi(n) = (p - 1)(q - 1) = 10 \cdot 12 = 120$
Let $e = 7$
- Find $d = e^{-1} \bmod \phi(n) = 7^{-1} \bmod{120}$
- Solve $ed + k'\phi(n) = 1$ using the Euclidean algorithm:
  
  $\begin{aligned} 120 &= 7 \cdot 17 + 1 \\ \therefore 1 &= 120 \cdot 1 + 7 \cdot (-17) \\ \therefore 7^{-1} \bmod{120} &= -17 = 103 \end{aligned}$

Encryption:

Let $M = 5$
$C = M^e \bmod n = 5^7 \bmod{143} = 47$

Decryption:

$C^d \bmod n = 47^{103} \bmod{143} = 5 = M$

Correctness

Decrypting an encrypted message: does $(M^e)^d \bmod n = M$ ?

As $d = e^{-1} \bmod \phi(n)$ , $ed \bmod \phi(n) = 1$ : this can be written as being some integer $k$ such that $ed = 1 + k\phi(n)$ .

Hence, $(M^e)^d \bmod n = M^{ed} \bmod n = M^{1 + k\phi(n)} \bmod n$ .

Now we must show that $M^{1 + k\phi(n)} \bmod n = M$ . There are two cases:

Case 1: Coprime to n

$\mathrm{gcd}(M, n) = 1$ . In this case, we can apply Euler’s theorem, $M^{\phi(n)} \bmod n = 1$ :

\begin{aligned} M^{1 + k\phi(n)} \bmod n &= M \cdot (M^{\phi(n)})^k \bmod n \\ &= M \cdot 1^k \bmod n \\ &= M \end{aligned}

Case 2: Multiple of p or q

Since $p$ and $q$ are prime and $M < pq$ , if $\mathrm{gcd}(M, n) \neq 1$ , $M$ must be a multiple of either $p$ and $q$ and hence coprime to the other prime.

Supposing that $\mathrm{gcd}(M, p) = 1$ , $\mathrm{gcd}(M, q) = q$ and hence, there is some integer $l$ such that $M = lq$ .

Applying Fermat’s theorem, $M^{\phi(p)} \bmod p = M^{p - 1} \bmod p = 1$ :

\begin{aligned} M^{1 + k\phi(n)} \bmod p &= M \cdot \left(M^{\phi(n)}\right)^k &\bmod p \\ &= M \cdot \left(M^{p - 1}\right)^{k(q - 1)} &\bmod p \\ &= M \cdot 1^{k(q - 1)} &\bmod p \\ &= M &\bmod p \end{aligned}

As $M = lq$ $M^{1 + k\phi(n)} \bmod q = 0$ . As $n = pq$ and $p$ and $q$ are primes, we can use the Chinese Remainder Theorem:

There is a unique solution $x = M^{1 + k\phi(n)} \bmod n$ to:
- $M = M^{1 + k\phi(n)} \bmod p$
- $M = M^{1 + k\phi(n)} \bmod q \,( = 0)$
- Hence, $x = M$
And $M = M^{1 + k\phi(n)} \bmod n$ is satisfied too

Applications

Message encryption
Digital signature
Distribution of key for symmetric key encryption (hybrid encryption)
User authentication

Implementation

A few implementation details

Key Generation

Generating large primes $p$ and $q$ :

At least 1024 bits recommended for today
Simple algorithm: select random odd number $r$ of required length, check if prime, incrementing by two if not

Choice of $e$ :

Choose at random for best security
But small values are often used in practice: more efficient
$e = 3$ is smallest possible value; very fast but has security issues
$e = 2^{16} + 1$ is a popular choice
$d$ should be at least $\sqrt{n}$ to prevent known attacks such as Wiener’s attack

Encryption/decryption

Fast Exponentiation

A square-and-multiply modular exponentiation algorithm.

In binary, $e = e_0 \cdot 2^0 + e_1 \cdot 2^1 + \dots + e_k \cdot 2^k$ , where $e_i$ are bits.

If $M$ is the plaintext, $M^e = M^{e_0} \cdot (M^2)^{e_1} \cdot \dots \cdot (M^{2^k})^{e_k}$ where $(M^{2^i})^{e_i}$ is zero when $e_i = 0$ and $M^{2^i}$ when $e_i = 1$ .

Code:

// 2^66 % 100
M = 2   // base
n = 100 // modulus
e = [0, 1, 0, 0, 0, 0, 1] // exponent as bits, LSB first
z = 1; // Result - M^e \bmod n
for(let i = 0; i < e.length - 1; i++) {
  if(e[i] == 1) {
    z = (z * M) % n;
    // z = 1 initially so first multiplication is unnecessary
    // Hence e.filter(i => i == 1).length - 1 multiplications
  }
  
  M = (M * M) % n; // equals base^{2^{i + 1}}
  // When i = 0, M = base^1 - right for the next iteration
  // Does not need to be calculated for the i + 1 == e.length
  // as the value is used for the next iteration
}

Cost:

If $2^k \le e \lt 2^{k + 1}$ the algorithm uses $k$ squarings
If $b$ bits of $e$ are high, there are $b - 1$ multiplications (first computation has $z = z \cdot M$ , but $z$ is initially one)
$n$ is 2048 bits so $e$ is at most 2048 bits. Hence, computing $M^e \bmod n$ requires at most 2048 modular squarings or multiplications
On average, only half of $e$ ’s bits are high so there are only 1024 multiplications

Faster Decryption Using CRT

Decrypting $C$ with respect to $p$ and $q$ separately.

Compute $M_p = C^{d \bmod (p - 1)} \pmod p$ and $M_q = C^{d \bmod (q - 1)} \pmod q$

Solve $M \bmod n$ using CRT. $d = (d \bmod (p - 1)) + k(p - 1)$ for some $k$ . Hence:

\begin{aligned} M \bmod p &= C^{d \bmod n} \bmod p = C^d \bmod p \\ &= C^{d \bmod (p - 1)} C^{k(p - 1)} \bmod p = C^{d \bmod (p - 1)} \\ &= M_p \end{aligned}

Hence, $M \equiv M_p \bmod p$ and similarly, $M \equiv M_q \bmod q$ .

Then we can output $M = q\cdot (q^{-1} \bmod p) \cdot M_p + p \cdot (p^{-1} \bmod q) \cdot M_q \bmod n$ .

Speedup

Exponents $d \bmod (p - 1)$ and $d \bmod (q - 1)$ are about half the length of $d$ and the complexity of exponentiation with square-and-multiply increases with the cube of input length.

Hence, computing each of $M_p$ and $M_q$ uses about an eighth of the computation, leading to four times less computation.

Hence, storing $p$ and $q$ instead of just $n$ allows for faster decryption.

Padding

Encryption directly on the message encoded as a number is bad as it is vulnerable to attacks:

Building up a dictionary of known plaintexts
Guessing the plaintext and checking if it encrypts to the ciphertext
Håstad’s attack

Hence, a padding mechanism must be used to prepare the message for encryption, adding redundancy and randomness.

Håstad’s Attack

The same message $M$ is encrypted without padding to three different ciphertexts $C_1$ , $C_2$ , $C_3$ with the public exponent $e = 3$ being used by all recipients.

\begin{aligned} C_1 = M^3 \bmod n_1 \\ C_2 = M^3 \bmod n_2 \\ C_3 = M^3 \bmod n_3 \end{aligned}

Equations can be solved using the CRT to obtain $M^3$ in the ordinary (non-modular integers). The attacker can then simply take the cube root to find $M$ .

Padding Types

PKCS 1: simple ad-hoc design
Optimal asymmetric encryption padding (OAEP):
- Bellaware and Rogaway, 1994
- Security proof in a suitable model
- Standardized in IEEE P1363

Security

Attacks

Mostly avoided through the use of standardized padding mechanisms.

Possible attacks:

Factorization of the modulus $n$ : this is believed to be a hard problem, so should be fine as long as $n$ is large enough.
Finding $d$ from $n$ and $e$ : as hard as factorizing $n$ (Miller’s theorem)
Quantum computers: Shor’s theorem can theoretically factorize $n$ in polynomial time
Timing analysis: using timing of decryption process to obtain information about $d$
- Demonstrated in practice in smart cards
- Avoided by randomizing the decryption processes

Practical Problems with Key Generation

OpenSSL implementation in some systems would use massively-reduced randomness (2008)
Lenstra in 2012 analyzed 6 million RSA keys:
- Found 4% of keys were identical
- Found 0.2% of keys provided no security as they shared one prime factor with each other

Diffie-Hellman Key Exchange

Two users sharing a secret using only public communication.

Public elements:

Large prime $p$
Generator $g \in \mathbb{Z}_p^*$

Alice and Bob randomly select values $a$ and $b$ respectively, where $1 < a, b < p$ .

Over an insecure channel, Alice sends $g^a$ , Bob sends $g^b$ .

Both compute the secret key $Z = g^{ab} \bmod p$ : $(g^b)^a \equiv (g^a)^b) \pmod p$ .

Example

Let $p = 181$ , $g = 2$ , $a = 50$ , $b = 33$ .

Alice sends $g^a \bmod p = 2^{50} \bmod{181} = 116$
Bob sends $g^b \bmod p = 2^{33} \bmod{181} = 30$

Both compute $Z = (g^b)^a \bmod p = (g^a)^b \bmod p = 30^{50} \bmod{181} = 116^{33} \bmod{181} = 49$

Properties

Security

Relies on the difficulty of the discrete logarithm problem.

If an attacker intercepts $g^a \bmod p$ and take the discrete logarithm to find $a$ , they can compute $(g^b)^a$ in the same way as Bob.

There is no better known way for a passive adversary to find the shared secret.

In the basic protocol, messages are not authenticated: a man-in-the-middle-attack is possible where the attacker acts as a proxy between the two parties, decrypting messages from one party and then re-encrypting messages to send to the other party with the other key.

Alice chooses $a$ , sending $A$ and $g^a \bmod p$
Bob chooses $b$ , sending $B$ , $g^b \bmod p$ and $\mathrm{Sig}_B(B, A, g^b)$
Alice sends $\mathrm{Sig}_A(A, B, g^a)$
Both computes $g^{ab} \bmod p$

Both parties must know each other’s public signature verification keys, $A$ and $B$ (identity + public key).

Static and Ephemeral Diffie-Hellman

The above protocol uses ephemeral keys: keys are used once. In the static protocol:

Alice chooses a long-term private key $x_A$ and public key $y_A = g^{x_A} \bmod p$
Bob chooses a long-term private key $x_B$ and public key $y_B = g^{x_B} \bmod p$
Alice and Bob find a shared secret $S = g^{x_A x_B} \bmod p$ which is static
- $S$ stays the same until either party changes their public key

Elgamal Cryptosystem

Proposed by Elgamal in 1985, turning the Diffie-Hellman protocol into a cryptosystem for encryption and for signature.

Alice combines her ephemeral private key with Bob’s long-term public key

Algorithms

Key generation:

Select prime $p$ , generator $g \in \mathbb{Z}_p^*$
Select long-term private key $1 < K_D < p$
Compute $y = g^{K_D} \bmod p$
Set the long-term public key $K_E = (p, g, y)$

Encryption:

Select a message $0 < M < p$
Choose at random an ephemeral private key $k$
Compute $g^k \bmod p$ and $My^k \bmod p$
Compute the ciphertext:
- $C = (C_1, C_2) = \mathrm{Enc}(M, K_E) = (g^k \bmod p, My^k \bmod p)$

Decryption:

Compute $C_1^{K_D} \bmod p$
$\mathrm{Dec}(C, K_D) = C_2 \cdot (C_1^{K_D})^{-1} \bmod p = M$

Correctness

Alice knows ephemeral private key $k$
Bob knows long-term private key $K_D$
Both compute the Diffie-Hellman value for the two public keys:
- $C_1 = g^k \bmod p$
- $y = g^{K_d} \bmod p$
Diffie-Hellman value $y^k \equiv C_1^{K_d} \pmod p$ used as the mask for the message $M$

Example

Key generation:

Prime $p = 181$
Generator $g = 2$
Long-term private key $K_D = 50$
Compute $y = g^{K_D} \bmod p = 2^{50} \bmod{181} = 116$
Bob’s public key is $(181, 2, 116)$

Encryption:

Alice wants to send $M = 97$
$k = 31$ chosen at random
Computes

$\begin{aligned} C &= (C_1,\, C_2) \\ &= (g^k \bmod p,\, My^k \bmod p) \\ &= (2^{31} \bmod{181},\, 97 \cdot 116^{31} \bmod{181}) \\ &= (98,\, 173) \end{aligned}$

Decryption:

Bob computes $C_1^{K_D} \bmod p = 98^{50} \bmod{181} = 138$
Bob recovers

$\begin{aligned} M &= C_2 \cdot (C_1^{K_D})^{-1} \bmod p \\ &= 173 \cdot 138^{-1} \bmod{181} \\ &= 173 \cdot 101 \bmod{181} \\ &= 97 \end{aligned}$

Security

Dependent on the difficulty of the discrete logarithm problem: if broken, they could determine the private key $K_D$ from $y = g^{K_D} \bmod p$
Many users could share the same $p$ and $g$
Padding not required: ephemeral key $k$ randomizes the ciphertext

Elliptic Curves

Algebraic structures formed from cubic equations.

Curves are defined over any field.

e.g. set of all $(x, y)$ pairs which satisfy $y^2 = x^3 + ax + b \bmod p$ , then creates a curve over the field $\mathbb{Z}_p$ .

A point on the curve is the identity element, and by defining a binary operation on the points (e.g. multiplication), we can form a group over elliptic curve’s points: the elliptic curve group.

Any elliptic curve can be used but most applications use standardized curves generated in a verifiably random way (e.g. NIST curve P-192 has curve of $n$ points over $\mathbb{Z}_p$ with generator $(G_x, G_y)$ and equation $y^2 = x^3 - 3x + b \bmod p$ with a defined random values for $p$ , $n$ , $b$ , $G_x$ , $G_y$ and seed $s$ .

Discrete log defined on elliptic curve groups: if elliptic curve operation operation denoted as multiplication, definition is the same as in $\mathbb{Z}_p^*$ .

Elliptic curve implementations require smaller keys compared to RSA:

Symmetric key	RSA modulus	EC element length
80	1024	160
128	3072	256
192	7680	384
256	15360	512

From Ars:

Symmetric across horizontal axis
Any non-vertical line intersects with the curve at most three times. Group operation:
- Infinity, $\mathcal{O}$ , is the identity element
- Draw line intersecting the two points
  - If the points are the same, use the tangent
- Find the third point intersecting the line
  - If the points are the same and it is at an inflection point, use the same point
- Find the opposite point $(x, y) \to (x, -y)$