Diffie-Hellman and RSA

Diffie-Hellman: Opening the Framework RSA: Filling in Encryption What RSA's Security Rests On Built on Factoring and Logarithms

Diffie-Hellman: Opening the Framework

The previous pages built the public-key idea as a specification and then named the hardness assumptions that might fill it, but exhibited no working system. The first system to realize the idea historically did not encrypt messages at all. It solved a narrower and, at the time, more urgent problem: how two parties who share no secret can agree on one over a channel an adversary is reading. This is the key-distribution problem that symmetric cryptography could not escape on its own, and the construction that broke it opened the public-key era.

The mechanism rests on a single asymmetry already in hand. In a finite cyclic group with generator \(\alpha\), raising \(\alpha\) to a power is cheap, while recovering the exponent from the result is the discrete logarithm problem, believed intractable in suitable groups. Two parties exploit this by each choosing a private exponent, publishing only the corresponding power, and combining what they receive with what they kept.

Definition: Diffie-Hellman Key Agreement

Let \(G\) be a finite cyclic group of order \(n\) with generator \(\alpha\), both fixed in advance and public. Two parties \(A\) and \(B\) establish a shared secret as follows.

  1. \(A\) chooses a private integer \(x\) with \(1 \leq x \leq n-1\), keeps it secret, and sends \(\alpha^x\) to \(B\).
  2. \(B\) chooses a private integer \(y\) with \(1 \leq y \leq n-1\), keeps it secret, and sends \(\alpha^y\) to \(A\).
  3. On receiving \(\alpha^y\), party \(A\) computes \((\alpha^y)^x\); on receiving \(\alpha^x\), party \(B\) computes \((\alpha^x)^y\).

Both parties obtain the same group element \[ (\alpha^y)^x = \alpha^{xy} = (\alpha^x)^y, \] the equality holding because exponents in a group multiply, and this common value \(K = \alpha^{xy}\) is their shared secret.

The structural novelty is that the shared secret is never transmitted and never chosen by either party alone. It emerges from the two private exponents jointly, and the commutativity \((\alpha^x)^y = (\alpha^y)^x\) is exactly what lets the parties arrive at the same value from opposite sides. This is a sharper departure from the trapdoor picture than it first appears: there is no secret that reverses a public operation here. Knowing \(x\) inverts nothing an adversary could otherwise not compute — it is simply the private contribution \(A\) holds back. Key agreement asks only that the parties reach a common value an eavesdropper cannot, and achieves this with a one-way function and the commutativity of exponentiation, demanding no trapdoor at all.

Example: A Small Key Agreement

Take the multiplicative group modulo the prime \(29\), generated by \(\alpha = 2\). Party \(A\) draws the private exponent \(x = 6\) and sends \[ \alpha^x = 2^6 \bmod 29 = 6; \] party \(B\) draws \(y = 9\) and sends \[ \alpha^y = 2^9 \bmod 29 = 19. \] Now \(A\) raises what it received to its own exponent, \((\alpha^y)^x = 19^6 \bmod 29 = 22\), and \(B\) does likewise, \((\alpha^x)^y = 6^9 \bmod 29 = 22\). Both reach the shared secret \(K = 22\), which equals \(\alpha^{xy} = 2^{54} \bmod 29\) computed either way. An eavesdropper sees \(2\), \(6\), and \(19\), and to find \(K\) must extract one of the hidden exponents from \(6 = 2^x\) or \(19 = 2^y\) — a discrete logarithm, easy at this size but infeasible once the prime is large.

Diffie-Hellman key agreement over an open channel Party A holds a private exponent x and sends alpha to the x; party B holds a private exponent y and sends alpha to the y. Each raises the value it receives to its own private exponent, and both arrive at alpha to the x y, the shared secret. An eavesdropper sees only alpha to the x and alpha to the y and cannot form alpha to the x y. A private x B private y αˣ αʹ (αʹ)ˣ = αˣʹ (αˣ)ʹ = αˣʹ shared secret, never sent an eavesdropper sees αˣ and αʹ but cannot form αˣʹ

What stops the eavesdropper is precisely the problem named on the previous page. The adversary sees \(\alpha\), \(\alpha^x\), and \(\alpha^y\), and wants \(\alpha^{xy}\) — which is the Diffie-Hellman problem exactly. The security of the agreement is the conjectured intractability of that problem, and the reduction established there bounds it from above: since the Diffie-Hellman problem is no harder than the discrete logarithm, anyone who can take discrete logarithms in \(G\) recovers \(x\) from \(\alpha^x\) and computes the secret directly. The agreement is therefore secure only in groups where the discrete logarithm itself resists computation; the choice of group is not cosmetic but the entire basis of the guarantee.

That guarantee is also bounded in a second way, and naming the boundary now prevents a misreading later. Diffie-Hellman protects the secret against an adversary who only listens. It offers nothing against one who sits in the middle and substitutes messages: such an adversary can run a separate agreement with each party, holding one shared secret with \(A\) and another with \(B\), relaying and re-encrypting traffic while reading all of it, and neither party can detect the interposition from the protocol alone. The exchanged values carry no evidence of who sent them. This is the same gap the previous page isolated for public keys in general — that publishing a value removes the need to keep it secret but not the need to know whose it is — and closing it requires binding each transmitted value to an identity, a matter of authentication the track takes up further along. Against a passive adversary, the agreement solves key distribution completely.

Nothing in the construction used the integers modulo a prime specifically. It required only a finite cyclic group in which exponentiation is efficient and the discrete logarithm is hard, and any such group serves. The choice among them is a choice of where the hardness is most concentrated per unit of key length, and a later page returns to this when a group of a very different character — the points of an elliptic curve — sharpens the same protocol. For now the framework has its first inhabitant: a working public-key construction, resting on a named hard problem, that establishes a secret with no secret shared in advance. What it does not yet do is encrypt. Supplying that is the contribution of the construction taken up next.

RSA: Filling in Encryption

Key agreement establishes a shared secret but transmits no message of the sender's choosing. The construction that followed closed that gap, realizing the other half of the public-key idea: encryption of an arbitrary plaintext under a published key. Where the previous construction needed no trapdoor, this one is built on a trapdoor one-way function in the literal sense — its first concrete instance. The forward map is easy for anyone; the inverse is infeasible without a secret and easy with it; and the secret is precisely the private key. The hardness it draws on is the integer factorization problem.

The scheme is assembled from modular exponentiation in a way that makes the exponents undo one another exactly when the modulus is a product of two primes. We state the construction, then prove it decrypts correctly.

Definition: The RSA Scheme

Key generation. Choose two distinct large primes \(p\) and \(q\), and set the modulus \(n = pq\). Compute \(\phi = (p-1)(q-1)\). Select an encryption exponent \(e\) with \(1 \lt e \lt \phi\) and \(\gcd(e, \phi) = 1\), and compute the decryption exponent \(d\), the unique integer with \(1 \lt d \lt \phi\) and \[ ed \equiv 1 \pmod{\phi}. \] The integer \(d\) exists and is unique because \(\gcd(e, \phi) = 1\) makes \(e\) invertible modulo \(\phi\). The public key is the pair \((n, e)\); the private key is \(d\). The primes \(p\), \(q\), and the value \(\phi\) are discarded or kept secret, as knowledge of any of them yields \(d\).

Encryption and decryption. A plaintext is an integer \(m\) with \(0 \leq m \leq n-1\). Encryption and decryption are \[ c = m^e \bmod n, \qquad m = c^d \bmod n. \]

The trapdoor structure is visible in the keys: the forward map \(m \mapsto m^e \bmod n\) is published in full as \((n, e)\), while the inverse \(c \mapsto c^d \bmod n\) requires \(d\), which an adversary holding only \((n, e)\) cannot compute without, in effect, breaking the factorization of \(n\) — a dependence the next section makes precise. That the two exponentiations genuinely invert one another is not assumed but follows from the choice of \(d\), which we now prove.

Theorem: Correctness of RSA Decryption

With notation as above, for every plaintext \(m\) with \(0 \leq m \leq n-1\), \[ (m^e)^d \equiv m \pmod{n}. \] Decryption recovers the plaintext exactly.

Proof

Since \(ed \equiv 1 \pmod{\phi}\), write \(ed = 1 + k\phi = 1 + k(p-1)(q-1)\) for some integer \(k \geq 0\). We show \(m^{ed} \equiv m\) modulo \(p\) and modulo \(q\) separately, then combine the two congruences.

Modulo \(p\). Fermat's little theorem gives \(a^p \equiv a \pmod{p}\) for every integer \(a\). When \(\gcd(m, p) = 1\), cancelling one factor of \(m\) yields \(m^{p-1} \equiv 1 \pmod{p}\), so \[ m^{ed} = m \cdot \bigl(m^{p-1}\bigr)^{k(q-1)} \equiv m \cdot 1^{k(q-1)} = m \pmod{p}. \] When instead \(\gcd(m, p) = p\), that is \(p \mid m\), both \(m^{ed}\) and \(m\) are divisible by \(p\) and hence both \(\equiv 0 \pmod{p}\), so \(m^{ed} \equiv m \pmod{p}\) holds in this case too. In either case \[ m^{ed} \equiv m \pmod{p}. \]

Modulo \(q\). The argument is identical with the roles of \(p\) and \(q\) exchanged, using \(ed = 1 + k(p-1)(q-1)\) regrouped as \(1 + k'(q-1)\) with \(k' = k(p-1)\), and gives \[ m^{ed} \equiv m \pmod{q}. \]

Combining. We now have \(p \mid (m^{ed} - m)\) and \(q \mid (m^{ed} - m)\). Since \(p\) and \(q\) are distinct primes they are relatively prime, so the additive group \(\mathbb{Z}_{pq}\) splits as \(\mathbb{Z}_p \times \mathbb{Z}_q\): a residue modulo \(pq\) is determined by its residues modulo \(p\) and modulo \(q\). Both residues of \(m^{ed} - m\) are zero, so \(m^{ed} - m \equiv 0 \pmod{pq}\), that is \[ m^{ed} \equiv m \pmod{n}. \] Because \(0 \leq m \leq n-1\), the residue class determines \(m\) itself, so \(c^d \bmod n = (m^e)^d \bmod n = m\).

Example: A Small Encryption

Take the primes \(p = 7\) and \(q = 13\), so \(n = 91\) and \(\phi = 6 \cdot 12 = 72\). Choose \(e = 5\), which is coprime to \(72\); inverting it modulo \(72\) gives \(d = 29\), since \(5 \cdot 29 = 145 = 1 + 2 \cdot 72\). The public key is \((91, 5)\) and the private key is \(29\). To encrypt the plaintext \(m = 9\), \[ c = 9^5 \bmod 91 = 81, \] and to decrypt, \(81^{29} \bmod 91 = 9\), recovering the plaintext.

The recovery is the proof in miniature. Here \(ed - 1 = 144 = 2\,\phi\), and the congruence \(m^{ed} \equiv m\) can be checked one prime at a time: modulo \(7\), one finds \(9^{145} \equiv 2 \equiv 9 \pmod 7\), and modulo \(13\), \(9^{145} \equiv 9 \pmod{13}\). The two agree with \(m\) in each factor, and the splitting of \(91\) into \(7\) and \(13\) combines them into \(9^{145} \equiv 9 \pmod{91}\) — exactly the route the proof took.

The proof exposes why the modulus must be a product of distinct primes rather than an arbitrary integer: the exponent arithmetic that makes \(e\) and \(d\) inverse runs in each prime factor separately, and the recombination is exactly the splitting of the modulus into coprime parts. The same splitting that makes correctness work is what an attacker would exploit by factoring — the structure serving the legitimate user and the structure exposed to the adversary are one and the same, a tension the next section examines.

One caution carries over from the framework and must not be skipped. The map \(m \mapsto m^e \bmod n\) is a fixed function: a given plaintext always produces the same ciphertext. Applied directly, it is a deterministic encryption, and the page defining security ruled deterministic public-key encryption out — equal plaintexts yield equal ciphertexts, and an adversary who can encrypt candidate plaintexts can match them against an observed ciphertext, the failure that randomized encryption exists to repair. The trapdoor function is the primitive; a secure scheme wraps it with fresh randomness at encryption time, so that the hard-to-invert core is preserved while the plaintext-to-ciphertext map becomes one-to-many. Textbook RSA — the bare exponentiation above — is the trapdoor, not yet the scheme deployed in practice, and the gap between the two is filled by structured randomized padding rather than by any change to the arithmetic just proved.

What RSA's Security Rests On

A passive adversary against RSA holds the public key \((n, e)\) and a ciphertext \(c = m^e \bmod n\), and wants the plaintext \(m\). Recovering \(m\) from \(c\) under \((n, e)\) is the RSA problem, and no efficient algorithm for it is known. The question this section settles is how that problem relates to the factoring of \(n\), since the scheme was introduced as resting on factoring and the precise nature of that dependence is easy to overstate.

One direction is clear and complete. If an adversary can factor \(n\), it recovers \(p\) and \(q\), forms \(\phi = (p-1)(q-1)\) exactly as the key generator did, and computes \(d\) from \(e\) by the same modular inversion. With \(d\) in hand it decrypts every ciphertext. So the ability to factor breaks the scheme outright: RSA can be no harder to break than \(n\) is to factor.

The reverse direction is the subtle one, and it must be stated in two steps that are often collapsed into one. The first step concerns not the plaintext but the private exponent: recovering \(d\) from the public key is exactly as hard as factoring \(n\).

Theorem: Recovering the Private Exponent Is Equivalent to Factoring

Given the public key \((n, e)\), computing the RSA decryption exponent \(d\) and factoring the modulus \(n\) are computationally equivalent: an efficient algorithm for either yields an efficient algorithm for the other.

Proof

Factoring yields \(d\). This is key generation: from \(p\) and \(q\) compute \(\phi = (p-1)(q-1)\), then invert \(e\) modulo \(\phi\) to obtain \(d\). Both steps are efficient.

\(d\) yields a factorization. Suppose \(d\) is known, so the integer \(ed - 1\) is a positive multiple of \(\phi\). The multiplicative group modulo \(n\) has order \(\phi\), so for every \(a\) with \(\gcd(a, n) = 1\) the order of \(a\) divides the group order \(\phi\), and hence divides \(ed - 1\), so \(a^{ed-1} \equiv 1 \pmod{n}\). Write \(ed - 1 = 2^s t\) with \(t\) odd. The element \(a^{t}\) then has order a power of \(2\) modulo \(n\), so the sequence \[ a^{t},\ a^{2t},\ a^{4t},\ \ldots,\ a^{2^s t} \equiv 1 \pmod{n} \] reaches \(1\), and we may locate the last entry before it equals \(1\). If some \(b = a^{2^{i-1} t}\) in this sequence satisfies \(b \not\equiv \pm 1 \pmod{n}\) yet \(b^2 \equiv 1 \pmod{n}\), then \(b\) is a square root of \(1\) other than \(\pm 1\). Such a \(b\) gives \((b-1)(b+1) \equiv 0 \pmod{n}\) while neither factor is divisible by \(n\), so \(\gcd(b - 1, n)\) is a non-trivial factor of \(n\). A non-trivial square root of \(1\) exists precisely because \(n\) has two distinct prime factors, and a randomly chosen \(a\) produces one for at least half of the residues coprime to \(n\); repeating with fresh \(a\) succeeds after a constant expected number of trials. Thus knowledge of \(d\) factors \(n\) efficiently with high probability.

Each direction is an efficient procedure calling the other's solver, so the two problems are computationally equivalent.

This equivalence is exact, but it is an equivalence about \(d\), not about \(m\). The second step — relating the recovery of a plaintext to factoring — is where the honest account must stop short. It is not known whether the RSA problem is as hard as factoring. An adversary might conceivably recover \(m\) from \(c\) without ever computing \(d\) or factoring \(n\), by some route that exploits the specific structure of the map \(m \mapsto m^e\); no such route has been found, but none has been ruled out. The scheme's security therefore rests on the hardness of factoring without being equivalent to it: factoring breaks RSA, and recovering \(d\) is exactly factoring, but whether breaking RSA requires either remains open. Stating the dependence as an equivalence would claim a theorem that does not exist.

Granting that the core problem is hard, a scheme can still be defeated through how it is used rather than through the problem it rests on. These usage failures do not contradict the hardness of factoring; they show that hardness of the underlying problem is necessary but not sufficient for a secure deployment. Each identifies a discipline that correct use must observe, and the disciplines are permanent even as the individual attacks are refined.

Small-exponent and small-message exposure. A small encryption exponent such as \(e = 3\) speeds up encryption, but if the same message is sent to enough recipients under the same small exponent, or if the message itself is small enough that \(m^e \lt n\), the modular reduction does nothing and \(m\) is recovered by an ordinary integer root of the ciphertext. The fix is not a larger exponent but the randomized padding already required for semantic security: padding destroys the algebraic relationships these attacks exploit.

Forward search over a small message space. Because the encryption map is public, an adversary facing a message drawn from a small or guessable set can encrypt every candidate and compare against the observed ciphertext — the deterministic-encryption failure in its starkest form. This is again answered by randomized padding, which makes re-encryption of a guessed plaintext almost never reproduce the target ciphertext.

The multiplicative structure. RSA respects multiplication: the product of the encryptions of \(m_1\) and \(m_2\) is the encryption of \(m_1 m_2\), since \((m_1 m_2)^e \equiv m_1^e m_2^e \pmod{n}\). An adversary can exploit this to have a disguised ciphertext decrypted and then strip the disguise, an adaptive chosen-ciphertext attack that the bare scheme does not withstand. The defense is to require plaintexts to carry a rigid structure that a tampered ciphertext will almost surely fail to produce, so that malformed decryptions are rejected — once more a property supplied by structured padding, not by the arithmetic.

A shared modulus. If several parties were issued distinct exponent pairs under one common modulus \(n\), the equivalence just proved would be fatal: any single party, knowing its own \((e_i, d_i)\), could factor \(n\) and then compute every other party's private exponent. A modulus can belong to exactly one party. This is a direct corollary of the equivalence theorem rather than a separate phenomenon, and it is why key generation produces a fresh modulus per user.

None of these failures touches the hardness of factoring; each is a requirement on construction around a core that is assumed hard. And that assumption carries the qualification every page of this track has attached to it. The hardness of factoring is hardness against classical computation, the model under which every algorithm in this section was reckoned. The padding disciplines above secure RSA against an adversary confined to that model; no choice of key length secures it against a model in which factoring itself becomes easy. What such a model does to RSA and to the key agreement of the first section, and why the two fall together, is the final matter.

Built on Factoring and Logarithms

The two constructions of this page descend from the two oldest hardness assumptions, one each. Key agreement rests on the discrete logarithm: an eavesdropper who could take logarithms would read every shared secret. Encryption rests on factoring: an adversary who could factor would recover every private exponent. The abstract problems named on the previous page have become two working systems, and the systems inherit not only the hardness of those problems but their fragility.

That fragility is shared, and the sharing is not a coincidence of history but a fact about the algebra. Both problems live in finite abelian groups — the integers modulo a prime for the discrete logarithm, the multiplicative structure modulo \(n\) for factoring — and the difficulty in each case is that of recovering hidden information about a periodicity in that group. A quantum computer running Shor's algorithm finds such periods efficiently, and so solves both problems in time polynomial in the input size. The single algorithmic advance that breaks the one breaks the other, because they were never as different as their statements suggest; they are two faces of the same underlying structure, and a method that exploits that structure does not distinguish between them.

For the constructions of this page the consequence is immediate and total. Against an adversary with a large-scale quantum computer, factoring \(n\) becomes feasible, so the private exponent \(d\) is recovered by the equivalence proved above and RSA is decrypted at will. In the same setting the discrete logarithm becomes feasible, so the private exponents in a key agreement are recovered and the shared secret with them. The collapse does not stop at the discrete logarithm, moreover: the reduction of the previous page runs the difficulty downward, so an algorithm that takes logarithms also solves the Diffie-Hellman problem directly — every construction resting anywhere in that chain of hardness falls at once. None of the padding disciplines of the previous section helps, because none of them touches the hardness of the core problem; they guard the construction within the classical model and have nothing to say once the model itself changes.

The reach of this attack is sharp but bounded, and the boundary is what the rest of the track is organized around. What Shor's algorithm exploits is the hidden abelian structure these two problems share; cryptographic constructions that do not rest on that structure are not broken by it. Symmetric-key ciphers and hash functions face only a generic quantum search that halves the effective key length — an erosion answered by doubling keys, not by abandoning the design. So the event on the horizon is not the end of cryptography but the obsolescence of one family within it: the public-key constructions of this page, and the number-theoretic assumptions beneath them, whose security has a known expiry condition rather than an indefinite future.

This is why the search for a successor does not look for a better factoring-based scheme or a harder instance of the discrete logarithm. A harder instance of the same problem is still an instance of the same problem, and falls to the same algorithm. The replacement must rest on a hardness that does not reduce to a hidden abelian period — a problem for which no efficient quantum algorithm is known and to which factoring and the discrete logarithm do not connect by any known reduction. The most developed candidates come from the geometry of high-dimensional lattices, where finding a shortest or nearest vector appears to resist classical and quantum attack alike. Moving there means leaving the arithmetic of primes and exponents that has carried public-key cryptography from its first key agreement to its first cipher, and rebuilding the trapdoor on ground a quantum computer does not level. The framework of the earlier pages — what it means to encrypt, to agree on a key, to be secure against a bounded adversary — carries over unchanged. Only the hard problem at the center is exchanged, and exchanging it is the work that follows.