The previous page fixed an encryption scheme as a family of injective maps \(E_e \colon
\mathcal{M} \to \mathcal{C}\), one per key, and treated each \(E_e\) as a fixed function: a given
plaintext under a given key always produces the same ciphertext. We flagged there that this
deterministic choice carries a weakness, and set the repair aside. We collect that debt now,
because the weakness is not a detail of one construction — it is a barrier that no deterministic
scheme can clear, and the security definitions that follow are built to measure exactly what clearing
it requires.
Fix a key and consider what a deterministic \(E_e\) leaks before any question of key recovery
arises. Three failures are structural, present in every deterministic scheme regardless of how hard
its key is to find.
1. Repetition. Because \(E_e\) is a function, equal plaintexts produce
equal ciphertexts: \(m = m'\) forces \(E_e(m) = E_e(m')\). An observer who sees the same ciphertext
twice learns that the same message was sent twice, without breaking anything — the equality of
ciphertexts is visible directly. In a setting where messages recur, the pattern of repetitions is
itself information, and a deterministic scheme transmits it in clear.
2. Distribution dependence. Suppose the adversary knows that the
plaintext is drawn from a small or skewed set — a yes/no instruction, a choice among a handful of
fixed commands, a field whose likely values are few. Under Kerckhoffs's principle the scheme itself
is public; if the encryption transformation is also public to the adversary, as in a public-key
setting, it can encrypt each likely plaintext directly, and even when it is not, a chosen-plaintext
adversary can obtain such encryptions. Either way the adversary can compare against the observed
ciphertext. A deterministic scheme offers no defense here, because the ciphertext it produces for a
given plaintext is a fixed target the adversary can reproduce and match.
3. Partial information. The third sharpens the second: even when the
adversary cannot recover the whole plaintext, a
deterministic scheme may leak some part of it — some predicate, some
fragment, some single bit that the ciphertext determines and the adversary can compute. Recovering
the plaintext in full is only the crudest notion of a break; a scheme that hides the message but
discloses a reliable bit of it has still failed against an adversary who needs only that bit. Making
this precise — what it means to leak nothing, not merely to hide the whole — is the work of the
sections ahead.
All three failures share one root: a deterministic \(E_e\) is a fixed map, so the ciphertext is a
deterministic function of the plaintext, and any structure in the plaintext passes through to
structure in the ciphertext that the adversary can read or reproduce. The repair is to break the
functional dependence. A randomized encryption scheme draws fresh randomness at
encryption time, so that a single plaintext maps to many possible ciphertexts and the particular one
sent carries no reproducible signature of the plaintext's structure.
Definition: Randomized Encryption
A randomized encryption scheme augments encryption with a space \(\mathcal{R}\)
of random values. Encryption under key \(e\) is a map
\[
E_e \colon \mathcal{M} \times \mathcal{R} \to \mathcal{C},
\]
and to encrypt a plaintext \(m\) the sender draws \(r \in \mathcal{R}\) afresh, uniformly and
independently of past encryptions, and transmits \(E_e(m, r)\). Decryption recovers the plaintext
from the ciphertext alone: there is a map \(D_d\) with
\[
D_d\bigl(E_e(m, r)\bigr) = m \quad \text{for every } m \in \mathcal{M} \text{ and every } r \in
\mathcal{R}.
\]
For each fixed \(m\), varying \(r\) ranges over a set of ciphertexts all decrypting to \(m\); the
scheme is no longer a function of the plaintext but a one-to-many assignment, while decryption
remains unambiguous.
The decryption condition forces a structural fact: distinct plaintexts must have disjoint ciphertext
sets. If some ciphertext \(c\) could arise as \(E_e(m, r)\) and also as \(E_e(m', r')\) with \(m \neq
m'\), then \(D_d(c)\) would have to equal both \(m\) and \(m'\), contradicting that decryption returns
a single plaintext. So the ciphertext space partitions: each plaintext owns a private region of
\(\mathcal{C}\), and randomness selects which point of its own region a given encryption lands on.
Re-encrypting a message now lands on a different point, and an adversary that encrypts a candidate
plaintext itself cannot predict which point the sender chose.
Randomization is the precondition for the security this page defines, not an optional hardening of
it. But possibility is not security: a plaintext having many ciphertexts is worthless if the
adversary can still tell one plaintext's ciphertexts from another's. We need a definition that holds
the scheme to the right standard — not "the adversary cannot recover the key," nor even "cannot
recover the plaintext," but "cannot extract anything the plaintext was meant to hide." The next two
sections build that standard in two equivalent forms.
The Indistinguishability Game
We want a definition of security that captures "the ciphertext reveals nothing useful," and we want
it phrased so that it can be checked. The route is to turn security into a contest the adversary
plays and is required to lose. If the best the adversary can do is no better than guessing, the
scheme leaks nothing the adversary could act on.
The contest is sharp precisely because the adversary is handed every advantage short of the key.
Rather than wait for messages it cannot control, the adversary itself names the two plaintexts whose
encryptions it will try to tell apart — the easiest possible discrimination task, since it knows the
candidates exactly. It is then shown the encryption of one of them, chosen by a fair coin it cannot
see, and asked which. A scheme that hides information must defeat the adversary even on this
self-chosen, two-way choice; if it can leak enough to bias that one guess, it has leaked something.
Definition: Indistinguishability of Encryptions
Security is measured by the following game between a challenger holding the key and an adversary
\(\mathcal{A}\), played at security parameter \(n\) (the key length). Throughout, \(\mathcal{A}\)
may encrypt plaintexts of its own choosing: in a public-key setting the encryption key is public,
and in a shared-key setting this is the chosen-plaintext access granted by the attack model of
the previous page.
\(\mathcal{A}\) selects two plaintexts \(m_0, m_1 \in \mathcal{M}\) of equal length and
submits both to the challenger.
The challenger draws a bit \(b \in \{0, 1\}\) uniformly at random and a random value \(r \in
\mathcal{R}\), computes the challenge ciphertext \(c = E_e(m_b, r)\), and
returns \(c\) to \(\mathcal{A}\).
\(\mathcal{A}\) outputs a guess \(b' \in \{0, 1\}\) and wins if \(b' = b\).
The scheme has indistinguishable encryptions if every adversary running in
polynomial time in \(n\) wins with probability at most
\[
\Pr[b' = b] \le \tfrac{1}{2} + \varepsilon(n),
\]
where the advantage \(\varepsilon(n)\) over a blind coin-flip is negligible: it shrinks faster
than the reciprocal of any polynomial in \(n\).
Three features of the game are deliberate, and each answers a question the previous page left open.
The bound is stated against polynomial-time adversaries. This is where the
complexity notions invoked on the previous page enter the definition itself. "The adversary cannot
win" does not mean no winning strategy exists in principle — exhaustive key search always exists —
but that no strategy runs within the resource budget of a feasible attacker, made precise as
time polynomial in
the security parameter.
A scheme is secure against the bounded adversary the previous page described, not against an
unbounded one; the unbounded case is the separate and stronger guarantee taken up later.
The winning margin must be negligible. An adversary can always achieve exactly
\(\tfrac{1}{2}\) by ignoring the ciphertext and flipping its own coin, and can do slightly better by
luck on any given run. What security forbids is a margin a feasible attacker can convert into a
reliable advantage — one that survives being
asymptotically negligible
and so cannot be amplified into a usable edge by any polynomial number of repetitions, since a
polynomial multiple of a negligible quantity remains negligible.
This is the precise form of the "non-negligible probability" the previous page used to separate a
genuine break from a lucky guess.
The adversary chooses the plaintexts. This single feature subsumes both
deterministic failures from the first section at once. If the scheme were deterministic, the
adversary would name any two distinct plaintexts, encrypt \(m_0\) itself, and compare with the
challenge: a match reveals \(b = 0\), a mismatch \(b = 1\), and the adversary wins with certainty.
Indistinguishability is therefore unattainable for deterministic schemes — the game formalizes
exactly why randomization was forced. It also disposes of distribution dependence: by letting the
adversary pick the worst pair of plaintexts, the definition demands security for every distribution
over messages, not merely an average-case one, since the hardest two-point distribution is among the
adversary's choices.
One further constraint of the game is not a design choice but an
honest accounting of what encryption does not promise to hide: \(m_0\) and \(m_1\) are required to
have equal length. A ciphertext generally reveals the
length of its plaintext, and an adversary allowed to submit plaintexts of different lengths could
win trivially by reading off the ciphertext size, defeating the scheme for a reason that has nothing
to do with the secrecy of content. Fixing equal lengths isolates the property we actually want — that
nothing about the content leaks — and the question of concealing length is a separate
concern addressed by other means.
This definition pins security to a single, concrete failure: distinguishing two known plaintexts. It
is operationally clean, but it leaves a question. Hiding a one-bit choice between two messages the
adversary already knows sounds weaker than the intuitive goal — that the ciphertext betray
nothing about a plaintext drawn from any distribution. The next section states that stronger
intuition directly, and the section after proves the two are the same.
Semantic Security
The indistinguishability game is a clean test, but it phrases security negatively and narrowly: the
adversary fails to win one specific two-way guessing contest. The intuition we began with was
broader and positive — a ciphertext should betray nothing about its plaintext that the adversary
could not already have known. We now state that intuition directly, in a form that makes no
reference to any game.
The difficulty is to say what "betray nothing" means without smuggling in a list of forbidden
leaks. We cannot enumerate every fact an adversary might extract — a parity bit, a price bracket,
whether the message names a particular person. The semantic formulation avoids the list by a single
move: it compares two worlds. In one, the adversary holds the ciphertext; in the other, it holds
nothing but the public description of the message distribution. The scheme is secure if these two
worlds are computationally the same — anything the adversary can compute from the ciphertext, it
could already compute without it.
Definition: Semantic Security
A scheme is semantically secure if, for every probability distribution over the
message space and every function \(f\) of the plaintext that an adversary might wish to compute,
whatever a polynomial-time adversary can compute about the plaintext given the ciphertext, it can
also compute in polynomial time without the ciphertext. Formally, for every polynomial-time
adversary \(\mathcal{A}\) given a challenge ciphertext there is a polynomial-time
simulator \(\mathcal{S}\) given only the length of the plaintext, such that the
probability that \(\mathcal{A}\) correctly computes \(f(m)\) exceeds the probability that
\(\mathcal{S}\) does so by only a negligible amount.
The simulator is the heart of the definition. It is a hypothetical algorithm that must reproduce the
adversary's success while being denied the one thing the adversary has — the ciphertext. If such a
simulator always exists, the ciphertext was useless to the adversary: every inference it supported
was already available from public knowledge of the message distribution alone. The simulator may
know everything about how messages are drawn and what \(f\) is, exactly as the adversary does, along
with the length of the plaintext — the one thing a ciphertext does not hide, as the
indistinguishability game already acknowledged. It is denied only the encrypted challenge itself.
This is the definition that retires the third deterministic failure from the first section. There we
observed that a scheme can hide a plaintext in full yet leak a reliable bit of it, and that
recovering the whole message is too crude a notion of a break. Semantic security is the notion that
is not too crude: it forbids leaking any function \(f\) of the plaintext, from the whole
message down to a single predicate, because \(f\) ranges over everything the adversary might want.
A scheme that disclosed even one reliable bit would admit an \(f\) — that bit — which the adversary
computes from the ciphertext and no simulator can match, and the definition would catch it.
Semantic security also clarifies what was meant, on the previous page, by tying secrecy to the whole
message distribution rather than a single message. The guarantee is quantified over every
distribution: it must hold whether the plaintext is a uniform random string or one of two known
commands. This is the same universality the indistinguishability game achieved by letting the
adversary choose the plaintexts, approached now from the opposite side — not "the adversary picks the
worst case" but "the guarantee holds in all cases." That the two routes lead to the same place is
the content of the next section.
The Two Definitions Agree
We now have two definitions of the same intention. Indistinguishability is operational and narrow: win
a two-way guessing game no more often than chance. Semantic security is conceptual and broad: leak no
function of the plaintext beyond what is already public. The first is easy to test against a scheme;
the second captures what we actually want. Their value comes from a single fact: they are the same
condition.
Theorem: Equivalence of Security Definitions
A scheme has indistinguishable encryptions if and only if it is semantically secure.
The two directions are unequal in difficulty, and seeing why is more instructive than a formal
reduction. We give the mechanism of each.
Semantic security implies indistinguishability. This direction is immediate in
contrapositive: an adversary who wins the indistinguishability game is already a violation of
semantic security. Suppose some adversary distinguishes the encryptions of two chosen plaintexts
\(m_0, m_1\) with non-negligible advantage. Consider the function \(f\) that the semantic definition
quantifies over, taken to be the single bit "is the plaintext \(m_1\) rather than \(m_0\)." From the
ciphertext, the winning adversary computes this bit with probability non-negligibly above one half.
No simulator denied the ciphertext can match this: knowing only that the plaintext is one of \(m_0,
m_1\) — each equally likely — the best blind guess of the bit succeeds with probability exactly one
half. The adversary's edge over every simulator is therefore non-negligible, which is precisely the
failure of semantic security. So a semantically secure scheme cannot have a distinguishing
adversary; it has indistinguishable encryptions.
Indistinguishability implies semantic security. This direction carries the real
content, and its mechanism is the construction of a simulator. We are given that no efficient
adversary wins the guessing game, and must produce, for any adversary \(\mathcal{A}\) that computes
some \(f(m)\) from the ciphertext, a simulator \(\mathcal{S}\) that does as well without it. The
simulator's strategy is to run \(\mathcal{A}\) on a decoy: rather than the true ciphertext,
which it does not have, \(\mathcal{S}\) encrypts a fixed dummy plaintext of the correct length —
which it can do, since encryption is available to it as it is to the adversary — drawing the
randomness itself, and feeds the result to \(\mathcal{A}\). Indistinguishability guarantees that
\(\mathcal{A}\) cannot tell it has been fed a decoy rather than a real challenge — if it could, that
detection would itself win the guessing game. So \(\mathcal{A}\) computes \(f\) on the decoy run
essentially as well as it would have on the real ciphertext, and the simulator simply reports
\(\mathcal{A}\)'s output. The ciphertext is thereby shown to have been dispensable: whatever
\(\mathcal{A}\) extracted from it could be extracted from a decoy carrying none of the true
plaintext's information. Any gap between the two runs would be a distinguisher, contradicting the
hypothesis.
The equivalence is what makes the theory workable. One proves indistinguishability — a finite,
game-shaped obligation that reduces to a hardness assumption about some underlying problem — and
obtains semantic security, the guarantee one actually cares about, for free. Every later claim that a
scheme "leaks nothing about the plaintext" is established this way: not by arguing directly that no
function leaks, which quantifies over all functions and all distributions, but by winning a single
guessing game.
The equivalence also closes the account opened on the previous page, where security split into the
unconditional and the computational. Perfect secrecy demanded that the ciphertext be statistically
independent of the plaintext — no information, against an adversary of unlimited power — and we saw
it costs a key as long as the message. Semantic security is the computational shadow of that
demand: the ciphertext need not be independent of the plaintext, only independent as far as a
polynomial-time adversary can tell. Relaxing "no information" to "no efficiently extractable
information" is exactly what lifts the key-length barrier, allowing a short key to protect a long
message — the trade the one-time pad could not make. The unconditional guarantee and its
computational relaxation are the same idea held to two different standards of adversary, and the
definitions of this page are the computational one made exact.
Stronger Notions
Semantic security, in either of its equivalent forms, is the baseline a modern encryption scheme is
expected to meet. It is not the ceiling. The guessing game of this page handed the adversary a
specific power — it chose plaintexts and saw their encryptions — but real adversaries have more.
Strengthening the definition is a matter of enlarging what the adversary may do, and the same
game-shaped template absorbs each enlargement.
The first enlargement follows the attack hierarchy of the previous page. The indistinguishability
game as stated lets the adversary obtain encryptions of plaintexts it chooses; this is security
against a chosen-plaintext attack. But the previous page placed a stronger adversary above it — one
that can also obtain decryptions of ciphertexts of its choosing, even adaptively, even
after seeing the challenge, for every ciphertext but the target itself. Granting the game's
adversary that decryption access, and still demanding it cannot win, defines security against an
adaptive chosen-ciphertext attack. The definition does not change shape — it is the same guessing
game with a more powerful player — but the bar is far higher, and it is the bar that schemes
deployed in adversarial settings are actually held to, because real systems do decrypt
attacker-supplied ciphertexts.
A second strengthening asks for something the guessing game does not directly address:
non-malleability. A scheme is malleable if, given a ciphertext, an adversary can
produce a different ciphertext whose plaintext is related to the original in a controlled way —
without ever learning either plaintext. Indistinguishability forbids reading the message;
non-malleability forbids tampering with it coherently. The two are not independent: a non-malleable
scheme is also semantically secure, so non-malleability sits strictly above the baseline of this
page. The distinction matters wherever a ciphertext is not merely read but acted upon — a transferred
amount, a command — since there the threat is an adversary who alters the effect without breaking
the secrecy.
These notions branch from a single baseline along two independent axes, and the branching pattern is
itself the durable object. Each stronger notion is the same construction — a game in which an
adversary, granted some catalogue of powers, must fail to win — instantiated against a stronger
adversary or aimed at a different goal. The catalogue grows; the shape does not. This is the
framework the previous page promised would survive even a change in the model of computation: the
definitions of security are claims about what a bounded adversary cannot do, and they remain
meaningful whatever the adversary is built from, classical or quantum. What a quantum adversary
changes is not the definitions but which schemes meet them — which underlying problems remain hard
enough to anchor the bound.
That anchoring is the next concern. Every definition on this page is conditional: a scheme is secure
if no efficient adversary wins, and whether one exists turns on the hardness of some computational
problem the scheme is built from. We have defined what security means without yet exhibiting a single
scheme that achieves it, because achieving it requires a problem believed intractable and a way to
bind a scheme's security to that problem. Supplying both is the work the rest of the track takes up.
It begins with the divide the previous page drew between a shared secret key and a public one, and
the number-theoretic problems — factoring, the discrete logarithm — on whose presumed hardness the
first concrete public-key schemes, and the meaning of "secure" made precise here, will rest.