Class NP

Polynomial Verifiability Nondeterministic Polynomial Time (NP) P vs NP Question NP-Completeness

Polynomial Verifiability

The Hamiltonian cycle problem illustrates an important phenomenon called polynomial verifiability. Although no known algorithm can determine whether a Hamiltonian cycle exists in polynomial time (the best known algorithms require exponential time), if someone hands us a proposed cycle, we can verify that it is indeed a valid Hamiltonian cycle in polynomial time — simply by checking that it visits every vertex exactly once.

Definition: Verifiability

A verifier for a language \(A\) is an algorithm \(V\) such that \[ A = \{w \mid V \text{ accepts } \langle w, c \rangle \text{ for some string } c\}. \] We measure the time of a verifier only in terms of the length of \(w\), so a polynomial-time verifier runs in polynomial time in the length of \(w\). A language \(A\) is polynomially verifiable if it has a polynomial-time verifier. The string \(c\) is called a certificate of membership in \(A\). Since the verifier runs in time polynomial in \(|w|\), the certificate \(c\) it inspects has length at most polynomial in \(|w|\) — there is no time to read more.

The string \(c\) plays the role of additional information that supports the claim \(w \in A\); we may think of it as a proof of membership presented by an external source. For the Hamiltonian cycle problem, a certificate is a specific cycle, and the verifier inspects it to confirm that it visits every vertex exactly once.

Nondeterministic Polynomial Time (NP)

Definition: Class NP

NP is the class of languages that have polynomial-time verifiers.

Definition: Nondeterministic Turing Machine

A nondeterministic Turing machine (NTM) is a 7-tuple \((Q, \Sigma, \Gamma, \delta, q_0, q_{\mathrm{accept}}, q_{\mathrm{reject}})\) defined exactly as a Turing machine except that the transition function is multi-valued: \[ \delta: Q \times \Gamma \to \mathcal{P}(Q \times \Gamma \times \{L, R\}), \] where \(\mathcal{P}(\cdot)\) denotes the power set. At each configuration, the machine may have several possible next moves, producing a computation tree rather than a linear sequence. A branch is a root-to-leaf path in this tree.

An NTM \(N\) accepts input \(w\) if at least one branch of its computation on \(w\) ends in state \(q_{\mathrm{accept}}\). It rejects \(w\) if all branches halt in \(q_{\mathrm{reject}}\). The machine decides the language \(L = \{w \mid N \text{ accepts } w\}\) in time \(t(n)\) if every branch on every input of length \(n\) halts within \(t(n)\) steps.

There is an equivalent characterization of NP using NTMs, in which the certificate \(c\) is replaced by a nondeterministic guess. The following theorem makes the equivalence precise.

Definition: Nondeterministic Time Complexity Class

Let \(t\colon \mathbb{N} \to \mathbb{R}^+\) be a function. The nondeterministic time complexity class \[ \text{NTIME}(t(n)) = \{L \mid L \text{ is a language decided by an } O(t(n))\text{-time nondeterministic Turing machine}\}. \]

Theorem: Verifier–NTM Equivalence

A language \(A\) is in NP if and only if some nondeterministic polynomial-time Turing machine decides \(A\). Consequently, \[ NP = \bigcup_{k \in \mathbb{N}} \text{NTIME}(n^k). \]

Proof.

(\(\Rightarrow\)) Verifier \(\Rightarrow\) NTM. Suppose \(A\) has a polynomial-time verifier \(V\) running in time \(n^k\) on inputs \(\langle w, c \rangle\) of length \(|w| = n\). By the certificate-length bound established in Verifiability, we may assume \(|c| \leq n^k\). Construct an NTM \(N\) that, on input \(w\):

  1. Nondeterministically writes a string \(c\) of length at most \(n^k\) on the tape (one symbol per branch, for at most \(n^k\) steps).
  2. Simulates \(V\) on \(\langle w, c \rangle\), accepting iff \(V\) accepts.

Step 1 takes \(O(n^k)\) steps; step 2 takes \(O(n^k)\) steps. The total time per branch is \(O(n^k)\), polynomial in \(n\). Some branch of \(N\) accepts \(w\) iff some certificate \(c\) makes \(V\) accept \(\langle w, c \rangle\), iff \(w \in A\).

(\(\Leftarrow\)) NTM \(\Rightarrow\) Verifier. Suppose an NTM \(N\) decides \(A\) in time \(n^k\). Define a deterministic verifier \(V\) that, on input \(\langle w, c \rangle\):

  1. Interprets \(c\) as a sequence of nondeterministic choices, one per step of \(N\).
  2. Simulates \(N\) on \(w\) following the choices encoded by \(c\), accepting iff this branch ends in \(q_{\mathrm{accept}}\).

Each step of \(N\) has at most \(|\delta_{\max}| := \max_{q,a} |\delta(q,a)|\) choices, which is a finite constant since \(Q\) and \(\Gamma\) are finite. Hence \(c\) of length \(O(n^k \log |\delta_{\max}|) = O(n^k)\) suffices to encode any branch. The simulation runs in time polynomial in \(|w|\). \(V\) accepts \(\langle w, c \rangle\) for some \(c\) iff some branch of \(N\) accepts \(w\), iff \(w \in A\). \(\blacksquare\)

P vs NP Question

Recall that P is the class of languages for which membership can be decided in polynomial time, while NP is the class of languages for which membership can be verified in polynomial time.

Problems like the Hamiltonian cycle are known to be in NP but are not known to be in P. Intuitively, it seems that verifying a solution should be easier than finding one — but this has never been proved. The question of whether \(P = NP\) or \(P \neq NP\) is one of the most important open problems in mathematics and computer science, listed among the Clay Mathematics Institute's Millennium Prize Problems (with a \$1 million prize). Most researchers believe that \(P \neq NP\).

Definition: Class EXPTIME

EXPTIME is the class of languages decidable in exponential time on a deterministic Turing machine: \[ \text{EXPTIME} = \bigcup_{k \in \mathbb{N}} \text{TIME}(2^{n^k}). \]

What is known is the chain of containments \[ P \subseteq NP \subseteq \text{EXPTIME}. \] Whether either inclusion is strict remains open. The two containments are established by the following theorems.

Theorem: \(P \subseteq NP\)

Every language in P is in NP. That is, \(P \subseteq NP\).

Proof.

Let \(A \in P\) and let \(M\) be a deterministic polynomial-time Turing machine deciding \(A\). Define a verifier \(V\) that, on input \(\langle w, c \rangle\), ignores \(c\) and simulates \(M\) on \(w\), accepting iff \(M\) accepts. Since \(M\) runs in polynomial time, so does \(V\). For any \(w \in A\), every \(c\) (including the empty string) is a valid certificate; for any \(w \notin A\), no \(c\) leads \(V\) to accept. Hence \(V\) is a polynomial-time verifier for \(A\), so \(A \in NP\). \(\blacksquare\)

Theorem: \(NP \subseteq \text{EXPTIME}\)

Every language in NP is in EXPTIME. That is, \(NP \subseteq \text{EXPTIME}\).

Proof.

Let \(A \in NP\). By Verifier–NTM Equivalence, some NTM \(N\) decides \(A\) in time \(n^k\) for some constant \(k\). Construct a deterministic TM \(D\) that, on input \(w\) of length \(n\), simulates every branch of \(N\) on \(w\) by exhaustive breadth-first search of the computation tree.

Each step of \(N\) has at most \(b := |\delta_{\max}|\) choices, where \(b\) is a finite constant determined by \(N\). The computation tree of \(N\) on \(w\) has depth at most \(n^k\), so it contains at most \(b^{n^k}\) branches. Simulating one branch takes \(O(n^k)\) steps. Total time: \[ O(n^k \cdot b^{n^k}) = O(n^k \cdot 2^{n^k \log_2 b}) = 2^{O(n^k)}. \] \(D\) accepts iff some branch of \(N\) accepts, iff \(w \in A\). Hence \(D\) runs in time \(2^{c \cdot n^k}\) for some constant \(c\). For \(n \geq c\), this is bounded by \(2^{n^{k+1}}\), so \(A \in \text{TIME}(2^{n^{k+1}}) \subseteq \text{EXPTIME}\). \(\blacksquare\)

NP-Completeness

Some problems in NP are as hard as any problem in NP — solving any one of them in polynomial time would imply \(P = NP\). These are the NP-complete problems. The Hamiltonian cycle problem is a classical example, established by reduction from a known NP-complete problem; we make the general transfer principle precise below.

Definition: Polynomial Time Reduction

Language \(A\) is polynomial-time reducible to language \(B\), denoted \[ A \leq_{P} B, \] if a polynomial-time computable function \(f: \Sigma^* \to \Sigma^*\) exists such that \[ w \in A \iff f(w) \in B \quad \text{for all } w \in \Sigma^*. \] The function \(f\) is called a polynomial-time reduction of \(A\) to \(B\).

Here, a function \(f\) is polynomial-time computable if there exists a polynomial-time Turing machine \(M\) that, on any input \(w\), halts with \(f(w)\) on its tape.

Definition: NP-Completeness

A language \(B\) is NP-complete if it satisfies the following two conditions:

  1. \(B\) is in NP.
  2. Every language \(A\) in NP is polynomial-time reducible to \(B\) (i.e., \(A \leq_P B\)).

A language satisfying condition 2 (but not necessarily condition 1) is called NP-hard.

Theorem: Reduction Transfers Tractability

If \(A \leq_P B\) and \(B \in P\), then \(A \in P\). Consequently, if any NP-complete language belongs to P, then \(P = NP\).

Proof.

Part 1. Let \(f\) be a polynomial-time reduction from \(A\) to \(B\), computable by a TM \(M_f\) in time \(O(n^j)\). Let \(M_B\) be a polynomial-time decider for \(B\) running in time \(O(m^k)\) on inputs of length \(m\). Define a decider \(M_A\) for \(A\) that, on input \(w\) of length \(n\):

  1. Computes \(f(w)\) using \(M_f\), in time \(O(n^j)\).
  2. Runs \(M_B\) on \(f(w)\), accepting iff \(M_B\) accepts.

Since \(M_f\) writes at most \(O(n^j)\) symbols, \(|f(w)| = O(n^j)\). Step 2 then runs in time \(O((n^j)^k) = O(n^{jk})\). Total time \(O(n^j + n^{jk}) = O(n^{jk})\), polynomial in \(n\). By \(A \leq_P B\), \(w \in A \Leftrightarrow f(w) \in B\), so \(M_A\) decides \(A\). Hence \(A \in P\).

Part 2. Suppose \(B\) is NP-complete and \(B \in P\). For any \(A \in NP\), condition 2 of NP-completeness gives \(A \leq_P B\); Part 1 then gives \(A \in P\). Hence \(NP \subseteq P\). Combined with \(P \subseteq NP\), we get \(P = NP\). \(\blacksquare\)

To illustrate, we introduce the closely related Traveling Salesperson Problem.

Example: Traveling Salesperson Problem (TSP) — Decision Version

Given a set of cities, the distances between them, and a number \(k\), is there a tour that visits each city exactly once, returns to the starting city, and has total distance \(\leq k\)?

The decision version of TSP is a classical NP-complete problem (proved via reduction from the Hamiltonian cycle problem). The optimization version — which asks for a tour of minimum total weight, rather than testing whether one of weight \(\leq k\) exists — is NP-hard but not NP-complete, because it is not a decision problem and thus does not belong to NP in the formal sense.

Under the widely believed assumption that \(P \neq NP\), the landscape of complexity classes is as follows:

complexity

Aside: Collapse Under \(P = NP\)

If \(P = NP\), the classes P, NP, and NP-complete all collapse to a single class. To see why, let \(A \in P\) be any nontrivial language (so there exist strings \(a^+ \in A\) and \(a^- \notin A\)). For any \(B \in NP = P\), let \(M_B\) be a polynomial-time decider for \(B\). The map \[ f_B(w) = \begin{cases} a^+ & \text{if } M_B \text{ accepts } w, \\ a^- & \text{otherwise} \end{cases} \] is a polynomial-time reduction \(B \leq_P A\). Combined with \(A \in NP\), this makes \(A\) NP-complete. Hence under \(P = NP\): \[ P = NP = \text{NP-complete (modulo trivial cases)}. \]

Math ↔ CS: Living with Intractability

NP-completeness theory has profound practical implications. When an industrial problem in scheduling, routing, network design, or resource allocation is shown to be NP-complete, an exact polynomial-time algorithm is almost certainly out of reach. The response is not to give up but to change the question: practitioners turn to approximation algorithms with provable suboptimality bounds, heuristics such as simulated annealing or genetic algorithms, fixed-parameter tractable solvers for restricted inputs, and SAT/SMT solvers that exploit problem structure despite worst-case exponential behavior. The boundary between P and NP-complete thus does not end engineering — it reshapes it.