The Ring of Polynomials
In our journey through algebra so far, we have treated groups and rings as abstract containers for symmetry and arithmetic.
To reach the modern standard of security, such as AES (Advanced Encryption Standard), we must change how we
view one of the most familiar objects in mathematics: the polynomial.
The critical shift here is treating polynomials not just as functions where we "plug in" numbers, but as
formal algebraic elements in their own right. By viewing polynomials as members of a ring,
we can mathematically manipulate digital bitstreams as if they were numbers. This abstraction allows us to
construct the finite fields necessary for error-correcting codes and encryption.
Definition: Ring of Polynomials over \(R\)
Let \(R\) be a commutative ring. The set of formal symbols
\[
R[x] = \{a_n x^n + a_{n-1} x^{n-1} + \cdots + a_1 x + a_0 \mid a_i \in R, \, n \in \mathbb{Z}_{\geq 0}\}
\]
is called the ring of polynomials over \(R\) in the indeterminate \(x\).
Two elements
\[
a_n x^n + a_{n-1} x^{n-1} + \cdots + a_1 x + a_0
\]
and
\[
b_m x^m + b_{m-1} x^{m-1} + \cdots + b_1 x + b_0
\]
of \(R[x]\) are said to be equal if and only if \(a_i = b_i\) for all nonnegative integers \(i\).
Note that \(a_i =0\) if \(i > n\) and \(b_i =0\) if \(i > m\).
Definition: Addition and Multiplication in \(R[x]\)
Let \(R\) be a commutative ring and let
\[
f(x) = a_n x^n + a_{n-1} x^{n-1} + \cdots + a_1 x + a_0
\]
and
\[
g(x) = b_m x^m + b_{m-1} x^{m-1} + \cdots + b_1 x + b_0
\]
belong to \(R[x]\). Then
\[
f(x) + g(x) = (a_s + b_s)x^s + (a_{s-1} + b_{s-1} )x^{s-1} + \cdots + (a_1 + b_1) x + a_0 + b_0,
\]
where \(s\) is the maximum of \(m\) and \(n\), \(a_i = 0\) for \(i > n\), and \(b_i = 0\) for \(i > m\).
Also,
\[
f(x)g(x) = c_{m+n} x^{m+n} + c_{m+n-1} x^{m+n-1} + \cdots + c_1 x + c_0,
\]
where
\[
c_k = a_k b_0 + a_{k-1}b_1 + \cdots + a_1 b_{k-1} + a_0 b_k
\]
for \(k = 0, \ldots , m+n\), where we adopt the convention that \(a_i = 0\) for \(i > n\) and \(b_j = 0\) for \(j > m\).
Equivalently, using summation notation:
\[
c_k = \sum_{i=0}^k a_i b_{k-i}.
\]
Under these operations, \(R[x]\) inherits the ring axioms from \(R\) coefficient-wise: associativity
and distributivity reduce to the corresponding identities for finite sums and products of coefficients in \(R\),
the zero polynomial (all coefficients \(0\)) is the additive identity, and additive inverses are obtained by
negating each coefficient. We treat \(R[x]\) as a ring throughout this page.
The formula for \(c_k\) is a discrete convolution - the same operation that appears throughout
signal processing and deep learning. In Convolutional Neural Networks (CNNs), the "convolution" of a filter
with an input follows this identical algebraic pattern. Understanding polynomial multiplication thus provides
insight into why convolution is so natural for processing sequential and spatial data.
Computational Note: For degree-\(n\) polynomials, naive polynomial multiplication requires
\(O(n^2)\) operations, which becomes a bottleneck for high-degree polynomials. However, by leveraging the
Fast Fourier Transform (FFT), we can
reduce this time complexity to \(O(n \log n)\).
The key insight lies in the Convolution Theorem:
while convolution is computationally expensive in the coefficient representation, it transforms into simple
pointwise multiplication in the evaluation (frequency) representation. This FFT-based approach is not just a
theoretical curiosity; it provides the backbone for efficient implementations of CRC, Reed-Solomon codes,
and large-integer arithmetic in modern cryptographic libraries.
Also, for a computer scientist, polynomial addition over \(\mathbb{Z}_2\) is equivalent to a bitwise XOR
operation, making these structures extremely efficient to implement in hardware.
Definition: Leading Coefficient
If a polynomial \(f(x)\) is written by
\[
f(x) = a_n x^n + a_{n-1} x^{n-1} + \cdots + a_1 x + a_0,
\]
where \(a_n \neq 0\), then \(a_n\) is called the leading coefficient of \(f(x)\), and \(n\) is
the degree of \(f(x)\), denoted \(\operatorname{deg} f(x)\).
By convention, the zero polynomial has no degree (or is sometimes assigned degree \(-\infty\)).
This convention ensures the Division Algorithm statement remains valid when the remainder is zero.
Theorem: \(D[x]\) Inherits the Integral Domain Property
If \(D\) is an integral domain, then \(D[x]\) is also an integral domain.
Proof:
Since \(D[x]\) is a ring, we show that \(D[x]\) is commutative with a unity and has no zero-divisors.
Commutativity of \(D[x]\) follows from the commutativity of \(D\): writing \(c_k = \sum_{i=0}^k a_i b_{k-i}\)
for the coefficients of \(f(x)g(x)\) and \(d_k = \sum_{j=0}^k b_j a_{k-j}\) for those of \(g(x)f(x)\),
the substitution \(j = k - i\) reindexes the second sum as \(d_k = \sum_{i=0}^k b_{k-i} a_i = \sum_{i=0}^k a_i b_{k-i} = c_k\),
where the second equality uses commutativity of \(D\).
For the unity, let \(1\) denote the unity of \(D\); then the constant polynomial \(f(x) = 1\) satisfies
\(1 \cdot g(x) = g(x)\) for every \(g(x) \in D[x]\) (the product formula reduces to copying \(g\)'s coefficients),
so \(f(x) = 1\) is the unity of \(D[x]\).
Now suppose that
\[
f(x) = a_n x^n + a_{n-1} x^{n-1} + \cdots + a_1 x + a_0
\]
and
\[
g(x) = b_m x^m + b_{m-1} x^{m-1} + \cdots + b_1 x + b_0,
\]
where \(a_n \neq 0\) and \(b_m \neq 0\). The coefficient of \(x^{n+m}\) in \(f(x)g(x)\) is
\[
c_{n+m} = \sum_{i=0}^{n+m} a_i b_{n+m-i}.
\]
For \(i < n\), the index \(n+m-i > m\) forces \(b_{n+m-i} = 0\); for \(i > n\), \(a_i = 0\).
Thus only the term \(i = n\) survives, giving \(c_{n+m} = a_n b_m\). Since
\(D\) is an integral domain and \(a_n, b_m \neq 0\), we have \(a_n b_m \neq 0\).
Therefore, \(f(x)g(x) \neq 0\), which shows that \(D[x]\)
has no zero-divisors. Hence, \(D[x]\) is an integral domain.
This result tells us that if we start with "nice" coefficient rings (like \(\mathbb{Z}\) or
any field), the polynomial ring inherits that niceness. In particular, since every field is an integral domain,
\(F[x]\) is always an integral domain when \(F\) is a field.
Corollary: Degree of a Product in an Integral Domain
Let \(D\) be an integral domain and let \(f(x), g(x) \in D[x]\) be nonzero polynomials. Then
\[
\operatorname{deg}(f(x)g(x)) = \operatorname{deg} f(x) + \operatorname{deg} g(x).
\]
This is immediate from the proof above: the leading coefficient of \(f(x)g(x)\) is \(a_n b_m \neq 0\).
For computer scientists, this guarantees that polynomial multiplication behaves predictably -
there is no "degree collapse" that could corrupt data in polynomial-based encodings.
The Division Algorithm
The power of integers lies in our ability to perform division with remainders (modular arithmetic). The Division Algorithm
for polynomials provides the same capability. This is the foundation of Cyclic Redundancy Checks (CRC) and
the construction of extension fields.
Theorem: Division Algorithm for \(F[x]\)
Let \(F\) be a field and let \(f(x), g(x) \in F[x]\) with \(g(x) \neq 0\). Then there exist unique
polynomials \(q(x)\) and \(r(x)\) in \(F[x]\) such that
\[
f(x) = g(x)q(x) + r(x)
\]
and either \(r(x) =0 \) or \(\operatorname{deg} r(x) < \operatorname{deg} g(x)\).
Proof:
Existence. We argue by strong induction on \(n = \operatorname{deg} f(x)\).
Base case: if \(f(x) = 0\) or \(\operatorname{deg} f(x) < \operatorname{deg} g(x)\),
set \(q(x) = 0\) and \(r(x) = f(x)\); the required conditions are immediate.
Inductive step: assume \(n = \operatorname{deg} f(x) \geq \operatorname{deg} g(x) = m\),
and write
\[
f(x) = a_n x^n + \cdots + a_0, \quad g(x) = b_m x^m + \cdots + b_0, \quad a_n \neq 0,\ b_m \neq 0.
\]
Since \(F\) is a field, \(b_m^{-1}\) exists. Define
\[
f_1(x) = f(x) - a_n b_m^{-1} x^{n-m} g(x).
\]
The coefficient of \(x^n\) in \(f_1(x)\) is \(a_n - a_n b_m^{-1} \cdot b_m = 0\), so either
\(f_1(x) = 0\) or \(\operatorname{deg} f_1(x) < n\).
Applying the induction hypothesis to \(f_1(x)\) (whose degree is strictly less than \(n\),
or which is zero), there exist \(q_1(x), r_1(x) \in F[x]\) with
\[
f_1(x) = g(x) q_1(x) + r_1(x), \quad r_1(x) = 0 \text{ or } \operatorname{deg} r_1(x) < \operatorname{deg} g(x).
\]
Substituting back:
\[
f(x) = a_n b_m^{-1} x^{n-m} g(x) + g(x) q_1(x) + r_1(x) = g(x)\bigl[a_n b_m^{-1} x^{n-m} + q_1(x)\bigr] + r_1(x).
\]
Setting \(q(x) = a_n b_m^{-1} x^{n-m} + q_1(x)\) and \(r(x) = r_1(x)\), we have \(f(x) = g(x) q(x) + r(x)\),
and the condition on \(r(x)\) is inherited from \(r_1(x)\).
Uniqueness. Suppose \(f(x) = g(x) q(x) + r(x) = g(x) \bar{q}(x) + \bar{r}(x)\),
where each of \(r(x), \bar{r}(x)\) is either \(0\) or of degree less than \(\operatorname{deg} g(x)\).
Subtracting:
\[
g(x)\bigl[q(x) - \bar{q}(x)\bigr] = \bar{r}(x) - r(x).
\]
Suppose for contradiction that \(q(x) - \bar{q}(x) \neq 0\). Since \(F[x]\) is an
integral domain, the
degree of the product
on the left equals
\(\operatorname{deg} g(x) + \operatorname{deg}\bigl(q(x) - \bar{q}(x)\bigr) \geq \operatorname{deg} g(x)\).
On the other hand, since each of \(r(x), \bar{r}(x)\) is either \(0\) or has degree strictly less than
\(\operatorname{deg} g(x)\), the difference \(\bar{r}(x) - r(x)\) is either \(0\) or has degree strictly
less than \(\operatorname{deg} g(x)\) — in either case it cannot have degree \(\geq \operatorname{deg} g(x)\),
a contradiction.
Hence \(q(x) - \bar{q}(x) = 0\), giving \(q(x) = \bar{q}(x)\). Substituting back into
\(g(x)[q(x) - \bar{q}(x)] = \bar{r}(x) - r(x)\) yields \(\bar{r}(x) = r(x)\).
This theorem ensures that we can always "reduce" a large polynomial modulo a smaller one, creating a
finite set of remainders. Two immediate and powerful consequences follow:
Corollary: Remainder Theorem
Let \(F\) be a field, \(a \in F\), and \(f(x) \in F[x]\). Then \(f(a)\) is the remainder in
the division of \(f(x)\) by \(x - a\).
Proof:
By the
Division Algorithm, \(f(x) = (x - a) q(x) + r(x)\) with \(r(x) = 0\) or
\(\operatorname{deg} r(x) < \operatorname{deg}(x - a) = 1\); that is, \(r(x)\) is a constant \(c \in F\).
Substituting \(x = a\) gives \(f(a) = (a - a)q(a) + c = c\), so the remainder equals \(f(a)\).
Corollary: Factor Theorem
Let \(F\) be a field, \(a \in F\), and \(f(x) \in F[x]\). Then \(a\) is a zero of \(f(x)\)
if and only if \(x -a\) is a factor of \(f(x)\).
Proof:
By the
Remainder Theorem, the remainder
of \(f(x)\) divided by \(x - a\) equals \(f(a)\). Hence \(x - a\) divides \(f(x)\) (i.e., the remainder is \(0\))
if and only if \(f(a) = 0\), i.e., \(a\) is a zero of \(f(x)\).
Theorem: A Polynomial of Degree \(n\) Has at Most \(n\) Zeros
A polynomial of degree \(n\) over a field has at most \(n\) zeros, counting multiplicity.
Proof:
We argue by induction on \(n = \operatorname{deg} f(x)\). The case \(n = 0\) is immediate: a nonzero constant
polynomial has no zeros. For \(n \geq 1\), if \(f(x)\) has no zeros in \(F\), the bound holds vacuously.
Otherwise, let \(a\) be a zero of \(f(x)\); by the
Factor Theorem,
\(f(x) = (x - a) q(x)\) for some \(q(x) \in F[x]\) with \(\operatorname{deg} q(x) = n - 1\) (by
Degree of a Product,
using that \(F\) is an integral domain). By induction, \(q(x)\) has at most \(n - 1\) zeros (counted with multiplicity). Any zero \(b\) of \(f\) other than \(a\)
satisfies \((b - a) q(b) = 0\), and since \(b - a \neq 0\) and \(F\) has no zero-divisors, \(q(b) = 0\).
Thus \(f\) has at most \(1 + (n - 1) = n\) zeros. The same bound holds when zeros are counted with multiplicity:
if \(a\) appears in \(f\) with multiplicity \(k \geq 1\), then \(a\) appears in \(q\) with multiplicity \(k - 1\),
and the multiplicities of all other zeros of \(f\) coincide with their multiplicities in \(q\); the total multiplicity
of zeros of \(f\) is therefore \(1\) (for the new factor \((x - a)\)) plus the total multiplicity of zeros of \(q\),
which is at most \(n - 1\) by induction.
The Division Algorithm does more than just count zeros - it reveals the ideal structure of polynomial rings.
Recall that an ideal is a subset closed under addition and "absorbing" multiplication from the ring.
In \(F[x]\), every ideal turns out to have a remarkably simple form:
Definition: Principal Ideal Domain (PID)
A principal ideal domain is an integral domain \(R\) in which every ideal has
the form
\[
\langle a \rangle = \{ra \mid r \in R\}
\]
for some \(a \in R\).
Theorem: \(F[x]\) is a Principal Ideal Domain
Let \(F\) be a field. Then \(F[x]\) is a principal ideal domain.
Proof:
First, we know that \(F[x]\) is an
integral domain. Let \(I\) be an ideal of \(F[x]\).
If \(I = \{0\}\), then \(I = \langle 0 \rangle\) (since \(\langle 0 \rangle = \{r \cdot 0 \mid r \in F[x]\} = \{0\}\)).
If \(I \neq \{0\}\), then among all the nonzero elements of \(I\), let \(g(x)\) be
one of minimum degree.
Since \(g(x) \in I\) and \(I\) is an ideal, every multiple of \(g(x)\) lies in \(I\); that is,
\(\langle g(x) \rangle = \{r(x) g(x) \mid r(x) \in F[x]\} \subseteq I\).
Conversely, let \(f(x) \in I\).
Then, by the
Division Algorithm,
\[
f(x) = g(x)q(x) + r(x),
\]
where \(r(x) = 0\) or \(\operatorname{deg} r(x) < \operatorname{deg} g(x)\).
Since \(r(x) = f(x) - g(x)q(x) \in I\), the minimality of \(\operatorname{deg} g(x)\) implies that the latter condition
cannot hold. So, \(r(x) = 0\) and thus \(f(x) \in \langle g(x) \rangle\). Therefore, \(I \subseteq \langle g(x) \rangle\).
Combining both inclusions gives \(I = \langle g(x) \rangle\), so \(I\) is principal. Since this holds for every ideal of \(F[x]\),
\(F[x]\) is a principal ideal domain.
The fact that \(F[x]\) is a PID has profound algorithmic consequences. Every ideal being principal means
that for any set of polynomials, we can find a single "Greatest Common Divisor" that generates the same ideal -
this is crucial for the Extended Euclidean Algorithm, which is used in AES to find multiplicative inverses.
Irreducibility
In the world of integers, prime numbers are the atoms that cannot be split. In the ring of polynomials, these "atoms" are called
irreducible polynomials. Understanding how to find them is the key to building secure cryptographic fields.
Definition: Irreducible Polynomial
Let \(D\) be an integral domain. A polynomial \(f(x)\) from \(D[x]\) that is neither the zero polynomial nor
a unit in \(D[x]\) is said to be irreducible over \(D\) if, whenever \(f(x)\) is expressed
as a product \(f(x) = g(x)h(x)\), with \(g(x)\) and \(h(x)\) from \(D[x]\), then \(g(x)\) or \(h(x)\) is a
unit in \(D[x]\).
Also, a nonzero, nonunit element of \(D[x]\) that is not irreducible over \(D\) is called reducible
over \(D\).
Important: Irreducibility is relative to the ring. The polynomial \(x^2 - 2\) is
irreducible over \(\mathbb{Q}\) (it has no rational roots), but it factors as \((x - \sqrt{2})(x + \sqrt{2})\)
over \(\mathbb{R}\). Similarly, \(x^2 + 1\) is irreducible over \(\mathbb{R}\) but factors as \((x-i)(x+i)\)
over \(\mathbb{C}\). For cryptography, we specifically need irreducibility over finite fields like \(\mathbb{Z}_2\).
Theorem: Reducibility Test for Degrees 2 and 3
Let \(F\) be a field. If \(f(x) \in F[x]\) and \(\operatorname{deg} f(x)\) is 2 or 3, then \(f(x)\) is reducible
over \(F\) if and only if \(f(x)\) has a zero in \(F\).
Proof:
(\(\Rightarrow\)) Suppose \(f(x)\) is reducible: there exist \(g(x), h(x) \in F[x]\) both nonunit with
\(f(x) = g(x) h(x)\). Since \(F[x]\) units are exactly the nonzero constants, both \(g(x)\) and \(h(x)\) have degree
\(\geq 1\). Since \(F\) is a field (hence an integral domain), the
degree of the product gives
\(\operatorname{deg} f(x) = \operatorname{deg} g(x) + \operatorname{deg} h(x)\). Because \(\operatorname{deg} f(x)\) is 2 or 3,
at least one of \(g(x), h(x)\) must have degree 1. Say \(g(x) = ax + b\) with \(a \neq 0\). Then \(-a^{-1}b \in F\) is a zero
of \(g(x)\), and therefore a zero of \(f(x)\).
(\(\Leftarrow\)) Suppose \(f(\alpha) = 0\) for some \(\alpha \in F\). By the
Factor Theorem, \(x - \alpha\) is a factor of \(f(x)\),
so \(f(x) = (x - \alpha) q(x)\) for some \(q(x) \in F[x]\). By the
degree of the product,
\(\operatorname{deg} q(x) = \operatorname{deg} f(x) - 1 \geq 1\), so \(q(x)\) is nonconstant.
Both \(x - \alpha\) (degree 1) and \(q(x)\) (degree \(\geq 1\)) have positive degree, so neither is a unit in \(F[x]\).
Moreover, \(\operatorname{deg} f(x) \geq 2\) ensures \(f(x)\) is itself nonzero and nonunit. Hence \(f(x)\) is reducible over \(F\).
For higher-degree polynomials, we need more sophisticated tools. The key insight is to leverage the relationship
between polynomials over \(\mathbb{Z}\) and polynomials over \(\mathbb{Q}\). This requires the concept of
content:
Definition: Content of a Polynomial, Primitive Polynomial
The content of a nonzero polynomial \(a_n x^n + a_{n-1} x^{n-1} + \cdots + a_1 x + a_0\), where the \(a\)'s
are integers, is the greatest common divisor of the integers \(a_n, a_{n-1}, \cdots, a_0\). A primitive polynomial
is an element of \(\mathbb{Z}[x]\) with content \(1\).
Theorem: Gauss's Lemma
The product of two primitive polynomials is primitive.
Proof:
Let \(f(x), g(x) \in \mathbb{Z}[x]\) be primitive, and suppose for contradiction that \(f(x)g(x)\) is not primitive.
Then there exists a prime \(p\) dividing the content of \(f(x)g(x)\); that is, \(p\) divides every coefficient of \(f(x)g(x)\).
Let \(\bar{f}(x), \bar{g}(x) \in \mathbb{Z}_p[x]\) be the polynomials obtained from \(f(x), g(x)\) by reducing each
coefficient modulo \(p\). The map \(\mathbb{Z}[x] \to \mathbb{Z}_p[x]\) given by coefficient-wise reduction is a ring
homomorphism, so \(\overline{f(x)g(x)} = \bar{f}(x)\bar{g}(x)\). By hypothesis, \(\overline{f(x)g(x)} = 0\) in \(\mathbb{Z}_p[x]\).
Since \(p\) is prime,
\(\mathbb{Z}_p\) is a field,
hence an integral domain. By the result that
\(D[x]\) is an integral domain whenever \(D\) is,
\(\mathbb{Z}_p[x]\) is an integral domain. Thus \(\bar{f}(x)\bar{g}(x) = 0\) forces \(\bar{f}(x) = 0\) or \(\bar{g}(x) = 0\).
The first case means \(p\) divides every coefficient of \(f(x)\), contradicting the primitivity of \(f(x)\); the second
case contradicts the primitivity of \(g(x)\). This contradiction completes the proof.
At first glance, Gauss's Lemma may seem like a niche property of primitive polynomials. However, its true power
lies in how it connects the arithmetic of integers to the arithmetic of rational numbers. Since any polynomial with
rational coefficients can be transformed into a primitive integer polynomial by clearing denominators, we obtain
the following crucial corollary:
Theorem: Reducibility over \(\mathbb{Q}\) Implies Reducibility over \(\mathbb{Z}\)
Let \(f(x) \in \mathbb{Z}[x]\). If \(f(x) = g(x) h(x)\) with \(g(x), h(x) \in \mathbb{Q}[x]\) and
\(\operatorname{deg} g(x), \operatorname{deg} h(x) < \operatorname{deg} f(x)\), then there exist
\(g_1(x), h_1(x) \in \mathbb{Z}[x]\) with \(f(x) = g_1(x) h_1(x)\), \(\operatorname{deg} g_1(x) = \operatorname{deg} g(x)\),
and \(\operatorname{deg} h_1(x) = \operatorname{deg} h(x)\). In particular, if \(f(x)\) is reducible over \(\mathbb{Q}\),
then it is reducible over \(\mathbb{Z}\).
Proof:
Reduction to primitive \(f(x)\). Write \(f(x) = c \cdot f'(x)\) where \(c \in \mathbb{Z}\) is the
content of \(f(x)\) and \(f'(x) \in \mathbb{Z}[x]\) is primitive. Then \(f(x) = g(x) h(x)\) yields
\(f'(x) = (g(x)/c) \cdot h(x)\) in \(\mathbb{Q}[x]\), still with degrees \(\operatorname{deg}(g(x)/c) = \operatorname{deg} g(x)\)
and \(\operatorname{deg} h(x)\), both \(< \operatorname{deg} f'(x) = \operatorname{deg} f(x)\).
If we prove the result for the primitive case — obtaining \(f'(x) = g_1'(x) h_1'(x)\) in \(\mathbb{Z}[x]\) with the
required degree equalities — then \(f(x) = (c \cdot g_1'(x)) \cdot h_1'(x)\) is the required \(\mathbb{Z}\)-factorization,
with \(\operatorname{deg}(c \cdot g_1'(x)) = \operatorname{deg} g_1'(x) = \operatorname{deg} g(x)\).
So we may assume \(f(x)\) is primitive.
Main argument. Let \(a\) be the least common multiple of the denominators of the coefficients of \(g(x)\),
and \(b\) the least common multiple of the denominators of the coefficients of \(h(x)\). Then \(a g(x), b h(x) \in \mathbb{Z}[x]\), and
\[
a b f(x) = a g(x) \cdot b h(x).
\]
Let \(c_1\) be the content of \(a g(x)\) and \(c_2\) the content of \(b h(x)\). Writing \(a g(x) = c_1 g_1(x)\) and
\(b h(x) = c_2 h_1(x)\) where \(g_1(x), h_1(x) \in \mathbb{Z}[x]\) are primitive, we obtain
\[
a b f(x) = c_1 c_2 g_1(x) h_1(x).
\]
By
Gauss's Lemma, \(g_1(x) h_1(x)\) is primitive,
so the content of the right side is \(c_1 c_2\). Since \(f(x)\) is primitive, the content of the left side is \(ab\).
Therefore \(ab = c_1 c_2\), so \(a b f(x) = ab \cdot g_1(x) h_1(x)\). Since \(\mathbb{Z}[x]\) is an
integral domain and \(ab \neq 0\), we may cancel to obtain
\[
f(x) = g_1(x) h_1(x), \quad g_1(x), h_1(x) \in \mathbb{Z}[x].
\]
Since dividing a polynomial by a nonzero integer constant does not change its degree,
\(\operatorname{deg} g_1(x) = \operatorname{deg}(a g(x)) = \operatorname{deg} g(x)\), and similarly
\(\operatorname{deg} h_1(x) = \operatorname{deg} h(x)\).
While it may seem intuitive that integer polynomials should have integer factors, this result provides the formal guarantee
that allows us to ignore the infinite complexity of rational numbers.
Specifically, it ensures that if a polynomial \(f(x) \in \mathbb{Z}[x]\) cannot be factored using integer coefficients,
it is mathematically impossible to find a factorization using rational coefficients either. This effectively reduces a
search across the infinite set \(\mathbb{Q}\) to a finite, decidable problem within \(\mathbb{Z}\) - a necessity for any computational approach
to irreducibility.
Theorem: Mod \(p\) Irreducibility Test
Let \(f(x) \in \mathbb{Z}[x]\) with \(\operatorname{deg} f(x) > 1\) and let \(p\) be a prime that does not divide the
lead coefficient of \(f(x)\) nor the constant term of \(f(x)\). Let \(\bar{f}(x)\) be the polynomial in
\(\mathbb{Z}_p [x]\) obtained from \(f(x)\) by reducing all the coefficients of \(f(x)\) modulo \(p\). If
\(\bar{f}(x)\) is irreducible over \(\mathbb{Z}_p\), then \(f(x)\) is irreducible over \(\mathbb{Q}\).
Proof:
We prove the contrapositive: if \(f(x)\) is reducible over \(\mathbb{Q}\), then \(\bar{f}(x)\) is reducible over \(\mathbb{Z}_p\).
Suppose \(f(x)\) is reducible over \(\mathbb{Q}\); that is, \(f(x) = g(x) h(x)\) with \(g(x), h(x) \in \mathbb{Q}[x]\)
both nonunit, equivalently with \(\operatorname{deg} g(x), \operatorname{deg} h(x) \geq 1\).
By the
degree-preserving \(\mathbb{Z}\)-factorization,
we may assume \(g(x), h(x) \in \mathbb{Z}[x]\) with \(\operatorname{deg} g(x), \operatorname{deg} h(x) < \operatorname{deg} f(x)\).
Let \(\bar{g}(x), \bar{h}(x)\) be obtained from \(g(x), h(x)\) by reducing coefficients modulo \(p\).
Since coefficient-wise reduction is a ring homomorphism \(\mathbb{Z}[x] \to \mathbb{Z}_p[x]\), we have
\(\bar{f}(x) = \bar{g}(x) \bar{h}(x)\).
Since \(p\) does not divide the lead coefficient of \(f(x)\), the leading term survives reduction,
so \(\operatorname{deg} \bar{f}(x) = \operatorname{deg} f(x)\). Reduction modulo \(p\) cannot increase degree,
so \(\operatorname{deg} \bar{g}(x) \leq \operatorname{deg} g(x) < \operatorname{deg} f(x) = \operatorname{deg} \bar{f}(x)\),
and similarly \(\operatorname{deg} \bar{h}(x) < \operatorname{deg} \bar{f}(x)\).
The hypothesis \(p \nmid\) constant term of \(f(x)\) ensures \(\bar{f}(0) \neq 0\), so \(\bar{g}(0) \neq 0\) and
\(\bar{h}(0) \neq 0\); in particular, neither \(\bar{g}(x)\) nor \(\bar{h}(x)\) is the zero polynomial.
Since \(p\) is prime,
\(\mathbb{Z}_p\) is a field,
hence an integral domain, so by the result that
\(D[x]\) is an integral domain whenever \(D\) is,
\(\mathbb{Z}_p[x]\) is also an integral domain. The
degree of the product then gives
\(\operatorname{deg} \bar{g}(x) + \operatorname{deg} \bar{h}(x) = \operatorname{deg} \bar{f}(x)\).
Combined with \(\operatorname{deg} \bar{g}(x), \operatorname{deg} \bar{h}(x) < \operatorname{deg} \bar{f}(x)\), this forces
\(\operatorname{deg} \bar{g}(x) \geq 1\) and \(\operatorname{deg} \bar{h}(x) \geq 1\); otherwise one factor would be a
nonzero constant and the other would have degree equal to \(\operatorname{deg} \bar{f}(x)\).
Hence neither \(\bar{g}(x)\) nor \(\bar{h}(x)\) is a unit in \(\mathbb{Z}_p[x]\), and \(\bar{f}(x) = \bar{g}(x) \bar{h}(x)\)
is a nontrivial factorization. So \(\bar{f}(x)\) is reducible over \(\mathbb{Z}_p\), completing the contrapositive.
The Mod \(p\) Test is a powerful bridge between infinite and finite structures.
In the world of rational numbers \(\mathbb{Q}\), checking for irreducibility is a global problem. By reducing the
coefficients modulo a prime \(p\), we project the polynomial into a finite field \(\mathbb{Z}_p\).
Theorem: Eisenstein's Criterion (1850)
Let
\[
f(x) = a_n x^n + a_{n-1} x^{n-1} + \cdots + a_0 \in \mathbb{Z}[x].
\]
If there is a prime \(p\) such that:
- \(p \nmid a_n\) (the prime does not divide the leading coefficient)
- \(p \mid a_{n-1}, \cdots, p \mid a_0\) (the prime divides all other coefficients)
- \(p^2 \nmid a_0\) (the square of the prime does not divide the constant term)
Then \(f(x)\) is irreducible over \(\mathbb{Q}\).
Proof Sketch:
We argue by contradiction. Suppose \(f(x)\) is reducible over \(\mathbb{Q}\). By the
degree-preserving Z-factorization,
there exist \(g(x) = b_r x^r + \cdots + b_0\) and \(h(x) = c_s x^s + \cdots + c_0\) in \(\mathbb{Z}[x]\) with
\(f(x) = g(x) h(x)\), \(1 \leq r, s\), and \(r + s = n\).
Comparing constant terms, \(a_0 = b_0 c_0\). Conditions (2) and (3) — \(p \mid a_0\) but \(p^2 \nmid a_0\) —
force \(p\) to divide exactly one of \(b_0, c_0\). Without loss of generality, say \(p \mid b_0\) and \(p \nmid c_0\).
Comparing leading coefficients, \(a_n = b_r c_s\); by condition (1), \(p \nmid a_n\), so \(p \nmid b_r\) (and \(p \nmid c_s\)).
Since \(p \mid b_0\) but \(p \nmid b_r\), there is a least index \(t\) with \(1 \leq t \leq r\) such that \(p \nmid b_t\).
Now examine the coefficient
\[
a_t = \sum_{i + j = t} b_i c_j = b_t c_0 + \sum_{\substack{i + j = t \\ i < t}} b_i c_j,
\]
where the second sum ranges over the valid index pairs \((i, j)\) with \(0 \leq i \leq r\), \(0 \leq j \leq s\),
and \(i < t\) (this sum has the form \(b_{t-1} c_1 + b_{t-2} c_2 + \cdots\), truncated at \(j = s\) when \(t > s\)).
Since \(t \leq r < n\), condition (2) gives \(p \mid a_t\). By the choice of \(t\), every term in the second sum
involves some \(b_i\) with \(i < t\) and is therefore divisible by \(p\). Subtracting, \(p \mid b_t c_0\).
But \(p\) is prime, \(p \nmid b_t\), and \(p \nmid c_0\) — a contradiction.
This is remarkably powerful because it allows us to verify the irreducibility of a polynomial almost at a glance,
without needing to perform long division or search for roots. It provides a sufficient (but not necessary) condition;
if a polynomial doesn't meet these criteria, it might still be irreducible, but if it does meet them, its
irreducibility is guaranteed.
For example, consider the polynomial \(f(x) = 3x^5 + 10x^4 + 0x^3 + 10x^2 + 0x + 20\).
Choosing \(p = 5\):
- \(5 \nmid 3\) (the leading coefficient; Condition 1 met)
- \(5\) divides all other coefficients: \(10, 0, 10, 0, 20\) (Condition 2 met)
- \(5^2 = 25 \nmid 20\) (the constant term; Condition 3 met)
Thus, \(f(x)\) is irreducible over \(\mathbb{Q}\).
In cryptography and error-correction, we often need to construct large extension fields. Eisenstein's Criterion provides a
constructive way to generate irreducible polynomials of any degree \(n\) we desire - simply choose a prime \(p\) and set
the coefficients accordingly. This ensures the mathematical integrity of the field we are building.
Algorithmic Perspective: Testing irreducibility is a fundamental operation in Computer Algebra Systems (CAS)
like Mathematica. The tests we've covered (degree 2-3 test, Mod \(p\) test, Eisenstein's Criterion)
are exactly what these systems use internally. For large polynomials over finite fields, more sophisticated algorithms
are employed: Berlekamp's algorithm (1967), a deterministic method based on linear algebra, and the
Cantor-Zassenhaus algorithm (1981), a probabilistic method that is now the dominant approach
in modern implementations.
Theorem: \(\langle p(x) \rangle\) is Maximal iff \(p(x)\) is Irreducible
Let \(F\) be a field and let \(p(x) \in F[x]\).
Then \(\langle p(x) \rangle\) is a maximal ideal in \(F[x]\) if and only if \(p(x)\) is irreducible over \(F\).
Proof:
(\(\Rightarrow\)) Suppose \(\langle p(x) \rangle\) is a
maximal ideal in \(F[x]\).
Since maximal ideals are proper, \(\langle p(x) \rangle \neq F[x]\), so \(p(x)\) is not a unit.
Also \(\langle p(x) \rangle \neq \{0\}\) — otherwise, by the
characterization of factor rings as fields via maximal ideals,
the quotient \(F[x]/\{0\} \cong F[x]\) would be a field, but \(x \in F[x]\) has no
multiplicative inverse — so \(p(x) \neq 0\).
Suppose \(p(x) = g(x) h(x)\) with \(g(x), h(x) \in F[x]\). Then \(p(x) \in \langle g(x) \rangle\), so
\(\langle p(x) \rangle \subseteq \langle g(x) \rangle \subseteq F[x]\). By maximality, either
\(\langle g(x) \rangle = \langle p(x) \rangle\) or \(\langle g(x) \rangle = F[x]\). In the second case,
\(1 \in \langle g(x) \rangle\), so \(g(x)\) is a unit. In the first case, \(g(x) \in \langle p(x) \rangle\), so
\(g(x) = p(x) \ell(x)\) for some \(\ell(x) \in F[x]\); substituting, \(p(x) = p(x) \ell(x) h(x)\), and since
\(p(x) \neq 0\) and \(F[x]\) is an
integral domain, we may cancel to obtain
\(\ell(x) h(x) = 1\), so \(h(x)\) is a unit. In either case, one of \(g(x), h(x)\) is a unit, so \(p(x)\) is irreducible.
(\(\Leftarrow\)) Suppose \(p(x)\) is irreducible over \(F\), and let \(I\) be any ideal of \(F[x]\) with
\(\langle p(x) \rangle \subseteq I \subseteq F[x]\). Since
\(F[x]\) is a principal ideal domain,
\(I = \langle g(x) \rangle\) for some \(g(x) \in F[x]\). Now \(p(x) \in \langle g(x) \rangle\), so
\(p(x) = g(x) h(x)\) for some \(h(x) \in F[x]\). Note \(g(x) \neq 0\), since otherwise \(I = \{0\}\) would not contain
the nonzero \(p(x)\). By irreducibility, either \(g(x)\) or \(h(x)\) is a unit in \(F[x]\).
If \(g(x)\) is a unit, then \(1 = g(x) \cdot g(x)^{-1} \in \langle g(x) \rangle = I\), so \(I = F[x]\).
If \(h(x)\) is a unit, then \(g(x) = p(x) h(x)^{-1} \in \langle p(x) \rangle\), so
\(I = \langle g(x) \rangle \subseteq \langle p(x) \rangle\); combined with \(\langle p(x) \rangle \subseteq I\),
this gives \(I = \langle p(x) \rangle\). Finally, \(\langle p(x) \rangle\) is proper: since \(p(x)\) is irreducible,
\(p(x)\) is not a unit, so \(\langle p(x) \rangle \neq F[x]\). Hence \(\langle p(x) \rangle\) is a proper ideal, and
the only ideals of \(F[x]\) containing \(\langle p(x) \rangle\) are \(\langle p(x) \rangle\) and \(F[x]\) themselves;
therefore \(\langle p(x) \rangle\) is maximal.
This theorem is the cornerstone for constructing new fields. Combining it with the
characterization of factor rings as fields via maximal ideals,
we reach the ultimate goal of this chapter:
Corollary: \(F[x] / \langle p(x) \rangle\) Is a Field iff \(p(x)\) Is Irreducible
Let \(F\) be a field and let \(p(x) \in F[x]\). Then the quotient ring \(F[x] / \langle p(x) \rangle\) is a field
if and only if \(p(x)\) is irreducible over \(F\).
For a computer scientist, this is the "Factory" for creating finite fields. If you need a field with
\(2^8 = 256\) elements (like in AES), you start with the base field \(\mathbb{Z}_2\), find an irreducible
polynomial of degree 8, and use it to form the quotient ring. Because the polynomial is irreducible, the
resulting structure is guaranteed to have all the properties of a field including the existence of multiplicative
inverses for every non-zero byte.
Theorem: Unique Factorization in \(\mathbb{Z}[x]\)
Every polynomial in \(\mathbb{Z}[x]\) that is not the zero polynomial or a unit in \(\mathbb{Z}[x]\) can be
written in the form
\[
b_1 b_2 \cdots b_s p_1(x)p_2(x) \cdots p_m(x),
\]
where the \(b_i\)'s are irreducible polynomials of degree \(0\) and the \(p_i(x)\)'s are irreducible polynomials
of positive degree in \(\mathbb{Z}[x]\).
Furthermore, if
\[
b_1 b_2 \cdots b_s p_1(x)p_2(x) \cdots p_m(x) = c_1 c_2 \cdots c_t q_1(x)q_2(x) \cdots q_n(x),
\]
where the \(b_i\)'s and \(c_i\)'s are irreducible polynomials of degree \(0\) and the \(p_i(x)\)'s and \(q_i(x)\)'s are
irreducible polynomials of positive degree,
then \(s =t, \, m = n\), and after renumbering the \(c\)'s and \(q(x)\)'s, we have
\[
\begin{align*}
b_i &= \pm c_i \text{ for } i = 1, \ldots, s \\\\
p_i(x) &= \pm q_i(x) \text{ for } i = 1, \ldots, m.
\end{align*}
\]
The Essence of the Proof:
Existence. Given a polynomial that is not yet a product of irreducibles, we keep splitting it into
smaller pieces. Each split either lowers the degree of a factor or shrinks the integer coefficients (when we peel off
a prime constant from the content). Both quantities are non-negative integers, so the splitting cannot continue forever;
the process terminates with a product of irreducible "atoms."
Uniqueness. Two ingredients combine. First,
\(\mathbb{Q}[x]\) is a principal ideal domain,
and from this it can be shown that \(\mathbb{Q}[x]\) admits unique factorization (up to units in \(\mathbb{Q}\)).
So if our polynomial in \(\mathbb{Z}[x]\) had two essentially different factorizations into positive-degree irreducibles,
the same disagreement would persist over \(\mathbb{Q}[x]\) — which is impossible. Second,
Gauss's Lemma
guarantees that the rational unit factors picked up when passing through \(\mathbb{Q}[x]\) reassemble into integers,
so the factorization remains genuinely a \(\mathbb{Z}[x]\)-factorization. Together, these force the integer-constant
parts and the positive-degree parts to match (up to signs), as stated.
This theorem tells us that irreducible polynomials are truly the "prime numbers" of the polynomial world.
Just as we can break down 60 into \(2^2 \cdot 3 \cdot 5\), we can break down any complex polynomial into
a unique set of irreducible atoms.
For a computer scientist, this atomicity is what makes data structures based on polynomials reliable.
In Reed-Solomon codes (used in QR codes, CDs, DVDs, and deep-space communication), data is encoded
as coefficients of a polynomial. The uniqueness of the factorization ensures that the error pattern can be
unambiguously identified and corrected.