Complete Reducibility
Building new representations from old ones, we found that the direct sum runs several
representations side by side without interaction, so that a
direct sum of representations
decomposes into its summands as a matter of construction. The question we did not answer
was the reverse one: given a representation handed to us, can we always take it apart into
indecomposable pieces? An
irreducible
representation is one that cannot be reduced at all, and these are the atoms we would like
to build everything from. The favorable situation is the one in which every representation
is a direct sum of such atoms.
Definition: Completely Reducible Representation
A finite-dimensional representation of a group or Lie algebra is
completely reducible if it is isomorphic to a direct sum of finitely
many irreducible representations.
The definition asks for a decomposition to exist; it says nothing about whether every
representation of a given group admits one. That is a property of the group itself, and it
deserves its own name.
Definition: The Complete Reducibility Property
A group or Lie algebra has the complete reducibility property if every
finite-dimensional representation of it is completely reducible.
Most groups and Lie algebras do not have this property. The point of the next two sections
is to identify a large and important class that does — the compact matrix Lie groups
— and to extract from the proof the structural mechanism (an invariant complement)
that makes complete reducibility work. First, an example showing that the property genuinely
can fail, so that the theorems to come are not vacuous.
Example: A Representation That Is Not Completely Reducible
Let \(\Pi : \mathbb{R} \to GL(2; \mathbb{C})\) be given by
\[
\Pi(x) = \begin{pmatrix} 1 & x \\ 0 & 1 \end{pmatrix}.
\]
Then \(\Pi\) is a representation of \(\mathbb{R}\), but it is not completely reducible.
Proof:
That \(\Pi\) is a representation is the matrix identity
\[
\Pi(s)\,\Pi(x) = \begin{pmatrix} 1 & s \\ 0 & 1 \end{pmatrix}
\begin{pmatrix} 1 & x \\ 0 & 1 \end{pmatrix}
= \begin{pmatrix} 1 & s + x \\ 0 & 1 \end{pmatrix} = \Pi(s + x),
\]
so \(\Pi\) carries addition in \(\mathbb{R}\) to multiplication in \(GL(2; \mathbb{C})\).
Let \(\{e_1, e_2\}\) be the standard basis of \(\mathbb{C}^2\). Since
\(\Pi(x)\,e_1 = e_1\) for every \(x\), the line \(\langle e_1 \rangle\) is an invariant
subspace. We claim it is the only nontrivial invariant subspace. Suppose \(V\)
is a nonzero invariant subspace containing a vector outside \(\langle e_1 \rangle\), say
\(v = a e_1 + b e_2\) with \(b \neq 0\). Invariance gives \(\Pi(1) v \in V\), hence also
\[
\Pi(1) v - v = b\, e_1 \in V.
\]
Because \(b \neq 0\), this forces \(e_1 \in V\), and then
\(e_2 = (v - a e_1)/b \in V\) as well. Thus \(V = \mathbb{C}^2\). Every invariant
subspace is therefore one of \(\{0\}\), \(\langle e_1 \rangle\), or \(\mathbb{C}^2\).
Suppose, for contradiction, that \(\mathbb{C}^2\) were a direct sum of irreducible
invariant subspaces. Each summand has dimension \(1\) or \(2\). A two-dimensional
summand would be all of \(\mathbb{C}^2\), making \(\mathbb{C}^2\) itself irreducible;
but \(\langle e_1 \rangle\) is a nontrivial invariant subspace, so \(\mathbb{C}^2\) is
not irreducible. The decomposition would therefore have to be a sum of two
one-dimensional invariant subspaces. Yet we showed \(\langle e_1 \rangle\) is the
only one-dimensional invariant subspace, and a space cannot be the direct sum
of one line with itself. No such decomposition exists, so \(\Pi\) is not completely
reducible.
The failure is instructive. The invariant line \(\langle e_1 \rangle\) has no invariant
complement: any complementary line is moved off itself by some \(\Pi(x)\). Complete
reducibility, when it holds, is precisely the guarantee that such a complement can always be
found. The next section makes that equivalence exact.
Invariant Complements
A general subspace of a vector space always has a complement; one simply extends a basis.
What the example of the previous section showed is that an invariant subspace need
not have an invariant complement. The following proposition shows that complete
reducibility is exactly the condition that repairs this: in a completely reducible
representation, every invariant subspace splits off, and moreover each invariant piece is
itself completely reducible. This is the structural fact that the deeper theorems of the
next sections will deliver, and it is what makes the irreducible decomposition usable rather
than merely existent.
Proposition (Invariant Complements)
Let \(V\) be a completely reducible representation of a group or Lie algebra. Then the
following hold.
-
For every invariant subspace \(U\) of \(V\), there is an invariant subspace \(W\)
such that \(V = U \oplus W\).
-
Every invariant subspace of \(V\) is itself completely reducible.
Proof of Point 1:
By complete reducibility, write
\[
V = U_1 \oplus U_2 \oplus \cdots \oplus U_k,
\]
where each \(U_j\) is an irreducible invariant subspace, and let \(U\) be any invariant
subspace of \(V\). If \(U = V\), take \(W = \{0\}\) and there is nothing to prove. If
\(U \neq V\), then some summand is not contained in \(U\); choose \(j_1\) with
\(U_{j_1} \not\subseteq U\). Because \(U_{j_1}\) is irreducible and
\(U_{j_1} \cap U\) is an invariant subspace of \(U_{j_1}\) that is not all of
\(U_{j_1}\), we must have \(U_{j_1} \cap U = \{0\}\). Consequently the sum
\(U + U_{j_1}\) is direct.
If \(U + U_{j_1} = V\) we are finished, with \(W = U_{j_1}\). Otherwise some summand is
not contained in \(U + U_{j_1}\); choose \(j_2\) with
\(U_{j_2} \not\subseteq U + U_{j_1}\). The same irreducibility argument applied to
\(U_{j_2}\) gives \((U + U_{j_1}) \cap U_{j_2} = \{0\}\), so
\(U + U_{j_1} + U_{j_2}\) is direct. Proceeding in this way, and using that \(V\) is
finite-dimensional so the process terminates, we obtain indices
\(j_1, j_2, \dots, j_l\) with
\[
U + U_{j_1} + \cdots + U_{j_l} = V
\]
and the sum direct. Setting \(W := U_{j_1} + \cdots + U_{j_l}\), which is an invariant
subspace as a sum of invariant subspaces, gives \(V = U \oplus W\), as required.
Proof of Point 2:
Let \(U\) be an invariant subspace of \(V\). We first show that \(U\) inherits the
invariant-complement property of Point 1 internally: if \(X \subseteq U\) is an
invariant subspace, then \(X\) has an invariant complement within \(U\). By
Point 1 applied in \(V\), there is an invariant subspace \(Y\) with \(V = X \oplus Y\).
Put \(Z := Y \cap U\), an invariant subspace contained in \(U\); we claim
\(U = X \oplus Z\).
For any \(u \in U\), write \(u = x + y\) with \(x \in X\) and \(y \in Y\). Since
\(X \subseteq U\), we have \(x \in U\), and therefore \(y = u - x \in U\); thus
\(y \in Y \cap U = Z\). This shows \(U = X + Z\). The sum is direct because
\(X \cap Z \subseteq X \cap Y = \{0\}\). Hence \(U = X \oplus Z\), establishing the
internal invariant-complement property.
We may now decompose \(U\) into irreducibles. If \(U\) is irreducible, it is already a
(one-term) direct sum of irreducibles. If not, \(U\) has a nontrivial invariant
subspace \(X\), and by the property just proved \(U = X \oplus Z\) for some invariant
\(Z\). If \(X\) and \(Z\) are irreducible we are done; if not, we apply the same
splitting to whichever factor is reducible. Since \(U\) is finite-dimensional, each
split strictly lowers the dimension of the factor being decomposed, so the process
terminates with \(U\) written as a direct sum of irreducible invariant subspaces. Thus
\(U\) is completely reducible.
The first point is the one we will use directly: it converts the abstract existence of an
irreducible decomposition into the concrete ability to peel off any invariant subspace with
an invariant complement still attached. The remaining task is to find groups for which the
hypothesis — complete reducibility of every representation — actually
holds. The route runs through inner products.
The Unitarian Trick
Inner products give complete reducibility almost for free. If a representation preserves an
inner product, then the orthogonal complement of an invariant subspace is again invariant,
and orthogonal complements always exist. The work, carried out in the theorem that follows,
is to manufacture such an inner product for any representation of a compact group by
averaging an arbitrary one over the group. That averaging step is the device known as the
unitarian trick, and it is the reason compact groups are so well behaved.
Proposition (Unitary Representations Are Completely Reducible)
Let \(G\) be a matrix Lie group and let \(\Pi\) be a finite-dimensional
unitary representation
of \(G\). Then \(\Pi\) is completely reducible. Likewise, if \(\mathfrak{g}\) is a real
Lie algebra and \(\pi\) is a finite-dimensional representation of \(\mathfrak{g}\) on an
inner product space with \(\pi(X)^\ast = -\pi(X)\) for all \(X \in \mathfrak{g}\), then
\(\pi\) is completely reducible.
Proof:
Let \(V\) denote the inner product space on which \(\Pi\) acts, with inner product
\(\langle \cdot, \cdot \rangle\). The crux is the claim that the orthogonal complement
of an invariant subspace is again invariant. Let \(W \subseteq V\) be an invariant
subspace and \(W^{\perp}\) its orthogonal complement, so that \(V = W \oplus W^{\perp}\)
as inner product spaces. Because \(\Pi\) is unitary, \(\Pi(A)^\ast = \Pi(A)^{-1} =
\Pi(A^{-1})\) for every \(A \in G\). Then for any \(w \in W\) and any
\(v \in W^{\perp}\),
\[
\begin{align*}
\langle \Pi(A) v, w \rangle
&= \langle v, \Pi(A)^\ast w \rangle = \langle v, \Pi(A^{-1}) w \rangle \\\\
&= \langle v, w' \rangle = 0,
\end{align*}
\]
where \(w' := \Pi(A^{-1}) w\) lies in \(W\) by invariance, so the pairing vanishes
because \(v \perp W\). Thus \(\Pi(A) v \perp W\) for all \(w\), i.e. \(\Pi(A) v \in
W^{\perp}\), and \(W^{\perp}\) is invariant. For the Lie algebra statement the same
computation applies with \(\Pi(A^{-1})\) replaced by \(\pi(X)^\ast = -\pi(X)\).
With the orthogonal complement of an invariant subspace known to be invariant, the
decomposition is immediate. If \(V\) is irreducible we are done. Otherwise choose an
invariant subspace \(W\) with \(\{0\} \neq W \neq V\); then \(V = W \oplus W^{\perp}\)
with both summands invariant, hence representations in their own right. Each of \(W\)
and \(W^{\perp}\) is either irreducible or splits further as an orthogonal direct sum of
invariant subspaces. Since \(V\) is finite-dimensional this cannot continue
indefinitely, and we arrive at a decomposition of \(V\) into irreducible invariant
subspaces. Hence \(\Pi\) is completely reducible.
The proposition reduces complete reducibility to the existence of an invariant inner
product. For a compact group such an inner product can always be produced by averaging, and
the averaging requires a notion of integration over the group that is invariant under the
group's own action. We construct that integral from a differential form, which is lighter
machinery than a measure and exactly suited to the top-degree averaging we need.
Theorem (Complete Reducibility for Compact Groups)
If \(G\) is a compact matrix Lie group, then every finite-dimensional representation of
\(G\) is completely reducible.
Construction of the invariant integral:
We require a notion of integration over \(G\) that is invariant under the right action
of the group. We obtain it from a right-invariant differential form. If
\(G \subseteq M_n(\mathbb{C})\) is a matrix Lie group, the tangent space at the identity
is its
Lie algebra
\(\mathfrak{g}\), and the tangent space \(T_A G\) at any point \(A \in G\) is the space
of matrices \(\{ X A : X \in \mathfrak{g} \}\). Let \(k\) be the dimension of
\(\mathfrak{g}\) as a real vector space, and choose a nonzero \(k\)-linear alternating
form \(\alpha_I : \mathfrak{g}^k \to \mathbb{R}\); such a form exists and is unique up
to a scalar multiple, since the alternating \(k\)-forms on a \(k\)-dimensional space
form a one-dimensional space. We transport \(\alpha_I\) to every point of \(G\) by the
right action, defining a \(k\)-linear alternating form \(\alpha_A\) on \(T_A G\) by
\[
\alpha_A(Y_1, \dots, Y_k) = \alpha_I\bigl(Y_1 A^{-1}, \dots, Y_k A^{-1}\bigr),
\qquad Y_1, \dots, Y_k \in T_A G.
\]
The assignment \(A \mapsto \alpha_A\) is a \(k\)-form \(\alpha\) on \(G\), right-invariant
by construction.
Such a top-degree form is exactly what is needed to integrate. For a smooth function
\(f : G \to \mathbb{R}\), the product \(f\alpha\) is again a \(k\)-form, and on the
compact manifold \(G\) it has a well-defined integral, which we write as
\[
\int_G f(A)\, \alpha(A).
\]
That this integral does not depend on the choice of local coordinates is the algebraic
content of how a top-degree form pulls back: under a change of coordinates the form
acquires precisely the
Jacobian determinant
that the classical change-of-variables theorem inserts, and the two cancel. The full
construction of the integral of a differential form — orientation, the
partition-of-unity assembly over charts — belongs to integration theory on
manifolds and we take it as given here; only the right-invariance, recorded next, enters
the argument. One caveat belongs here. Integration of a top-degree form is sensitive to
orientation, and on a disconnected group such as \(O(n)\) a right translation by an
element of determinant \(-1\) reverses the orientation, which would flip the sign of the
integral. To keep the construction valid for every compact group, one integrates the
associated density \(|\alpha|\) rather than the oriented form \(\alpha\); the density is
insensitive to orientation, so the right-invariance and the positivity used below hold
without a connectedness assumption. For a connected group the distinction is immaterial
and the oriented form suffices. We continue to write \(\alpha\), understanding the
density when the group is not connected.
Because \(\alpha\) was built from the right action, the resulting integral is invariant
under that action: for every \(B \in G\),
\[
\int_G f(AB)\, \alpha(A) = \int_G f(A)\, \alpha(A).
\]
Compactness of \(G\) is what guarantees the integral is finite, so this right-invariant
integral exists precisely in the setting we care about.
Proof of the theorem:
Let \(\Pi\) be a finite-dimensional representation of \(G\) on \(V\). Choose any inner
product \(\langle \cdot, \cdot \rangle\) on \(V\) and average it over the group, defining
\(\langle \cdot, \cdot \rangle_G : V \times V \to \mathbb{C}\) by
\[
\langle v, w \rangle_G = \int_G \langle \Pi(A) v, \Pi(A) w \rangle\, \alpha(A).
\]
This is again an inner product: it is linear and conjugate-symmetric as an integral of
such, and it is positive definite because the integrand
\(\langle \Pi(A) v, \Pi(A) v \rangle\) is positive for every \(A\) when \(v \neq 0\),
so its integral against \(\alpha\) is positive.
We claim \(\Pi(B)\) is unitary with respect to \(\langle \cdot, \cdot \rangle_G\) for
every \(B \in G\). Using that \(\Pi\) is a homomorphism and then the right-invariance of
the integral,
\[
\begin{align*}
\langle \Pi(B) v, \Pi(B) w \rangle_G
&= \int_G \langle \Pi(A)\Pi(B) v, \Pi(A)\Pi(B) w \rangle\, \alpha(A) \\\\
&= \int_G \langle \Pi(AB) v, \Pi(AB) w \rangle\, \alpha(A) \\\\
&= \int_G \langle \Pi(A) v, \Pi(A) w \rangle\, \alpha(A) \\\\
&= \langle v, w \rangle_G,
\end{align*}
\]
where the third equality is the right-invariance applied to the function
\(A \mapsto \langle \Pi(A) v, \Pi(A) w \rangle\). Thus every \(\Pi(B)\) preserves
\(\langle \cdot, \cdot \rangle_G\), so \(\Pi\) is a unitary representation with respect
to this averaged inner product. By the preceding proposition, \(\Pi\) is completely
reducible.
The argument isolates exactly where compactness is used: only in guaranteeing that the
averaging integral converges. The rest is the unitarian trick — replace an arbitrary
inner product by a group-averaged one, under which the representation becomes unitary, and
let the orthogonal-complement mechanism of the proposition finish the decomposition. This is
why the rotation groups \(SO(n)\) and \(SU(n)\), being compact, have every finite-dimensional
representation splitting into irreducible pieces — the structural fact on which the
classification of their representations rests.
Schur's Lemma
Complete reducibility tells us that a representation breaks into irreducible pieces. Schur's
lemma governs the maps between irreducible pieces, and it is startlingly rigid:
between two irreducibles there is essentially no room for an
intertwining map
to be anything other than zero or an isomorphism, and a self-map of a complex irreducible
can only be a scalar. This is the lemma that converts symmetry into constraint, and the
following sections of the wider theory — characters, the orthogonality relations, the
decomposition of tensor products — all lean on it.
Theorem (Schur's Lemma)
-
Let \(V\) and \(W\) be irreducible real or complex representations of a group or Lie
algebra, and let \(\phi : V \to W\) be an intertwining map. Then either
\(\phi = 0\) or \(\phi\) is an isomorphism.
-
Let \(V\) be an irreducible complex representation of a group or Lie
algebra, and let \(\phi : V \to V\) be an intertwining map of \(V\) with itself.
Then \(\phi = \lambda I\) for some \(\lambda \in \mathbb{C}\).
-
Let \(V\) and \(W\) be irreducible complex representations, and let
\(\phi_1, \phi_2 : V \to W\) be nonzero intertwining maps. Then
\(\phi_1 = \lambda\, \phi_2\) for some \(\lambda \in \mathbb{C}\).
The last two points hold only over \(\mathbb{C}\) (or another algebraically closed
field), not over \(\mathbb{R}\).
Proof of Point 1:
We argue the group case; the Lie algebra case requires only the obvious notational
changes. Write \(\Pi\) for the action on \(V\) and \(\Sigma\) for the action on \(W\),
so that \(\phi(\Pi(A) v) = \Sigma(A) \phi(v)\) for all \(A\). Then \(\ker\phi\) is an
invariant subspace of \(V\): if \(v \in \ker\phi\), then
\[
\phi(\Pi(A) v) = \Sigma(A) \phi(v) = \Sigma(A) \cdot 0 = 0,
\]
so \(\Pi(A) v \in \ker\phi\). Since \(V\) is irreducible, \(\ker\phi = \{0\}\) or
\(\ker\phi = V\); thus \(\phi\) is either one-to-one or zero.
Suppose \(\phi\) is one-to-one. Then \(\operatorname{im}\phi\) is a nonzero subspace of
\(W\), and it is invariant: if \(w = \phi(v)\) lies in the image, then
\[
\Sigma(A) w = \Sigma(A) \phi(v) = \phi(\Pi(A) v) \in \operatorname{im}\phi.
\]
Since \(W\) is irreducible and \(\operatorname{im}\phi\) is nonzero and invariant, we
must have \(\operatorname{im}\phi = W\). Thus \(\phi\) is either zero or one-to-one and
onto, that is, an isomorphism.
Proof of Point 2:
Suppose \(V\) is an irreducible complex representation and \(\phi : V \to V\) intertwines,
so \(\phi\,\Pi(A) = \Pi(A)\,\phi\) for all \(A\). Since we are working over the
algebraically closed field \(\mathbb{C}\), the operator \(\phi\) has at least one
eigenvalue \(\lambda \in \mathbb{C}\). Let \(U\) be the corresponding eigenspace. Then
each \(\Pi(A)\) maps \(U\) into itself: if \(\phi u = \lambda u\), then
\[
\phi\bigl(\Pi(A) u\bigr) = \Pi(A)\,\phi(u) = \Pi(A)(\lambda u) = \lambda\,\Pi(A) u,
\]
so \(\Pi(A) u\) is again a \(\lambda\)-eigenvector, i.e. \(\Pi(A) u \in U\). Thus \(U\)
is an invariant subspace. Since \(\lambda\) is an eigenvalue, \(U \neq \{0\}\), and
irreducibility forces \(U = V\). But \(U = V\) means \(\phi\) acts as \(\lambda\) on all
of \(V\), that is, \(\phi = \lambda I\).
This is exactly where complexity is essential: the existence of an eigenvalue used the
algebraic closure of \(\mathbb{C}\). Over \(\mathbb{R}\) the conclusion fails. The
rotation representation of \(SO(2)\) on \(\mathbb{R}^2\) is irreducible — no real
line is fixed by all rotations — yet the rotation by a right angle commutes with
every element of the representation while being no real scalar multiple of the identity.
Proof of Point 3:
Let \(\phi_1, \phi_2 : V \to W\) be nonzero intertwining maps between irreducible
complex representations. By Point 1, \(\phi_2\) is an isomorphism, so \(\phi_2^{-1}\)
exists and \(\phi_1 \circ \phi_2^{-1} : W \to W\) is an intertwining map of \(W\) with
itself. By Point 2 it equals \(\lambda I\) for some \(\lambda \in \mathbb{C}\), whence
\(\phi_1 = \lambda\, \phi_2\).
Two consequences record the rigidity in the forms used most often. The first concerns
elements acting centrally.
Corollary (Central Elements Act by Scalars)
Let \(\Pi\) be an irreducible complex representation of a matrix Lie group \(G\). If
\(A\) lies in the center of \(G\), then \(\Pi(A) = \lambda I\) for some
\(\lambda \in \mathbb{C}\). Likewise, if \(\pi\) is an irreducible complex representation
of a Lie algebra \(\mathfrak{g}\) and \(X\) lies in the center of \(\mathfrak{g}\), then
\(\pi(X) = \lambda I\).
Proof:
If \(A\) is central, then \(\Pi(A)\) commutes with \(\Pi(B)\) for every \(B \in G\),
since \(\Pi(A)\Pi(B) = \Pi(AB) = \Pi(BA) = \Pi(B)\Pi(A)\); thus \(\Pi(A)\) is an
intertwining map of the irreducible complex representation with itself, and Point 2 of
Schur's lemma gives \(\Pi(A) = \lambda I\). The Lie algebra case is identical, with the
bracket relation \(\pi(X)\pi(Y) - \pi(Y)\pi(X) = \pi([X, Y]) = 0\) for central \(X\)
showing \(\pi(X)\) commutes with all \(\pi(Y)\).
The second consequence settles the abelian case completely.
Corollary (Irreducible Complex Representations of an Abelian Group)
An irreducible complex representation of a commutative group or Lie algebra is
one-dimensional.
Proof:
We argue the group case. If \(G\) is commutative, then the center of \(G\) is all of
\(G\), so by the preceding corollary \(\Pi(A) = \lambda I\) for each \(A \in G\), with
the scalar depending on \(A\). But if every \(\Pi(A)\) is a multiple of the identity,
then every subspace of \(V\) is invariant. The only way \(V\) can avoid having
a nontrivial invariant subspace — that is, the only way it can be irreducible
— is for \(V\) to be one-dimensional.
Why Equivariant Layers Have So Few Weights
A linear layer that respects a group's symmetry is exactly an intertwining map between
the representation carried by its input and the one carried by its output — the
layer is an
equivariant map
in the linear case. Decompose input and output into irreducible pieces, as complete
reducibility permits for a compact symmetry group. Schur's lemma then dictates the block
structure: between two inequivalent irreducible pieces the only intertwining
map is zero, so those blocks are forced empty; between equivalent complex irreducibles
the map is a single scalar. The free parameters of an equivariant layer therefore
collapse to one scalar per matched pair of irreducible types, rather than a full dense
matrix between channels. The drastic reduction in weights, and the resulting gain in
sample efficiency, is not a heuristic of architecture design; it is Schur's lemma
counting the admissible maps. One qualification: the "single scalar" count is the
complex statement, the second part of Schur's lemma, which held only over
\(\mathbb{C}\). A network whose weights are real does not automatically inherit it; the
space of equivariant maps between two real irreducibles can be larger, and pinning down
its exact dimension is a separate question about the endomorphism algebra of a real
representation, beyond the complex lemma proved here. The qualitative lesson —
inequivalent types do not mix, matched types couple through a small fixed number of
parameters — survives regardless.