Complete Reducibility and Schur's Lemma

Complete Reducibility

Building new representations from old ones, we found that the direct sum runs several representations side by side without interaction, so that a direct sum of representations decomposes into its summands as a matter of construction. The question we did not answer was the reverse one: given a representation handed to us, can we always take it apart into indecomposable pieces? An irreducible representation is one that cannot be reduced at all, and these are the atoms we would like to build everything from. The favorable situation is the one in which every representation is a direct sum of such atoms.

Definition: Completely Reducible Representation

A finite-dimensional representation of a group or Lie algebra is completely reducible if it is isomorphic to a direct sum of finitely many irreducible representations.

The definition asks for a decomposition to exist; it says nothing about whether every representation of a given group admits one. That is a property of the group itself, and it deserves its own name.

Definition: The Complete Reducibility Property

A group or Lie algebra has the complete reducibility property if every finite-dimensional representation of it is completely reducible.

Most groups and Lie algebras do not have this property. The point of the next two sections is to identify a large and important class that does — the compact matrix Lie groups — and to extract from the proof the structural mechanism (an invariant complement) that makes complete reducibility work. First, an example showing that the property genuinely can fail, so that the theorems to come are not vacuous.

Example: A Representation That Is Not Completely Reducible

Let \(\Pi : \mathbb{R} \to GL(2; \mathbb{C})\) be given by \[ \Pi(x) = \begin{pmatrix} 1 & x \\ 0 & 1 \end{pmatrix}. \] Then \(\Pi\) is a representation of \(\mathbb{R}\), but it is not completely reducible.

Proof:

That \(\Pi\) is a representation is the matrix identity \[ \Pi(s)\,\Pi(x) = \begin{pmatrix} 1 & s \\ 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & x \\ 0 & 1 \end{pmatrix} = \begin{pmatrix} 1 & s + x \\ 0 & 1 \end{pmatrix} = \Pi(s + x), \] so \(\Pi\) carries addition in \(\mathbb{R}\) to multiplication in \(GL(2; \mathbb{C})\).

Let \(\{e_1, e_2\}\) be the standard basis of \(\mathbb{C}^2\). Since \(\Pi(x)\,e_1 = e_1\) for every \(x\), the line \(\langle e_1 \rangle\) is an invariant subspace. We claim it is the only nontrivial invariant subspace. Suppose \(V\) is a nonzero invariant subspace containing a vector outside \(\langle e_1 \rangle\), say \(v = a e_1 + b e_2\) with \(b \neq 0\). Invariance gives \(\Pi(1) v \in V\), hence also \[ \Pi(1) v - v = b\, e_1 \in V. \] Because \(b \neq 0\), this forces \(e_1 \in V\), and then \(e_2 = (v - a e_1)/b \in V\) as well. Thus \(V = \mathbb{C}^2\). Every invariant subspace is therefore one of \(\{0\}\), \(\langle e_1 \rangle\), or \(\mathbb{C}^2\).

Suppose, for contradiction, that \(\mathbb{C}^2\) were a direct sum of irreducible invariant subspaces. Each summand has dimension \(1\) or \(2\). A two-dimensional summand would be all of \(\mathbb{C}^2\), making \(\mathbb{C}^2\) itself irreducible; but \(\langle e_1 \rangle\) is a nontrivial invariant subspace, so \(\mathbb{C}^2\) is not irreducible. The decomposition would therefore have to be a sum of two one-dimensional invariant subspaces. Yet we showed \(\langle e_1 \rangle\) is the only one-dimensional invariant subspace, and a space cannot be the direct sum of one line with itself. No such decomposition exists, so \(\Pi\) is not completely reducible.

The failure is instructive. The invariant line \(\langle e_1 \rangle\) has no invariant complement: any complementary line is moved off itself by some \(\Pi(x)\). Complete reducibility, when it holds, is precisely the guarantee that such a complement can always be found. The next section makes that equivalence exact.

Invariant Complements

A general subspace of a vector space always has a complement; one simply extends a basis. What the example of the previous section showed is that an invariant subspace need not have an invariant complement. The following proposition shows that complete reducibility is exactly the condition that repairs this: in a completely reducible representation, every invariant subspace splits off, and moreover each invariant piece is itself completely reducible. This is the structural fact that the deeper theorems of the next sections will deliver, and it is what makes the irreducible decomposition usable rather than merely existent.

Proposition (Invariant Complements)

Let \(V\) be a completely reducible representation of a group or Lie algebra. Then the following hold.

For every invariant subspace \(U\) of \(V\), there is an invariant subspace \(W\) such that \(V = U \oplus W\).
Every invariant subspace of \(V\) is itself completely reducible.

Proof of Point 1:

By complete reducibility, write \[ V = U_1 \oplus U_2 \oplus \cdots \oplus U_k, \] where each \(U_j\) is an irreducible invariant subspace, and let \(U\) be any invariant subspace of \(V\). If \(U = V\), take \(W = \{0\}\) and there is nothing to prove. If \(U \neq V\), then some summand is not contained in \(U\); choose \(j_1\) with \(U_{j_1} \not\subseteq U\). Because \(U_{j_1}\) is irreducible and \(U_{j_1} \cap U\) is an invariant subspace of \(U_{j_1}\) that is not all of \(U_{j_1}\), we must have \(U_{j_1} \cap U = \{0\}\). Consequently the sum \(U + U_{j_1}\) is direct.

If \(U + U_{j_1} = V\) we are finished, with \(W = U_{j_1}\). Otherwise some summand is not contained in \(U + U_{j_1}\); choose \(j_2\) with \(U_{j_2} \not\subseteq U + U_{j_1}\). The same irreducibility argument applied to \(U_{j_2}\) gives \((U + U_{j_1}) \cap U_{j_2} = \{0\}\), so \(U + U_{j_1} + U_{j_2}\) is direct. Proceeding in this way, and using that \(V\) is finite-dimensional so the process terminates, we obtain indices \(j_1, j_2, \dots, j_l\) with \[ U + U_{j_1} + \cdots + U_{j_l} = V \] and the sum direct. Setting \(W := U_{j_1} + \cdots + U_{j_l}\), which is an invariant subspace as a sum of invariant subspaces, gives \(V = U \oplus W\), as required.

Proof of Point 2:

Let \(U\) be an invariant subspace of \(V\). We first show that \(U\) inherits the invariant-complement property of Point 1 internally: if \(X \subseteq U\) is an invariant subspace, then \(X\) has an invariant complement within \(U\). By Point 1 applied in \(V\), there is an invariant subspace \(Y\) with \(V = X \oplus Y\). Put \(Z := Y \cap U\), an invariant subspace contained in \(U\); we claim \(U = X \oplus Z\).

For any \(u \in U\), write \(u = x + y\) with \(x \in X\) and \(y \in Y\). Since \(X \subseteq U\), we have \(x \in U\), and therefore \(y = u - x \in U\); thus \(y \in Y \cap U = Z\). This shows \(U = X + Z\). The sum is direct because \(X \cap Z \subseteq X \cap Y = \{0\}\). Hence \(U = X \oplus Z\), establishing the internal invariant-complement property.

We may now decompose \(U\) into irreducibles. If \(U\) is irreducible, it is already a (one-term) direct sum of irreducibles. If not, \(U\) has a nontrivial invariant subspace \(X\), and by the property just proved \(U = X \oplus Z\) for some invariant \(Z\). If \(X\) and \(Z\) are irreducible we are done; if not, we apply the same splitting to whichever factor is reducible. Since \(U\) is finite-dimensional, each split strictly lowers the dimension of the factor being decomposed, so the process terminates with \(U\) written as a direct sum of irreducible invariant subspaces. Thus \(U\) is completely reducible.

The first point is the one we will use directly: it converts the abstract existence of an irreducible decomposition into the concrete ability to peel off any invariant subspace with an invariant complement still attached. The remaining task is to find groups for which the hypothesis — complete reducibility of every representation — actually holds. The route runs through inner products.

The Unitarian Trick

Inner products give complete reducibility almost for free. If a representation preserves an inner product, then the orthogonal complement of an invariant subspace is again invariant, and orthogonal complements always exist. The work, carried out in the theorem that follows, is to manufacture such an inner product for any representation of a compact group by averaging an arbitrary one over the group. That averaging step is the device known as the unitarian trick, and it is the reason compact groups are so well behaved.

Proposition (Unitary Representations Are Completely Reducible)

Let \(G\) be a matrix Lie group and let \(\Pi\) be a finite-dimensional unitary representation of \(G\). Then \(\Pi\) is completely reducible. Likewise, if \(\mathfrak{g}\) is a real Lie algebra and \(\pi\) is a finite-dimensional representation of \(\mathfrak{g}\) on an inner product space with \(\pi(X)^\ast = -\pi(X)\) for all \(X \in \mathfrak{g}\), then \(\pi\) is completely reducible.

Proof:

Let \(V\) denote the inner product space on which \(\Pi\) acts, with inner product \(\langle \cdot, \cdot \rangle\). The crux is the claim that the orthogonal complement of an invariant subspace is again invariant. Let \(W \subseteq V\) be an invariant subspace and \(W^{\perp}\) its orthogonal complement, so that \(V = W \oplus W^{\perp}\) as inner product spaces. Because \(\Pi\) is unitary, \(\Pi(A)^\ast = \Pi(A)^{-1} = \Pi(A^{-1})\) for every \(A \in G\). Then for any \(w \in W\) and any \(v \in W^{\perp}\), \[ \begin{align*} \langle \Pi(A) v, w \rangle &= \langle v, \Pi(A)^\ast w \rangle = \langle v, \Pi(A^{-1}) w \rangle \\\\ &= \langle v, w' \rangle = 0, \end{align*} \] where \(w' := \Pi(A^{-1}) w\) lies in \(W\) by invariance, so the pairing vanishes because \(v \perp W\). Thus \(\Pi(A) v \perp W\) for all \(w\), i.e. \(\Pi(A) v \in W^{\perp}\), and \(W^{\perp}\) is invariant. For the Lie algebra statement the same computation applies with \(\Pi(A^{-1})\) replaced by \(\pi(X)^\ast = -\pi(X)\).

With the orthogonal complement of an invariant subspace known to be invariant, the decomposition is immediate. If \(V\) is irreducible we are done. Otherwise choose an invariant subspace \(W\) with \(\{0\} \neq W \neq V\); then \(V = W \oplus W^{\perp}\) with both summands invariant, hence representations in their own right. Each of \(W\) and \(W^{\perp}\) is either irreducible or splits further as an orthogonal direct sum of invariant subspaces. Since \(V\) is finite-dimensional this cannot continue indefinitely, and we arrive at a decomposition of \(V\) into irreducible invariant subspaces. Hence \(\Pi\) is completely reducible.

The proposition reduces complete reducibility to the existence of an invariant inner product. For a compact group such an inner product can always be produced by averaging, and the averaging requires a notion of integration over the group that is invariant under the group's own action. We construct that integral from a differential form, which is lighter machinery than a measure and exactly suited to the top-degree averaging we need.

Theorem (Complete Reducibility for Compact Groups)

If \(G\) is a compact matrix Lie group, then every finite-dimensional representation of \(G\) is completely reducible.

Construction of the invariant integral:

We require a notion of integration over \(G\) that is invariant under the right action of the group. We obtain it from a right-invariant differential form. If \(G \subseteq M_n(\mathbb{C})\) is a matrix Lie group, the tangent space at the identity is its Lie algebra \(\mathfrak{g}\), and the tangent space \(T_A G\) at any point \(A \in G\) is the space of matrices \(\{ X A : X \in \mathfrak{g} \}\). Let \(k\) be the dimension of \(\mathfrak{g}\) as a real vector space, and choose a nonzero \(k\)-linear alternating form \(\alpha_I : \mathfrak{g}^k \to \mathbb{R}\); such a form exists and is unique up to a scalar multiple, since the alternating \(k\)-forms on a \(k\)-dimensional space form a one-dimensional space. We transport \(\alpha_I\) to every point of \(G\) by the right action, defining a \(k\)-linear alternating form \(\alpha_A\) on \(T_A G\) by \[ \alpha_A(Y_1, \dots, Y_k) = \alpha_I\bigl(Y_1 A^{-1}, \dots, Y_k A^{-1}\bigr), \qquad Y_1, \dots, Y_k \in T_A G. \] The assignment \(A \mapsto \alpha_A\) is a \(k\)-form \(\alpha\) on \(G\), right-invariant by construction.

Such a top-degree form is exactly what is needed to integrate. For a smooth function \(f : G \to \mathbb{R}\), the product \(f\alpha\) is again a \(k\)-form, and on the compact manifold \(G\) it has a well-defined integral, which we write as \[ \int_G f(A)\, \alpha(A). \] That this integral does not depend on the choice of local coordinates is the algebraic content of how a top-degree form pulls back: under a change of coordinates the form acquires precisely the Jacobian determinant that the classical change-of-variables theorem inserts, and the two cancel. The full construction of the integral of a differential form — orientation, the partition-of-unity assembly over charts — belongs to integration theory on manifolds and we take it as given here; only the right-invariance, recorded next, enters the argument. One caveat belongs here. Integration of a top-degree form is sensitive to orientation, and on a disconnected group such as \(O(n)\) a right translation by an element of determinant \(-1\) reverses the orientation, which would flip the sign of the integral. To keep the construction valid for every compact group, one integrates the associated density \(|\alpha|\) rather than the oriented form \(\alpha\); the density is insensitive to orientation, so the right-invariance and the positivity used below hold without a connectedness assumption. For a connected group the distinction is immaterial and the oriented form suffices. We continue to write \(\alpha\), understanding the density when the group is not connected.

Because \(\alpha\) was built from the right action, the resulting integral is invariant under that action: for every \(B \in G\), \[ \int_G f(AB)\, \alpha(A) = \int_G f(A)\, \alpha(A). \] Compactness of \(G\) is what guarantees the integral is finite, so this right-invariant integral exists precisely in the setting we care about.

Proof of the theorem:

Let \(\Pi\) be a finite-dimensional representation of \(G\) on \(V\). Choose any inner product \(\langle \cdot, \cdot \rangle\) on \(V\) and average it over the group, defining \(\langle \cdot, \cdot \rangle_G : V \times V \to \mathbb{C}\) by \[ \langle v, w \rangle_G = \int_G \langle \Pi(A) v, \Pi(A) w \rangle\, \alpha(A). \] This is again an inner product: it is linear and conjugate-symmetric as an integral of such, and it is positive definite because the integrand \(\langle \Pi(A) v, \Pi(A) v \rangle\) is positive for every \(A\) when \(v \neq 0\), so its integral against \(\alpha\) is positive.

We claim \(\Pi(B)\) is unitary with respect to \(\langle \cdot, \cdot \rangle_G\) for every \(B \in G\). Using that \(\Pi\) is a homomorphism and then the right-invariance of the integral, \[ \begin{align*} \langle \Pi(B) v, \Pi(B) w \rangle_G &= \int_G \langle \Pi(A)\Pi(B) v, \Pi(A)\Pi(B) w \rangle\, \alpha(A) \\\\ &= \int_G \langle \Pi(AB) v, \Pi(AB) w \rangle\, \alpha(A) \\\\ &= \int_G \langle \Pi(A) v, \Pi(A) w \rangle\, \alpha(A) \\\\ &= \langle v, w \rangle_G, \end{align*} \] where the third equality is the right-invariance applied to the function \(A \mapsto \langle \Pi(A) v, \Pi(A) w \rangle\). Thus every \(\Pi(B)\) preserves \(\langle \cdot, \cdot \rangle_G\), so \(\Pi\) is a unitary representation with respect to this averaged inner product. By the preceding proposition, \(\Pi\) is completely reducible.

The argument isolates exactly where compactness is used: only in guaranteeing that the averaging integral converges. The rest is the unitarian trick — replace an arbitrary inner product by a group-averaged one, under which the representation becomes unitary, and let the orthogonal-complement mechanism of the proposition finish the decomposition. This is why the rotation groups \(SO(n)\) and \(SU(n)\), being compact, have every finite-dimensional representation splitting into irreducible pieces — the structural fact on which the classification of their representations rests.

Schur's Lemma

Complete reducibility tells us that a representation breaks into irreducible pieces. Schur's lemma governs the maps between irreducible pieces, and it is startlingly rigid: between two irreducibles there is essentially no room for an intertwining map to be anything other than zero or an isomorphism, and a self-map of a complex irreducible can only be a scalar. This is the lemma that converts symmetry into constraint, and the following sections of the wider theory — characters, the orthogonality relations, the decomposition of tensor products — all lean on it.

Theorem (Schur's Lemma)

Let \(V\) and \(W\) be irreducible real or complex representations of a group or Lie algebra, and let \(\phi : V \to W\) be an intertwining map. Then either \(\phi = 0\) or \(\phi\) is an isomorphism.
Let \(V\) be an irreducible complex representation of a group or Lie algebra, and let \(\phi : V \to V\) be an intertwining map of \(V\) with itself. Then \(\phi = \lambda I\) for some \(\lambda \in \mathbb{C}\).
Let \(V\) and \(W\) be irreducible complex representations, and let \(\phi_1, \phi_2 : V \to W\) be nonzero intertwining maps. Then \(\phi_1 = \lambda\, \phi_2\) for some \(\lambda \in \mathbb{C}\).

The last two points hold only over \(\mathbb{C}\) (or another algebraically closed field), not over \(\mathbb{R}\).

Proof of Point 1:

We argue the group case; the Lie algebra case requires only the obvious notational changes. Write \(\Pi\) for the action on \(V\) and \(\Sigma\) for the action on \(W\), so that \(\phi(\Pi(A) v) = \Sigma(A) \phi(v)\) for all \(A\). Then \(\ker\phi\) is an invariant subspace of \(V\): if \(v \in \ker\phi\), then \[ \phi(\Pi(A) v) = \Sigma(A) \phi(v) = \Sigma(A) \cdot 0 = 0, \] so \(\Pi(A) v \in \ker\phi\). Since \(V\) is irreducible, \(\ker\phi = \{0\}\) or \(\ker\phi = V\); thus \(\phi\) is either one-to-one or zero.

Suppose \(\phi\) is one-to-one. Then \(\operatorname{im}\phi\) is a nonzero subspace of \(W\), and it is invariant: if \(w = \phi(v)\) lies in the image, then \[ \Sigma(A) w = \Sigma(A) \phi(v) = \phi(\Pi(A) v) \in \operatorname{im}\phi. \] Since \(W\) is irreducible and \(\operatorname{im}\phi\) is nonzero and invariant, we must have \(\operatorname{im}\phi = W\). Thus \(\phi\) is either zero or one-to-one and onto, that is, an isomorphism.

Proof of Point 2:

Suppose \(V\) is an irreducible complex representation and \(\phi : V \to V\) intertwines, so \(\phi\,\Pi(A) = \Pi(A)\,\phi\) for all \(A\). Since we are working over the algebraically closed field \(\mathbb{C}\), the operator \(\phi\) has at least one eigenvalue \(\lambda \in \mathbb{C}\). Let \(U\) be the corresponding eigenspace. Then each \(\Pi(A)\) maps \(U\) into itself: if \(\phi u = \lambda u\), then \[ \phi\bigl(\Pi(A) u\bigr) = \Pi(A)\,\phi(u) = \Pi(A)(\lambda u) = \lambda\,\Pi(A) u, \] so \(\Pi(A) u\) is again a \(\lambda\)-eigenvector, i.e. \(\Pi(A) u \in U\). Thus \(U\) is an invariant subspace. Since \(\lambda\) is an eigenvalue, \(U \neq \{0\}\), and irreducibility forces \(U = V\). But \(U = V\) means \(\phi\) acts as \(\lambda\) on all of \(V\), that is, \(\phi = \lambda I\).

This is exactly where complexity is essential: the existence of an eigenvalue used the algebraic closure of \(\mathbb{C}\). Over \(\mathbb{R}\) the conclusion fails. The rotation representation of \(SO(2)\) on \(\mathbb{R}^2\) is irreducible — no real line is fixed by all rotations — yet the rotation by a right angle commutes with every element of the representation while being no real scalar multiple of the identity.

Proof of Point 3:

Let \(\phi_1, \phi_2 : V \to W\) be nonzero intertwining maps between irreducible complex representations. By Point 1, \(\phi_2\) is an isomorphism, so \(\phi_2^{-1}\) exists and \(\phi_1 \circ \phi_2^{-1} : W \to W\) is an intertwining map of \(W\) with itself. By Point 2 it equals \(\lambda I\) for some \(\lambda \in \mathbb{C}\), whence \(\phi_1 = \lambda\, \phi_2\).

Two consequences record the rigidity in the forms used most often. The first concerns elements acting centrally.

Corollary (Central Elements Act by Scalars)

Let \(\Pi\) be an irreducible complex representation of a matrix Lie group \(G\). If \(A\) lies in the center of \(G\), then \(\Pi(A) = \lambda I\) for some \(\lambda \in \mathbb{C}\). Likewise, if \(\pi\) is an irreducible complex representation of a Lie algebra \(\mathfrak{g}\) and \(X\) lies in the center of \(\mathfrak{g}\), then \(\pi(X) = \lambda I\).

Proof:

If \(A\) is central, then \(\Pi(A)\) commutes with \(\Pi(B)\) for every \(B \in G\), since \(\Pi(A)\Pi(B) = \Pi(AB) = \Pi(BA) = \Pi(B)\Pi(A)\); thus \(\Pi(A)\) is an intertwining map of the irreducible complex representation with itself, and Point 2 of Schur's lemma gives \(\Pi(A) = \lambda I\). The Lie algebra case is identical, with the bracket relation \(\pi(X)\pi(Y) - \pi(Y)\pi(X) = \pi([X, Y]) = 0\) for central \(X\) showing \(\pi(X)\) commutes with all \(\pi(Y)\).

The second consequence settles the abelian case completely.

Corollary (Irreducible Complex Representations of an Abelian Group)

An irreducible complex representation of a commutative group or Lie algebra is one-dimensional.

Proof:

We argue the group case. If \(G\) is commutative, then the center of \(G\) is all of \(G\), so by the preceding corollary \(\Pi(A) = \lambda I\) for each \(A \in G\), with the scalar depending on \(A\). But if every \(\Pi(A)\) is a multiple of the identity, then every subspace of \(V\) is invariant. The only way \(V\) can avoid having a nontrivial invariant subspace — that is, the only way it can be irreducible — is for \(V\) to be one-dimensional.

Why Equivariant Layers Have So Few Weights

A linear layer that respects a group's symmetry is exactly an intertwining map between the representation carried by its input and the one carried by its output — the layer is an equivariant map in the linear case. Decompose input and output into irreducible pieces, as complete reducibility permits for a compact symmetry group. Schur's lemma then dictates the block structure: between two inequivalent irreducible pieces the only intertwining map is zero, so those blocks are forced empty; between equivalent complex irreducibles the map is a single scalar. The free parameters of an equivariant layer therefore collapse to one scalar per matched pair of irreducible types, rather than a full dense matrix between channels. The drastic reduction in weights, and the resulting gain in sample efficiency, is not a heuristic of architecture design; it is Schur's lemma counting the admissible maps. One qualification: the "single scalar" count is the complex statement, the second part of Schur's lemma, which held only over \(\mathbb{C}\). A network whose weights are real does not automatically inherit it; the space of equivariant maps between two real irreducibles can be larger, and pinning down its exact dimension is a separate question about the endomorphism algebra of a real representation, beyond the complex lemma proved here. The qualitative lesson — inequivalent types do not mix, matched types couple through a small fixed number of parameters — survives regardless.

Complete Reducibility and Schur's Lemma

Loading...