The Classification
We have built, for each integer \(m \geq 0\), an explicit
polynomial representation
\(\pi_m\) of dimension \(m + 1\), and we proved each one
irreducible.
That was the constructive half of the story: it produced one irreducible representation in
each dimension. The present section completes it. We will show that these are, up to
isomorphism, the only finite-dimensional irreducible complex representations of
\(\mathfrak{sl}(2;\mathbb{C})\) — nothing else exists. The classification is exact and
complete.
The computation is worth doing carefully for three reasons. First,
\(\mathfrak{sl}(2;\mathbb{C})\) is the
complexification
of \(\mathfrak{su}(2)\), which is in turn isomorphic to \(\mathfrak{so}(3)\); the
representations of \(\mathfrak{so}(3)\) are the mathematics of angular momentum in quantum
mechanics, and the calculation we are about to perform is exactly the one found in physics
texts under that heading. Second, the method is a template: it shows how the commutation
relations alone, with no further input, force the entire structure of a representation.
Third, the representation theory of larger semisimple Lie algebras is built by locating
copies of \(\mathfrak{sl}(2;\mathbb{C})\) inside them and applying precisely this result.
The Working Basis
We use the basis of \(\mathfrak{sl}(2;\mathbb{C})\) introduced when we formed the
complexification of \(\mathfrak{su}(2)\),
\[
H = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}, \qquad
E = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}, \qquad
F = \begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix},
\]
with the commutation relations
\[
\begin{align*}
[H, E] &= 2E, \\\\
[H, F] &= -2F, \\\\
[E, F] &= H.
\end{align*}
\]
These three relations are the entire content of the Lie algebra; everything that follows is
extracted from them. The element \(H\) is the one we will diagonalize, \(E\) will raise
eigenvalues, and \(F\) will lower them.
A useful consequence is that representations of \(\mathfrak{sl}(2;\mathbb{C})\) are cheap to
specify. If \(V\) is a finite-dimensional complex vector space and \(A, B, C\) are operators
on \(V\) satisfying the same three relations \([A, B] = 2B\), \([A, C] = -2C\), and
\([B, C] = A\), then by the bilinearity and skew-symmetry of the bracket the unique linear
map \(\pi : \mathfrak{sl}(2;\mathbb{C}) \to \mathfrak{gl}(V)\) determined by
\[
\pi(H) = A, \qquad \pi(E) = B, \qquad \pi(F) = C
\]
is automatically a representation: the bracket relations it must preserve are exactly the
three we imposed on \(A, B, C\). Building a representation thus reduces to finding three
operators with the right commutators.
Theorem (Classification of Irreducible Representations of \(\mathfrak{sl}(2;\mathbb{C})\))
For each integer \(m \geq 0\), there is an irreducible complex representation of
\(\mathfrak{sl}(2;\mathbb{C})\) of dimension \(m + 1\). Any two irreducible complex
representations of \(\mathfrak{sl}(2;\mathbb{C})\) of the same dimension are isomorphic.
If \(\pi\) is an irreducible complex representation of \(\mathfrak{sl}(2;\mathbb{C})\) of
dimension \(m + 1\), then \(\pi\) is isomorphic to the polynomial representation
\(\pi_m\).
The existence half is already in hand — the \(\pi_m\) supply one irreducible
representation in each dimension. What remains is the uniqueness, and it is the substantial
part: we must show that an arbitrary irreducible representation, presented with no structure
beyond the bracket relations, is forced into the shape of some \(\pi_m\). The strategy is to
diagonalize \(\pi(H)\) and watch how \(E\) and \(F\) move its eigenvalues. The single lemma
of the next section is the engine.
Raising and Lowering
When we built the polynomial representations, we saw the operators \(\pi_m(E)\) and
\(\pi_m(F)\) shift the monomial basis up and down a ladder of \(\pi_m(H)\)-eigenvalues in
steps of two. That was a feature of the explicit model. The following lemma shows it is not
special to the model at all: in any representation, \(E\) and \(F\) move
\(\pi(H)\)-eigenvalues by exactly \(+2\) and \(-2\). The fact is forced by the commutation
relations alone, and it is the whole engine of the classification.
Lemma (Raising and Lowering)
Let \(\pi\) be a representation of \(\mathfrak{sl}(2;\mathbb{C})\) on a complex vector
space \(V\), and let \(u\) be an eigenvector of \(\pi(H)\) with eigenvalue
\(\alpha \in \mathbb{C}\). Then
\[
\pi(H)\,\pi(E)\,u = (\alpha + 2)\,\pi(E)\,u.
\]
Thus either \(\pi(E) u = 0\), or \(\pi(E) u\) is an eigenvector of \(\pi(H)\) with
eigenvalue \(\alpha + 2\). Similarly,
\[
\pi(H)\,\pi(F)\,u = (\alpha - 2)\,\pi(F)\,u,
\]
so either \(\pi(F) u = 0\), or \(\pi(F) u\) is an eigenvector of \(\pi(H)\) with
eigenvalue \(\alpha - 2\).
Proof:
Since \(\pi\) is a representation, it carries brackets to commutators:
\([\pi(H), \pi(E)] = \pi([H, E]) = 2\,\pi(E)\), using the relation \([H, E] = 2E\).
Rearranging the commutator gives \(\pi(H)\pi(E) = \pi(E)\pi(H) + 2\pi(E)\). Applying both
sides to \(u\) and using \(\pi(H) u = \alpha u\),
\[
\begin{align*}
\pi(H)\,\pi(E)\,u &= \pi(E)\,\pi(H)\,u + 2\,\pi(E)\,u \\\\
&= \pi(E)\,(\alpha u) + 2\,\pi(E)\,u \\\\
&= (\alpha + 2)\,\pi(E)\,u.
\end{align*}
\]
If \(\pi(E) u \neq 0\) this says \(\pi(E) u\) is a \(\pi(H)\)-eigenvector with eigenvalue
\(\alpha + 2\). The argument for \(\pi(F)\) is identical, starting from
\([\pi(H), \pi(F)] = \pi([H, F]) = -2\,\pi(F)\), which gives
\(\pi(H)\pi(F) = \pi(F)\pi(H) - 2\pi(F)\) and hence
\(\pi(H)\pi(F)u = (\alpha - 2)\pi(F) u\).
The mechanism is purely commutational. Conjugating \(\pi(E)\) past \(\pi(H)\) costs an extra
\(+2\pi(E)\), and that surcharge is what bumps the eigenvalue up by two; conjugating
\(\pi(F)\) costs \(-2\pi(F)\) and bumps it down. So from a single \(\pi(H)\)-eigenvector we
can manufacture a whole string of them, with eigenvalues marching in steps of two, until an
application of \(E\) or \(F\) finally returns zero. Controlling where those strings start and
stop is exactly what pins down the representation, and that is the next section.
Proof of the Classification
Let \(\pi\) be an irreducible complex representation of \(\mathfrak{sl}(2;\mathbb{C})\)
acting on a finite-dimensional space \(V\). We prove it must take the explicit form of one
of the \(\pi_m\). The argument has three movements: find a top of the ladder, descend it to
discover that the eigenvalues are integers symmetric about zero, and check that the ladder
spans everything.
Finding the Top of the Ladder
Since we work over \(\mathbb{C}\), the operator \(\pi(H)\) has at least one eigenvector
\(u\), say with eigenvalue \(\alpha\). Applying the raising lemma repeatedly,
\[
\pi(H)\,\pi(E)^k u = (\alpha + 2k)\,\pi(E)^k u,
\]
so each nonzero \(\pi(E)^k u\) is a \(\pi(H)\)-eigenvector with eigenvalue \(\alpha + 2k\).
These eigenvalues \(\alpha, \alpha + 2, \alpha + 4, \dots\) are all distinct, but a finite
\(\pi(H)\) has only finitely many eigenvalues, so the vectors \(\pi(E)^k u\) cannot all be
nonzero. Let \(N\) be the largest index with \(\pi(E)^N u \neq 0\); then
\(\pi(E)^{N+1} u = 0\). Set
\[
u_0 := \pi(E)^N u, \qquad \lambda := \alpha + 2N.
\]
By construction \(u_0\) is a nonzero \(\pi(H)\)-eigenvector at the top of its raising chain,
\[
\pi(H)\,u_0 = \lambda\, u_0, \qquad \pi(E)\,u_0 = 0.
\]
This \(u_0\) is the highest weight vector: raising it gives nothing.
Descending the Ladder
Now lower repeatedly. Define
\[
u_k := \pi(F)^k u_0 \qquad (k \geq 0).
\]
By the lowering half of the lemma, applied \(k\) times to \(u_0\),
\[
\pi(H)\,u_k = (\lambda - 2k)\, u_k.
\]
The action of \(\pi(E)\) on this descending chain is governed by an identity we verify by
induction:
\[
\pi(E)\,u_k = k\,[\lambda - (k - 1)]\, u_{k-1} \qquad (k \geq 1). \tag{$\ast$}
\]
For \(k = 1\), use the bracket relation \([E, F] = H\) in the form
\(\pi(E)\pi(F) = \pi(F)\pi(E) + \pi(H)\), applied to \(u_0\):
\[
\begin{align*}
\pi(E)\,u_1 = \pi(E)\pi(F)\,u_0
&= \pi(F)\,\pi(E)\,u_0 + \pi(H)\,u_0 \\\\
&= 0 + \lambda\, u_0 = 1 \cdot [\lambda - 0]\, u_0,
\end{align*}
\]
using \(\pi(E) u_0 = 0\). This is \((\ast)\) at \(k = 1\). Assume \((\ast)\) holds at \(k\).
Then, again from \(\pi(E)\pi(F) = \pi(F)\pi(E) + \pi(H)\) applied to \(u_k\),
\[
\begin{align*}
\pi(E)\,u_{k+1} &= \pi(E)\pi(F)\,u_k = \pi(F)\,\pi(E)\,u_k + \pi(H)\,u_k \\\\
&= \pi(F)\bigl(k[\lambda - (k-1)]\,u_{k-1}\bigr) + (\lambda - 2k)\,u_k \\\\
&= k[\lambda - (k-1)]\,u_k + (\lambda - 2k)\,u_k \\\\
&= \bigl(k\lambda - k(k-1) + \lambda - 2k\bigr)\,u_k \\\\
&= (k + 1)\,[\lambda - k]\,u_k,
\end{align*}
\]
where the last line factors \(k\lambda + \lambda - k^2 + k - 2k = (k+1)\lambda - k(k+1)
= (k+1)(\lambda - k)\). This is \((\ast)\) at \(k + 1\), completing the induction.
The Eigenvalue Is a Non-Negative Integer
The descending eigenvalues \(\lambda, \lambda - 2, \lambda - 4, \dots\) are again distinct,
so the \(u_k\) cannot all be nonzero. Let \(m\) be the largest index with \(u_m \neq 0\);
then
\[
u_{m+1} = \pi(F)^{m+1} u_0 = 0.
\]
Apply \(\pi(E)\) to this zero vector and use \((\ast)\) at \(k = m + 1\):
\[
0 = \pi(E)\,u_{m+1} = (m + 1)\,[\lambda - m]\, u_m.
\]
Since \(u_m \neq 0\) and \(m + 1 \neq 0\), we are forced to conclude \(\lambda - m = 0\),
that is,
\[
\lambda = m,
\]
a non-negative integer. The highest weight could not have been anything else; the bracket
relations admit no representation with a non-integer or negative top eigenvalue. This is the
crux of the classification.
The Ladder Is Everything
With \(\lambda = m\), the vectors \(u_0, u_1, \dots, u_m\) are eigenvectors of \(\pi(H)\)
with the distinct eigenvalues \(m, m - 2, \dots, -m\), and
eigenvectors for distinct eigenvalues are linearly independent;
so the \(u_k\) are independent and their span
\(W = \langle u_0, \dots, u_m \rangle\) has dimension \(m + 1\). Collecting the three actions,
\[
\begin{align*}
\pi(H)\,u_k &= (m - 2k)\,u_k, \\\\
\pi(F)\,u_k &= \begin{cases} u_{k+1} & k < m, \\ 0 & k = m, \end{cases} \\\\
\pi(E)\,u_k &= \begin{cases} k\,[m - (k-1)]\,u_{k-1} & k > 0, \\ 0 & k = 0, \end{cases}
\end{align*} \tag{$\ast\ast$}
\]
we see \(W\) is carried into itself by \(\pi(H)\), \(\pi(E)\), and \(\pi(F)\), hence by
\(\pi(Z)\) for every \(Z \in \mathfrak{sl}(2;\mathbb{C})\). So \(W\) is a nonzero invariant
subspace. Because \(\pi\) is irreducible, \(W = V\). In particular \(\dim V = m + 1\), and
\(\pi\) is given on the basis \(u_0, \dots, u_m\) by the formulas \((\ast\ast)\).
The formulas \((\ast\ast)\) determine \(\pi\) completely once \(m\) is fixed, so any two
irreducible complex representations of the same dimension \(m + 1\) are described by the
same data and are therefore isomorphic. Conversely, defining operators by \((\ast\ast)\) on
an \((m+1)\)-dimensional space yields operators satisfying the three bracket relations —
a direct check on the formulas — and the same ladder argument shows the resulting
representation is irreducible. The explicit polynomial representation \(\pi_m\) is one such
representation of dimension \(m + 1\); being irreducible of that dimension, it must coincide
with \((\ast\ast)\), so every irreducible complex representation of dimension \(m + 1\) is
isomorphic to \(\pi_m\). This proves the classification.
General Representations
The classification described the irreducible representations. For applications —
locating copies of \(\mathfrak{sl}(2;\mathbb{C})\) inside larger algebras, and reading the
weight structure of a representation that has not been decomposed — we need a few
facts about finite-dimensional representations that need not be irreducible. They follow
from the classification by decomposing into irreducibles, but several can be seen directly.
Theorem (Properties of General Representations)
Let \((\pi, V)\) be a finite-dimensional representation of \(\mathfrak{sl}(2;\mathbb{C})\),
not necessarily irreducible. Then:
-
Every eigenvalue of \(\pi(H)\) is an integer. If \(v\) is an eigenvector of
\(\pi(H)\) with eigenvalue \(\lambda\) and \(\pi(E) v = 0\), then \(\lambda\) is a
non-negative integer.
-
The operators \(\pi(E)\) and \(\pi(F)\) are nilpotent.
-
The operator \(S = e^{\pi(E)} e^{-\pi(F)} e^{\pi(E)}\) satisfies
\(S\,\pi(H)\,S^{-1} = -\pi(H)\).
-
If an integer \(k\) is an eigenvalue of \(\pi(H)\), then so is each of the numbers
\(-|k|, -|k| + 2, \dots, |k| - 2, |k|\).
Proof of Point 1:
Let \(v\) be an eigenvector of \(\pi(H)\) with eigenvalue \(\lambda\). By the raising
lemma, \(\pi(E)^j v\) is, when nonzero, a \(\pi(H)\)-eigenvector with eigenvalue
\(\lambda + 2j\); since \(\pi(H)\) has finitely many eigenvalues, there is some
\(N \geq 0\) with \(\pi(E)^N v \neq 0\) and \(\pi(E)^{N+1} v = 0\). Then
\(\pi(E)^N v\) is a highest weight vector with eigenvalue \(\lambda + 2N\). The argument
of the classification proof shows that the highest weight of any such chain is a
non-negative integer \(m\), so \(\lambda + 2N = m\) and hence \(\lambda = m - 2N\) is an
integer. If already \(\pi(E) v = 0\), then \(N = 0\) and \(\lambda = m\) is a
non-negative integer.
Proof of Point 2:
Over \(\mathbb{C}\), the space \(V\) has a basis of generalized eigenvectors for
\(\pi(H)\) — vectors \(v\) for which \([\pi(H) - \lambda I]^k v = 0\) for some
\(\lambda\) and some positive integer \(k\). This is the standard decomposition of a
complex operator into its generalized eigenspaces, a fact of linear algebra that we take
as given here. The point is that raising respects it. Using the relation
\([H, E] = 2E\) and induction on \(k\),
\[
\bigl[\pi(H) - (\lambda + 2) I\bigr]^k \pi(E)
= \pi(E)\,\bigl[\pi(H) - \lambda I\bigr]^k.
\]
Thus if \(v\) is a generalized eigenvector for \(\pi(H)\) with eigenvalue \(\lambda\),
then \(\pi(E) v\) is either zero or a generalized eigenvector with eigenvalue
\(\lambda + 2\). Applying \(\pi(E)\) repeatedly to a generalized eigenvector must
eventually give zero, since \(\pi(H)\) has only finitely many generalized eigenvalues
and each application raises the eigenvalue by two. Hence \(\pi(E)\) is nilpotent. The
same argument, lowering by two, shows \(\pi(F)\) is nilpotent.
Proof of Point 3:
Write \(S = e^{\pi(E)} e^{-\pi(F)} e^{\pi(E)}\). Conjugation by each exponential is
computed by the
exponential-of-\(\mathrm{ad}\) formula:
for any operator \(W\), \(e^{\pi(Z)} W e^{-\pi(Z)} = e^{\mathrm{ad}(\pi(Z))}(W)\). We
apply this three times to \(\pi(H)\), tracking the result through the bracket relations
\([E, H] = -2E\), \([F, H] = 2F\), \([E, F] = H\) (equivalently
\(\mathrm{ad}(\pi(E))(\pi(H)) = -2\pi(E)\) and so on). First,
\[
e^{\mathrm{ad}(\pi(E))}(\pi(H)) = \pi(H) + [\pi(E), \pi(H)] = \pi(H) - 2\pi(E),
\]
the series terminating because \([\pi(E), \pi(E)] = 0\) kills all higher brackets.
Next, applying \(e^{-\mathrm{ad}(\pi(F))}\),
\[
e^{-\mathrm{ad}(\pi(F))}\bigl(\pi(H) - 2\pi(E)\bigr) = -\pi(H) - 2\pi(E),
\]
where the brackets \([F, H] = 2F\) and \([F, E] = -H\) generate the surviving terms; a
short calculation shows the \(\pi(F)\) contributions appearing at first and second order
cancel exactly, and the series again terminates. Finally,
\[
e^{\mathrm{ad}(\pi(E))}\bigl(-\pi(H) - 2\pi(E)\bigr) = -\pi(H).
\]
Composing the three conjugations, \(S\,\pi(H)\,S^{-1} = -\pi(H)\), as claimed.
Proof of Point 4:
First, the operator \(S\) of Point 3 reflects eigenvalues. If \(\pi(H) v = k v\), then
\[
\pi(H)(S v) = S\,(S^{-1}\pi(H) S)\,v = S\,(-\pi(H))\,v = -k\,(S v),
\]
so \(-k\) is an eigenvalue whenever \(k\) is. In particular, whenever \(k\) is an
eigenvalue so is \(|k| \geq 0\). It therefore suffices to run the chain argument from the
non-negative eigenvalue \(|k|\). As in Point 1, raising from a \(\pi(H)\)-eigenvector of
eigenvalue \(|k|\) produces a highest weight vector with eigenvalue
\(m = |k| + 2N \geq |k|\) for some \(N \geq 0\); note \(m \equiv |k| \pmod 2\). The
classification chain attached to it carries the \(\pi(H)\)-eigenvalues
\(m, m - 2, \dots, -m\), which include every number from \(|k|\) down to \(-|k|\) in
steps of two. Thus \(-|k|, -|k| + 2, \dots, |k| - 2, |k|\) are all eigenvalues of
\(\pi(H)\), as claimed.
The Weight Lattice a Network Sees
Point 4 is the structural fact behind the feature types of a rotation-equivariant
network. When \(\mathfrak{so}(3) \cong \mathfrak{su}(2)\) acts on a layer's activations,
the activations decompose into irreducible pieces indexed by the highest weight \(m\),
and within each piece the \(\pi(H)\)-eigenvalues — the weights — run
symmetrically from \(m\) down to \(-m\) in steps of two. These are exactly the
\(m + 1\) components of a "type-\(m\)" feature: a scalar for \(m = 0\), a vector for
\(m = 1\), and higher tensors beyond. The classification guarantees there is one
irreducible type in each dimension, and Point 4 guarantees its weights form this
symmetric ladder; together they fix, with no freedom left, the channel structure that a
rotation-equivariant architecture is built from.