The Representations of \(\mathfrak{sl}(2;\mathbb{C})\)

The Classification Raising and Lowering Proof of the Classification General Representations

The Classification

We have built, for each integer \(m \geq 0\), an explicit polynomial representation \(\pi_m\) of dimension \(m + 1\), and we proved each one irreducible. That was the constructive half of the story: it produced one irreducible representation in each dimension. The present section completes it. We will show that these are, up to isomorphism, the only finite-dimensional irreducible complex representations of \(\mathfrak{sl}(2;\mathbb{C})\) — nothing else exists. The classification is exact and complete.

The computation is worth doing carefully for three reasons. First, \(\mathfrak{sl}(2;\mathbb{C})\) is the complexification of \(\mathfrak{su}(2)\), which is in turn isomorphic to \(\mathfrak{so}(3)\); the representations of \(\mathfrak{so}(3)\) are the mathematics of angular momentum in quantum mechanics, and the calculation we are about to perform is exactly the one found in physics texts under that heading. Second, the method is a template: it shows how the commutation relations alone, with no further input, force the entire structure of a representation. Third, the representation theory of larger semisimple Lie algebras is built by locating copies of \(\mathfrak{sl}(2;\mathbb{C})\) inside them and applying precisely this result.

The Working Basis

We use the basis of \(\mathfrak{sl}(2;\mathbb{C})\) introduced when we formed the complexification of \(\mathfrak{su}(2)\), \[ H = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}, \qquad E = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}, \qquad F = \begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix}, \] with the commutation relations \[ \begin{align*} [H, E] &= 2E, \\\\ [H, F] &= -2F, \\\\ [E, F] &= H. \end{align*} \] These three relations are the entire content of the Lie algebra; everything that follows is extracted from them. The element \(H\) is the one we will diagonalize, \(E\) will raise eigenvalues, and \(F\) will lower them.

A useful consequence is that representations of \(\mathfrak{sl}(2;\mathbb{C})\) are cheap to specify. If \(V\) is a finite-dimensional complex vector space and \(A, B, C\) are operators on \(V\) satisfying the same three relations \([A, B] = 2B\), \([A, C] = -2C\), and \([B, C] = A\), then by the bilinearity and skew-symmetry of the bracket the unique linear map \(\pi : \mathfrak{sl}(2;\mathbb{C}) \to \mathfrak{gl}(V)\) determined by \[ \pi(H) = A, \qquad \pi(E) = B, \qquad \pi(F) = C \] is automatically a representation: the bracket relations it must preserve are exactly the three we imposed on \(A, B, C\). Building a representation thus reduces to finding three operators with the right commutators.

Theorem (Classification of Irreducible Representations of \(\mathfrak{sl}(2;\mathbb{C})\))

For each integer \(m \geq 0\), there is an irreducible complex representation of \(\mathfrak{sl}(2;\mathbb{C})\) of dimension \(m + 1\). Any two irreducible complex representations of \(\mathfrak{sl}(2;\mathbb{C})\) of the same dimension are isomorphic. If \(\pi\) is an irreducible complex representation of \(\mathfrak{sl}(2;\mathbb{C})\) of dimension \(m + 1\), then \(\pi\) is isomorphic to the polynomial representation \(\pi_m\).

The existence half is already in hand — the \(\pi_m\) supply one irreducible representation in each dimension. What remains is the uniqueness, and it is the substantial part: we must show that an arbitrary irreducible representation, presented with no structure beyond the bracket relations, is forced into the shape of some \(\pi_m\). The strategy is to diagonalize \(\pi(H)\) and watch how \(E\) and \(F\) move its eigenvalues. The single lemma of the next section is the engine.

Raising and Lowering

When we built the polynomial representations, we saw the operators \(\pi_m(E)\) and \(\pi_m(F)\) shift the monomial basis up and down a ladder of \(\pi_m(H)\)-eigenvalues in steps of two. That was a feature of the explicit model. The following lemma shows it is not special to the model at all: in any representation, \(E\) and \(F\) move \(\pi(H)\)-eigenvalues by exactly \(+2\) and \(-2\). The fact is forced by the commutation relations alone, and it is the whole engine of the classification.

Lemma (Raising and Lowering)

Let \(\pi\) be a representation of \(\mathfrak{sl}(2;\mathbb{C})\) on a complex vector space \(V\), and let \(u\) be an eigenvector of \(\pi(H)\) with eigenvalue \(\alpha \in \mathbb{C}\). Then \[ \pi(H)\,\pi(E)\,u = (\alpha + 2)\,\pi(E)\,u. \] Thus either \(\pi(E) u = 0\), or \(\pi(E) u\) is an eigenvector of \(\pi(H)\) with eigenvalue \(\alpha + 2\). Similarly, \[ \pi(H)\,\pi(F)\,u = (\alpha - 2)\,\pi(F)\,u, \] so either \(\pi(F) u = 0\), or \(\pi(F) u\) is an eigenvector of \(\pi(H)\) with eigenvalue \(\alpha - 2\).

Proof:

Since \(\pi\) is a representation, it carries brackets to commutators: \([\pi(H), \pi(E)] = \pi([H, E]) = 2\,\pi(E)\), using the relation \([H, E] = 2E\). Rearranging the commutator gives \(\pi(H)\pi(E) = \pi(E)\pi(H) + 2\pi(E)\). Applying both sides to \(u\) and using \(\pi(H) u = \alpha u\), \[ \begin{align*} \pi(H)\,\pi(E)\,u &= \pi(E)\,\pi(H)\,u + 2\,\pi(E)\,u \\\\ &= \pi(E)\,(\alpha u) + 2\,\pi(E)\,u \\\\ &= (\alpha + 2)\,\pi(E)\,u. \end{align*} \] If \(\pi(E) u \neq 0\) this says \(\pi(E) u\) is a \(\pi(H)\)-eigenvector with eigenvalue \(\alpha + 2\). The argument for \(\pi(F)\) is identical, starting from \([\pi(H), \pi(F)] = \pi([H, F]) = -2\,\pi(F)\), which gives \(\pi(H)\pi(F) = \pi(F)\pi(H) - 2\pi(F)\) and hence \(\pi(H)\pi(F)u = (\alpha - 2)\pi(F) u\).

The mechanism is purely commutational. Conjugating \(\pi(E)\) past \(\pi(H)\) costs an extra \(+2\pi(E)\), and that surcharge is what bumps the eigenvalue up by two; conjugating \(\pi(F)\) costs \(-2\pi(F)\) and bumps it down. So from a single \(\pi(H)\)-eigenvector we can manufacture a whole string of them, with eigenvalues marching in steps of two, until an application of \(E\) or \(F\) finally returns zero. Controlling where those strings start and stop is exactly what pins down the representation, and that is the next section.

Proof of the Classification

Let \(\pi\) be an irreducible complex representation of \(\mathfrak{sl}(2;\mathbb{C})\) acting on a finite-dimensional space \(V\). We prove it must take the explicit form of one of the \(\pi_m\). The argument has three movements: find a top of the ladder, descend it to discover that the eigenvalues are integers symmetric about zero, and check that the ladder spans everything.

Finding the Top of the Ladder

Since we work over \(\mathbb{C}\), the operator \(\pi(H)\) has at least one eigenvector \(u\), say with eigenvalue \(\alpha\). Applying the raising lemma repeatedly, \[ \pi(H)\,\pi(E)^k u = (\alpha + 2k)\,\pi(E)^k u, \] so each nonzero \(\pi(E)^k u\) is a \(\pi(H)\)-eigenvector with eigenvalue \(\alpha + 2k\). These eigenvalues \(\alpha, \alpha + 2, \alpha + 4, \dots\) are all distinct, but a finite \(\pi(H)\) has only finitely many eigenvalues, so the vectors \(\pi(E)^k u\) cannot all be nonzero. Let \(N\) be the largest index with \(\pi(E)^N u \neq 0\); then \(\pi(E)^{N+1} u = 0\). Set \[ u_0 := \pi(E)^N u, \qquad \lambda := \alpha + 2N. \] By construction \(u_0\) is a nonzero \(\pi(H)\)-eigenvector at the top of its raising chain, \[ \pi(H)\,u_0 = \lambda\, u_0, \qquad \pi(E)\,u_0 = 0. \] This \(u_0\) is the highest weight vector: raising it gives nothing.

Descending the Ladder

Now lower repeatedly. Define \[ u_k := \pi(F)^k u_0 \qquad (k \geq 0). \] By the lowering half of the lemma, applied \(k\) times to \(u_0\), \[ \pi(H)\,u_k = (\lambda - 2k)\, u_k. \] The action of \(\pi(E)\) on this descending chain is governed by an identity we verify by induction: \[ \pi(E)\,u_k = k\,[\lambda - (k - 1)]\, u_{k-1} \qquad (k \geq 1). \tag{$\ast$} \] For \(k = 1\), use the bracket relation \([E, F] = H\) in the form \(\pi(E)\pi(F) = \pi(F)\pi(E) + \pi(H)\), applied to \(u_0\): \[ \begin{align*} \pi(E)\,u_1 = \pi(E)\pi(F)\,u_0 &= \pi(F)\,\pi(E)\,u_0 + \pi(H)\,u_0 \\\\ &= 0 + \lambda\, u_0 = 1 \cdot [\lambda - 0]\, u_0, \end{align*} \] using \(\pi(E) u_0 = 0\). This is \((\ast)\) at \(k = 1\). Assume \((\ast)\) holds at \(k\). Then, again from \(\pi(E)\pi(F) = \pi(F)\pi(E) + \pi(H)\) applied to \(u_k\), \[ \begin{align*} \pi(E)\,u_{k+1} &= \pi(E)\pi(F)\,u_k = \pi(F)\,\pi(E)\,u_k + \pi(H)\,u_k \\\\ &= \pi(F)\bigl(k[\lambda - (k-1)]\,u_{k-1}\bigr) + (\lambda - 2k)\,u_k \\\\ &= k[\lambda - (k-1)]\,u_k + (\lambda - 2k)\,u_k \\\\ &= \bigl(k\lambda - k(k-1) + \lambda - 2k\bigr)\,u_k \\\\ &= (k + 1)\,[\lambda - k]\,u_k, \end{align*} \] where the last line factors \(k\lambda + \lambda - k^2 + k - 2k = (k+1)\lambda - k(k+1) = (k+1)(\lambda - k)\). This is \((\ast)\) at \(k + 1\), completing the induction.

The Eigenvalue Is a Non-Negative Integer

The descending eigenvalues \(\lambda, \lambda - 2, \lambda - 4, \dots\) are again distinct, so the \(u_k\) cannot all be nonzero. Let \(m\) be the largest index with \(u_m \neq 0\); then \[ u_{m+1} = \pi(F)^{m+1} u_0 = 0. \] Apply \(\pi(E)\) to this zero vector and use \((\ast)\) at \(k = m + 1\): \[ 0 = \pi(E)\,u_{m+1} = (m + 1)\,[\lambda - m]\, u_m. \] Since \(u_m \neq 0\) and \(m + 1 \neq 0\), we are forced to conclude \(\lambda - m = 0\), that is, \[ \lambda = m, \] a non-negative integer. The highest weight could not have been anything else; the bracket relations admit no representation with a non-integer or negative top eigenvalue. This is the crux of the classification.

The Ladder Is Everything

With \(\lambda = m\), the vectors \(u_0, u_1, \dots, u_m\) are eigenvectors of \(\pi(H)\) with the distinct eigenvalues \(m, m - 2, \dots, -m\), and eigenvectors for distinct eigenvalues are linearly independent; so the \(u_k\) are independent and their span \(W = \langle u_0, \dots, u_m \rangle\) has dimension \(m + 1\). Collecting the three actions, \[ \begin{align*} \pi(H)\,u_k &= (m - 2k)\,u_k, \\\\ \pi(F)\,u_k &= \begin{cases} u_{k+1} & k < m, \\ 0 & k = m, \end{cases} \\\\ \pi(E)\,u_k &= \begin{cases} k\,[m - (k-1)]\,u_{k-1} & k > 0, \\ 0 & k = 0, \end{cases} \end{align*} \tag{$\ast\ast$} \] we see \(W\) is carried into itself by \(\pi(H)\), \(\pi(E)\), and \(\pi(F)\), hence by \(\pi(Z)\) for every \(Z \in \mathfrak{sl}(2;\mathbb{C})\). So \(W\) is a nonzero invariant subspace. Because \(\pi\) is irreducible, \(W = V\). In particular \(\dim V = m + 1\), and \(\pi\) is given on the basis \(u_0, \dots, u_m\) by the formulas \((\ast\ast)\).

The formulas \((\ast\ast)\) determine \(\pi\) completely once \(m\) is fixed, so any two irreducible complex representations of the same dimension \(m + 1\) are described by the same data and are therefore isomorphic. Conversely, defining operators by \((\ast\ast)\) on an \((m+1)\)-dimensional space yields operators satisfying the three bracket relations — a direct check on the formulas — and the same ladder argument shows the resulting representation is irreducible. The explicit polynomial representation \(\pi_m\) is one such representation of dimension \(m + 1\); being irreducible of that dimension, it must coincide with \((\ast\ast)\), so every irreducible complex representation of dimension \(m + 1\) is isomorphic to \(\pi_m\). This proves the classification.

General Representations

The classification described the irreducible representations. For applications — locating copies of \(\mathfrak{sl}(2;\mathbb{C})\) inside larger algebras, and reading the weight structure of a representation that has not been decomposed — we need a few facts about finite-dimensional representations that need not be irreducible. They follow from the classification by decomposing into irreducibles, but several can be seen directly.

Theorem (Properties of General Representations)

Let \((\pi, V)\) be a finite-dimensional representation of \(\mathfrak{sl}(2;\mathbb{C})\), not necessarily irreducible. Then:

  1. Every eigenvalue of \(\pi(H)\) is an integer. If \(v\) is an eigenvector of \(\pi(H)\) with eigenvalue \(\lambda\) and \(\pi(E) v = 0\), then \(\lambda\) is a non-negative integer.
  2. The operators \(\pi(E)\) and \(\pi(F)\) are nilpotent.
  3. The operator \(S = e^{\pi(E)} e^{-\pi(F)} e^{\pi(E)}\) satisfies \(S\,\pi(H)\,S^{-1} = -\pi(H)\).
  4. If an integer \(k\) is an eigenvalue of \(\pi(H)\), then so is each of the numbers \(-|k|, -|k| + 2, \dots, |k| - 2, |k|\).
Proof of Point 1:

Let \(v\) be an eigenvector of \(\pi(H)\) with eigenvalue \(\lambda\). By the raising lemma, \(\pi(E)^j v\) is, when nonzero, a \(\pi(H)\)-eigenvector with eigenvalue \(\lambda + 2j\); since \(\pi(H)\) has finitely many eigenvalues, there is some \(N \geq 0\) with \(\pi(E)^N v \neq 0\) and \(\pi(E)^{N+1} v = 0\). Then \(\pi(E)^N v\) is a highest weight vector with eigenvalue \(\lambda + 2N\). The argument of the classification proof shows that the highest weight of any such chain is a non-negative integer \(m\), so \(\lambda + 2N = m\) and hence \(\lambda = m - 2N\) is an integer. If already \(\pi(E) v = 0\), then \(N = 0\) and \(\lambda = m\) is a non-negative integer.

Proof of Point 2:

Over \(\mathbb{C}\), the space \(V\) has a basis of generalized eigenvectors for \(\pi(H)\) — vectors \(v\) for which \([\pi(H) - \lambda I]^k v = 0\) for some \(\lambda\) and some positive integer \(k\). This is the standard decomposition of a complex operator into its generalized eigenspaces, a fact of linear algebra that we take as given here. The point is that raising respects it. Using the relation \([H, E] = 2E\) and induction on \(k\), \[ \bigl[\pi(H) - (\lambda + 2) I\bigr]^k \pi(E) = \pi(E)\,\bigl[\pi(H) - \lambda I\bigr]^k. \] Thus if \(v\) is a generalized eigenvector for \(\pi(H)\) with eigenvalue \(\lambda\), then \(\pi(E) v\) is either zero or a generalized eigenvector with eigenvalue \(\lambda + 2\). Applying \(\pi(E)\) repeatedly to a generalized eigenvector must eventually give zero, since \(\pi(H)\) has only finitely many generalized eigenvalues and each application raises the eigenvalue by two. Hence \(\pi(E)\) is nilpotent. The same argument, lowering by two, shows \(\pi(F)\) is nilpotent.

Proof of Point 3:

Write \(S = e^{\pi(E)} e^{-\pi(F)} e^{\pi(E)}\). Conjugation by each exponential is computed by the exponential-of-\(\mathrm{ad}\) formula: for any operator \(W\), \(e^{\pi(Z)} W e^{-\pi(Z)} = e^{\mathrm{ad}(\pi(Z))}(W)\). We apply this three times to \(\pi(H)\), tracking the result through the bracket relations \([E, H] = -2E\), \([F, H] = 2F\), \([E, F] = H\) (equivalently \(\mathrm{ad}(\pi(E))(\pi(H)) = -2\pi(E)\) and so on). First, \[ e^{\mathrm{ad}(\pi(E))}(\pi(H)) = \pi(H) + [\pi(E), \pi(H)] = \pi(H) - 2\pi(E), \] the series terminating because \([\pi(E), \pi(E)] = 0\) kills all higher brackets. Next, applying \(e^{-\mathrm{ad}(\pi(F))}\), \[ e^{-\mathrm{ad}(\pi(F))}\bigl(\pi(H) - 2\pi(E)\bigr) = -\pi(H) - 2\pi(E), \] where the brackets \([F, H] = 2F\) and \([F, E] = -H\) generate the surviving terms; a short calculation shows the \(\pi(F)\) contributions appearing at first and second order cancel exactly, and the series again terminates. Finally, \[ e^{\mathrm{ad}(\pi(E))}\bigl(-\pi(H) - 2\pi(E)\bigr) = -\pi(H). \] Composing the three conjugations, \(S\,\pi(H)\,S^{-1} = -\pi(H)\), as claimed.

Proof of Point 4:

First, the operator \(S\) of Point 3 reflects eigenvalues. If \(\pi(H) v = k v\), then \[ \pi(H)(S v) = S\,(S^{-1}\pi(H) S)\,v = S\,(-\pi(H))\,v = -k\,(S v), \] so \(-k\) is an eigenvalue whenever \(k\) is. In particular, whenever \(k\) is an eigenvalue so is \(|k| \geq 0\). It therefore suffices to run the chain argument from the non-negative eigenvalue \(|k|\). As in Point 1, raising from a \(\pi(H)\)-eigenvector of eigenvalue \(|k|\) produces a highest weight vector with eigenvalue \(m = |k| + 2N \geq |k|\) for some \(N \geq 0\); note \(m \equiv |k| \pmod 2\). The classification chain attached to it carries the \(\pi(H)\)-eigenvalues \(m, m - 2, \dots, -m\), which include every number from \(|k|\) down to \(-|k|\) in steps of two. Thus \(-|k|, -|k| + 2, \dots, |k| - 2, |k|\) are all eigenvalues of \(\pi(H)\), as claimed.

The Weight Lattice a Network Sees

Point 4 is the structural fact behind the feature types of a rotation-equivariant network. When \(\mathfrak{so}(3) \cong \mathfrak{su}(2)\) acts on a layer's activations, the activations decompose into irreducible pieces indexed by the highest weight \(m\), and within each piece the \(\pi(H)\)-eigenvalues — the weights — run symmetrically from \(m\) down to \(-m\) in steps of two. These are exactly the \(m + 1\) components of a "type-\(m\)" feature: a scalar for \(m = 0\), a vector for \(m = 1\), and higher tensors beyond. The classification guarantees there is one irreducible type in each dimension, and Point 4 guarantees its weights form this symmetric ladder; together they fix, with no freedom left, the channel structure that a rotation-equivariant architecture is built from.