The Tensor Product as a Representation
We have classified the irreducible representations of \(\mathfrak{sl}(2;\mathbb{C})\):
for each non-negative integer \(m\) there is exactly one of dimension \(m + 1\), the
irreducible representation
\((\pi_m, V_m)\). We have also learned how to
tensor two representations of one group or Lie algebra
into a representation on the tensor-product space. The natural question now joins these
two threads: given two irreducibles \(V_m\) and \(V_n\), the tensor product \(V_m \otimes V_n\)
is again a representation of \(\mathfrak{sl}(2;\mathbb{C})\) — but it is almost never
irreducible. How does it break into irreducible pieces? The answer is the
Clebsch-Gordan decomposition, and it turns out to be governed entirely by a single
arithmetic fact: on a tensor product, the eigenvalues of the diagonal generator
add.
The Action and Its Diagonal Generator
We keep the
working basis
\(H, E, F\) of \(\mathfrak{sl}(2;\mathbb{C})\), with \(H\) the generator we diagonalize,
\(E\) raising \(H\)-eigenvalues by \(2\), and \(F\) lowering them by \(2\). For a single
irreducible \((\pi_m, V_m)\) the operator \(\pi_m(H)\) is diagonalizable with eigenvalues
\[
m,\; m - 2,\; \dots,\; -m,
\]
each occurring once; we write \(u_m, u_{m-2}, \dots, u_{-m}\) for a corresponding basis of
eigenvectors, so that \(\pi_m(H)\,u_j = j\,u_j\). The eigenvalue \(j\) of a vector under
\(\pi_m(H)\) is its weight.
On the tensor product \(V_m \otimes V_n\), the Lie algebra acts by the
one-group tensor-product rule
\[
(\pi_m \otimes \pi_n)(X) = \pi_m(X) \otimes I + I \otimes \pi_n(X),
\qquad X \in \mathfrak{sl}(2;\mathbb{C}).
\]
The additive form on the right is not a convention but a consequence of differentiating the
group rule \((\Pi_m \otimes \Pi_n)(A) = \Pi_m(A) \otimes \Pi_n(A)\) at the identity, and it
is exactly the structure that makes weights add.
Weights Add
Take eigenvectors \(u_j \in V_m\) and \(v_k \in V_n\), so that \(\pi_m(H)\,u_j = j\,u_j\)
and \(\pi_n(H)\,v_k = k\,v_k\). Applying the diagonal generator to the elementary tensor
\(u_j \otimes v_k\),
\[
\begin{align*}
(\pi_m \otimes \pi_n)(H)\,(u_j \otimes v_k)
&= \bigl(\pi_m(H) \otimes I + I \otimes \pi_n(H)\bigr)(u_j \otimes v_k) \\\\
&= (\pi_m(H)\,u_j) \otimes v_k + u_j \otimes (\pi_n(H)\,v_k) \\\\
&= j\,(u_j \otimes v_k) + k\,(u_j \otimes v_k) \\\\
&= (j + k)\,(u_j \otimes v_k).
\end{align*}
\]
So each elementary tensor \(u_j \otimes v_k\) is again an eigenvector of the diagonal
generator, now with weight \(j + k\). The \((m+1)(n+1)\) tensors \(u_j \otimes v_k\) form a
basis of \(V_m \otimes V_n\), and they diagonalize \((\pi_m \otimes \pi_n)(H)\) outright.
The entire decomposition problem reduces to bookkeeping: count how many basis tensors land
at each weight.
The same elementary tensors are not, however, eigenvectors of the raising and lowering
operators. By the
raising-lowering relations,
\((\pi_m \otimes \pi_n)(E)\) sends a weight-\(w\) vector to weight \(w + 2\) and
\((\pi_m \otimes \pi_n)(F)\) sends it to weight \(w - 2\), just as on a single irreducible;
what changes is that an eigenspace at a given weight may now be more than one-dimensional,
and the raising operator no longer annihilates only the single top vector. Recovering the
irreducible pieces means finding, inside this graded space, the vectors that are
annihilated by \(E\) — the highest weight vectors — and following the chains they generate.
That is the content of the next two sections.
Why a Rotation-Equivariant Network Cares
In a rotation-equivariant network the activations of a layer carry an action of
\(\mathfrak{so}(3) \cong \mathfrak{su}(2)\), and they decompose into irreducible
"type-\(\ell\)" features — a type-\(0\) scalar, a type-\(1\) vector, and higher tensors,
each a copy of some \(V_m\). When two such features interact multiplicatively, the layer
forms their tensor product, and the tensor product of two irreducibles is what we are
about to decompose. The arithmetic just established — that weights add on a tensor
product — is the reason the combined object reorganizes into a predictable list of new
type-\(\ell\) features rather than into an unstructured mixture. The Clebsch-Gordan
decomposition is the rule a layer follows when it multiplies two equivariant features
together.
The Motivating Example: \(V_1 \otimes V_1\)
Before stating the general theorem, we work out the smallest non-trivial case by hand. It
is small enough to see every vector explicitly, yet it already exhibits the full mechanism:
locate the highest weight, descend with the lowering operator until the chain closes, and
read off the leftover. Let \(V_1 = \mathbb{C}^2\) be the
standard representation
of \(\mathfrak{sl}(2;\mathbb{C})\), for which \(\pi_1(X) = X\). With \(\{e_1, e_2\}\) the
standard basis of \(\mathbb{C}^2\) and \(H = \operatorname{diag}(1, -1)\), we have
\(\pi_1(H)\,e_1 = e_1\) and \(\pi_1(H)\,e_2 = -e_2\), so \(e_1\) and \(e_2\) carry weights
\(+1\) and \(-1\).
The Four Weights
The four elementary tensors \(e_k \otimes e_l\), \(1 \le k, l \le 2\), form a basis of
\(\mathbb{C}^2 \otimes \mathbb{C}^2\). By the additive rule of the previous section, each is
an eigenvector of \((\pi_1 \otimes \pi_1)(H)\) whose weight is the sum of the two factor
weights. The tensor \(e_1 \otimes e_1\) has weight \(+2\); the two tensors
\(e_1 \otimes e_2\) and \(e_2 \otimes e_1\) have weight \(0\); and \(e_2 \otimes e_2\) has
weight \(-2\). The weights run \(2, 0, 0, -2\): the value \(0\) occurs twice, the values
\(\pm 2\) once each. A single irreducible never repeats a weight, so this four-dimensional
space cannot be irreducible — it must split.
Descending from the Top
The largest weight is \(+2\), attained only by \(e_1 \otimes e_1\). Since there is no vector
of weight \(+4\) for it to map to, the raising operator must annihilate it:
\((\pi_1 \otimes \pi_1)(E)\,(e_1 \otimes e_1) = 0\). It is therefore a highest weight
vector, and the
chain it generates under the lowering operator
spans an irreducible subspace. We follow that chain. Writing
\(L := (\pi_1 \otimes \pi_1)(F) = \pi_1(F) \otimes I + I \otimes \pi_1(F)\) and using
\(\pi_1(F)\,e_1 = e_2\), \(\pi_1(F)\,e_2 = 0\),
\[
\begin{align*}
L\,(e_1 \otimes e_1)
&= (\pi_1(F)\,e_1) \otimes e_1 + e_1 \otimes (\pi_1(F)\,e_1)
= e_2 \otimes e_1 + e_1 \otimes e_2, \\\\
L\,(e_1 \otimes e_2 + e_2 \otimes e_1)
&= 2\,(e_2 \otimes e_2), \\\\
L\,(e_2 \otimes e_2) &= 0.
\end{align*}
\]
The chain closes after three steps. Its three vectors carry weights \(2, 0, -2\) and span a
three-dimensional invariant irreducible subspace, a copy of \(V_2\).
The Leftover
One direction of the four-dimensional space has not been used: the antisymmetric
combination \(e_1 \otimes e_2 - e_2 \otimes e_1\), of weight \(0\). A direct calculation
with the same operators shows that \(\mathfrak{sl}(2;\mathbb{C})\) acts on it as zero,
\[
\begin{align*}
(\pi_1 \otimes \pi_1)(H)\,(e_1 \otimes e_2 - e_2 \otimes e_1) &= 0, \\\\
(\pi_1 \otimes \pi_1)(E)\,(e_1 \otimes e_2 - e_2 \otimes e_1) &= 0, \\\\
(\pi_1 \otimes \pi_1)(F)\,(e_1 \otimes e_2 - e_2 \otimes e_1) &= 0,
\end{align*}
\]
so it spans a one-dimensional invariant subspace on which the action is trivial: a copy of
\(V_0\). Together with the three-dimensional piece it accounts for all four dimensions, and
the two subspaces meet only in zero. We have shown
\[
V_1 \otimes V_1 \;\cong\; V_2 \oplus V_0,
\]
an isomorphism of \(\mathfrak{sl}(2;\mathbb{C})\) representations. The familiar splitting of
a rank-two tensor into its symmetric and antisymmetric parts is exactly this decomposition:
the three symmetric tensors are the \(V_2\), the single antisymmetric tensor is the \(V_0\).
Every feature of the general case is already visible here. The weights add; the top weight
is unrepeated and seeds the largest irreducible; descending with the lowering operator
traces out that irreducible; what remains assembles into strictly smaller irreducibles, each
appearing once. The next section turns this procedure into a theorem.
The Clebsch-Gordan Theorem
The example generalizes without surprise. The top weight seeds the largest irreducible; one
peels it off, passes to an invariant complement, and repeats, each step lowering the top
weight by two, until nothing is left. We state the result and then carry out exactly this
induction.
Theorem (Clebsch-Gordan Decomposition)
Let \(m\) and \(n\) be non-negative integers with \(m \ge n\). As a representation of
\(\mathfrak{sl}(2;\mathbb{C})\),
\[
V_m \otimes V_n \;\cong\; V_{m+n} \oplus V_{m+n-2} \oplus \cdots \oplus V_{m-n}
\;=\; \bigoplus_{k=0}^{n} V_{\,m+n-2k}.
\]
Every irreducible appearing on the right occurs exactly once: the decomposition is
multiplicity-free.
The dimensions check at once: the right-hand side has dimension
\(\sum_{k=0}^{n} (m + n - 2k + 1) = (n + 1)(m + 1)\), matching
\(\dim(V_m \otimes V_n) = (m+1)(n+1)\). Multiplicity-freeness is special to
\(\mathfrak{sl}(2;\mathbb{C})\); for tensor products of representations of larger Lie
algebras, irreducibles generally recur with multiplicity.
Weights and Their Multiplicities
Choose weight bases \(u_m, u_{m-2}, \dots, u_{-m}\) of \(V_m\) and
\(v_n, v_{n-2}, \dots, v_{-n}\) of \(V_n\), where \(\pi_m(H)\,u_j = j\,u_j\) and
\(\pi_n(H)\,v_k = k\,v_k\). The products \(u_j \otimes v_k\) are a basis of
\(V_m \otimes V_n\), and by the additive rule each has weight \(j + k\). The weights range
from \(m + n\) down to \(-(m+n)\) in steps of \(2\), and we record how many basis tensors
realize each.
The top weight \(m + n\) is attained only by \(u_m \otimes v_n\): its multiplicity is one.
Lowering by \(2\) to weight \(m + n - 2\), the realizations are \(u_{m-2} \otimes v_n\) and
\(u_m \otimes v_{n-2}\) — multiplicity two — provided \(n > 0\). Each further step of \(-2\)
adds one new realization, because the second index \(k\) gains one more admissible value,
until \(k\) saturates at \(-n\). Concretely the multiplicity of weight \(w\) is the number
of pairs \((j, k)\) with \(j + k = w\), \(j \in \{m, \dots, -m\}\), \(k \in \{n, \dots,
-n\}\); since \(m \ge n\), this count climbs by one at each step from \(w = m+n\) until it
reaches \(n + 1\) at weight \(w = m - n\), holds steady at \(n + 1\) across the plateau
\(m - n \ge w \ge -(m-n)\), and then falls symmetrically by one per step down to \(-(m+n)\).
Peeling Off the Irreducibles
Proof:
First we record that \(V_m \otimes V_n\) is completely reducible. Restricting the action to
the real Lie algebra \(\mathfrak{su}(2)\), whose
complexification is \(\mathfrak{sl}(2;\mathbb{C})\),
the representation arises from the compact group \(SU(2)\), and
every finite-dimensional representation of a compact group is completely reducible.
Complete reducibility transfers back to \(\mathfrak{sl}(2;\mathbb{C})\) because a
subspace invariant under \(\mathfrak{su}(2)\) is, by the
complex-linear extension \(\pi(X + iY) = \pi(X) + i\,\pi(Y)\),
automatically invariant under all of \(\mathfrak{sl}(2;\mathbb{C})\), and conversely. We
may therefore split off any invariant subspace
and find an
invariant complement that is again completely reducible.
Now run the induction on the top weight. Consider the vector \(u_m \otimes v_n\), of weight
\(m + n\). It is annihilated by the raising operator \((\pi_m \otimes \pi_n)(E)\): there is
no weight \(m + n + 2\) for its image to occupy, so the image is forced to be zero. Thus
\(u_m \otimes v_n\) is a highest weight vector, and the
chain it generates under repeated lowering
spans an invariant irreducible subspace \(W\) whose weights are \(m + n, m + n - 2, \dots,
-(m+n)\), each once; that is, \(W \cong V_{m+n}\).
Take an invariant complement \(W'\) with \(V_m \otimes V_n = W \oplus W'\), itself
completely reducible. Because \(W\) contributes multiplicity exactly one to every weight
from \(m + n\) down to \(-(m+n)\), removing it lowers each multiplicity by one. In
particular weight \(m + n\) no longer appears in \(W'\), and (assuming \(n > 0\)) the
largest weight surviving in \(W'\) is \(m + n - 2\), now with multiplicity one. The same
argument applied inside \(W'\) produces a highest weight vector of weight \(m + n - 2\),
annihilated by \(E\), generating an irreducible subspace isomorphic to \(V_{m+n-2}\).
Iterate. At each stage we pass to the invariant complement of all irreducibles extracted so
far; this lowers every remaining multiplicity by one and so lowers the top surviving weight
by two. The process extracts \(V_{m+n}, V_{m+n-2}, \dots\) in turn. It halts when the
complement is exhausted, which by the multiplicity profile occurs precisely after
\(V_{m-n}\) is removed: at that point every weight has been accounted for the correct number
of times, and the remaining space is zero. Collecting the extracted irreducibles,
\[
V_m \otimes V_n \;\cong\; \bigoplus_{k=0}^{n} V_{\,m+n-2k},
\]
and since each top weight \(m + n, m + n - 2, \dots, m - n\) was extracted once, no
irreducible repeats. \(\blacksquare\)
Why the Count Stops at \(V_{m-n}\)
The lower factor \(V_n\) sets the length of the decomposition. The number of summands is
\(n + 1\), one for each admissible value of the second weight \(k\), and the smallest
irreducible that appears is \(V_{m-n}\) — the difference of the two highest weights. The
larger factor \(V_m\) sets where the list begins, at \(V_{m+n}\), the sum of the two
highest weights. The whole decomposition is thus pinned by two numbers, the sum and the
difference of the highest weights, with everything in between filled in by steps of two.
Decomposition in Practice
The theorem is best absorbed through a concrete case larger than the motivating one. Take
\(m = 4\), \(n = 2\), so that \(V_4 \otimes V_2\) is \(15\)-dimensional. The theorem predicts
\[
V_4 \otimes V_2 \;\cong\; V_6 \oplus V_4 \oplus V_2,
\]
of dimensions \(7 + 5 + 3 = 15\). We verify it by the weight count alone, with no operator
algebra.
Reading the Decomposition Off the Weights
The first factor contributes weights \(4, 2, 0, -2, -4\); the second contributes
\(2, 0, -2\). Adding every pair gives the weight multiplicities of the tensor product, which
we tabulate.
| Weight \(w\) |
Multiplicity |
Realizing tensors \(u_j \otimes v_k\) |
| \(6\) | \(1\) | \(u_4 \otimes v_2\) |
| \(4\) | \(2\) | \(u_2 \otimes v_2,\; u_4 \otimes v_0\) |
| \(2\) | \(3\) | \(u_0 \otimes v_2,\; u_2 \otimes v_0,\; u_4 \otimes v_{-2}\) |
| \(0\) | \(3\) | \(u_{-2} \otimes v_2,\; u_0 \otimes v_0,\; u_2 \otimes v_{-2}\) |
| \(-2\) | \(3\) | \(u_{-4} \otimes v_2,\; u_{-2} \otimes v_0,\; u_0 \otimes v_{-2}\) |
| \(-4\) | \(2\) | \(u_{-4} \otimes v_0,\; u_{-2} \otimes v_{-2}\) |
| \(-6\) | \(1\) | \(u_{-4} \otimes v_{-2}\) |
The multiplicities are \(1, 2, 3, 3, 3, 2, 1\). Subtracting the weight profile of \(V_6\)
(one copy of each weight \(6, 4, 2, 0, -2, -4, -6\)) leaves \(0, 1, 2, 2, 2, 1, 0\); the
largest surviving weight is \(4\), seeding a \(V_4\). Subtracting the profile of \(V_4\)
(weights \(4, 2, 0, -2, -4\)) leaves \(0, 0, 1, 1, 1, 0, 0\), the weight profile of \(V_2\).
Nothing remains after that, confirming \(V_4 \otimes V_2 \cong V_6 \oplus V_4 \oplus V_2\)
with each summand once. The peeling induction of the proof is exactly this subtraction of
weight profiles, carried out one irreducible at a time from the top.
The Coupling Rule of an Equivariant Layer
A rotation-equivariant network stores its activations as type-\(\ell\) features, each a
copy of an irreducible of \(\mathfrak{so}(3) \cong \mathfrak{su}(2)\); under the
correspondence between \(SO(3)\) and \(\mathfrak{sl}(2;\mathbb{C})\) representations
these are exactly the \(V_m\) decomposed here. When a layer multiplies a type-\(\ell_1\)
feature by a type-\(\ell_2\) feature, the product lives in \(V_{\ell_1} \otimes
V_{\ell_2}\), and the Clebsch-Gordan decomposition is the rule that reorganizes it back
into a sum of admissible type-\(\ell\) features, with \(\ell\) ranging over
\(|\ell_1 - \ell_2|, \dots, \ell_1 + \ell_2\). Multiplicity-freeness means each output
type is produced in exactly one way, so the coupling carries no ambiguity that the
architecture would have to resolve by extra bookkeeping. This is the algebraic content
behind the tensor-product layers that combine directional features in equivariant
models; the decomposition fixes which output channels a product of two input channels is
allowed to populate.
The construction reaches further than the case worked here. The
same complete-reducibility and Schur machinery
that closed the proof governs how operators relate the irreducible pieces, and pushing in
that direction leads to the selection rules and reduced matrix elements that organize how a
rotationally symmetric system responds to vector quantities. The decomposition of a tensor
product into irreducibles is the first and most basic of these structural rules.