From Eigenvalues to the Spectrum
In Eigenvalues & Eigenvectors,
we learned that every square matrix \(A \in \mathbb{R}^{n \times n}\) has eigenvalues —
scalars \(\lambda\) for which \(A\mathbf{v} = \lambda\mathbf{v}\) has a nonzero solution.
Equivalently, \(\lambda\) is an eigenvalue if and only if the operator \(A - \lambda I\)
is not invertible, i.e., \(\det(A - \lambda I) = 0\).
In finite dimensions, every linear operator has at least one eigenvalue (over \(\mathbb{C}\)),
and the Spectral Theorem for Symmetric Matrices
guarantees that symmetric matrices possess a complete orthonormal eigenbasis.
In infinite dimensions, the situation is far more subtle. The characteristic polynomial
\(\det(T - \lambda I) = 0\) is no longer available — determinants are not defined for
operators on infinite-dimensional spaces. We must replace this algebraic criterion with
a topological one: invertibility of the operator \(T - \lambda I\).
The Resolvent and Spectrum
Definition: Resolvent Set and Spectrum
Let \(\mathcal{X}\) be a Banach space and \(T \in \mathcal{B}(\mathcal{X})\) a bounded
linear operator. The resolvent set of \(T\) is
\[
\rho(T) = \bigl\{\lambda \in \mathbb{C} : (T - \lambda I) \text{ has a bounded inverse in } \mathcal{B}(\mathcal{X})\bigr\}.
\]
The spectrum of \(T\) is its complement:
\[
\sigma(T) = \mathbb{C} \setminus \rho(T).
\]
In finite dimensions, the spectrum equals the set of eigenvalues. In infinite dimensions,
the operator \(T - \lambda I\) can fail to be invertible in fundamentally different ways,
giving rise to three distinct parts of the spectrum.
Definition: Decomposition of the Spectrum
The spectrum \(\sigma(T)\) decomposes into three disjoint parts:
(a) Point Spectrum \(\sigma_p(T)\): The set of \(\lambda\) for which
\(T - \lambda I\) is not injective. These are the true eigenvalues —
there exists a nonzero \(x\) with \(Tx = \lambda x\).
(b) Continuous Spectrum \(\sigma_c(T)\): The set of \(\lambda\) for which
\(T - \lambda I\) is injective, has dense range, but is not surjective.
The "inverse" exists on a dense subspace but is unbounded.
(c) Residual Spectrum \(\sigma_r(T)\): The set of \(\lambda\) for which
\(T - \lambda I\) is injective but does not have dense range.
The continuous spectrum is the genuinely new phenomenon of infinite dimensions.
Let us see it in action.
Example: The Right Shift Operator on \(\ell^2\)
Define the right shift \(S : \ell^2 \to \ell^2\) by
\(S(x_1, x_2, x_3, \ldots) = (0, x_1, x_2, x_3, \ldots)\).
This operator has no eigenvalues: if \(S\mathbf{x} = \lambda\mathbf{x}\),
then \(0 = \lambda x_1\), \(x_1 = \lambda x_2\), \(x_2 = \lambda x_3\), etc.
For \(\lambda = 0\), we get \(\mathbf{x} = 0\). For \(\lambda \neq 0\),
we get \(x_n = x_1 / \lambda^{n-1}\), which is not square-summable unless \(x_1 = 0\).
So \(\sigma_p(S) = \emptyset\).
Yet the spectrum is nonempty: one can show that \(\sigma(S) = \{\lambda \in \mathbb{C} : |\lambda| \leq 1\}\),
the entire closed unit disk. The boundary \(\{|\lambda| = 1\}\) contributes to the
continuous and residual spectra — a phenomenon with no finite-dimensional counterpart.
This example reveals why infinite-dimensional spectral theory requires new tools.
A general bounded operator on a Hilbert space may have spectrum that is a continuous
"smear" rather than a discrete set of points. To recover the clean, discrete
eigenvalue structure of finite-dimensional linear algebra, we need to impose
additional structure on our operators.
Connection to the Graph Laplacian
In Graph Laplacians & Spectral Methods
and Spectral Clustering,
we exploited the eigenvalues and eigenvectors of the (finite) graph Laplacian \(\mathbf{L} = \mathbf{D} - \mathbf{W}\)
to partition data. The Spectral Theorem for symmetric matrices guaranteed that \(\mathbf{L}\) is
orthogonally diagonalizable. This chapter generalizes that machinery to
continuous operators on infinite-dimensional spaces — replacing the
discrete graph Laplacian with integral operators, and finite eigenvectors with
eigenfunctions.
The Adjoint & Self-Adjoint Operators
In finite-dimensional linear algebra, the transpose (or conjugate transpose) of a matrix
plays a central role — symmetric matrices (\(A = A^T\)) have real eigenvalues and
orthogonal eigenvectors. We now generalize this to Hilbert spaces using the inner product.
The Hilbert Space Adjoint
Theorem: Existence of the Adjoint
Let \(\mathcal{H}\) be a Hilbert space and \(T \in \mathcal{B}(\mathcal{H})\).
There exists a unique operator \(T^* \in \mathcal{B}(\mathcal{H})\),
called the adjoint of \(T\), satisfying
\[
\langle Tx, y \rangle = \langle x, T^* y \rangle \quad \text{for all } x, y \in \mathcal{H}.
\]
Moreover, \(\|T^*\| = \|T\|\).
The existence of \(T^*\) follows from the
Riesz Representation Theorem:
for each fixed \(y\), the map \(x \mapsto \langle Tx, y \rangle\) is a continuous
linear functional on \(\mathcal{H}\). By Riesz, there exists a unique vector — which we
call \(T^* y\) — such that \(\langle Tx, y \rangle = \langle x, T^* y \rangle\).
The map \(y \mapsto T^* y\) is linear, and the bound \(\|T^*\| = \|T\|\) follows from
the isometry of the Riesz map.
Example: Adjoints of Familiar Operators
Matrices: For \(A \in \mathbb{C}^{m \times n}\) acting on
\(\mathbb{C}^n\), the adjoint is the conjugate transpose: \(A^* = \overline{A}^T\).
Over \(\mathbb{R}\), this is simply \(A^T\).
Integral operators: If \(T : L^2([a,b]) \to L^2([a,b])\) is defined by
\[
(Tf)(x) = \int_a^b K(x, t)\, f(t)\, dt,
\]
then the adjoint is
\[
(T^* g)(t) = \int_a^b \overline{K(x, t)}\, g(x)\, dx.
\]
That is, the kernel is "transposed and conjugated": \(K^*(t, x) = \overline{K(x, t)}\).
Self-Adjoint Operators
Definition: Self-Adjoint (Hermitian) Operator
A bounded operator \(T \in \mathcal{B}(\mathcal{H})\) is self-adjoint
if \(T = T^*\), i.e.,
\[
\langle Tx, y \rangle = \langle x, Ty \rangle \quad \text{for all } x, y \in \mathcal{H}.
\]
Self-adjoint operators are the infinite-dimensional generalization of real symmetric
(or Hermitian) matrices. They inherit the crucial spectral properties:
Theorem: Spectral Properties of Self-Adjoint Operators
Let \(T = T^* \in \mathcal{B}(\mathcal{H})\). Then:
(a) Every eigenvalue of \(T\) is real.
(b) Eigenvectors corresponding to distinct eigenvalues are orthogonal.
(c) The entire spectrum \(\sigma(T) \subseteq \mathbb{R}\).
(d) \(\|T\| = \sup_{\|x\| = 1} |\langle Tx, x \rangle|\), and both
\(\inf_{\|x\|=1} \langle Tx, x \rangle\) and \(\sup_{\|x\|=1} \langle Tx, x \rangle\)
belong to \(\sigma(T)\).
Proof of (a) and (b):
(a) If \(Tx = \lambda x\) with \(x \neq 0\), then
\(\lambda \langle x, x \rangle = \langle Tx, x \rangle = \langle x, Tx \rangle
= \overline{\lambda} \langle x, x \rangle\). Since \(\langle x, x \rangle \neq 0\),
we get \(\lambda = \overline{\lambda}\), so \(\lambda \in \mathbb{R}\).
(b) If \(Tx = \lambda x\) and \(Ty = \mu y\) with \(\lambda \neq \mu\),
then \(\lambda \langle x, y \rangle = \langle Tx, y \rangle = \langle x, Ty \rangle = \mu \langle x, y \rangle\).
Since \(\lambda \neq \mu\), we conclude \(\langle x, y \rangle = 0\). \(\square\)
Property (d) is particularly important — it says the operator norm of a self-adjoint
operator is determined by its quadratic form \(\langle Tx, x \rangle\),
the infinite-dimensional analogue of the Rayleigh quotient from
quadratic forms.
Connection to Quantum Mechanics
In quantum mechanics, observables (position, momentum, energy) are modeled as
self-adjoint operators on a Hilbert space. The requirement that eigenvalues be real
(property (a)) ensures that measurement outcomes are real numbers.
The orthogonality of eigenvectors (property (b)) ensures that distinct measurement
outcomes correspond to mutually exclusive states — the mathematical basis of the
Bra-Ket formalism.
Compact Operators: "Almost Finite-Dimensional"
The right shift example in Section 1 showed that a general
bounded operator can have a continuous spectrum with no eigenvalues at all.
To recover the discrete, diagonalizable structure of finite-dimensional linear algebra,
we restrict our attention to a special class of operators that are, in a precise sense,
"close to finite-dimensional."
Definition: Compact Operator
A bounded linear operator \(T : \mathcal{X} \to \mathcal{Y}\) between normed spaces
is called compact if it maps every bounded set in \(\mathcal{X}\)
to a relatively compact (precompact) set in \(\mathcal{Y}\).
Equivalently, for every bounded sequence \(\{x_n\}\) in \(\mathcal{X}\),
the image sequence \(\{Tx_n\}\) has a convergent subsequence in \(\mathcal{Y}\).
Recall from Intro to Functional Analysis
that in infinite dimensions, bounded sequences need not have convergent subsequences —
the closed unit ball is not compact. A compact operator "repairs" this: it
compresses the infinite-dimensional unit ball into a set that is
(relatively) compact. In other words, it "squashes away" the infinite-dimensional
directions, leaving behind something that behaves like a finite-dimensional object.
Finite-Rank Operators
The simplest compact operators are those with finite-dimensional range.
Definition: Finite-Rank Operator
An operator \(T \in \mathcal{B}(\mathcal{X}, \mathcal{Y})\) is finite-rank
if \(\dim(\operatorname{Range}(T)) < \infty\). Every finite-rank operator is compact
(its image lies in a finite-dimensional subspace, where Heine-Borel applies).
The deep connection between compact and finite-rank operators is captured by the following
approximation property:
Theorem: Compact Operators as Limits of Finite-Rank Operators
Let \(\mathcal{H}\) be a Hilbert space. An operator \(T \in \mathcal{B}(\mathcal{H})\)
is compact if and only if there exists a sequence of finite-rank operators \(\{T_n\}\)
such that \(\|T - T_n\| \to 0\).
This is the functional-analytic analogue of the
truncated SVD:
a compact operator can be approximated arbitrarily well by operators with
finite-dimensional range, just as any matrix can be approximated by low-rank matrices
obtained by discarding small singular values.
The Canonical Example: Integral Operators
Example: Hilbert-Schmidt Integral Operators
Let \(K : [a,b] \times [a,b] \to \mathbb{R}\) be a square-integrable kernel, meaning
\[
\int_a^b \int_a^b |K(x,t)|^2 \, dx\, dt < \infty.
\]
The integral operator defined by
\[
(Tf)(x) = \int_a^b K(x, t)\, f(t)\, dt
\]
is a compact operator on \(L^2([a,b])\).
Such operators are called Hilbert-Schmidt operators, and they form
a two-sided ideal in \(\mathcal{B}(L^2)\).
The Hilbert-Schmidt norm is
\[
\|T\|_{\text{HS}} = \left(\int_a^b \int_a^b |K(x,t)|^2 \, dx\, dt\right)^{1/2},
\]
and it satisfies \(\|T\| \leq \|T\|_{\text{HS}}\)
(the operator norm is bounded by the Hilbert-Schmidt norm).
These integral operators are the "continuous matrices" of functional analysis.
The kernel function \(K(x,t)\) plays the role of the matrix entries \(a_{ij}\),
and the integral replaces the finite sum. When the kernel is symmetric
(\(K(x,t) = K(t,x)\)), the integral operator is self-adjoint — the
continuous analogue of a symmetric matrix.
Key Properties of Compact Operators
Theorem: Properties of Compact Operators
(a) The set of compact operators \(\mathcal{K}(\mathcal{X}, \mathcal{Y})\)
is a closed subspace of \(\mathcal{B}(\mathcal{X}, \mathcal{Y})\).
(b) If \(T\) is compact and \(S\) is bounded, then both \(ST\) and \(TS\)
are compact (compact operators form a two-sided ideal in \(\mathcal{B}(\mathcal{H})\)).
(c) If \(T\) is compact, then \(T^*\) is compact (Schauder's theorem).
(d) The identity operator \(I\) on an infinite-dimensional space is
never compact (since the closed unit ball is not compact).
Property (d) highlights the key distinction: compact operators are the "well-behaved"
operators that do not include the identity — they necessarily "lose dimensions"
in the sense that their image cannot contain an open set.
Connection to Machine Learning: Kernel Matrices as Finite Approximations
In Kernel PCA,
we compute the Gram matrix \(K_{ij} = k(x_i, x_j)\) for \(N\) data points
and extract its eigenvectors. This \(N \times N\) matrix is a
finite-rank approximation of the underlying integral operator
\((Tf)(x) = \int k(x, t) f(t)\, d\mu(t)\) on the function space.
The theorem above guarantees that as \(N \to \infty\), the finite Gram matrix
converges (in operator norm) to the true compact operator — justifying why
kernel methods work with finite data.
The Spectral Theorem for Compact Self-Adjoint Operators
We now arrive at the central result of this chapter — the infinite-dimensional
generalization of the
Spectral Theorem for Symmetric Matrices.
Theorem: Spectral Theorem (Compact Self-Adjoint Operators)
Let \(\mathcal{H}\) be a separable Hilbert space and
\(T = T^* \in \mathcal{B}(\mathcal{H})\) a compact, self-adjoint operator. Then:
(a) There exists a finite or countably infinite orthonormal sequence
of eigenvectors \(\{e_n\}_{n=1}^N\) (\(N \leq \infty\)) with corresponding
real eigenvalues \(\{\lambda_n\}\), such that
\[
Tx = \sum_{n=1}^{N} \lambda_n \langle x, e_n \rangle \, e_n \quad \text{for all } x \in \mathcal{H}.
\]
(b) If \(N = \infty\), then \(\lambda_n \to 0\) as \(n \to \infty\).
(c) Each nonzero eigenvalue has finite multiplicity
(its eigenspace is finite-dimensional).
(d) The only possible accumulation point of \(\{\lambda_n\}\) is \(0\).
(e) The eigenvectors \(\{e_n\}\) together with an orthonormal basis for
\(\ker(T)\) form a complete orthonormal basis for \(\mathcal{H}\).
The Spectral Theorem resurrects the full power of finite-dimensional diagonalization:
a compact self-adjoint operator is completely determined by its eigenvalues and eigenvectors,
just as a symmetric matrix is determined by \(A = Q\Lambda Q^T\).
The only new feature is that the eigenvalues must decay to zero — the
"price" of being compact in infinite dimensions.
Comparison: Finite vs. Infinite Dimensions
| Property |
Symmetric Matrix \(A \in \mathbb{R}^{n \times n}\) |
Compact Self-Adjoint \(T \in \mathcal{B}(\mathcal{H})\) |
| Eigenvalues |
Finitely many (\(n\)), all real |
Countably many, all real, \(\lambda_n \to 0\) |
| Eigenvectors |
Orthonormal basis for \(\mathbb{R}^n\) |
Orthonormal basis for \(\overline{\operatorname{Range}(T)}\) |
| Decomposition |
\(A = Q\Lambda Q^T = \sum_{i=1}^{n} \lambda_i \mathbf{v}_i \mathbf{v}_i^T\) |
\(Tx = \sum_{n=1}^{\infty} \lambda_n \langle x, e_n \rangle \, e_n\) |
| Truncation |
Low-rank approximation (truncated SVD) |
Finite-rank approximation (\(\|T - T_k\| \leq |\lambda_{k+1}|\)) |
| Zero eigenvalue |
Indicates rank deficiency |
Always in \(\sigma(T)\) if \(\dim(\mathcal{H}) = \infty\) |
Proof Sketch
The proof proceeds by an iterative maximization argument that parallels the
Rayleigh quotient construction from finite-dimensional linear algebra.
Proof Sketch:
Step 1: Find the first eigenvector.
Since \(T\) is self-adjoint, \(\|T\| = \sup_{\|x\|=1} |\langle Tx, x \rangle|\).
By weak compactness
of the unit ball (the space is reflexive) and compactness of \(T\), there exists
\(e_1\) with \(\|e_1\| = 1\) achieving
\(|\langle Te_1, e_1 \rangle| = \|T\|\). Set \(\lambda_1 = \langle Te_1, e_1 \rangle\).
A variational argument shows \(Te_1 = \lambda_1 e_1\).
Step 2: Restrict and repeat.
The subspace \(\{e_1\}^\perp\) is invariant under \(T\) (because \(T\) is self-adjoint).
The restriction \(T|_{\{e_1\}^\perp}\) is still compact and self-adjoint,
so we repeat Step 1 to find \(e_2\) with \(|\lambda_2| = \|T|_{\{e_1\}^\perp}\| \leq |\lambda_1|\).
Step 3: Convergence to zero.
The sequence \(\{|\lambda_n|\}\) is nonincreasing and bounded below by 0, so it converges.
If it converged to some \(\epsilon > 0\), the orthonormal sequence \(\{e_n\}\) would satisfy
\(\|Te_m - Te_n\|^2 = \lambda_m^2 + \lambda_n^2 \geq 2\epsilon^2\), contradicting
the compactness of \(T\) (which requires \(\{Te_n\}\) to have a convergent subsequence).
Hence \(\lambda_n \to 0\). \(\square\)
Notice that Step 1 uses the Rayleigh quotient characterization, Step 2 uses the
orthogonal complement (Hilbert space structure), and Step 3 uses compactness
to force the eigenvalues to decay. Each ingredient is essential.
The Singular Value Decomposition in Infinite Dimensions
Just as the finite-dimensional SVD
extends the Spectral Theorem from symmetric to general matrices,
there is an infinite-dimensional SVD for compact operators.
Theorem: Singular Value Decomposition (Compact Operators)
Let \(T \in \mathcal{B}(\mathcal{H})\) be a compact operator (not necessarily self-adjoint).
Then there exist orthonormal sequences \(\{e_n\}\) and \(\{f_n\}\) and a nonincreasing
sequence of nonnegative reals \(\sigma_1 \geq \sigma_2 \geq \cdots \geq 0\)
(the singular values) such that
\[
Tx = \sum_{n=1}^{\infty} \sigma_n \langle x, e_n \rangle \, f_n.
\]
The singular values are the eigenvalues of \((T^*T)^{1/2}\), and \(\sigma_n \to 0\).
This is the precise infinite-dimensional analogue of \(A = U\Sigma V^T\).
The \(\{e_n\}\) are the "right singular vectors," the \(\{f_n\}\) are the
"left singular vectors," and the \(\sigma_n\) decay to zero because \(T\) is compact.
Functional PCA & The Path to Kernels
The Spectral Theorem is not merely an abstract result — it is the mathematical engine
behind several of the most important algorithms in machine learning and statistics.
In this final section, we connect the theory to three concrete applications.
Application 1: The Karhunen-Loève Expansion (Functional PCA)
In PCA, we performed
eigendecomposition on the covariance matrix \(\Sigma \in \mathbb{R}^{n \times n}\)
of a finite-dimensional random vector. But what if our data consists of
random functions — e.g., time series, curves, or images viewed as functions
on a domain?
Let \(X(t)\) be a centered stochastic process on \([a,b]\) with covariance function
\(C(s, t) = \mathbb{E}[X(s) X(t)]\). The covariance operator
\(\mathcal{C} : L^2([a,b]) \to L^2([a,b])\) is defined by
\[
(\mathcal{C}f)(s) = \int_a^b C(s, t)\, f(t)\, dt.
\]
This is a Hilbert-Schmidt integral operator with a symmetric, positive semi-definite kernel,
so it is compact and self-adjoint. The Spectral Theorem applies directly.
Theorem: Karhunen-Loève Expansion
Let \(X(t)\) be a centered, square-integrable stochastic process on \([a,b]\) with
continuous covariance function \(C(s,t)\). Let \(\{\lambda_n, e_n\}\) be the eigenvalues
and eigenfunctions of the covariance operator \(\mathcal{C}\). Then
\[
X(t) = \sum_{n=1}^{\infty} Z_n \, e_n(t),
\]
where \(Z_n = \langle X, e_n \rangle = \int_a^b X(t) e_n(t)\, dt\) are uncorrelated
random variables with \(\mathbb{E}[Z_n] = 0\) and \(\text{Var}(Z_n) = \lambda_n\).
The convergence is in \(L^2\).
This is literally infinite-dimensional PCA. The eigenfunctions
\(\{e_n\}\) are the principal components (the directions of maximum variance),
the eigenvalues \(\lambda_n\) give the variance in each direction,
and the random coefficients \(Z_n\) are the "scores."
The decay \(\lambda_n \to 0\) guarantees that truncating after \(k\) terms
captures most of the variance — the continuous analogue of the scree plot.
Application 2: Spectral Clustering via the Continuous Laplacian
In Spectral Clustering,
we used eigenvectors of the discrete graph Laplacian \(\mathbf{L} = \mathbf{D} - \mathbf{W}\)
to embed data into a low-dimensional space and then applied K-means. The continuous
analogue replaces the graph with a smooth domain and the discrete Laplacian
with the Laplace-Beltrami operator \(\Delta\) on a manifold.
The Laplace-Beltrami operator (with appropriate boundary conditions) is self-adjoint
and has compact resolvent, so its eigenfunctions \(\{e_n\}\) form an orthonormal
basis for \(L^2\) on the manifold, with eigenvalues \(0 = \lambda_0 \leq \lambda_1 \leq \cdots \to \infty\).
The first few eigenfunctions — the "low-frequency modes" — capture the large-scale
geometry of the manifold, precisely as the Fiedler vector captures the main partition
of a graph in the discrete case.
From Graphs to Manifolds: The Convergence
As the number of data points \(N \to \infty\) and the graph becomes denser,
the discrete graph Laplacian converges (in a suitable sense) to the
Laplace-Beltrami operator on the underlying data manifold.
The eigenvectors of the Gram matrix converge to the eigenfunctions of
the continuous operator. This is the rigorous justification for spectral clustering:
the discrete algorithm is a finite-sample approximation of a well-defined
continuous operation, and the Spectral Theorem guarantees that the continuous
object has the orthogonal eigenfunction structure we need.
Application 3: The Bridge to Mercer's Theorem and RKHS
Consider a continuous, symmetric, positive semi-definite kernel
\(K : [a,b] \times [a,b] \to \mathbb{R}\). The integral operator
\[
(T_K f)(x) = \int_a^b K(x, t)\, f(t)\, dt
\]
is compact, self-adjoint, and positive (all eigenvalues \(\lambda_n \geq 0\)).
By the Spectral Theorem:
\[
K(x, t) = \sum_{n=1}^{\infty} \lambda_n \, e_n(x) \, e_n(t),
\]
where the convergence is absolute and uniform.
This eigenfunction expansion has a profound consequence: the kernel \(K\) defines
an implicit feature map
\[
\Phi(x) = \bigl(\sqrt{\lambda_1}\, e_1(x),\; \sqrt{\lambda_2}\, e_2(x),\; \ldots\bigr) \in \ell^2,
\]
such that \(K(x, t) = \langle \Phi(x), \Phi(t) \rangle_{\ell^2}\).
The kernel "trick" — computing inner products in a high-dimensional feature space
without ever forming \(\Phi\) explicitly — is mathematically grounded in this
spectral decomposition.
This is the content of Mercer's Theorem, which we will state and prove
rigorously in the next chapter
(RKHS & Kernel Methods).
The Spectral Theorem of the present chapter provides the engine;
Mercer's Theorem adds the fuel — positivity of the kernel — to build
the Reproducing Kernel Hilbert Space.
Summary: The Spectral Thread
Let us trace the spectral idea through our entire curriculum:
- Finite-dimensional
(Symmetry & SVD):
\(A = Q\Lambda Q^T\). Discrete eigenvalues, finite eigenvectors. Powers PCA, SVD.
- Discrete infinite
(Graph Laplacians):
\(\mathbf{L} = \mathbf{D} - \mathbf{W}\). Discrete eigenvalues of growing matrices.
Powers Spectral Clustering, Graph Fourier Transform.
- Continuous infinite (this chapter):
\(Tx = \sum \lambda_n \langle x, e_n \rangle e_n\). Countable eigenvalues decaying to 0.
Powers Functional PCA, continuous Laplacians.
- Next :
Mercer's Theorem converts the spectral decomposition of a positive kernel
into an explicit feature space — the mathematical foundation of
SVMs, Gaussian Processes, and Kernel PCA.