Why Fourier Belongs in Hilbert Space
The site has touched Fourier analysis twice already, from two very different vantage points.
In Fourier Series, we
decomposed periodic functions into discrete frequencies and asserted Parseval's identity.
In Fourier Transform,
we generalized to functions on \(\mathbb{R}\) and stated Plancherel's theorem. In both pages,
several foundational results were stated, used, or sketched — some explicitly deferred to "the
forthcoming page on Fourier analysis in Hilbert spaces," and others (notably the classical
Parseval identity for Fourier series) proved directly but awaiting recognition as instances
of a more general Hilbert-space structure. This is that page.
The reason for the deferral was not stylistic. At the time those pages were written, the
site did not yet possess the abstract framework in which their central claims become
transparent corollaries rather than ad hoc calculations. That framework is the theory of
Hilbert spaces
with their
orthonormal bases,
developed in the functional-analysis block, together with the
Riesz-Fischer theorem
that places \(L^2\) inside it. With these tools in hand, the classical \(L^2\) Fourier results
cease to be a separate subject: they become specific instances of Hilbert-space machinery —
orthonormal-basis decomposition, Bessel's inequality, and bounded extension from a dense
subspace — recognized retrospectively in the sections that follow.
Convention. Throughout this page, all Hilbert spaces are taken over
\(\mathbb{C}\), with inner product linear in the first slot and conjugate-linear in the second.
We inherit the mathematical/PDE convention from the Fourier-series and Fourier-transform pages
(positive sign in the coefficient integral and in the forward transform). Explicitly, the
complex Fourier series on \([-L, L]\) takes the form
\(f(x) = \sum_n c_n\, e^{-i n \pi x / L}\) with coefficients
\(c_n = \tfrac{1}{2L}\int_{-L}^{L} f(x)\, e^{+i n \pi x / L}\, dx\), and the Fourier transform
on \(\mathbb{R}\) is \(\hat f(\xi) = \int_{\mathbb{R}} f(x)\, e^{i\xi x}\, dx\) from the
Fourier Transform
page; under this convention, the normalized map
\(\mathcal{U}: f \mapsto (2\pi)^{-1/2}\hat f\) is the operator we will identify as unitary on
\(L^2(\mathbb{R})\).
Fourier Series as an Orthonormal Basis
The Fourier-series page introduced the complex exponential system
\(\{e^{-in\pi x/L}\}_{n \in \mathbb{Z}}\) on \([-L, L]\) and showed by direct
computation that it is orthogonal with respect to the inner product
\(\langle f, g \rangle = \frac{1}{2L}\int_{-L}^{L} f(x)\overline{g(x)}\, dx\).
Normalizing, we obtain the orthonormal system
\[
e_n(x) \;:=\; \frac{1}{\sqrt{2L}}\, e^{-i n \pi x / L}, \qquad n \in \mathbb{Z},
\]
in the Hilbert space \(\mathcal{H} = L^2([-L, L])\) equipped with the unnormalized inner product
\(\langle f, g \rangle = \int_{-L}^{L} f(x)\overline{g(x)}\, dx\). The relationship between this
normalization and the normalized inner product used in the Fourier-series page is a uniform rescaling
by \(2L\), and either choice produces the same orthonormal system after rescaling the basis vectors;
we use the unnormalized inner product here to align with the functional-analysis block.
The system \(\{e_n\}_{n \in \mathbb{Z}}\) is orthonormal — this is the elementary computation
carried out in the Fourier-series page, transported through the rescaling. What is not elementary,
and what was left open at that stage, is whether this orthonormal system is in fact an
orthonormal basis
of \(L^2([-L, L])\): that is, whether the closed linear span of \(\{e_n\}\) is all of \(L^2([-L, L])\).
Once this is established, Parseval's identity for Fourier series becomes a special case of the
abstract
Parseval identity
in Hilbert spaces, and the convergence of partial sums in \(L^2\) — invoked but not proved
in the Fourier-series page — becomes the expansion statement of that same theorem.
Strategy: from orthogonality to a basis
The decisive characterization of orthonormal bases in Hilbert spaces, equivalent to the
closed-span definition adopted in
Intro to Functional Analysis,
is the following: an orthonormal set \(\{e_n\}\) in \(\mathcal{H}\) is a basis if and only if
the only vector orthogonal to every \(e_n\) is the zero vector. That is, if
\(\langle h, e_n \rangle = 0\) for all \(n\) forces \(h = 0\), then the closed span equals \(\mathcal{H}\),
and conversely. (Each direction is immediate from the orthogonal-decomposition theorem:
if the closed span \(M\) is a proper subspace, then \(M^\perp\) contains a nonzero vector,
contradicting the implication; conversely, any vector orthogonal to all \(e_n\) lies in \(M^\perp\),
which is \(\{0\}\) when \(M = \mathcal{H}\).)
For the trigonometric system, this strategy translates into the following concrete claim: if
\(f \in L^2([-L, L])\) has every Fourier coefficient \(\int_{-L}^{L} f(x)\, e^{+in\pi x/L}\, dx = 0\)
for \(n \in \mathbb{Z}\), then \(f = 0\) almost everywhere. The technical core of the argument lies
in approximating an arbitrary continuous periodic function by trigonometric polynomials, and then
using density of continuous periodic functions in \(L^2\) to conclude.
Trigonometric approximation via the Fejér kernel
For the approximation step we use the classical Fejér kernel construction. For a continuous
\(2L\)-periodic function \(f\), define the partial sums and Fourier coefficients in the normalization of
the
Fourier-series page
(so \(c_n\) carries a factor of \(1/(2L)\); the rescaling that relates these \(c_n\) to the abstract
inner products \(\langle f, e_n \rangle\) will be reinstated at the end of this section):
\[
\begin{align*}
S_N f(x) &\;=\; \sum_{|n| \leq N} c_n\, e^{-i n \pi x / L}, \\\\
c_n &\;=\; \frac{1}{2L} \int_{-L}^{L} f(y)\, e^{+i n \pi y / L}\, dy,
\end{align*}
\]
and define the Cesàro means
\[
\sigma_N f(x) \;=\; \frac{S_0 f(x) + S_1 f(x) + \cdots + S_{N-1} f(x)}{N}.
\]
Substituting the integral expression for \(c_n\) and switching summation with integration
(a finite sum, so the interchange is unconditional) gives
\[
\sigma_N f(x) \;=\; \frac{1}{2L}\int_{-L}^{L} f(y)\, F_N\!\left(\frac{\pi(x-y)}{L}\right) dy,
\]
where the Fejér kernel \(F_N\) is obtained by tracking the same computation
through each partial sum: writing
\(S_n f(x) = \tfrac{1}{2\pi}\int_{-\pi}^{\pi} f(x - Lt/\pi)\, D_n(t)\, dt\) with the
Dirichlet kernel \(D_n(t) = \sum_{k=-n}^{n} e^{ikt}\) and averaging over
\(n = 0, \ldots, N-1\) gives \(F_N = \tfrac{1}{N}(D_0 + D_1 + \cdots + D_{N-1})\),
the Cesàro average of the Dirichlet kernels. This in turn admits the closed form
\[
F_N(t) \;=\; \frac{1}{N}\, \frac{\sin^2(N t / 2)}{\sin^2(t / 2)}, \qquad t \in (-\pi, \pi) \setminus \{0\},
\]
(with the natural limiting value \(F_N(0) = N\), obtained from
\(\lim_{t \to 0} \sin^2(Nt/2)/\sin^2(t/2) = N^2\); this makes \(F_N\) continuous at \(0\)
and the closed form valid on all of \([-\pi, \pi]\)).
To see how this closed form arises, recall the standard geometric-series evaluation
\(D_n(t) = \frac{\sin((n + \tfrac{1}{2}) t)}{\sin(t/2)}\), valid for \(t \not\equiv 0 \pmod{2\pi}\).
Substituting this into \(F_N = \tfrac{1}{N}(D_0 + D_1 + \cdots + D_{N-1})\),
\[
F_N(t) \;=\; \frac{1}{N\, \sin(t/2)} \sum_{n=0}^{N-1} \sin\!\bigl((n + \tfrac{1}{2})\, t\bigr).
\]
The inner sum is a sum of sines in arithmetic progression. Using the product-to-sum identity
\(2 \sin(\alpha) \sin(\beta) = \cos(\alpha - \beta) - \cos(\alpha + \beta)\) with
\(\alpha = (n + \tfrac{1}{2}) t\) and \(\beta = t/2\), each term becomes a telescoping difference
of cosines:
\[
2 \sin(t/2) \sin\!\bigl((n + \tfrac{1}{2}) t\bigr) \;=\; \cos(n t) - \cos\!\bigl((n + 1) t\bigr).
\]
Summing from \(n = 0\) to \(N - 1\) telescopes to \(1 - \cos(N t) = 2 \sin^2(N t / 2)\), and dividing
by \(2 \sin(t/2)\) gives \(\sum_{n=0}^{N-1} \sin((n + \tfrac{1}{2}) t) = \tfrac{\sin^2(N t / 2)}{\sin(t / 2)}\).
Substituting back into the expression for \(F_N\) yields the displayed closed form.
Three properties of \(F_N\) drive the approximation argument:
- Positivity: \(F_N(t) \geq 0\) for all \(t\), since it is a squared modulus divided by a positive quantity.
- Unit mass: \(\frac{1}{2\pi}\int_{-\pi}^{\pi} F_N(t)\, dt = 1\) for every \(N\),
inherited from the Dirichlet-kernel computation \(\frac{1}{2\pi}\int_{-\pi}^{\pi} D_n(t)\, dt = 1\)
(only the constant term \(e^{i0t} = 1\) in \(D_n = \sum_{k=-n}^n e^{ikt}\) survives integration);
averaging over \(n = 0, \ldots, N-1\) preserves this value.
- Concentration: for any \(\delta \in (0, \pi)\),
\(\int_{\delta \leq |t| \leq \pi} F_N(t)\, dt \to 0\) as \(N \to \infty\), because \(\sin^2(t/2) \geq c_\delta > 0\)
on \(\{\delta \leq |t| \leq \pi\}\), hence \(F_N(t) \leq 1/(N c_\delta)\) uniformly on that region.
These three properties characterize what is classically called a good kernel or
approximate identity. From them follows the central approximation result:
Proof:
Fix \(\varepsilon > 0\). Because \(f\) is continuous and periodic, it is uniformly continuous on
\(\mathbb{R}\); choose \(\delta \in (0, \pi)\) such that \(|f(x - s) - f(x)| < \varepsilon\)
for all \(x\) whenever \(|s| < \delta L / \pi\). Since \(f\) is \(2L\)-periodic and \(F_N\) is
\(2\pi\)-periodic, the integrand in the convolution representation of \(\sigma_N f\) is
\(2\pi\)-periodic in the variable \(t = \pi(x-y)/L\), so the integration window may be taken
to be \([-\pi, \pi]\). Using the unit-mass property of \(F_N\) to write \(f(x)\) as a
convolution of \(f\) against the kernel,
\[
\sigma_N f(x) - f(x)
\;=\; \frac{1}{2\pi}\int_{-\pi}^{\pi} \bigl[ f(x - L t / \pi) - f(x) \bigr]\, F_N(t)\, dt.
\]
Split the integral into \(|t| < \delta\) and \(\delta \leq |t| \leq \pi\). On the first region,
\(|f(x - Lt/\pi) - f(x)| < \varepsilon\) for every \(x\) by uniform continuity, so positivity of
\(F_N\) and unit mass give the bound, uniform in \(x\),
\[
\left| \frac{1}{2\pi}\int_{|t| < \delta} \bigl[f(x - Lt/\pi) - f(x)\bigr] F_N(t)\, dt \right|
\;\leq\; \varepsilon.
\]
On the second region, \(|f(x - Lt/\pi) - f(x)| \leq 2 \sup_{\mathbb{R}} |f|\), where
\(\sup_{\mathbb{R}} |f| < \infty\) by continuity on a compact period. Concentration of \(F_N\) gives
\[
\begin{align*}
\left| \frac{1}{2\pi}\int_{\delta \leq |t| \leq \pi}
\bigl[f(x - Lt/\pi) - f(x)\bigr] F_N(t)\, dt \right|
&\;\leq\; \frac{2 \sup_{\mathbb{R}} |f|}{2\pi} \int_{\delta \leq |t| \leq \pi} F_N(t)\, dt \\\\
&\;\longrightarrow\; 0
\end{align*}
\]
as \(N \to \infty\), uniformly in \(x\). Combining the two estimates,
\(\sup_{\mathbb{R}} |\sigma_N f - f| \leq \varepsilon + \eta_N\), where \(\eta_N \to 0\) as
\(N \to \infty\). Since \(\varepsilon\) was arbitrary, \(\sigma_N f \to f\) uniformly. Each
\(\sigma_N f\) is a finite linear combination of the \(e^{-in\pi x/L}\), so it is a trigonometric
polynomial. \(\square\)
Fejér's theorem is the analytic ingredient that finite Fourier coefficients cannot detect: a
continuous periodic function is determined by its full set of Fourier coefficients, because if all
were zero the Cesàro means would be identically zero, forcing \(f\) to be zero by uniform convergence.
This observation, combined with the density of continuous periodic functions in \(L^2\), yields the
main theorem of this section.
The trigonometric system is a basis of \(L^2([-L, L])\)
Theorem: Completeness of the Trigonometric System
The orthonormal system \(\{e_n\}_{n \in \mathbb{Z}}\), with
\(e_n(x) = (2L)^{-1/2} e^{-in\pi x/L}\), is an orthonormal basis of \(L^2([-L, L])\).
Equivalently, the only function \(f \in L^2([-L, L])\) with
\(\int_{-L}^{L} f(x)\, e^{+in\pi x/L}\, dx = 0\) for every \(n \in \mathbb{Z}\) is the
zero function (almost everywhere).
Proof:
Two preliminaries. First, the Hilbert space \(L^2([-L, L])\) is separable:
the density argument below produces a countable dense subset (trigonometric polynomials with
rational-complex coefficients), so the abstract orthonormal-basis machinery applies. Second,
the abstract definition of orthonormal basis and the abstract Parseval identity were stated
for \(\mathbb{N}\)-indexed sequences \(\{e_n\}_{n=1}^\infty\), whereas our trigonometric system
is indexed by \(\mathbb{Z}\). Any bijection \(\phi: \mathbb{Z} \to \mathbb{N}\) transfers
\(\{e_n\}_{n \in \mathbb{Z}}\) to a sequence \(\{e_{\phi^{-1}(k)}\}_{k=1}^\infty\) of the
abstract form; since the relevant Parseval series \(\sum_k |\langle f, e_{\phi^{-1}(k)} \rangle|^2\)
consists of non-negative terms, it converges absolutely and its value is independent of \(\phi\),
so we may freely write the result in the original notation \(\sum_{n \in \mathbb{Z}}\).
By the characterization recalled above, it suffices to show: if \(f \in L^2([-L, L])\) satisfies
\(\langle f, e_n \rangle = 0\) for every \(n \in \mathbb{Z}\), then \(f = 0\) almost everywhere.
We extend \(f\) to a \(2L\)-periodic function on \(\mathbb{R}\) by periodicity; the orthogonality
condition states that all Fourier coefficients of the extension vanish.
Reduction to continuous test functions. The continuous \(2L\)-periodic functions
form a
dense
subspace of \(L^2([-L, L])\). We sketch this in three steps. (i) Simple functions —
finite linear combinations of indicator functions of measurable sets of finite measure — are dense
in \(L^2([-L, L])\): writing \(g \in L^2\) as a signed combination of four non-negative
measurable parts \(g = (\Re g)^+ - (\Re g)^- + i\,(\Im g)^+ - i\,(\Im g)^-\) (each part
dominated pointwise by \(|g|\)), each part admits a monotone increasing sequence of
non-negative simple functions converging to it pointwise (standard construction in the
theory of the Lebesgue integral); the resulting complex simple functions \(s_n\) satisfy
\(|s_n| \leq |g|\) pointwise. Since \(|s_n - g|^2 \to 0\) pointwise and is dominated by
\(4|g|^2 \in L^1\) (because \(g \in L^2\)), the
dominated convergence theorem
gives \(\|s_n - g\|_2 \to 0\). (ii) Indicators of measurable sets of finite measure are
approximated in \(L^2\) by indicators of finite unions of open intervals: by outer regularity
of Lebesgue measure, every such set \(E\) is contained in an open set \(U\) with
\(|U \setminus E| < \varepsilon^2 / 2\); decomposing \(U\) as a countable disjoint union of
open intervals \(\bigsqcup_k I_k\) and truncating to a finite union
\(V = I_1 \cup \cdots \cup I_M\) with \(\sum_{k > M} |I_k| < \varepsilon^2 / 2\) gives
\(|V \triangle E| < \varepsilon^2\), hence \(\|\mathbf{1}_V - \mathbf{1}_E\|_2 < \varepsilon\). (iii)
An indicator of an open interval \((a, b) \subset [-L, L]\) is approximated in \(L^2\) by continuous
functions vanishing outside \([a - \varepsilon, b + \varepsilon]\) and equal to \(1\) on
\([a, b]\) (a trapezoidal cutoff); choosing \(\varepsilon\) small enough that
\([a - \varepsilon, b + \varepsilon] \subset (-L, L)\), each such function extends by zero
to a continuous \(2L\)-periodic function on \(\mathbb{R}\). Chaining the three
approximations, every \(g \in L^2([-L, L])\) is the \(L^2\)-limit of continuous \(2L\)-periodic
functions. It therefore suffices to show \(\langle f, g \rangle = 0\) for every continuous
\(2L\)-periodic \(g\) on \(\mathbb{R}\) — for then \(f\) is orthogonal to a dense subset of
\(L^2([-L, L])\), forcing \(f = 0\).
From trigonometric polynomials to continuous \(g\) via Fejér.
Fix a continuous \(2L\)-periodic \(g\). By Fejér's theorem above, there exist trigonometric
polynomials \(P_N(x) = \sum_{|n| \leq m_N} a_n^{(N)} e^{-in\pi x / L}\) such that
\(P_N \to g\) uniformly on \([-L, L]\). Uniform convergence on a bounded interval implies
\(L^2\) convergence, since
\[
\int_{-L}^{L} |P_N(x) - g(x)|^2\, dx \;\leq\; 2L\, \|P_N - g\|_\infty^2 \;\to\; 0.
\]
By the
Cauchy-Schwarz inequality,
\(|\langle f, P_N \rangle - \langle f, g \rangle| = |\langle f, P_N - g \rangle| \leq \|f\|_2\, \|P_N - g\|_2 \to 0\),
so \(\langle f, P_N \rangle \to \langle f, g \rangle\). But each \(P_N\) is a finite linear
combination of the \(e^{-in\pi x/L}\), and \(f\) is assumed orthogonal to every such exponential,
so \(\langle f, P_N \rangle = 0\) for every \(N\). Passing to the limit, \(\langle f, g \rangle = 0\).
Since \(g\) was an arbitrary continuous \(2L\)-periodic function, the density argument forces
\(f = 0\) in \(L^2([-L, L])\), which is the same as \(f = 0\) almost everywhere. \(\square\)
Retrospective recognition: classical Parseval as a Hilbert-space corollary
With the orthonormal-basis property established, the
Hilbert-space Parseval identity
applies directly to the trigonometric system. Specializing the abstract statement
\(\|h\|^2 = \sum_n |\langle h, e_n \rangle|^2\) to \(\mathcal{H} = L^2([-L, L])\) and
\(e_n(x) = (2L)^{-1/2} e^{-in\pi x/L}\), the inner product becomes
\[
\begin{align*}
\langle f, e_n \rangle
&\;=\; \int_{-L}^{L} f(x)\, \overline{e_n(x)}\, dx \\\\
&\;=\; \frac{1}{\sqrt{2L}} \int_{-L}^{L} f(x)\, e^{+i n \pi x / L}\, dx \\\\
&\;=\; \sqrt{2L}\, c_n,
\end{align*}
\]
where \(c_n\) are the complex Fourier coefficients in the normalization of the Fourier-series page.
Substituting, the abstract Parseval identity becomes
\[
\begin{align*}
\int_{-L}^{L} |f(x)|^2\, dx
&\;=\; \sum_{n \in \mathbb{Z}} \bigl| \sqrt{2L}\, c_n \bigr|^2 \\\\
&\;=\; 2L \sum_{n \in \mathbb{Z}} |c_n|^2,
\end{align*}
\]
which after dividing by \(2L\) is exactly
Parseval's identity
in its classical complex-exponential form. The expansion statement of the abstract Parseval theorem
similarly produces the \(L^2\)-convergence of partial sums
\(S_N f = \sum_{|n| \leq N} c_n\, e^{-in\pi x/L} \to f\) in \(L^2([-L, L])\) for every
\(f \in L^2([-L, L])\) — settling the mean-square convergence claim of the Fourier-series page,
whose proof was deferred to "general Hilbert-space theory applied to an orthonormal basis."
What Changed
Before this section, the site possessed:
- The abstract Parseval identity for any orthonormal basis of any separable Hilbert space
(in functional analysis).
- A concrete Parseval-type formula for the trigonometric system on \([-L, L]\), stated
and used (in Fourier series).
What was missing was the link between them: the proof that the concrete trigonometric system
is, in fact, an orthonormal basis. The theorem of this section supplies that link, and the
retrospective recognition above is its first consequence. The earlier concrete formula is
now identified, not by analogy but by direct substitution, as a special case of the abstract
theorem — and the convergence in \(L^2\) of the Fourier-series partial sums is no longer an
independent assertion but a corollary of basis expansion in a Hilbert space.
The Riemann-Lebesgue Lemma
A recurring theme in classical Fourier analysis is that "high-frequency Fourier coefficients are small."
This intuition is made precise by the Riemann-Lebesgue lemma: the Fourier coefficients (for series)
or the Fourier transform (for functions on \(\mathbb{R}\)) of an integrable function decay to zero
at high frequency. The lemma was invoked, but not proved, in the
Fourier series page during the
outline of pointwise (Dirichlet-Jordan) convergence, with a forward-pointer to the present chapter
for the proof. With the Hilbert-space framework now in place, the lemma is an almost immediate
corollary of Bessel's inequality applied to the trigonometric system.
The \(L^2\) statement: a direct corollary of Bessel's inequality
The abstract
Bessel inequality
states that for any orthonormal sequence \(\{e_n\}\) in a Hilbert space and any vector \(x\),
the series \(\sum_{n} |\langle x, e_n \rangle|^2\) converges and is bounded above by \(\|x\|^2\).
Convergence of a non-negative series forces its terms to tend to zero. Specializing to the
Hilbert space \(L^2([-L, L])\) with the orthonormal basis
\(\{e_n(x) = (2L)^{-1/2} e^{-i n \pi x / L}\}_{n \in \mathbb{Z}}\) of the previous section, the
Fourier coefficients of any \(L^2\) function must vanish in the limit:
Theorem: Riemann-Lebesgue Lemma (\(L^2\) version)
If \(f \in L^2([-L, L])\), then its Fourier coefficients
\(c_n = \tfrac{1}{2L} \int_{-L}^{L} f(x)\, e^{+i n \pi x / L}\, dx\) satisfy
\(c_n \to 0\) as \(|n| \to \infty\).
Proof:
Let \(\langle \cdot, \cdot \rangle\) denote the unnormalized inner product
\(\langle f, g \rangle = \int_{-L}^{L} f(x) \overline{g(x)}\, dx\) on \(L^2([-L, L])\).
Computing the inner product of \(f\) against \(e_n(x) = (2L)^{-1/2} e^{-in\pi x/L}\),
\[
\begin{align*}
\langle f, e_n \rangle
&\;=\; \frac{1}{\sqrt{2L}} \int_{-L}^{L} f(x)\, e^{+i n \pi x / L}\, dx \\\\
&\;=\; \sqrt{2L}\, c_n,
\end{align*}
\]
which rearranges to \(|c_n|^2 = (2L)^{-1} |\langle f, e_n \rangle|^2\).
By Bessel's inequality applied to the orthonormal sequence \(\{e_n\}_{n \in \mathbb{Z}}\),
\(\sum_{n \in \mathbb{Z}} |\langle f, e_n \rangle|^2 \leq \|f\|_2^2 < \infty\). Dividing through
by \(2L\), the non-negative series \(\sum |c_n|^2 \leq \|f\|_2^2 / (2L)\) converges, so its
general term tends to zero: \(|c_n|^2 \to 0\), hence \(c_n \to 0\). \(\square\)
Extension to \(L^1\) by density
The classical Riemann-Lebesgue lemma is stated for \(L^1\) functions, which form the largest class
on which the Fourier transform integral converges absolutely. Crucially, \(L^1\) is neither contained
in nor contains \(L^2\) on \(\mathbb{R}\): for instance, \(f(x) = (1 + |x|)^{-1}\) is in \(L^2\) but not
\(L^1\), while \(f(x) = |x|^{-1/2}\mathbf{1}_{[0, 1]}(x)\) is in \(L^1\) but not \(L^2\). The bridge between
the two is the dense subspace \(C_c(\mathbb{R})\) of continuous compactly supported functions, which lies
inside both spaces. The strategy is to prove decay on \(C_c\) using the \(L^2\) result above and then
extend to \(L^1\) by approximation.
Theorem: Riemann-Lebesgue Lemma (\(L^1\) version)
If \(f \in L^1(\mathbb{R})\), then its Fourier transform
\(\hat{f}(\xi) = \int_{-\infty}^{\infty} f(x)\, e^{i x \xi}\, dx\) satisfies
\[
\lim_{|\xi| \to \infty} \hat{f}(\xi) \;=\; 0.
\]
In particular, \(\hat{f}\) is a continuous function vanishing at infinity.
Proof:
Step 1: continuity of \(\hat{f}\).
For any \(\xi, \xi' \in \mathbb{R}\),
\(|\hat{f}(\xi) - \hat{f}(\xi')| \leq \int |f(x)|\, |e^{i x \xi} - e^{i x \xi'}|\, dx\).
The integrand is bounded by \(2 |f(x)| \in L^1\) and tends pointwise to zero as \(\xi' \to \xi\)
(since \(e^{i x \xi'} \to e^{i x \xi}\) pointwise). By the
dominated convergence theorem,
\(\hat{f}(\xi') \to \hat{f}(\xi)\), so \(\hat{f}\) is continuous on \(\mathbb{R}\). Moreover the trivial bound
\(|\hat{f}(\xi)| \leq \|f\|_1\) shows that \(\hat{f}\) is bounded.
Step 2: decay on indicators of bounded intervals.
For the indicator function of a bounded interval, \(f = \mathbf{1}_{[a, b]}\), a direct computation gives
\[
\begin{align*}
\hat{f}(\xi)
&\;=\; \int_a^b e^{i x \xi}\, dx \\\\
&\;=\; \frac{e^{i b \xi} - e^{i a \xi}}{i \xi}, \qquad \xi \neq 0,
\end{align*}
\]
which is bounded in modulus by \(2 / |\xi|\) and therefore tends to zero as \(|\xi| \to \infty\).
By linearity, the same holds for every finite linear combination of indicators of bounded intervals —
that is, for every step function of bounded support.
Step 3a: step functions of bounded support are dense in \(L^1(\mathbb{R})\).
We argue in three stages, performing truncation first so that all subsequent approximations take
place inside a fixed bounded interval.
(1) Truncation to bounded support. For \(f \in L^1(\mathbb{R})\), set
\(f_M := f \cdot \mathbf{1}_{[-M, M]}\). The difference \(f - f_M\) vanishes on \([-M, M]\) and is
dominated by \(|f| \in L^1\), so the
dominated convergence theorem
gives \(\|f - f_M\|_1 \to 0\) as \(M \to \infty\). It therefore suffices to approximate any
\(L^1\) function of bounded support.
(2) Simple-function approximation inside \([-M, M]\).
Write the boundedly supported function as a complex combination of four non-negative measurable
parts. Each part admits a monotone increasing sequence of non-negative simple functions converging
to it pointwise (standard simple-function construction), and the
monotone convergence theorem
upgrades the pointwise convergence to convergence in \(L^1\). Since each part vanishes outside
\([-M, M]\), each approximating simple function may be chosen to vanish there as well, leaving a
simple function \(s = \sum_j c_j \mathbf{1}_{E_j}\) with each \(E_j \subseteq [-M, M]\).
(3) Step-function approximation of indicators inside \([-M, M]\).
For each \(E_j \subseteq [-M, M]\), outer regularity of Lebesgue measure produces an open set
\(U_j \subseteq \mathbb{R}\) with \(E_j \subseteq U_j\) and \(\mu(U_j \setminus E_j)\) as small as
desired. Intersecting with \([-M, M]\), the set \(U_j \cap [-M, M]\) is a countable disjoint union
of bounded open intervals; since their total length is at most \(2M\), the tail can be discarded
with arbitrarily small \(L^1\) cost, leaving a finite disjoint union of bounded open intervals
whose indicator approximates \(\mathbf{1}_{E_j}\) in \(L^1\). Chaining the three stages,
every \(f \in L^1(\mathbb{R})\) is the \(L^1\)-limit of step functions of bounded support.
Step 3b: \(\varepsilon\)-extension.
Given \(f \in L^1(\mathbb{R})\) and \(\varepsilon > 0\), choose a step function \(g\) of bounded
support with \(\|f - g\|_1 < \varepsilon / 2\). Then
\[
\begin{align*}
|\hat{f}(\xi)|
&\;\leq\; |\hat{f}(\xi) - \hat{g}(\xi)| + |\hat{g}(\xi)| \\\\
&\;\leq\; \|f - g\|_1 + |\hat{g}(\xi)| \\\\
&\;\leq\; \tfrac{\varepsilon}{2} + |\hat{g}(\xi)|,
\end{align*}
\]
where the bound on the difference uses
\(|\hat{f}(\xi) - \hat{g}(\xi)| \leq \int |f - g|\, dx = \|f - g\|_1\). By Step 2,
\(|\hat{g}(\xi)| < \varepsilon / 2\) for \(|\xi|\) sufficiently large, hence \(|\hat{f}(\xi)| < \varepsilon\)
for all sufficiently large \(|\xi|\). Since \(\varepsilon\) was arbitrary, \(\hat{f}(\xi) \to 0\). \(\square\)
What was Discharged
In the Fourier-series page, the proof outline of the Dirichlet-Jordan pointwise convergence theorem
used the Riemann-Lebesgue lemma to control a tail integral and explicitly deferred its proof to
"the forthcoming page on Fourier analysis in Hilbert spaces." Both the \(L^2\) form (used directly
in the tail estimate) and its \(L^1\) extension (the standard form of the lemma) are now established.
The proof reveals what kind of theorem the lemma actually is: it is not a theorem about oscillatory
integrals — it is a theorem about orthonormal bases. Once one views Fourier coefficients as inner
products against an orthonormal sequence, their decay is a one-line consequence of the convergence
of the Bessel series.
Plancherel: The Fourier Transform as a Unitary
The Fourier-transform page proved Plancherel's identity
\(\|f\|_2^2 = \tfrac{1}{2\pi}\|\hat{f}\|_2^2\) on \(L^1 \cap L^2(\mathbb{R})\), using Schwartz functions
and Fubini. Two structural questions were left open and, in fact, explicitly flagged for the present chapter:
- The integral defining the Fourier transform requires \(f \in L^1\) for absolute convergence —
but the Plancherel identity is naturally a statement about \(L^2\) norms.
How is the Fourier transform defined on a function \(f \in L^2(\mathbb{R})\) that is not
in \(L^1\), so that its integral definition fails?
- An isometry is a one-sided structure: it preserves norms but need not be surjective. The
Fourier transform on \(L^2\) is in fact surjective — every \(L^2\) function arises as the
Fourier transform of some other \(L^2\) function. What is the cleanest way to see this, and
what is its structural meaning?
The unified answer to both questions, and the structural punchline of this chapter, is that the
normalized Fourier transform \(\mathcal{U} := \mathcal{F}/\sqrt{2\pi}\) extends to a
unitary operator on \(L^2(\mathbb{R})\) — meaning a bijective isometry whose inverse
coincides with its adjoint. Once this is established, the Fourier transform on \(L^2\) is no longer
an integral but an abstract Hilbert-space isomorphism.
Step 1: extending the Fourier transform to \(L^2(\mathbb{R})\)
The natural domain of the classical Fourier transform integral is \(L^1(\mathbb{R})\); the natural
target of an "energy-preserving" theory is \(L^2(\mathbb{R})\). The bridge is the intersection
\(L^1 \cap L^2(\mathbb{R})\), on which both the integral definition and Plancherel's identity make sense.
Crucially, \(L^1 \cap L^2\) is a
dense
subspace of \(L^2(\mathbb{R})\): it contains \(C_c(\mathbb{R})\), which is itself dense in
\(L^2(\mathbb{R})\) by the same chain of approximations used in the proof of trigonometric ONB
completeness above (simple functions in \(L^2\) via dominated convergence, indicators of measurable
sets via outer regularity, indicators of intervals via trapezoidal cutoffs). We use this density
to extend the Fourier transform from its initial domain \(L^1 \cap L^2\) to the entirety of
\(L^2(\mathbb{R})\).
On \(L^1 \cap L^2\), set \(\mathcal{U} f := \tfrac{1}{\sqrt{2\pi}}\hat{f}\). The classical
Plancherel identity,
proved on the Schwartz class and extended to \(L^1 \cap L^2\) by Schwartz density together with
norm continuity, gives \(\|f\|_2^2 = \tfrac{1}{2\pi}\|\hat{f}\|_2^2\) for every \(f \in L^1 \cap L^2\),
which rewrites as \(\|\mathcal{U} f\|_2 = \|f\|_2\). Thus \(\mathcal{U}\) is an isometry on
\(L^1 \cap L^2\) with values in \(L^2\). Density of \(L^1 \cap L^2\) in \(L^2(\mathbb{R})\) together
with completeness of \(L^2\) allows us to extend \(\mathcal{U}\) by continuity, giving a unique
bounded linear operator
on all of \(L^2\):
Proof:
Given \(f \in L^2(\mathbb{R})\), choose any sequence \(\{f_n\} \subset L^1 \cap L^2\) with
\(f_n \to f\) in \(L^2\) (possible by density). The classical Plancherel identity on
\(L^1 \cap L^2\) gives \(\|\mathcal{U} f_n - \mathcal{U} f_m\|_2 = \|f_n - f_m\|_2\) for all
\(n, m\), since \(f_n - f_m \in L^1 \cap L^2\) and \(\mathcal{U}\) is linear on this subspace.
The Cauchy property of \(\{f_n\}\) in \(L^2\) therefore transfers to \(\{\mathcal{U} f_n\}\).
By
completeness of \(L^2\),
there exists \(\mathcal{U} f \in L^2(\mathbb{R})\) with \(\mathcal{U} f_n \to \mathcal{U} f\)
in \(L^2\).
Well-definedness. If \(\{f_n'\}\) is another sequence in \(L^1 \cap L^2\) with
\(f_n' \to f\), then \(\|f_n - f_n'\|_2 \to 0\), so by isometry on \(L^1 \cap L^2\),
\(\|\mathcal{U} f_n - \mathcal{U} f_n'\|_2 \to 0\), forcing the two limit candidates to agree.
Linearity, boundedness, isometry. Linearity passes to the limit termwise.
Boundedness follows from the isometric estimate \(\|\mathcal{U} f\|_2 = \lim \|\mathcal{U} f_n\|_2
= \lim \|f_n\|_2 = \|f\|_2\), proving \(\|\mathcal{U} f\|_2 = \|f\|_2\) for every
\(f \in L^2(\mathbb{R})\).
Compatibility with the integral definition. If \(f \in L^1 \cap L^2\), choosing
the constant sequence \(f_n = f\) gives \(\mathcal{U} f = \tfrac{1}{\sqrt{2\pi}}\hat{f}\)
immediately from the construction. Uniqueness: any bounded linear operator agreeing with
\(\mathcal{U}\) on the dense subspace \(L^1 \cap L^2\) must agree with the limit defined above
by continuity, so the extension is the unique one. \(\square\)
The operator \(\mathcal{U}\) constructed above is the Fourier-Plancherel transform
on \(L^2(\mathbb{R})\). On \(L^1 \cap L^2\) it coincides with the normalized integral; on the rest
of \(L^2\), where the integral may fail to converge absolutely, \(\mathcal{U}\) is defined via the
\(L^2\)-limit construction above. We will continue to write \(\hat{f}\) for \(\sqrt{2\pi}\, \mathcal{U} f\)
with this understanding, recovering the classical normalization of the
Fourier transform.
Step 2: surjectivity via Fourier inversion
Isometry alone does not yield a bijection on an infinite-dimensional Hilbert space — for instance,
on the sequence space \(\ell^2\) of square-summable sequences, the right-shift operator
\((x_1, x_2, \ldots) \mapsto (0, x_1, x_2, \ldots)\) is an isometry but not surjective. To upgrade
\(\mathcal{U}\) from an isometry to a unitary, we must show that its image is all of
\(L^2(\mathbb{R})\). This is exactly where the classical
Fourier inversion formula
enters.
Define the "inverse" candidate operator \(\mathcal{V}\) on \(L^1 \cap L^2\) by
\[
(\mathcal{V} g)(x) \;=\; \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\infty} g(\xi)\, e^{-i x \xi}\, d\xi,
\]
differing from \(\mathcal{U}\) only by the sign in the exponent. The exact same density argument
as in Step 1 — replacing \(e^{i x \xi}\) with \(e^{-i x \xi}\) throughout — extends \(\mathcal{V}\) to
a bounded linear isometry \(\mathcal{V} : L^2(\mathbb{R}) \to L^2(\mathbb{R})\). For a Schwartz
function \(f \in \mathcal{S}(\mathbb{R})\), composing the operators directly,
\[
\begin{align*}
(\mathcal{V} \mathcal{U} f)(x)
&\;=\; \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\infty} (\mathcal{U} f)(\xi)\, e^{-i x \xi}\, d\xi \\\\
&\;=\; \frac{1}{2\pi} \int_{-\infty}^{\infty} \hat{f}(\xi)\, e^{-i x \xi}\, d\xi \\\\
&\;=\; f(x),
\end{align*}
\]
where the second equality uses \(\mathcal{U} f = \hat{f}/\sqrt{2\pi}\) and the third is the classical
Fourier inversion formula
applied to \(f\). The reverse identity \(\mathcal{U} \mathcal{V} g = g\) on Schwartz \(g\) follows by
relating \(\mathcal{V}\) to \(\mathcal{U}\) and a reflection. Substituting \(\eta = -\xi\) in the
definition of \(\mathcal{V}\),
\[
(\mathcal{V} g)(x) \;=\; \frac{1}{\sqrt{2\pi}}\int g(-\eta)\, e^{i x \eta}\, d\eta
\;=\; (\mathcal{U} \tilde g)(x), \qquad \text{where } \tilde g(\eta) := g(-\eta).
\]
Iterating, \(\mathcal{U}\mathcal{V} g = \mathcal{U}^2 \tilde g\). Now the corollary
\(\mathcal{F}\{\mathcal{F}\{f\}\}(x) = 2\pi\, f(-x)\) stated with the Fourier-transform definition
translates to \(\mathcal{U}^2 f(x) = f(-x)\), so \(\mathcal{U}^2 \tilde g(x) = \tilde g(-x) = g(x)\).
Hence \(\mathcal{U}\mathcal{V} g = g\) on the Schwartz class. Since the Schwartz class is dense in
\(L^2(\mathbb{R})\) (it contains \(C_c^\infty\), which is itself dense in \(L^2\)), the same
density-and-limit argument extends both identities to all of \(L^2(\mathbb{R})\):
\(\mathcal{V} \mathcal{U} = \mathcal{U} \mathcal{V} = I\) on \(L^2(\mathbb{R})\). In particular,
\(\mathcal{U}\) is surjective with two-sided inverse \(\mathcal{V}\).
Step 3: the adjoint and the unitarity
The final ingredient is the identification of \(\mathcal{V}\) with the Hilbert-space adjoint
\(\mathcal{U}^*\) of \(\mathcal{U}\). For Schwartz functions \(f, g \in \mathcal{S}(\mathbb{R})\),
we compute \(\langle \mathcal{U} f, g \rangle\) by applying the polarized form of the classical
Plancherel identity
— \(\int f\, \overline{h}\, dx = \tfrac{1}{2\pi}\int \hat{f}\, \overline{\hat{h}}\, d\xi\) —
to the pair \((f, \mathcal{V} g)\). The Fourier transform of \(\mathcal{V} g\) is computable directly:
comparing definitions with the
Fourier-transform page
shows \(\mathcal{V} g = \sqrt{2\pi}\, \mathcal{F}^{-1}(g)\), hence
\(\mathcal{F}(\mathcal{V} g) = \sqrt{2\pi}\, \mathcal{F}\mathcal{F}^{-1}(g) = \sqrt{2\pi}\, g\) by
Fourier inversion on the Schwartz class. Therefore
\[
\begin{align*}
\langle f,\, \mathcal{V} g \rangle
&\;=\; \int f(x)\, \overline{(\mathcal{V} g)(x)}\, dx \\\\
&\;=\; \frac{1}{2\pi} \int \hat{f}(\xi)\, \overline{\sqrt{2\pi}\, g(\xi)}\, d\xi \\\\
&\;=\; \frac{1}{\sqrt{2\pi}} \int \hat{f}(\xi)\, \overline{g(\xi)}\, d\xi \\\\
&\;=\; \int (\mathcal{U} f)(\xi)\, \overline{g(\xi)}\, d\xi \\\\
&\;=\; \langle \mathcal{U} f,\, g \rangle,
\end{align*}
\]
where the second equality is classical Plancherel and the fourth uses
\(\mathcal{U} f = \hat{f}/\sqrt{2\pi}\) by definition. The defining property of the adjoint is
\(\langle \mathcal{U} f, g \rangle = \langle f, \mathcal{U}^* g \rangle\) for all \(f, g\), so
\(\mathcal{V} = \mathcal{U}^*\) on the Schwartz class, and by density on all of \(L^2(\mathbb{R})\).
Combined with Step 2, \(\mathcal{U}^* \mathcal{U} = \mathcal{U} \mathcal{U}^* = I\) — which is exactly
the condition that \(\mathcal{U}\) be a unitary operator on \(L^2(\mathbb{R})\).
Theorem: Plancherel — The Fourier Transform as a Unitary
The normalized Fourier transform
\(\mathcal{U} : L^2(\mathbb{R}) \to L^2(\mathbb{R})\),
\(\mathcal{U} = \mathcal{F}/\sqrt{2\pi}\), extending the classical Fourier transform
from \(L^1 \cap L^2\) by density, is a unitary operator: it is bijective,
preserves the \(L^2\) inner product
\[
\langle \mathcal{U} f, \mathcal{U} g \rangle \;=\; \langle f, g \rangle
\qquad \text{for all } f, g \in L^2(\mathbb{R}),
\]
and satisfies \(\mathcal{U}^{-1} = \mathcal{U}^*\), the Hilbert-space adjoint. On the dense
subspace \(L^1 \cap L^2\), the adjoint admits the integral representation
\((\mathcal{U}^* g)(x) = (1/\sqrt{2\pi}) \int_{-\infty}^{\infty} g(\xi)\, e^{-i x \xi}\, d\xi\),
differing from \(\mathcal{U}\) only by the sign of the exponent.
Proof:
The four conclusions correspond to the three steps above together with a final polarization
argument. Step 1 (Fourier-Plancherel transform on \(L^2\)) gave the existence of \(\mathcal{U}\)
as a bounded linear operator with \(\|\mathcal{U} f\|_2 = \|f\|_2\). Step 2 (surjectivity via
Fourier inversion) gave bijectivity, with \(\mathcal{V}\) as two-sided inverse:
\(\mathcal{V}\mathcal{U} = \mathcal{U}\mathcal{V} = I\). Step 3 (adjoint identification)
established \(\mathcal{V} = \mathcal{U}^*\), so \(\mathcal{U}^{-1} = \mathcal{V} = \mathcal{U}^*\)
and the integral representation on \(L^1 \cap L^2\) is the defining formula for \(\mathcal{V}\).
Finally, preservation of the inner product follows from norm-preservation by polarization: the
identity \(\langle f, g \rangle = \tfrac{1}{4}\sum_{k=0}^3 i^k \|f + i^k g\|^2\) (valid in any
complex inner-product space under the first-slot-linear convention) determines
\(\langle f, g \rangle\) from norms alone, and each norm term is preserved by \(\mathcal{U}\).
\(\square\)
What the Unitarity Buys
The classical Plancherel identity \(\|f\|_2^2 = \tfrac{1}{2\pi}\|\hat{f}\|_2^2\) is a statement
about a single function: integrating the modulus-squared of either \(f\) or \(\hat{f}\)
gives the same answer, up to a constant. The unitary reformulation upgrades this to a
statement about the entire structure of \(L^2(\mathbb{R})\): the Fourier transform is an
isomorphism of Hilbert spaces from \(L^2(\mathbb{R})\) (as a space of "spatial"
functions) to \(L^2(\mathbb{R})\) (as a space of "frequency" functions), and the two spaces
are not merely isomorphic abstractly — they are isomorphic via this specific concrete map. Every
Hilbert-space construction one performs on functions has a frequency-domain mirror: orthogonal
decompositions of \(L^2\) correspond to orthogonal decompositions of the frequency space;
self-adjoint operators commuting with translations correspond to multiplication operators in
frequency (the spectral theorem for translation-invariant operators); and probability densities
with characteristic functions in \(L^2\) inherit the same structural framework. This is the
viewpoint that makes the Fourier transform the prototype, in spectral theory and in quantum
mechanics, of "diagonalizing a self-adjoint family of operators by a unitary change of basis."
With this, the program announced in the introduction is complete. The three claims of classical
\(L^2\) Fourier theory that were stated, used, or sketched in earlier pages —
Parseval's identity for Fourier series, the decay of Fourier coefficients, and the unitary nature
of the Fourier transform on \(L^2(\mathbb{R})\) — have each been identified, by direct proof, as a
specific instance of Hilbert-space machinery: respectively, the Parseval identity applied to a
concrete orthonormal basis, the Bessel inequality applied to that same basis, and the bounded-extension
construction applied to an isometric operator on a dense subspace. What was previously a chapter of
classical analysis is now a chapter of operator theory.