From Functions to Equivalence Classes
Let \((\Omega, \mathcal{F}, \mu)\) be a measure space.
In Intro to Functional Analysis,
we defined the \(L^p\) space as the collection of measurable functions \(f\) with
\(\int |f|^p \, d\mu < \infty\), equipped with the quantity
\[
\|f\|_p \;=\; \left( \int_\Omega |f|^p \, d\mu \right)^{1/p}.
\]
For this to be a valid norm, three axioms must hold:
(i) positive definiteness: \(\|f\|_p \geq 0\), with
\(\|f\|_p = 0 \iff f = 0\) (the "\(\Leftarrow\)" direction is immediate; the
"\(\Rightarrow\)" direction is the non-trivial content);
(ii) absolute homogeneity: \(\|\alpha f\|_p = |\alpha| \cdot \|f\|_p\);
(iii) triangle inequality: \(\|f + g\|_p \leq \|f\|_p + \|g\|_p\).
Non-negativity and axiom (ii) are straightforward: the former from the non-negativity
of the integrand \(|f|^p\), the latter from the linearity of the integral. The two
remaining difficulties — the definiteness half of axiom (i), and the
triangle inequality — drive the structure of this entire chapter. We address
definiteness first.
Convention. Throughout this chapter, \(\mathbb{F}\) denotes the
scalar field, either \(\mathbb{R}\) or \(\mathbb{C}\). Measurable functions
\(f : \Omega \to \mathbb{F}\) are understood as
\((\mathcal{F}, \mathcal{B}(\mathbb{F}))\)-measurable, where
\(\mathcal{B}(\mathbb{F})\) is the Borel \(\sigma\)-algebra on \(\mathbb{F}\).
Results are stated in generality over \(\mathbb{F}\); the real and complex cases
differ only where phase/sign arguments appear (notably the equality conditions
for Hölder and Minkowski).
The Seminorm Problem
Suppose \(\|f\|_p = 0\). Then \(\int |f|^p \, d\mu = 0\). Since \(f\) is
measurable, so is the composition \(|f|^p\) (the absolute value is continuous
on \(\mathbb{F}\) and \(t \mapsto t^p\) is continuous on \([0, \infty)\), both
are Borel measurable, and measurability is preserved under composition), and
\(|f|^p \geq 0\) pointwise. By
Theorem: Zero Integral Implies Vanishing Almost Everywhere,
\(|f(x)|^p = 0\) for almost every \(x \in \Omega\). Since \(p \geq 1\), the map
\(t \mapsto t^p\) is strictly increasing on \([0, \infty)\), so \(t^p = 0 \iff t = 0\);
hence \(|f(x)| = 0\) a.e., i.e., \(f(x) = 0\) except on a set of
measure zero.
But this does not mean \(f\) is the zero function. It means only that \(f = 0\)
almost everywhere (a.e.).
For a concrete example, recall the Lebesgue Integration of
Dirichlet function:
\[
f(x) = \chi_{\mathbb{Q}}(x) =
\begin{cases}
1 & \text{if } x \in \mathbb{Q}, \\
0 & \text{if } x \notin \mathbb{Q}.
\end{cases}
\]
Considering \(\mathbb{R}\) with Lebesgue measure, since \(\mu(\mathbb{Q}) = 0\),
we have \(\|f\|_p = 0\) for every \(1 \leq p < \infty\),
yet \(f\) is not the zero function — it equals \(1\) at every rational point. The quantity
\(\|\cdot\|_p\) therefore fails to distinguish \(f\) from the zero function.
In the language of normed space theory, \(\|\cdot\|_p\) is a seminorm,
not a norm: it satisfies \(\|f\|_p = 0\) without \(f = 0\).
The Equivalence Relation
The resolution is a standard algebraic maneuver: we quotient out the
ambiguity. We declare two measurable functions to be "the same" if they differ only
on a negligible set.
Definition: Equality Almost Everywhere
Let \(f, g : \Omega \to \mathbb{F}\) (\(\mathbb{F} = \mathbb{R}\) or \(\mathbb{C}\))
be measurable functions. We say \(f\) and \(g\) are
equal almost everywhere, written \(f = g\) a.e., if
\[
\mu\bigl(\{x \in \Omega : f(x) \neq g(x)\}\bigr) = 0.
\]
The relation \(f \sim g \iff f = g\) a.e. is an equivalence relation on
the set of measurable functions.
Verification:
We verify the three axioms of an equivalence relation. A preliminary remark:
for measurable \(f, g : \Omega \to \mathbb{F}\), the difference \(f - g\) is
measurable (componentwise in the complex case). Thus
\(\{x : f(x) = g(x)\} = (f - g)^{-1}(\{0\})\); since \(f - g\) is measurable
(i.e., the preimage of every Borel set lies in \(\mathcal{F}\)) and
\(\{0\} \subset \mathbb{F}\) is closed hence Borel, this preimage is in
\(\mathcal{F}\), and so is its complement \(\{x : f(x) \neq g(x)\}\).
Reflexivity: \(\{x : f(x) \neq f(x)\} = \emptyset\), which has measure zero.
Symmetry: The sets \(\{f \neq g\}\) and \(\{g \neq f\}\) are
literally equal, so \(f \sim g \iff g \sim f\).
Transitivity: Suppose \(f = g\) a.e. and \(g = h\) a.e.
Let \(N_1 = \{x : f(x) \neq g(x)\}\) and \(N_2 = \{x : g(x) \neq h(x)\}\);
both are null by hypothesis and measurable by the preliminary remark
(applied to each pair). The set \(\{x : f(x) \neq h(x)\}\) is likewise
measurable by the remark (applied to \(f\) and \(h\)), and if
\(f(x) \neq h(x)\), then either \(f(x) \neq g(x)\) or \(g(x) \neq h(x)\), so
\(\{x : f(x) \neq h(x)\} \subseteq N_1 \cup N_2\). By monotonicity and
subadditivity of \(\mu\),
\(\mu(\{x : f(x) \neq h(x)\}) \leq \mu(N_1) + \mu(N_2) = 0\).
The Formal Definition of \(L^p\)
Definition: The \(L^p\) Space (Rigorous)
Let \((\Omega, \mathcal{F}, \mu)\) be a measure space and \(1 \leq p < \infty\). Define
\[
\mathscr{L}^p(\Omega, \mathcal{F}, \mu) \;=\;
\bigl\{\, f : \Omega \to \mathbb{F} \;\big|\; f \text{ is measurable and }
\int_\Omega |f|^p \, d\mu < \infty \,\bigr\}.
\]
The \(L^p\) space is the quotient
\[
L^p(\Omega, \mathcal{F}, \mu) \;=\; \mathscr{L}^p(\Omega, \mathcal{F}, \mu) \,\big/\!\sim
\]
where \(f \sim g \iff f = g\) a.e. Each element of \(L^p\) is an
equivalence class \([f]\) of functions that agree almost everywhere.
Following universal convention, we write \(f \in L^p\) rather than \([f] \in L^p\),
understanding that "\(f\)" refers to the equivalence class and not to any particular
representative. This notational abuse is harmless because all quantities we care
about — the norm \(\|f\|_p\), integrals \(\int fg \, d\mu\), and convergence statements —
are invariant under modification on sets of measure zero.
When we write \(f = 0\) in \(L^p\), we mean \(f(x) = 0\) for \(\mu\)-almost every \(x\).
Definition: The \(L^\infty\) Space
Let \((\Omega, \mathcal{F}, \mu)\) be a measure space. A measurable
function \(f : \Omega \to \mathbb{F}\) is essentially bounded
if there exists a constant \(C \geq 0\) such that \(|f(x)| \leq C\) for
a.e. \(x\). The essential supremum is
\[
\|f\|_\infty \;=\; \inf\{C \geq 0 : |f(x)| \leq C \text{ for a.e. } x\}.
\]
The \(L^\infty\) space is the set of equivalence classes
(under a.e.-equality) of essentially bounded measurable functions, equipped
with the quantity \(\|\cdot\|_\infty\); we verify below that this is indeed
a norm.
The infimum in the definition of \(\|f\|_\infty\) is not just an infimum — it is
actually attained. This small fact is what makes \(\|\cdot\|_\infty\) well-behaved
as a norm and is used repeatedly in the sequel.
Lemma: Essential Supremum Is Attained
If \(f \in L^\infty\), then \(|f(x)| \leq \|f\|_\infty\) for almost every \(x\).
In particular, the infimum in the definition of \(\|f\|_\infty\) is achieved.
Proof:
Since \(f \in L^\infty\), the set of admissible bounds is non-empty; it is
also bounded below by \(0\), so its infimum \(\|f\|_\infty\) lies in
\([0, \infty)\). By the defining property of the infimum, for each
\(n \in \mathbb{N}\) there exists an admissible \(C_n\) with
\(\|f\|_\infty \leq C_n < \|f\|_\infty + 1/n\); for this \(C_n\), the set
\(E_n = \{x : |f(x)| > C_n\}\) has measure zero. Let
\(E = \bigcup_{n=1}^\infty E_n\). Since each \(E_n\) is null and \(\mu\) is
\(\sigma\)-subadditive (a direct consequence of
countable additivity,
via disjointification),
\(\mu(E) \leq \sum_{n=1}^\infty \mu(E_n) = 0\). For \(x \notin E\), we have
\(|f(x)| \leq C_n < \|f\|_\infty + 1/n\) for every \(n\); letting
\(n \to \infty\) gives \(|f(x)| \leq \|f\|_\infty\). This proves the first
claim; the second is its immediate corollary, since \(C = \|f\|_\infty\) is
itself an admissible bound.
With this lemma in hand, the three norm axioms for \(\|\cdot\|_\infty\) are
immediate. Non-negativity is clear from the definition.
Definiteness on equivalence classes: if \(\|f\|_\infty = 0\), then
\(|f(x)| \leq 0\) a.e., so \(f = 0\) a.e. — i.e., \([f]\) is the zero class.
Absolute homogeneity: for \(\alpha = 0\) both sides vanish; for
\(\alpha \neq 0\), the lemma gives \(|f(x)| \leq \|f\|_\infty\) a.e., so
\(|\alpha f(x)| \leq |\alpha| \|f\|_\infty\) a.e., showing
\(\|\alpha f\|_\infty \leq |\alpha| \|f\|_\infty\); applying the same argument
to \(\alpha^{-1}(\alpha f) = f\) yields the reverse inequality.
Triangle inequality: since \(|f(x)| \leq \|f\|_\infty\) and
\(|g(x)| \leq \|g\|_\infty\) a.e., we have
\(|f(x) + g(x)| \leq \|f\|_\infty + \|g\|_\infty\) a.e., which forces
\(\|f + g\|_\infty \leq \|f\|_\infty + \|g\|_\infty\). So \((L^\infty, \|\cdot\|_\infty)\)
is a normed space; its completeness is established alongside the
\(1 \leq p < \infty\) case — though by a genuinely simpler argument —
when we prove the Riesz-Fischer theorem in the next chapter.
Connection to Sequence Spaces
The sequence spaces \(\ell^p\) are a special case of \(L^p\). If we take \(\Omega = \mathbb{N}\),
\(\mathcal{F} = 2^{\mathbb{N}}\) (all subsets), and \(\mu\) = counting measure
(\(\mu(\{n\}) = 1\) for each \(n\)), then \(L^p(\mathbb{N}, \mu)\) is exactly \(\ell^p\).
In this setting, the equivalence class issue is trivial — since every singleton
\(\{n\}\) has positive measure, two sequences are equal a.e. if and only if they
are identical. Every theorem we prove for \(L^p\) in this chapter therefore
specializes to \(\ell^p\) automatically.
With the quotient construction in hand, the seminorm \(\|\cdot\|_p\) on \(\mathscr{L}^p\)
descends to a genuine norm on \(L^p\): if \(\|[f]\|_p = 0\), then \(f = 0\) a.e.,
which means \([f]\) is the zero element of the quotient space.
The remaining norm axiom — the triangle inequality \(\|f + g\|_p \leq \|f\|_p + \|g\|_p\) —
is the content of Minkowski's inequality, which requires
Hölder's inequality as an intermediate step.
We turn to these now.
Young's & Hölder's Inequality
The proof chain begins with an elementary inequality about real numbers,
which we then "integrate" to obtain the central inequality of \(L^p\) theory.
Hölder Conjugates
Definition: Hölder Conjugate
For an exponent \(1 < p < \infty\), the Hölder conjugate
\(q\) is the unique real number satisfying
\[
\frac{1}{p} + \frac{1}{q} = 1,
\qquad \text{equivalently} \quad q = \frac{p}{p - 1}.
\]
Since \(p > 1\), \(p - 1 > 0\) gives \(q > 0\); and \(p > p - 1\) gives
\(q > 1\), while \(p < \infty\) gives \(q < \infty\). Thus \(1 < q < \infty\).
The extreme cases are defined by convention so that the relation
\(1/p + 1/q = 1\) remains formally valid with the extended-real convention
\(1/\infty = 0\): the conjugate of \(p = 1\) is \(q = \infty\), and the
conjugate of \(p = \infty\) is \(q = 1\). The relation is symmetric: the
conjugate of \(q\) is \(p\).
The unique self-conjugate case is \(p = q = 2\) — the setting of
Hilbert spaces
and the
Cauchy-Schwarz inequality.
The algebraic identity \((p - 1)q = p\), which follows immediately from
\(q = p/(p-1)\), will appear repeatedly in the proofs below.
Young's Inequality
Theorem: Young's Inequality
Let \(1 < p < \infty\) and let \(q\) be its Hölder conjugate. For all
\(a, b \geq 0\),
\[
ab \;\leq\; \frac{a^p}{p} + \frac{b^q}{q}.
\]
Equality holds if and only if \(a^p = b^q\).
Proof:
The argument is a direct specialization of the weighted AM-GM inequality
with weights \((1/p, 1/q)\). If \(a = 0\) or \(b = 0\), both sides reduce
to a non-negative quantity and the inequality holds trivially
(see the equality analysis below). Assume \(a, b > 0\).
The key observation is that the logarithm \(t \mapsto \log t\) is
strictly concave on \((0, \infty)\): its second derivative
\((\log t)'' = -1/t^2\) is strictly negative. By the definition of strict
concavity, for any \(\lambda \in (0, 1)\) and \(u, v > 0\) with \(u \neq v\),
\[
\log\bigl(\lambda u + (1 - \lambda) v\bigr) \;>\; \lambda \log u + (1 - \lambda) \log v,
\]
with equality when \(u = v\). By the logarithm rules
\(\lambda \log u + (1 - \lambda) \log v = \log\bigl(u^\lambda v^{1-\lambda}\bigr)\).
Since \(\log\) is strictly increasing, we obtain the
weighted AM-GM inequality
\[
u^\lambda \, v^{1 - \lambda} \;\leq\; \lambda u + (1 - \lambda) v,
\]
with equality if and only if \(u = v\).
Now set \(\lambda = 1/p\) (so that \(1 - \lambda = 1/q\) and
\(\lambda \in (0, 1)\) since \(1 < p < \infty\)), \(u = a^p\), and \(v = b^q\):
\[
(a^p)^{1/p} (b^q)^{1/q} \;\leq\; \frac{a^p}{p} + \frac{b^q}{q}.
\]
Simplifying the left side, \((a^p)^{1/p} (b^q)^{1/q} = a \cdot b\), which gives
\(ab \leq \frac{a^p}{p} + \frac{b^q}{q}\).
Equality condition. For \(a, b > 0\): the weighted AM-GM step
used strict concavity of \(\log\) with \(\lambda = 1/p \in (0, 1)\), so equality
holds iff \(u = v\), i.e., \(a^p = b^q\).
For the degenerate cases: if \(a = 0\) and \(b > 0\), the inequality reads
\(0 \leq b^q/q\) which is strict (since \(b^q/q > 0\)), and correspondingly
\(a^p = 0 \neq b^q\); symmetrically for \(b = 0, a > 0\). If \(a = b = 0\),
both sides are zero and \(a^p = 0 = b^q\). Hence in every case, equality
holds if and only if \(a^p = b^q\).
Young's inequality is a pointwise statement about real numbers.
Its power emerges when we "integrate both sides" — which is exactly what happens
in the proof of Hölder's inequality.
Hölder's Inequality
Theorem: Hölder's Inequality
Let \(1 \leq p \leq \infty\) and let \(q\) be its Hölder conjugate.
If \(f \in L^p\) and \(g \in L^q\), then the product \(fg\) is measurable,
\(fg \in L^1\), and
\[
\|fg\|_1 \;=\; \int_\Omega |f g| \, d\mu
\;\leq\; \|f\|_p \, \|g\|_q.
\]
Proof:
Measurability of \(fg\) is standard: the product of measurable functions is
measurable (in the complex case, by considering real and imaginary parts).
We establish the inequality below; since \(\|f\|_p, \|g\|_q < \infty\) by
hypothesis, its right-hand side is finite, which gives the integrability
conclusion \(fg \in L^1\).
Case \(p = 1, q = \infty\) (the case \(p = \infty, q = 1\)
follows by interchanging the roles of \(f\) and \(g\)):
If \(\|f\|_1 = 0\), then by
Theorem: Zero Integral Implies Vanishing Almost Everywhere,
\(|f| = 0\) a.e., so \(fg = 0\) a.e. and both sides are zero.
If \(\|g\|_\infty = 0\), then by
Lemma: Essential Supremum Is Attained,
\(|g(x)| \leq 0\) a.e., so \(g = 0\) a.e. and again \(fg = 0\) a.e.
Otherwise, by the same lemma,
\(|g(x)| \leq \|g\|_\infty\) for a.e. \(x\). Using multiplicativity of the
absolute value, \(|f(x) g(x)| = |f(x)| \cdot |g(x)| \leq |f(x)| \cdot \|g\|_\infty\)
for a.e. \(x\). Integrating both sides over \(\Omega\) (using monotonicity
of the integral for a.e. inequality, and linearity to pull out the constant
\(\|g\|_\infty\)),
\[
\int_\Omega |fg| \, d\mu
\;\leq\; \int_\Omega |f| \cdot \|g\|_\infty \, d\mu
\;=\; \|g\|_\infty \int_\Omega |f| \, d\mu
\;=\; \|f\|_1 \|g\|_\infty.
\]
Case \(1 < p < \infty\):
If \(\|f\|_p = 0\) or \(\|g\|_q = 0\), then \(f = 0\) a.e. or \(g = 0\) a.e.,
so \(fg = 0\) a.e. and both sides are zero. Assume \(\|f\|_p > 0\) and \(\|g\|_q > 0\).
Normalization. The strategy is to rescale \(f\) and \(g\) to
have unit norm, reducing the integral inequality to a pointwise application
of Young's inequality. Define
\[
\tilde{f} = \frac{|f|}{\|f\|_p}, \qquad
\tilde{g} = \frac{|g|}{\|g\|_q}.
\]
By direct computation,
\(\|\tilde f\|_p^p = \int |\tilde f|^p \, d\mu = \int |f|^p / \|f\|_p^p \, d\mu
= \|f\|_p^p / \|f\|_p^p = 1\), so \(\|\tilde f\|_p = 1\); similarly
\(\|\tilde g\|_q = 1\). Since \(f, g\) take values in \(\mathbb{F}\), we have
\(\tilde{f}(x), \tilde{g}(x) \in [0, \infty)\) everywhere, and
Young's inequality
applies pointwise:
\[
\tilde{f}(x) \, \tilde{g}(x)
\;\leq\; \frac{\tilde{f}(x)^p}{p} + \frac{\tilde{g}(x)^q}{q}
\quad \text{for every } x.
\]
Integrating this pointwise inequality over \(\Omega\) (by monotonicity of
the integral) and using linearity,
\[
\int_\Omega \tilde{f} \, \tilde{g} \, d\mu
\;\leq\; \frac{1}{p} \int_\Omega \tilde{f}^p \, d\mu
+ \frac{1}{q} \int_\Omega \tilde{g}^q \, d\mu
\;=\; \frac{1}{p} + \frac{1}{q} \;=\; 1.
\]
Substituting back \(\tilde{f} = |f|/\|f\|_p\) and \(\tilde{g} = |g|/\|g\|_q\):
\[
\frac{1}{\|f\|_p \, \|g\|_q} \int_\Omega |f g| \, d\mu \;\leq\; 1,
\]
which gives \(\int |fg| \, d\mu \leq \|f\|_p \, \|g\|_q\) as claimed.
Equality Conditions and Special Cases
Tracing through the proof reveals when Hölder's inequality is sharp.
Equality Condition (\(1 < p < \infty\)):
Assume first that \(\|f\|_p > 0\) and \(\|g\|_q > 0\) (the degenerate
cases \(f = 0\) or \(g = 0\) a.e. give equality trivially, with both
sides zero). Using the normalized \(\tilde f = |f|/\|f\|_p\) and
\(\tilde g = |g|/\|g\|_q\) from the main proof, define the pointwise gap
\[
\Delta(x) \;=\; \frac{\tilde f(x)^p}{p} + \frac{\tilde g(x)^q}{q}
- \tilde f(x)\tilde g(x).
\]
Pointwise Young's inequality gives \(\Delta(x) \geq 0\) for a.e. \(x\).
To apply
Theorem: Zero Integral Implies Vanishing Almost Everywhere
(which requires a pointwise nonnegative integrand), we define the
a.e.-equal modification \(\Delta^+(x) = \max(\Delta(x), 0)\);
then \(\Delta^+ \geq 0\) pointwise, \(\Delta^+\) is measurable, and
\(\int \Delta^+ \, d\mu = \int \Delta \, d\mu\) (since \(\Delta = \Delta^+\) a.e.).
The main proof's integration step rewritten in equation form reads
\(\int \tilde f \, \tilde g \, d\mu = \tfrac{1}{p} + \tfrac{1}{q} - \int \Delta \, d\mu
= 1 - \int \Delta \, d\mu\);
equality in Hölder's inequality — \(\int \tilde f \tilde g \, d\mu = 1\) — is therefore
equivalent to \(\int \Delta \, d\mu = 0\), i.e., \(\int \Delta^+ \, d\mu = 0\).
By the zero-integral theorem just invoked,
\(\Delta^+ = 0\) a.e., hence \(\Delta = 0\) a.e. By the equality case of
Young's inequality,
\(\Delta(x) = 0\) forces \(\tilde f(x)^p = \tilde g(x)^q\) for a.e. \(x\).
Unwinding the normalization,
\[
\frac{|f(x)|^p}{\|f\|_p^p} \;=\; \frac{|g(x)|^q}{\|g\|_q^q}
\quad \text{for a.e. } x,
\]
so \(|f|^p\) and \(|g|^q\) are proportional a.e.: there exists a constant
\(c = \|f\|_p^p / \|g\|_q^q > 0\) with \(|f|^p = c \, |g|^q\) a.e.
Conversely, suppose \(|f|^p = c |g|^q\) a.e. for some \(c > 0\). Integrating
both sides gives \(\|f\|_p^p = c \, \|g\|_q^q\), so \(c\) is forced to equal
\(\|f\|_p^p / \|g\|_q^q\); then
\(\tilde f^p = \tilde g^q\) a.e., so \(\Delta = 0\) a.e. and \(\int \Delta \, d\mu = 0\).
The main proof's integration identity then gives
\(\int \tilde f \, \tilde g \, d\mu = 1 - 0 = 1\); unwinding the
normalization gives \(\int |fg| \, d\mu = \|f\|_p \, \|g\|_q\), i.e., equality
in Hölder's inequality.
When \(p = q = 2\), Hölder's inequality reduces to:
\[
\int_\Omega |fg| \, d\mu \;\leq\; \|f\|_2 \, \|g\|_2.
\]
For the standard \(L^2\) inner product
\(\langle f, g \rangle = \int f \, \overline{g} \, d\mu\), the triangle
inequality for integrals gives
\(|\langle f, g \rangle| \leq \int |fg| \, d\mu\), so Hölder's inequality
immediately implies the
Cauchy-Schwarz inequality
\(|\langle f, g \rangle| \leq \|f\|_2 \|g\|_2\); the finite-dimensional bound
\(|\mathbf{u} \cdot \mathbf{v}| \leq \|\mathbf{u}\| \, \|\mathbf{v}\|\) on
Hilbert spaces
is the special case of counting measure on a finite set. Hölder's inequality
thus extends Cauchy-Schwarz — with \(p = q = 2\) as its self-conjugate, most
symmetric case — to the full family of conjugate exponent pairs.
The One Direction of \(L^p\) Duality
Hölder's inequality immediately settles "one half" of the duality claim from
Dual Spaces.
For any fixed \(g \in L^q\), the map
\[
\varphi_g : L^p \to \mathbb{F}, \qquad \varphi_g(f) = \int_\Omega f \, g \, d\mu
\]
is a bounded linear functional
on \(L^p\). (That \(L^p\) is indeed a vector space — i.e.,
\(\alpha f_1 + \beta f_2 \in L^p\) for \(f_1, f_2 \in L^p\) and \(\alpha, \beta \in \mathbb{F}\) —
will be established in the next section via Minkowski's inequality; we state
the duality result here because Hölder's inequality is its main analytic
ingredient.)
It is well-defined on equivalence classes: if \(f \sim f'\) (i.e., \(f = f'\) a.e.),
then \(fg = f'g\) a.e., so \(\int fg \, d\mu = \int f'g \, d\mu\). Linearity in \(f\)
follows from linearity of the integral:
\(\varphi_g(\alpha f_1 + \beta f_2) = \alpha \varphi_g(f_1) + \beta \varphi_g(f_2)\).
Boundedness is Hölder:
\(\|\varphi_g\|_{(L^p)^*} = \sup_{\|f\|_p = 1} |\varphi_g(f)| \leq \|g\|_q\).
This bound is in fact sharp. We construct, for each case, a
normalized \(f\) witnessing (or approaching) the equality
\(|\varphi_g(f)| = \|g\|_q\).
Case \(1 < p < \infty\): for \(g \neq 0\), define the "equalizer"
\[
f_0(x) \;=\; \|g\|_q^{\,1-q} \,|g(x)|^{q-1} \, \overline{\operatorname{sgn} g(x)}
\]
where \(\operatorname{sgn} z = z/|z|\) for \(z \neq 0\) (the complex "phase";
\(\pm 1\) or \(0\) in the real case) and \(\operatorname{sgn} 0 = 0\). Each factor
is measurable: \(\|g\|_q^{1-q}\) is a constant; \(|g|^{q-1}\) is the composition
of the measurable function \(|g|\) with the Borel map \(t \mapsto t^{q-1}\) on
\([0, \infty)\); and \(\operatorname{sgn} g\) is the composition of \(g\) with
the Borel map \(z \mapsto \operatorname{sgn} z\) on \(\mathbb{F}\) (continuous on
\(\{z \neq 0\}\), extended by \(\operatorname{sgn} 0 = 0\)). Hence \(f_0\) is
measurable. Using the algebraic identity \(p(q - 1) = q\) (equivalently,
\(p(1 - q) = -q\)),
\[
|f_0|^p
\;=\; \|g\|_q^{p(1-q)} \, |g|^{p(q-1)}
\;=\; \|g\|_q^{-q} \, |g|^q,
\]
so \(\int |f_0|^p \, d\mu = \|g\|_q^{-q} \int |g|^q \, d\mu = \|g\|_q^{-q} \cdot \|g\|_q^q = 1\);
in particular \(f_0 \in L^p\) with \(\|f_0\|_p = 1\). And
\(f_0(x) g(x) = \|g\|_q^{1-q} |g(x)|^q \geq 0\), so
\(\varphi_g(f_0) = \|g\|_q^{1-q} \int |g|^q \, d\mu
= \|g\|_q^{1-q} \cdot \|g\|_q^q = \|g\|_q\).
Case \(p = \infty, q = 1\): for \(g \neq 0\), take
\(f_0 = \overline{\operatorname{sgn} g}\). Then \(|f_0| \leq 1\) pointwise with
\(|f_0| = 1\) on \(\{g \neq 0\}\), so \(\|f_0\|_\infty = 1\); and
\(f_0 g = |g|\), so
\(\varphi_g(f_0) = \int |g| \, d\mu = \|g\|_1\).
Case \(p = 1, q = \infty\) (assuming \(\mu\) is
\(\sigma\)-finite):
the bound \(\|g\|_\infty\) need not be attained by a single \(f\), but it is
approached. For any \(\varepsilon > 0\), the set
\(E_\varepsilon = \{x : |g(x)| > \|g\|_\infty - \varepsilon\}\) has positive
measure by the definition of \(\|g\|_\infty\) as an infimum; by \(\sigma\)-finiteness,
choose \(F_\varepsilon \subseteq E_\varepsilon\) with \(0 < \mu(F_\varepsilon) < \infty\).
Set \(f_\varepsilon = \mu(F_\varepsilon)^{-1} \chi_{F_\varepsilon} \, \overline{\operatorname{sgn} g}\).
Then \(\|f_\varepsilon\|_1 = 1\) and
\[
\varphi_g(f_\varepsilon)
\;=\; \mu(F_\varepsilon)^{-1} \int_{F_\varepsilon} |g| \, d\mu
\;\geq\; \|g\|_\infty - \varepsilon.
\]
Letting \(\varepsilon \to 0^+\), \(\sup_{\|f\|_1 = 1} \varphi_g(f) \geq \|g\|_\infty\).
(Without \(\sigma\)-finiteness, this bound can fail to be attained, and the
full duality \((L^1)^* \cong L^\infty\) breaks down — the exact failure is
treated in Conway's functional analysis text, among other standard references.)
Combining the three cases with the upper bound from Hölder's inequality,
\(\|\varphi_g\|_{(L^p)^*} = \|g\|_q\) for all \(1 \leq p \leq \infty\)
(with \(\sigma\)-finiteness for \(p = 1\)), so the embedding
\(L^q \hookrightarrow (L^p)^*\) given by \(g \mapsto \varphi_g\) is
isometric. Every element of \(L^q\) gives rise to a
continuous functional on \(L^p\), with no loss of norm.
The converse — that every continuous functional on \(L^p\) arises from
some \(g \in L^q\) (for \(1 \leq p < \infty\), with \(\sigma\)-finiteness assumed
when \(p = 1\)) — is substantially harder and relies on the
Radon-Nikodym theorem from measure theory, whose treatment
lies beyond the scope of this chapter; it is developed in Durrett's probability
text and Conway's functional analysis text, among other standard references.
Minkowski's Inequality & the \(L^p\) Norm
We now use Hölder's inequality to prove the triangle inequality for \(\|\cdot\|_p\),
completing the verification that \(L^p\) is a normed space.
Theorem: Minkowski's Inequality
Let \(1 \leq p \leq \infty\). If \(f, g \in L^p\), then \(f + g \in L^p\) and
\[
\|f + g\|_p \;\leq\; \|f\|_p + \|g\|_p.
\]
Proof:
Throughout, \(f + g\) is measurable as the sum of measurable functions
(componentwise in the complex case), so \(|f + g|\) and its powers are
measurable as well. The proof splits into three cases: the endpoints
\(p = 1\) and \(p = \infty\) follow from the pointwise triangle inequality
and standard properties of integration; the main case \(1 < p < \infty\)
is the central case of the argument and uses Hölder's inequality.
Case \(p = 1\):
By the pointwise triangle inequality, \(|f(x) + g(x)| \leq |f(x)| + |g(x)|\).
Integrating this a.e. inequality over \(\Omega\) (using monotonicity of the
integral, with \(f + g\) measurable as established above),
\[
\int_\Omega |f + g| \, d\mu
\;\leq\; \int_\Omega |f| \, d\mu + \int_\Omega |g| \, d\mu
\;=\; \|f\|_1 + \|g\|_1 \;<\; \infty.
\]
Hence \(f + g \in L^1\), and by the definition of the \(L^1\) norm,
\(\|f+g\|_1 = \int |f+g| \, d\mu \leq \|f\|_1 + \|g\|_1\).
Case \(p = \infty\):
By Lemma: Essential Supremum Is Attained,
\(|f(x)| \leq \|f\|_\infty\) and \(|g(x)| \leq \|g\|_\infty\) for a.e. \(x\),
whence \(|f(x) + g(x)| \leq \|f\|_\infty + \|g\|_\infty < \infty\) for a.e. \(x\).
This shows \(f + g\) is essentially bounded, i.e., \(f + g \in L^\infty\).
The constant \(\|f\|_\infty + \|g\|_\infty\) is an admissible
a.e.-bound for \(|f + g|\), so by the definition of \(\|f + g\|_\infty\) as
the infimum of such bounds, \(\|f + g\|_\infty \leq \|f\|_\infty + \|g\|_\infty\).
Case \(1 < p < \infty\):
First, observe that \(f + g \in L^p\): pointwise,
\(|f(x) + g(x)| \leq |f(x)| + |g(x)| \leq 2\max(|f(x)|, |g(x)|)\),
and since \(t \mapsto t^p\) is monotonically increasing on \([0, \infty)\)
for \(p > 0\),
\[
|f + g|^p
\;\leq\; \bigl(2 \max(|f|, |g|)\bigr)^p
\;=\; 2^p \max(|f|, |g|)^p
\;=\; 2^p \max(|f|^p, |g|^p)
\;\leq\; 2^p (|f|^p + |g|^p).
\]
Integrating,
\(\int |f+g|^p \, d\mu \leq 2^p(\|f\|_p^p + \|g\|_p^p) < \infty\),
so \(f + g \in L^p\) and \(\|f+g\|_p = \bigl(\int |f+g|^p \, d\mu\bigr)^{1/p}\) is finite.
Now assume \(\|f + g\|_p > 0\) (otherwise the inequality is trivial).
We begin by splitting \(|f + g|^p\): since \(|f+g|^{p-1} \geq 0\),
multiplying the pointwise triangle inequality \(|f+g| \leq |f| + |g|\) by
\(|f+g|^{p-1}\) preserves its direction, giving
\[
|f + g|^p \;=\; |f + g|^{p-1} \cdot |f + g|
\;\leq\; |f + g|^{p-1} |f| \;+\; |f + g|^{p-1} |g|.
\]
We now apply Hölder's inequality
to each term on the right. Each of \(|f+g|^{p-1}\), \(|f|\), \(|g|\) is
non-negative and measurable: \(|f+g|^{p-1}\) is the composition of the
measurable function \(|f+g|\) with the Borel map \(t \mapsto t^{p-1}\) on
\([0, \infty)\), and similarly for \(|f|, |g|\). We verify the relevant
\(L^p\) memberships: \(|f| \in L^p\) directly (since \(\int |f|^p \, d\mu = \|f\|_p^p < \infty\)),
similarly \(|g| \in L^p\), and for \(|f+g|^{p-1} \in L^q\),
\[
\int_\Omega \bigl(|f + g|^{p-1}\bigr)^q \, d\mu
\;=\; \int_\Omega |f + g|^{(p-1)q} \, d\mu
\;=\; \int_\Omega |f + g|^{p} \, d\mu
\;=\; \|f + g\|_p^p,
\]
where we used the identity \((p - 1)q = p\). Therefore
\(\bigl\| |f+g|^{p-1} \bigr\|_q = \|f+g\|_p^{p/q}\).
Applying Hölder's inequality with non-negative factors (so
\(|(\cdot)(\cdot)| = (\cdot)(\cdot)\)):
\[
\int |f + g|^{p-1} |f| \, d\mu
\;\leq\; \bigl\||f+g|^{p-1}\bigr\|_q \cdot \|f\|_p
\;=\; \|f+g\|_p^{p/q} \cdot \|f\|_p,
\]
and analogously \(\int |f+g|^{p-1} |g| \, d\mu \leq \|f+g\|_p^{p/q} \cdot \|g\|_p\).
Adding these two bounds and using linearity of the integral to combine the
left-hand sides,
\[
\|f + g\|_p^p
\;=\; \int |f+g|^p \, d\mu
\;\leq\; \int |f+g|^{p-1} |f| \, d\mu + \int |f+g|^{p-1} |g| \, d\mu
\;\leq\; \|f+g\|_p^{p/q} \bigl(\|f\|_p + \|g\|_p\bigr).
\]
Since \(\|f+g\|_p^{p/q}\) is strictly positive (by assumption) and finite
(established above), we may divide both sides by it:
\[
\|f + g\|_p^{p - p/q} \;\leq\; \|f\|_p + \|g\|_p.
\]
Since \(p - p/q = p(1 - 1/q) = p \cdot (1/p) = 1\), the left side is simply
\(\|f + g\|_p\), completing the proof.
Equality Condition for Minkowski (\(1 < p < \infty\))
Equality \(\|f + g\|_p = \|f\|_p + \|g\|_p\) holds if and only if \(f\) and
\(g\) are non-negatively proportional a.e. — that is, either
\(f = 0\) a.e., or \(g = 0\) a.e., or there exists a constant \(c > 0\) such
that \(f = c \, g\) a.e. (equivalently, \(g = c^{-1} f\) a.e.).
Proof:
Excluding the trivial cases where \(f = 0\) or \(g = 0\) a.e., assume
\(\|f\|_p, \|g\|_p > 0\). Then by the just-proved Minkowski inequality,
\(\|f + g\|_p \leq \|f\|_p + \|g\|_p\). Moreover, under our equality
hypothesis \(\|f + g\|_p = \|f\|_p + \|g\|_p\), we have
\(\|f + g\|_p > 0\) (since \(\|f\|_p, \|g\|_p > 0\) by assumption).
Tracing the main proof, equality in Minkowski requires equality in
both ingredients used:
- The pointwise triangle inequality \(|f + g| \leq |f| + |g|\) must hold
with equality a.e., i.e., \(|f(x) + g(x)| = |f(x)| + |g(x)|\) for a.e. \(x\).
Equality in this triangle inequality occurs iff \(f(x), g(x)\) lie on a
common non-negative ray from the origin — explicitly: for
\(\mathbb{F} = \mathbb{R}\), either one of \(f(x), g(x)\) is zero or they
share the same sign; for \(\mathbb{F} = \mathbb{C}\), either one is zero
or they have the same complex argument (phase).
- Each of the two Hölder applications must hold with equality. By the
equality condition for Hölder's inequality,
applied with exponents \((p, q)\) to the factors \(|f+g|^{p-1}\) (in \(L^q\))
and \(|f|\) (in \(L^p\)), there exists \(\alpha > 0\) with
\(|f|^p = \alpha \cdot \bigl(|f+g|^{p-1}\bigr)^q = \alpha |f+g|^{(p-1)q}
= \alpha |f+g|^p\) a.e., using \((p-1)q = p\). Similarly, from the second
Hölder application there exists \(\beta > 0\) with
\(|g|^p = \beta |f+g|^p\) a.e.
On the set \(\{|f+g| = 0\}\), both relations give \(|f|^p = 0 = |g|^p\) a.e.,
so \(f = g = 0\) a.e. on this set; the desired proportionality
\(|f|^p = (\alpha/\beta) |g|^p\) then holds trivially there.
On \(\{|f+g| > 0\}\), we have \(|g|^p = \beta |f+g|^p > 0\), so the
ratio \(|f|^p / |g|^p = \alpha/\beta\) is well-defined.
Setting \(k = \alpha/\beta > 0\), we conclude
\(|f|^p = k \, |g|^p\) a.e., i.e., \(|f| = k^{1/p} |g|\) a.e.
Combining (i) — common phase a.e. — with (ii) — \(|f| = c \, |g|\) a.e. where
\(c = k^{1/p} > 0\) — we obtain \(f = c \, g\) a.e. Indeed, working modulo
a null set: at points where \(g(x) = 0\), (ii) gives \(|f(x)| = 0\), so
\(f(x) = 0 = c \cdot g(x)\); at points where \(g(x) \neq 0\), (i) forces
\(f(x)\) to have the same phase as \(g(x)\), and (ii) fixes
\(|f(x)| = c \, |g(x)|\), so \(f(x) = c \, g(x)\).
Conversely, if \(f = c \, g\) a.e. with \(c > 0\), then \(f + g = (c + 1) g\)
a.e., so by absolute homogeneity \(\|f + g\|_p = (c + 1) \|g\|_p\), and
\(\|f\|_p + \|g\|_p = c \|g\|_p + \|g\|_p = (c + 1) \|g\|_p = \|f + g\|_p\).
Geometrically, Minkowski's inequality is strict whenever \(f\) and \(g\) point
in genuinely different "directions" in \(L^p\) — a manifestation of the
strict convexity of the \(L^p\) norm for \(1 < p < \infty\).
(At the endpoints \(p = 1\) and \(p = \infty\), strict convexity fails and
equality can occur in many more configurations.)
\(L^p\) Is a Normed Space
Before listing the norm axioms, we record that
\(L^p\) is a vector space over \(\mathbb{F}\): for \(f, g \in L^p\)
and \(\alpha, \beta \in \mathbb{F}\), we have \(\alpha f + \beta g \in L^p\).
Closure under addition follows from Minkowski's inequality, case by case
(\(p = 1\), \(p = \infty\), and \(1 < p < \infty\) were each established
as the first step of the corresponding case in the proof above); closure
under scalar multiplication is immediate from
\(\int |\alpha f|^p \, d\mu = |\alpha|^p \int |f|^p \, d\mu < \infty\) (and analogously
\(\|\alpha f\|_\infty = |\alpha| \|f\|_\infty < \infty\)). The quotient construction
of \(L^p\) as \(\mathscr{L}^p / \sim\) preserves these operations, since
modifications on null sets are compatible with pointwise addition and scalar
multiplication.
We now summarize the complete verification of the norm axioms
for \(\|\cdot\|_p\) on \(L^p(\Omega, \mathcal{F}, \mu)\) in the case
\(1 \leq p < \infty\); the parallel verification for the \(p = \infty\) case
was completed immediately after
Lemma: Essential Supremum Is Attained.
- Positive definiteness: \(\|f\|_p \geq 0\), and
\(\|f\|_p = 0 \iff f = 0\) in \(L^p\) (i.e., \(f = 0\) a.e.).
This is where the equivalence class construction is essential
— without it, \(\|\cdot\|_p\) would only be a seminorm.
- Absolute homogeneity:
\(\|\alpha f\|_p = |\alpha| \cdot \|f\|_p\) for all \(\alpha \in \mathbb{F}\).
This follows immediately from \(\int |\alpha f|^p = |\alpha|^p \int |f|^p\).
- Triangle inequality:
\(\|f + g\|_p \leq \|f\|_p + \|g\|_p\).
This is Minkowski's inequality, proven above.
Therefore \(\bigl(L^p(\Omega, \mathcal{F}, \mu),\, \|\cdot\|_p\bigr)\) is a
normed vector space
for every \(1 \leq p \leq \infty\). This completes our construction of \(L^p\) as a normed structure.
The deepest question, however, remains open: is this normed space
complete? Does every Cauchy sequence in \(L^p\)
converge to a limit that is itself in \(L^p\)? An affirmative answer — the
Riesz–Fischer theorem — elevates \(L^p\) from a normed space to a
Banach space, and provides the
convergence theory that makes \(L^p\) the natural setting for probability, signal processing,
and quantum mechanics. We develop this in the next chapter,
\(L^p\) Completeness & Convergence.