\(L^p\) Spaces — Construction & Inequalities

Introduction From Functions to Equivalence Classes Young's & Hölder's Inequality Minkowski's Inequality & the \(L^p\) Norm Next: \(L^p\) Completeness →

Introduction

We introduced the \(L^p\) spaces as the most important examples of Banach spaces in analysis and machine learning. In that introduction we left three structural claims unproven: that \(\|\cdot\|_p\) separates distinct elements (positive-definiteness), that it obeys the triangle inequality, and that the resulting normed space is complete. The first two together make \(L^p\) a normed space; the third upgrades it to a Banach space. Throughout this chapter we treat the range \(1 \leq p \leq \infty\) uniformly: the \(L^\infty\) space, with its essential supremum norm, is defined in parallel with the finite-exponent case, and the main inequalities (Hölder, Minkowski) are stated and proved in the full range \(1 \leq p \leq \infty\), with the endpoint \(p = \infty\) handled as its own case within each proof. In Dual Spaces, we further relied on Hölder's inequality — also without proof — to justify the pairing \(\varphi_g(f) = \int fg \, d\mu\) in the duality table.

This chapter settles the debt in full:

  1. \(\|\cdot\|_p\) is genuinely a norm — not merely a seminorm — once we pass to equivalence classes of functions equal almost everywhere.
  2. The triangle inequality for this norm (Minkowski's inequality) follows from Hölder's inequality, which in turn follows from the elementary Young's inequality.
  3. The resulting normed space is complete: the Riesz-Fischer theorem (proved in the next chapter, \(L^p\) Completeness & Convergence) guarantees that every Cauchy sequence in \(L^p\) converges in \(L^p\).

The chain of implications is: Young → Hölder → Minkowski (triangle inequality) → Riesz-Fischer (completeness) → Banach space. Each link depends logically on the preceding one. This chapter establishes the first three links (up to Minkowski and the normed-space structure); the completeness step and the Banach-space conclusion are taken up in the next chapter.

From Functions to Equivalence Classes

Let \((\Omega, \mathcal{F}, \mu)\) be a measure space. In Intro to Functional Analysis, we defined the \(L^p\) space as the collection of measurable functions \(f\) with \(\int |f|^p \, d\mu < \infty\), equipped with the quantity \[ \|f\|_p \;=\; \left( \int_\Omega |f|^p \, d\mu \right)^{1/p}. \] For this to be a valid norm, three axioms must hold: (i) positive definiteness: \(\|f\|_p \geq 0\), with \(\|f\|_p = 0 \iff f = 0\) (the "\(\Leftarrow\)" direction is immediate; the "\(\Rightarrow\)" direction is the non-trivial content); (ii) absolute homogeneity: \(\|\alpha f\|_p = |\alpha| \cdot \|f\|_p\); (iii) triangle inequality: \(\|f + g\|_p \leq \|f\|_p + \|g\|_p\). Non-negativity and axiom (ii) are straightforward: the former from the non-negativity of the integrand \(|f|^p\), the latter from the linearity of the integral. The two remaining difficulties — the definiteness half of axiom (i), and the triangle inequality — drive the structure of this entire chapter. We address definiteness first.

Convention. Throughout this chapter, \(\mathbb{F}\) denotes the scalar field, either \(\mathbb{R}\) or \(\mathbb{C}\). Measurable functions \(f : \Omega \to \mathbb{F}\) are understood as \((\mathcal{F}, \mathcal{B}(\mathbb{F}))\)-measurable, where \(\mathcal{B}(\mathbb{F})\) is the Borel \(\sigma\)-algebra on \(\mathbb{F}\). Results are stated in generality over \(\mathbb{F}\); the real and complex cases differ only where phase/sign arguments appear (notably the equality conditions for Hölder and Minkowski).

The Seminorm Problem

Suppose \(\|f\|_p = 0\). Then \(\int |f|^p \, d\mu = 0\). Since \(f\) is measurable, so is the composition \(|f|^p\) (the absolute value is continuous on \(\mathbb{F}\) and \(t \mapsto t^p\) is continuous on \([0, \infty)\), both are Borel measurable, and measurability is preserved under composition), and \(|f|^p \geq 0\) pointwise. By Theorem: Zero Integral Implies Vanishing Almost Everywhere, \(|f(x)|^p = 0\) for almost every \(x \in \Omega\). Since \(p \geq 1\), the map \(t \mapsto t^p\) is strictly increasing on \([0, \infty)\), so \(t^p = 0 \iff t = 0\); hence \(|f(x)| = 0\) a.e., i.e., \(f(x) = 0\) except on a set of measure zero. But this does not mean \(f\) is the zero function. It means only that \(f = 0\) almost everywhere (a.e.).

For a concrete example, recall the Lebesgue Integration of Dirichlet function: \[ f(x) = \chi_{\mathbb{Q}}(x) = \begin{cases} 1 & \text{if } x \in \mathbb{Q}, \\ 0 & \text{if } x \notin \mathbb{Q}. \end{cases} \] Considering \(\mathbb{R}\) with Lebesgue measure, since \(\mu(\mathbb{Q}) = 0\), we have \(\|f\|_p = 0\) for every \(1 \leq p < \infty\), yet \(f\) is not the zero function — it equals \(1\) at every rational point. The quantity \(\|\cdot\|_p\) therefore fails to distinguish \(f\) from the zero function. In the language of normed space theory, \(\|\cdot\|_p\) is a seminorm, not a norm: it satisfies \(\|f\|_p = 0\) without \(f = 0\).

The Equivalence Relation

The resolution is a standard algebraic maneuver: we quotient out the ambiguity. We declare two measurable functions to be "the same" if they differ only on a negligible set.

Definition: Equality Almost Everywhere

Let \(f, g : \Omega \to \mathbb{F}\) (\(\mathbb{F} = \mathbb{R}\) or \(\mathbb{C}\)) be measurable functions. We say \(f\) and \(g\) are equal almost everywhere, written \(f = g\) a.e., if \[ \mu\bigl(\{x \in \Omega : f(x) \neq g(x)\}\bigr) = 0. \] The relation \(f \sim g \iff f = g\) a.e. is an equivalence relation on the set of measurable functions.

Verification:

We verify the three axioms of an equivalence relation. A preliminary remark: for measurable \(f, g : \Omega \to \mathbb{F}\), the difference \(f - g\) is measurable (componentwise in the complex case). Thus \(\{x : f(x) = g(x)\} = (f - g)^{-1}(\{0\})\); since \(f - g\) is measurable (i.e., the preimage of every Borel set lies in \(\mathcal{F}\)) and \(\{0\} \subset \mathbb{F}\) is closed hence Borel, this preimage is in \(\mathcal{F}\), and so is its complement \(\{x : f(x) \neq g(x)\}\).

Reflexivity: \(\{x : f(x) \neq f(x)\} = \emptyset\), which has measure zero.

Symmetry: The sets \(\{f \neq g\}\) and \(\{g \neq f\}\) are literally equal, so \(f \sim g \iff g \sim f\).

Transitivity: Suppose \(f = g\) a.e. and \(g = h\) a.e. Let \(N_1 = \{x : f(x) \neq g(x)\}\) and \(N_2 = \{x : g(x) \neq h(x)\}\); both are null by hypothesis and measurable by the preliminary remark (applied to each pair). The set \(\{x : f(x) \neq h(x)\}\) is likewise measurable by the remark (applied to \(f\) and \(h\)), and if \(f(x) \neq h(x)\), then either \(f(x) \neq g(x)\) or \(g(x) \neq h(x)\), so \(\{x : f(x) \neq h(x)\} \subseteq N_1 \cup N_2\). By monotonicity and subadditivity of \(\mu\), \(\mu(\{x : f(x) \neq h(x)\}) \leq \mu(N_1) + \mu(N_2) = 0\).

The Formal Definition of \(L^p\)

Definition: The \(L^p\) Space (Rigorous)

Let \((\Omega, \mathcal{F}, \mu)\) be a measure space and \(1 \leq p < \infty\). Define \[ \mathscr{L}^p(\Omega, \mathcal{F}, \mu) \;=\; \bigl\{\, f : \Omega \to \mathbb{F} \;\big|\; f \text{ is measurable and } \int_\Omega |f|^p \, d\mu < \infty \,\bigr\}. \] The \(L^p\) space is the quotient \[ L^p(\Omega, \mathcal{F}, \mu) \;=\; \mathscr{L}^p(\Omega, \mathcal{F}, \mu) \,\big/\!\sim \] where \(f \sim g \iff f = g\) a.e. Each element of \(L^p\) is an equivalence class \([f]\) of functions that agree almost everywhere.

Following universal convention, we write \(f \in L^p\) rather than \([f] \in L^p\), understanding that "\(f\)" refers to the equivalence class and not to any particular representative. This notational abuse is harmless because all quantities we care about — the norm \(\|f\|_p\), integrals \(\int fg \, d\mu\), and convergence statements — are invariant under modification on sets of measure zero. When we write \(f = 0\) in \(L^p\), we mean \(f(x) = 0\) for \(\mu\)-almost every \(x\).

Definition: The \(L^\infty\) Space

Let \((\Omega, \mathcal{F}, \mu)\) be a measure space. A measurable function \(f : \Omega \to \mathbb{F}\) is essentially bounded if there exists a constant \(C \geq 0\) such that \(|f(x)| \leq C\) for a.e. \(x\). The essential supremum is \[ \|f\|_\infty \;=\; \inf\{C \geq 0 : |f(x)| \leq C \text{ for a.e. } x\}. \] The \(L^\infty\) space is the set of equivalence classes (under a.e.-equality) of essentially bounded measurable functions, equipped with the quantity \(\|\cdot\|_\infty\); we verify below that this is indeed a norm.

The infimum in the definition of \(\|f\|_\infty\) is not just an infimum — it is actually attained. This small fact is what makes \(\|\cdot\|_\infty\) well-behaved as a norm and is used repeatedly in the sequel.

Lemma: Essential Supremum Is Attained

If \(f \in L^\infty\), then \(|f(x)| \leq \|f\|_\infty\) for almost every \(x\). In particular, the infimum in the definition of \(\|f\|_\infty\) is achieved.

Proof:

Since \(f \in L^\infty\), the set of admissible bounds is non-empty; it is also bounded below by \(0\), so its infimum \(\|f\|_\infty\) lies in \([0, \infty)\). By the defining property of the infimum, for each \(n \in \mathbb{N}\) there exists an admissible \(C_n\) with \(\|f\|_\infty \leq C_n < \|f\|_\infty + 1/n\); for this \(C_n\), the set \(E_n = \{x : |f(x)| > C_n\}\) has measure zero. Let \(E = \bigcup_{n=1}^\infty E_n\). Since each \(E_n\) is null and \(\mu\) is \(\sigma\)-subadditive (a direct consequence of countable additivity, via disjointification), \(\mu(E) \leq \sum_{n=1}^\infty \mu(E_n) = 0\). For \(x \notin E\), we have \(|f(x)| \leq C_n < \|f\|_\infty + 1/n\) for every \(n\); letting \(n \to \infty\) gives \(|f(x)| \leq \|f\|_\infty\). This proves the first claim; the second is its immediate corollary, since \(C = \|f\|_\infty\) is itself an admissible bound.

With this lemma in hand, the three norm axioms for \(\|\cdot\|_\infty\) are immediate. Non-negativity is clear from the definition. Definiteness on equivalence classes: if \(\|f\|_\infty = 0\), then \(|f(x)| \leq 0\) a.e., so \(f = 0\) a.e. — i.e., \([f]\) is the zero class. Absolute homogeneity: for \(\alpha = 0\) both sides vanish; for \(\alpha \neq 0\), the lemma gives \(|f(x)| \leq \|f\|_\infty\) a.e., so \(|\alpha f(x)| \leq |\alpha| \|f\|_\infty\) a.e., showing \(\|\alpha f\|_\infty \leq |\alpha| \|f\|_\infty\); applying the same argument to \(\alpha^{-1}(\alpha f) = f\) yields the reverse inequality. Triangle inequality: since \(|f(x)| \leq \|f\|_\infty\) and \(|g(x)| \leq \|g\|_\infty\) a.e., we have \(|f(x) + g(x)| \leq \|f\|_\infty + \|g\|_\infty\) a.e., which forces \(\|f + g\|_\infty \leq \|f\|_\infty + \|g\|_\infty\). So \((L^\infty, \|\cdot\|_\infty)\) is a normed space; its completeness is established alongside the \(1 \leq p < \infty\) case — though by a genuinely simpler argument — when we prove the Riesz-Fischer theorem in the next chapter.

Connection to Sequence Spaces

The sequence spaces \(\ell^p\) are a special case of \(L^p\). If we take \(\Omega = \mathbb{N}\), \(\mathcal{F} = 2^{\mathbb{N}}\) (all subsets), and \(\mu\) = counting measure (\(\mu(\{n\}) = 1\) for each \(n\)), then \(L^p(\mathbb{N}, \mu)\) is exactly \(\ell^p\). In this setting, the equivalence class issue is trivial — since every singleton \(\{n\}\) has positive measure, two sequences are equal a.e. if and only if they are identical. Every theorem we prove for \(L^p\) in this chapter therefore specializes to \(\ell^p\) automatically.

With the quotient construction in hand, the seminorm \(\|\cdot\|_p\) on \(\mathscr{L}^p\) descends to a genuine norm on \(L^p\): if \(\|[f]\|_p = 0\), then \(f = 0\) a.e., which means \([f]\) is the zero element of the quotient space. The remaining norm axiom — the triangle inequality \(\|f + g\|_p \leq \|f\|_p + \|g\|_p\) — is the content of Minkowski's inequality, which requires Hölder's inequality as an intermediate step. We turn to these now.

Young's & Hölder's Inequality

The proof chain begins with an elementary inequality about real numbers, which we then "integrate" to obtain the central inequality of \(L^p\) theory.

Hölder Conjugates

Definition: Hölder Conjugate

For an exponent \(1 < p < \infty\), the Hölder conjugate \(q\) is the unique real number satisfying \[ \frac{1}{p} + \frac{1}{q} = 1, \qquad \text{equivalently} \quad q = \frac{p}{p - 1}. \] Since \(p > 1\), \(p - 1 > 0\) gives \(q > 0\); and \(p > p - 1\) gives \(q > 1\), while \(p < \infty\) gives \(q < \infty\). Thus \(1 < q < \infty\). The extreme cases are defined by convention so that the relation \(1/p + 1/q = 1\) remains formally valid with the extended-real convention \(1/\infty = 0\): the conjugate of \(p = 1\) is \(q = \infty\), and the conjugate of \(p = \infty\) is \(q = 1\). The relation is symmetric: the conjugate of \(q\) is \(p\).

The unique self-conjugate case is \(p = q = 2\) — the setting of Hilbert spaces and the Cauchy-Schwarz inequality. The algebraic identity \((p - 1)q = p\), which follows immediately from \(q = p/(p-1)\), will appear repeatedly in the proofs below.

Young's Inequality

Theorem: Young's Inequality

Let \(1 < p < \infty\) and let \(q\) be its Hölder conjugate. For all \(a, b \geq 0\), \[ ab \;\leq\; \frac{a^p}{p} + \frac{b^q}{q}. \] Equality holds if and only if \(a^p = b^q\).

Proof:

The argument is a direct specialization of the weighted AM-GM inequality with weights \((1/p, 1/q)\). If \(a = 0\) or \(b = 0\), both sides reduce to a non-negative quantity and the inequality holds trivially (see the equality analysis below). Assume \(a, b > 0\).

The key observation is that the logarithm \(t \mapsto \log t\) is strictly concave on \((0, \infty)\): its second derivative \((\log t)'' = -1/t^2\) is strictly negative. By the definition of strict concavity, for any \(\lambda \in (0, 1)\) and \(u, v > 0\) with \(u \neq v\), \[ \log\bigl(\lambda u + (1 - \lambda) v\bigr) \;>\; \lambda \log u + (1 - \lambda) \log v, \] with equality when \(u = v\). By the logarithm rules \(\lambda \log u + (1 - \lambda) \log v = \log\bigl(u^\lambda v^{1-\lambda}\bigr)\). Since \(\log\) is strictly increasing, we obtain the weighted AM-GM inequality \[ u^\lambda \, v^{1 - \lambda} \;\leq\; \lambda u + (1 - \lambda) v, \] with equality if and only if \(u = v\).

Now set \(\lambda = 1/p\) (so that \(1 - \lambda = 1/q\) and \(\lambda \in (0, 1)\) since \(1 < p < \infty\)), \(u = a^p\), and \(v = b^q\): \[ (a^p)^{1/p} (b^q)^{1/q} \;\leq\; \frac{a^p}{p} + \frac{b^q}{q}. \] Simplifying the left side, \((a^p)^{1/p} (b^q)^{1/q} = a \cdot b\), which gives \(ab \leq \frac{a^p}{p} + \frac{b^q}{q}\).

Equality condition. For \(a, b > 0\): the weighted AM-GM step used strict concavity of \(\log\) with \(\lambda = 1/p \in (0, 1)\), so equality holds iff \(u = v\), i.e., \(a^p = b^q\). For the degenerate cases: if \(a = 0\) and \(b > 0\), the inequality reads \(0 \leq b^q/q\) which is strict (since \(b^q/q > 0\)), and correspondingly \(a^p = 0 \neq b^q\); symmetrically for \(b = 0, a > 0\). If \(a = b = 0\), both sides are zero and \(a^p = 0 = b^q\). Hence in every case, equality holds if and only if \(a^p = b^q\).

Young's inequality is a pointwise statement about real numbers. Its power emerges when we "integrate both sides" — which is exactly what happens in the proof of Hölder's inequality.

Hölder's Inequality

Theorem: Hölder's Inequality

Let \(1 \leq p \leq \infty\) and let \(q\) be its Hölder conjugate. If \(f \in L^p\) and \(g \in L^q\), then the product \(fg\) is measurable, \(fg \in L^1\), and \[ \|fg\|_1 \;=\; \int_\Omega |f g| \, d\mu \;\leq\; \|f\|_p \, \|g\|_q. \]

Proof:

Measurability of \(fg\) is standard: the product of measurable functions is measurable (in the complex case, by considering real and imaginary parts). We establish the inequality below; since \(\|f\|_p, \|g\|_q < \infty\) by hypothesis, its right-hand side is finite, which gives the integrability conclusion \(fg \in L^1\).

Case \(p = 1, q = \infty\) (the case \(p = \infty, q = 1\) follows by interchanging the roles of \(f\) and \(g\)): If \(\|f\|_1 = 0\), then by Theorem: Zero Integral Implies Vanishing Almost Everywhere, \(|f| = 0\) a.e., so \(fg = 0\) a.e. and both sides are zero. If \(\|g\|_\infty = 0\), then by Lemma: Essential Supremum Is Attained, \(|g(x)| \leq 0\) a.e., so \(g = 0\) a.e. and again \(fg = 0\) a.e. Otherwise, by the same lemma, \(|g(x)| \leq \|g\|_\infty\) for a.e. \(x\). Using multiplicativity of the absolute value, \(|f(x) g(x)| = |f(x)| \cdot |g(x)| \leq |f(x)| \cdot \|g\|_\infty\) for a.e. \(x\). Integrating both sides over \(\Omega\) (using monotonicity of the integral for a.e. inequality, and linearity to pull out the constant \(\|g\|_\infty\)), \[ \int_\Omega |fg| \, d\mu \;\leq\; \int_\Omega |f| \cdot \|g\|_\infty \, d\mu \;=\; \|g\|_\infty \int_\Omega |f| \, d\mu \;=\; \|f\|_1 \|g\|_\infty. \]

Case \(1 < p < \infty\): If \(\|f\|_p = 0\) or \(\|g\|_q = 0\), then \(f = 0\) a.e. or \(g = 0\) a.e., so \(fg = 0\) a.e. and both sides are zero. Assume \(\|f\|_p > 0\) and \(\|g\|_q > 0\).

Normalization. The strategy is to rescale \(f\) and \(g\) to have unit norm, reducing the integral inequality to a pointwise application of Young's inequality. Define \[ \tilde{f} = \frac{|f|}{\|f\|_p}, \qquad \tilde{g} = \frac{|g|}{\|g\|_q}. \] By direct computation, \(\|\tilde f\|_p^p = \int |\tilde f|^p \, d\mu = \int |f|^p / \|f\|_p^p \, d\mu = \|f\|_p^p / \|f\|_p^p = 1\), so \(\|\tilde f\|_p = 1\); similarly \(\|\tilde g\|_q = 1\). Since \(f, g\) take values in \(\mathbb{F}\), we have \(\tilde{f}(x), \tilde{g}(x) \in [0, \infty)\) everywhere, and Young's inequality applies pointwise: \[ \tilde{f}(x) \, \tilde{g}(x) \;\leq\; \frac{\tilde{f}(x)^p}{p} + \frac{\tilde{g}(x)^q}{q} \quad \text{for every } x. \]

Integrating this pointwise inequality over \(\Omega\) (by monotonicity of the integral) and using linearity, \[ \int_\Omega \tilde{f} \, \tilde{g} \, d\mu \;\leq\; \frac{1}{p} \int_\Omega \tilde{f}^p \, d\mu + \frac{1}{q} \int_\Omega \tilde{g}^q \, d\mu \;=\; \frac{1}{p} + \frac{1}{q} \;=\; 1. \]

Substituting back \(\tilde{f} = |f|/\|f\|_p\) and \(\tilde{g} = |g|/\|g\|_q\): \[ \frac{1}{\|f\|_p \, \|g\|_q} \int_\Omega |f g| \, d\mu \;\leq\; 1, \] which gives \(\int |fg| \, d\mu \leq \|f\|_p \, \|g\|_q\) as claimed.

Equality Conditions and Special Cases

Tracing through the proof reveals when Hölder's inequality is sharp.

Equality Condition (\(1 < p < \infty\)):

Assume first that \(\|f\|_p > 0\) and \(\|g\|_q > 0\) (the degenerate cases \(f = 0\) or \(g = 0\) a.e. give equality trivially, with both sides zero). Using the normalized \(\tilde f = |f|/\|f\|_p\) and \(\tilde g = |g|/\|g\|_q\) from the main proof, define the pointwise gap \[ \Delta(x) \;=\; \frac{\tilde f(x)^p}{p} + \frac{\tilde g(x)^q}{q} - \tilde f(x)\tilde g(x). \] Pointwise Young's inequality gives \(\Delta(x) \geq 0\) for a.e. \(x\). To apply Theorem: Zero Integral Implies Vanishing Almost Everywhere (which requires a pointwise nonnegative integrand), we define the a.e.-equal modification \(\Delta^+(x) = \max(\Delta(x), 0)\); then \(\Delta^+ \geq 0\) pointwise, \(\Delta^+\) is measurable, and \(\int \Delta^+ \, d\mu = \int \Delta \, d\mu\) (since \(\Delta = \Delta^+\) a.e.). The main proof's integration step rewritten in equation form reads \(\int \tilde f \, \tilde g \, d\mu = \tfrac{1}{p} + \tfrac{1}{q} - \int \Delta \, d\mu = 1 - \int \Delta \, d\mu\); equality in Hölder's inequality — \(\int \tilde f \tilde g \, d\mu = 1\) — is therefore equivalent to \(\int \Delta \, d\mu = 0\), i.e., \(\int \Delta^+ \, d\mu = 0\). By the zero-integral theorem just invoked, \(\Delta^+ = 0\) a.e., hence \(\Delta = 0\) a.e. By the equality case of Young's inequality, \(\Delta(x) = 0\) forces \(\tilde f(x)^p = \tilde g(x)^q\) for a.e. \(x\). Unwinding the normalization, \[ \frac{|f(x)|^p}{\|f\|_p^p} \;=\; \frac{|g(x)|^q}{\|g\|_q^q} \quad \text{for a.e. } x, \] so \(|f|^p\) and \(|g|^q\) are proportional a.e.: there exists a constant \(c = \|f\|_p^p / \|g\|_q^q > 0\) with \(|f|^p = c \, |g|^q\) a.e. Conversely, suppose \(|f|^p = c |g|^q\) a.e. for some \(c > 0\). Integrating both sides gives \(\|f\|_p^p = c \, \|g\|_q^q\), so \(c\) is forced to equal \(\|f\|_p^p / \|g\|_q^q\); then \(\tilde f^p = \tilde g^q\) a.e., so \(\Delta = 0\) a.e. and \(\int \Delta \, d\mu = 0\). The main proof's integration identity then gives \(\int \tilde f \, \tilde g \, d\mu = 1 - 0 = 1\); unwinding the normalization gives \(\int |fg| \, d\mu = \|f\|_p \, \|g\|_q\), i.e., equality in Hölder's inequality.

When \(p = q = 2\), Hölder's inequality reduces to: \[ \int_\Omega |fg| \, d\mu \;\leq\; \|f\|_2 \, \|g\|_2. \] For the standard \(L^2\) inner product \(\langle f, g \rangle = \int f \, \overline{g} \, d\mu\), the triangle inequality for integrals gives \(|\langle f, g \rangle| \leq \int |fg| \, d\mu\), so Hölder's inequality immediately implies the Cauchy-Schwarz inequality \(|\langle f, g \rangle| \leq \|f\|_2 \|g\|_2\); the finite-dimensional bound \(|\mathbf{u} \cdot \mathbf{v}| \leq \|\mathbf{u}\| \, \|\mathbf{v}\|\) on Hilbert spaces is the special case of counting measure on a finite set. Hölder's inequality thus extends Cauchy-Schwarz — with \(p = q = 2\) as its self-conjugate, most symmetric case — to the full family of conjugate exponent pairs.

The One Direction of \(L^p\) Duality

Hölder's inequality immediately settles "one half" of the duality claim from Dual Spaces. For any fixed \(g \in L^q\), the map \[ \varphi_g : L^p \to \mathbb{F}, \qquad \varphi_g(f) = \int_\Omega f \, g \, d\mu \] is a bounded linear functional on \(L^p\). (That \(L^p\) is indeed a vector space — i.e., \(\alpha f_1 + \beta f_2 \in L^p\) for \(f_1, f_2 \in L^p\) and \(\alpha, \beta \in \mathbb{F}\) — will be established in the next section via Minkowski's inequality; we state the duality result here because Hölder's inequality is its main analytic ingredient.) It is well-defined on equivalence classes: if \(f \sim f'\) (i.e., \(f = f'\) a.e.), then \(fg = f'g\) a.e., so \(\int fg \, d\mu = \int f'g \, d\mu\). Linearity in \(f\) follows from linearity of the integral: \(\varphi_g(\alpha f_1 + \beta f_2) = \alpha \varphi_g(f_1) + \beta \varphi_g(f_2)\). Boundedness is Hölder: \(\|\varphi_g\|_{(L^p)^*} = \sup_{\|f\|_p = 1} |\varphi_g(f)| \leq \|g\|_q\).

This bound is in fact sharp. We construct, for each case, a normalized \(f\) witnessing (or approaching) the equality \(|\varphi_g(f)| = \|g\|_q\).

Case \(1 < p < \infty\): for \(g \neq 0\), define the "equalizer" \[ f_0(x) \;=\; \|g\|_q^{\,1-q} \,|g(x)|^{q-1} \, \overline{\operatorname{sgn} g(x)} \] where \(\operatorname{sgn} z = z/|z|\) for \(z \neq 0\) (the complex "phase"; \(\pm 1\) or \(0\) in the real case) and \(\operatorname{sgn} 0 = 0\). Each factor is measurable: \(\|g\|_q^{1-q}\) is a constant; \(|g|^{q-1}\) is the composition of the measurable function \(|g|\) with the Borel map \(t \mapsto t^{q-1}\) on \([0, \infty)\); and \(\operatorname{sgn} g\) is the composition of \(g\) with the Borel map \(z \mapsto \operatorname{sgn} z\) on \(\mathbb{F}\) (continuous on \(\{z \neq 0\}\), extended by \(\operatorname{sgn} 0 = 0\)). Hence \(f_0\) is measurable. Using the algebraic identity \(p(q - 1) = q\) (equivalently, \(p(1 - q) = -q\)), \[ |f_0|^p \;=\; \|g\|_q^{p(1-q)} \, |g|^{p(q-1)} \;=\; \|g\|_q^{-q} \, |g|^q, \] so \(\int |f_0|^p \, d\mu = \|g\|_q^{-q} \int |g|^q \, d\mu = \|g\|_q^{-q} \cdot \|g\|_q^q = 1\); in particular \(f_0 \in L^p\) with \(\|f_0\|_p = 1\). And \(f_0(x) g(x) = \|g\|_q^{1-q} |g(x)|^q \geq 0\), so \(\varphi_g(f_0) = \|g\|_q^{1-q} \int |g|^q \, d\mu = \|g\|_q^{1-q} \cdot \|g\|_q^q = \|g\|_q\).

Case \(p = \infty, q = 1\): for \(g \neq 0\), take \(f_0 = \overline{\operatorname{sgn} g}\). Then \(|f_0| \leq 1\) pointwise with \(|f_0| = 1\) on \(\{g \neq 0\}\), so \(\|f_0\|_\infty = 1\); and \(f_0 g = |g|\), so \(\varphi_g(f_0) = \int |g| \, d\mu = \|g\|_1\).

Case \(p = 1, q = \infty\) (assuming \(\mu\) is \(\sigma\)-finite): the bound \(\|g\|_\infty\) need not be attained by a single \(f\), but it is approached. For any \(\varepsilon > 0\), the set \(E_\varepsilon = \{x : |g(x)| > \|g\|_\infty - \varepsilon\}\) has positive measure by the definition of \(\|g\|_\infty\) as an infimum; by \(\sigma\)-finiteness, choose \(F_\varepsilon \subseteq E_\varepsilon\) with \(0 < \mu(F_\varepsilon) < \infty\). Set \(f_\varepsilon = \mu(F_\varepsilon)^{-1} \chi_{F_\varepsilon} \, \overline{\operatorname{sgn} g}\). Then \(\|f_\varepsilon\|_1 = 1\) and \[ \varphi_g(f_\varepsilon) \;=\; \mu(F_\varepsilon)^{-1} \int_{F_\varepsilon} |g| \, d\mu \;\geq\; \|g\|_\infty - \varepsilon. \] Letting \(\varepsilon \to 0^+\), \(\sup_{\|f\|_1 = 1} \varphi_g(f) \geq \|g\|_\infty\). (Without \(\sigma\)-finiteness, this bound can fail to be attained, and the full duality \((L^1)^* \cong L^\infty\) breaks down — the exact failure is treated in Conway's functional analysis text, among other standard references.)

Combining the three cases with the upper bound from Hölder's inequality, \(\|\varphi_g\|_{(L^p)^*} = \|g\|_q\) for all \(1 \leq p \leq \infty\) (with \(\sigma\)-finiteness for \(p = 1\)), so the embedding \(L^q \hookrightarrow (L^p)^*\) given by \(g \mapsto \varphi_g\) is isometric. Every element of \(L^q\) gives rise to a continuous functional on \(L^p\), with no loss of norm.

The converse — that every continuous functional on \(L^p\) arises from some \(g \in L^q\) (for \(1 \leq p < \infty\), with \(\sigma\)-finiteness assumed when \(p = 1\)) — is substantially harder and relies on the Radon-Nikodym theorem from measure theory, whose treatment lies beyond the scope of this chapter; it is developed in Durrett's probability text and Conway's functional analysis text, among other standard references.

Minkowski's Inequality & the \(L^p\) Norm

We now use Hölder's inequality to prove the triangle inequality for \(\|\cdot\|_p\), completing the verification that \(L^p\) is a normed space.

Theorem: Minkowski's Inequality

Let \(1 \leq p \leq \infty\). If \(f, g \in L^p\), then \(f + g \in L^p\) and \[ \|f + g\|_p \;\leq\; \|f\|_p + \|g\|_p. \]

Proof:

Throughout, \(f + g\) is measurable as the sum of measurable functions (componentwise in the complex case), so \(|f + g|\) and its powers are measurable as well. The proof splits into three cases: the endpoints \(p = 1\) and \(p = \infty\) follow from the pointwise triangle inequality and standard properties of integration; the main case \(1 < p < \infty\) is the central case of the argument and uses Hölder's inequality.

Case \(p = 1\): By the pointwise triangle inequality, \(|f(x) + g(x)| \leq |f(x)| + |g(x)|\). Integrating this a.e. inequality over \(\Omega\) (using monotonicity of the integral, with \(f + g\) measurable as established above), \[ \int_\Omega |f + g| \, d\mu \;\leq\; \int_\Omega |f| \, d\mu + \int_\Omega |g| \, d\mu \;=\; \|f\|_1 + \|g\|_1 \;<\; \infty. \] Hence \(f + g \in L^1\), and by the definition of the \(L^1\) norm, \(\|f+g\|_1 = \int |f+g| \, d\mu \leq \|f\|_1 + \|g\|_1\).

Case \(p = \infty\): By Lemma: Essential Supremum Is Attained, \(|f(x)| \leq \|f\|_\infty\) and \(|g(x)| \leq \|g\|_\infty\) for a.e. \(x\), whence \(|f(x) + g(x)| \leq \|f\|_\infty + \|g\|_\infty < \infty\) for a.e. \(x\). This shows \(f + g\) is essentially bounded, i.e., \(f + g \in L^\infty\). The constant \(\|f\|_\infty + \|g\|_\infty\) is an admissible a.e.-bound for \(|f + g|\), so by the definition of \(\|f + g\|_\infty\) as the infimum of such bounds, \(\|f + g\|_\infty \leq \|f\|_\infty + \|g\|_\infty\).

Case \(1 < p < \infty\): First, observe that \(f + g \in L^p\): pointwise, \(|f(x) + g(x)| \leq |f(x)| + |g(x)| \leq 2\max(|f(x)|, |g(x)|)\), and since \(t \mapsto t^p\) is monotonically increasing on \([0, \infty)\) for \(p > 0\), \[ |f + g|^p \;\leq\; \bigl(2 \max(|f|, |g|)\bigr)^p \;=\; 2^p \max(|f|, |g|)^p \;=\; 2^p \max(|f|^p, |g|^p) \;\leq\; 2^p (|f|^p + |g|^p). \] Integrating, \(\int |f+g|^p \, d\mu \leq 2^p(\|f\|_p^p + \|g\|_p^p) < \infty\), so \(f + g \in L^p\) and \(\|f+g\|_p = \bigl(\int |f+g|^p \, d\mu\bigr)^{1/p}\) is finite.

Now assume \(\|f + g\|_p > 0\) (otherwise the inequality is trivial). We begin by splitting \(|f + g|^p\): since \(|f+g|^{p-1} \geq 0\), multiplying the pointwise triangle inequality \(|f+g| \leq |f| + |g|\) by \(|f+g|^{p-1}\) preserves its direction, giving \[ |f + g|^p \;=\; |f + g|^{p-1} \cdot |f + g| \;\leq\; |f + g|^{p-1} |f| \;+\; |f + g|^{p-1} |g|. \]

We now apply Hölder's inequality to each term on the right. Each of \(|f+g|^{p-1}\), \(|f|\), \(|g|\) is non-negative and measurable: \(|f+g|^{p-1}\) is the composition of the measurable function \(|f+g|\) with the Borel map \(t \mapsto t^{p-1}\) on \([0, \infty)\), and similarly for \(|f|, |g|\). We verify the relevant \(L^p\) memberships: \(|f| \in L^p\) directly (since \(\int |f|^p \, d\mu = \|f\|_p^p < \infty\)), similarly \(|g| \in L^p\), and for \(|f+g|^{p-1} \in L^q\), \[ \int_\Omega \bigl(|f + g|^{p-1}\bigr)^q \, d\mu \;=\; \int_\Omega |f + g|^{(p-1)q} \, d\mu \;=\; \int_\Omega |f + g|^{p} \, d\mu \;=\; \|f + g\|_p^p, \] where we used the identity \((p - 1)q = p\). Therefore \(\bigl\| |f+g|^{p-1} \bigr\|_q = \|f+g\|_p^{p/q}\).

Applying Hölder's inequality with non-negative factors (so \(|(\cdot)(\cdot)| = (\cdot)(\cdot)\)): \[ \int |f + g|^{p-1} |f| \, d\mu \;\leq\; \bigl\||f+g|^{p-1}\bigr\|_q \cdot \|f\|_p \;=\; \|f+g\|_p^{p/q} \cdot \|f\|_p, \] and analogously \(\int |f+g|^{p-1} |g| \, d\mu \leq \|f+g\|_p^{p/q} \cdot \|g\|_p\). Adding these two bounds and using linearity of the integral to combine the left-hand sides, \[ \|f + g\|_p^p \;=\; \int |f+g|^p \, d\mu \;\leq\; \int |f+g|^{p-1} |f| \, d\mu + \int |f+g|^{p-1} |g| \, d\mu \;\leq\; \|f+g\|_p^{p/q} \bigl(\|f\|_p + \|g\|_p\bigr). \]

Since \(\|f+g\|_p^{p/q}\) is strictly positive (by assumption) and finite (established above), we may divide both sides by it: \[ \|f + g\|_p^{p - p/q} \;\leq\; \|f\|_p + \|g\|_p. \] Since \(p - p/q = p(1 - 1/q) = p \cdot (1/p) = 1\), the left side is simply \(\|f + g\|_p\), completing the proof.

Equality Condition for Minkowski (\(1 < p < \infty\))

Equality \(\|f + g\|_p = \|f\|_p + \|g\|_p\) holds if and only if \(f\) and \(g\) are non-negatively proportional a.e. — that is, either \(f = 0\) a.e., or \(g = 0\) a.e., or there exists a constant \(c > 0\) such that \(f = c \, g\) a.e. (equivalently, \(g = c^{-1} f\) a.e.).

Proof:

Excluding the trivial cases where \(f = 0\) or \(g = 0\) a.e., assume \(\|f\|_p, \|g\|_p > 0\). Then by the just-proved Minkowski inequality, \(\|f + g\|_p \leq \|f\|_p + \|g\|_p\). Moreover, under our equality hypothesis \(\|f + g\|_p = \|f\|_p + \|g\|_p\), we have \(\|f + g\|_p > 0\) (since \(\|f\|_p, \|g\|_p > 0\) by assumption). Tracing the main proof, equality in Minkowski requires equality in both ingredients used:

  1. The pointwise triangle inequality \(|f + g| \leq |f| + |g|\) must hold with equality a.e., i.e., \(|f(x) + g(x)| = |f(x)| + |g(x)|\) for a.e. \(x\). Equality in this triangle inequality occurs iff \(f(x), g(x)\) lie on a common non-negative ray from the origin — explicitly: for \(\mathbb{F} = \mathbb{R}\), either one of \(f(x), g(x)\) is zero or they share the same sign; for \(\mathbb{F} = \mathbb{C}\), either one is zero or they have the same complex argument (phase).
  2. Each of the two Hölder applications must hold with equality. By the equality condition for Hölder's inequality, applied with exponents \((p, q)\) to the factors \(|f+g|^{p-1}\) (in \(L^q\)) and \(|f|\) (in \(L^p\)), there exists \(\alpha > 0\) with \(|f|^p = \alpha \cdot \bigl(|f+g|^{p-1}\bigr)^q = \alpha |f+g|^{(p-1)q} = \alpha |f+g|^p\) a.e., using \((p-1)q = p\). Similarly, from the second Hölder application there exists \(\beta > 0\) with \(|g|^p = \beta |f+g|^p\) a.e. On the set \(\{|f+g| = 0\}\), both relations give \(|f|^p = 0 = |g|^p\) a.e., so \(f = g = 0\) a.e. on this set; the desired proportionality \(|f|^p = (\alpha/\beta) |g|^p\) then holds trivially there. On \(\{|f+g| > 0\}\), we have \(|g|^p = \beta |f+g|^p > 0\), so the ratio \(|f|^p / |g|^p = \alpha/\beta\) is well-defined. Setting \(k = \alpha/\beta > 0\), we conclude \(|f|^p = k \, |g|^p\) a.e., i.e., \(|f| = k^{1/p} |g|\) a.e.

Combining (i) — common phase a.e. — with (ii) — \(|f| = c \, |g|\) a.e. where \(c = k^{1/p} > 0\) — we obtain \(f = c \, g\) a.e. Indeed, working modulo a null set: at points where \(g(x) = 0\), (ii) gives \(|f(x)| = 0\), so \(f(x) = 0 = c \cdot g(x)\); at points where \(g(x) \neq 0\), (i) forces \(f(x)\) to have the same phase as \(g(x)\), and (ii) fixes \(|f(x)| = c \, |g(x)|\), so \(f(x) = c \, g(x)\).

Conversely, if \(f = c \, g\) a.e. with \(c > 0\), then \(f + g = (c + 1) g\) a.e., so by absolute homogeneity \(\|f + g\|_p = (c + 1) \|g\|_p\), and \(\|f\|_p + \|g\|_p = c \|g\|_p + \|g\|_p = (c + 1) \|g\|_p = \|f + g\|_p\).

Geometrically, Minkowski's inequality is strict whenever \(f\) and \(g\) point in genuinely different "directions" in \(L^p\) — a manifestation of the strict convexity of the \(L^p\) norm for \(1 < p < \infty\). (At the endpoints \(p = 1\) and \(p = \infty\), strict convexity fails and equality can occur in many more configurations.)

\(L^p\) Is a Normed Space

Before listing the norm axioms, we record that \(L^p\) is a vector space over \(\mathbb{F}\): for \(f, g \in L^p\) and \(\alpha, \beta \in \mathbb{F}\), we have \(\alpha f + \beta g \in L^p\). Closure under addition follows from Minkowski's inequality, case by case (\(p = 1\), \(p = \infty\), and \(1 < p < \infty\) were each established as the first step of the corresponding case in the proof above); closure under scalar multiplication is immediate from \(\int |\alpha f|^p \, d\mu = |\alpha|^p \int |f|^p \, d\mu < \infty\) (and analogously \(\|\alpha f\|_\infty = |\alpha| \|f\|_\infty < \infty\)). The quotient construction of \(L^p\) as \(\mathscr{L}^p / \sim\) preserves these operations, since modifications on null sets are compatible with pointwise addition and scalar multiplication.

We now summarize the complete verification of the norm axioms for \(\|\cdot\|_p\) on \(L^p(\Omega, \mathcal{F}, \mu)\) in the case \(1 \leq p < \infty\); the parallel verification for the \(p = \infty\) case was completed immediately after Lemma: Essential Supremum Is Attained.

  1. Positive definiteness: \(\|f\|_p \geq 0\), and \(\|f\|_p = 0 \iff f = 0\) in \(L^p\) (i.e., \(f = 0\) a.e.). This is where the equivalence class construction is essential — without it, \(\|\cdot\|_p\) would only be a seminorm.
  2. Absolute homogeneity: \(\|\alpha f\|_p = |\alpha| \cdot \|f\|_p\) for all \(\alpha \in \mathbb{F}\). This follows immediately from \(\int |\alpha f|^p = |\alpha|^p \int |f|^p\).
  3. Triangle inequality: \(\|f + g\|_p \leq \|f\|_p + \|g\|_p\). This is Minkowski's inequality, proven above.

Therefore \(\bigl(L^p(\Omega, \mathcal{F}, \mu),\, \|\cdot\|_p\bigr)\) is a normed vector space for every \(1 \leq p \leq \infty\). This completes our construction of \(L^p\) as a normed structure.

The deepest question, however, remains open: is this normed space complete? Does every Cauchy sequence in \(L^p\) converge to a limit that is itself in \(L^p\)? An affirmative answer — the Riesz–Fischer theorem — elevates \(L^p\) from a normed space to a Banach space, and provides the convergence theory that makes \(L^p\) the natural setting for probability, signal processing, and quantum mechanics. We develop this in the next chapter, \(L^p\) Completeness & Convergence.