Geometric Hahn-Banach: Separation of Convex Sets

Continuity, Hyperplanes, and the Gauge

The extension form of the Hahn-Banach theorem produced functionals. Given a bounded functional on a subspace, it manufactured a norm-preserving extension to the whole space. That is an analytic statement about domination by a sublinear bound. There is a second face of the same theorem, entirely geometric, which speaks not of extending functionals but of separating sets.

A nonzero continuous linear functional \(f\) cuts a real topological vector space into two pieces, \(\{f \lt \alpha\}\) and \(\{f \gt \alpha\}\), meeting along the hyperplane \(\{f = \alpha\}\). To say that two convex sets can be separated is to say that some such functional places them on opposite sides of a hyperplane. The whole development rests on converting geometry into analysis and then feeding the result to the extension theorem. The geometry is an open convex set, the analysis a sublinear functional.

We work throughout over a scalar field \(\mathbb{F}\), either \(\mathbb{R}\) or \(\mathbb{C}\), and specialize to \(\mathbb{R}\) where the geometry demands it. Two preliminary results carry the entire section: a criterion for when a linear functional on a topological vector space is continuous, and the construction that extracts a sublinear functional from an open convex set. Both adapt their normed-space ancestors with only the topology of the space available.

Continuity of Linear Functionals

On a normed space, a linear functional is continuous exactly when it is bounded. Without a norm the criterion must be phrased topologically, but the list of equivalent conditions is longer and sharper. Continuity is detected at a single point, or by the closedness of the kernel alone.

Theorem: Continuity of a Linear Functional

Let \(\mathcal{X}\) be a topological vector space over \(\mathbb{F}\) and let \(f : \mathcal{X} \to \mathbb{F}\) be a nonzero linear functional. The following are equivalent.

(a) \(f\) is continuous.
(b) \(f\) is continuous at \(0\).
(c) \(f\) is continuous at some point.
(d) \(\ker f\) is closed.
(e) \(x \mapsto |f(x)|\) is a continuous seminorm.

Proof

(a) \(\Rightarrow\) (b) \(\Rightarrow\) (c).
Continuity everywhere implies continuity at \(0\), which is one particular point. Both implications are immediate.

(c) \(\Rightarrow\) (a).
Suppose \(f\) is continuous at a point \(x_0\). Given any \(x_1\) and \(\varepsilon \gt 0\), continuity at \(x_0\) furnishes an open neighborhood \(W\) of \(0\) with \(|f(x_0 + w) - f(x_0)| \lt \varepsilon\) for all \(w \in W\). By linearity this says \(|f(w)| \lt \varepsilon\) for \(w \in W\). Since translation \(x \mapsto x + (x_1 - x_0)\) is a homeomorphism, \(x_1 + W\) is a neighborhood of \(x_1\), and for \(x = x_1 + w\) in it, \[ |f(x) - f(x_1)| = |f(w)| \lt \varepsilon . \] Thus \(f\) is continuous at the arbitrary point \(x_1\), hence continuous.

(a) \(\Rightarrow\) (d).
If \(f\) is continuous then \(\ker f = f^{-1}(\{0\})\) is the preimage of the closed set \(\{0\} \subseteq \mathbb{F}\), hence closed.

(d) \(\Rightarrow\) (b).
Assume \(\ker f\) is closed. Because \(f \neq 0\), pick \(x_1\) with \(f(x_1) = 1\). Since \(x_1 \notin \ker f\) and \(\ker f\) is closed, its complement is an open neighborhood of \(x_1\). Translating, there is an open neighborhood \(N\) of \(0\) with \((x_1 + N) \cap \ker f = \varnothing\). By continuity of scalar multiplication at \((0,0)\), the neighborhood \(N\) contains a balanced open neighborhood \(U\) of \(0\). There are \(\delta \gt 0\) and a neighborhood \(W\) of \(0\) with \(\alpha W \subseteq N\) for \(|\alpha| \leq \delta\), and \(U = \bigcup_{|\alpha| \leq \delta} \alpha W\) is balanced, open, and contained in \(N\). Then \((x_1 + U) \cap \ker f = \varnothing\) as well, since \(U \subseteq N\).

We claim \(|f(u)| \lt 1\) for all \(u \in U\). If instead some \(u \in U\) had \(|f(u)| \geq 1\), then \(\lambda := f(u)\) satisfies \(|\lambda| \geq 1\), so \(\lambda^{-1} u \in U\) by balancedness, and \[ f\bigl(x_1 - \lambda^{-1} u\bigr) = 1 - \lambda^{-1} f(u) = 1 - 1 = 0, \] placing \(x_1 - \lambda^{-1} u \in (x_1 + U) \cap \ker f\), a contradiction. Hence \(|f(u)| \lt 1\) on \(U\). For any \(\varepsilon \gt 0\) the balanced neighborhood \(\varepsilon U\) then satisfies \(|f| \lt \varepsilon\), so \(f\) is continuous at \(0\).

(b) \(\Leftrightarrow\) (e).
The map \(p(x) = |f(x)|\) is absolutely homogeneous, \(p(\alpha x) = |\alpha|\,|f(x)|\), and subadditive by the triangle inequality in \(\mathbb{F}\). It is therefore a seminorm. A seminorm is continuous exactly when it is continuous at \(0\), because \(|p(x) - p(x_0)| \leq p(x - x_0)\). Continuity of \(p\) at \(0\) is the statement that for every \(\varepsilon \gt 0\) there is a neighborhood of \(0\) on which \(|f| \lt \varepsilon\), which is precisely continuity of \(f\) at \(0\). The equivalence follows.

The crux is the implication from a closed kernel to continuity. It rests on a structural dichotomy peculiar to hyperplanes. The kernel of a nonzero functional is a maximal proper subspace, so its closure has nowhere to grow except to the whole space.

Theorem: A Hyperplane Is Closed or Dense

Let \(\mathcal{X}\) be a topological vector space over \(\mathbb{F}\) and let \(f : \mathcal{X} \to \mathbb{F}\) be a nonzero linear functional. Then \(\ker f\) is either closed or dense in \(\mathcal{X}\). It is closed if and only if \(f\) is continuous.

Proof

Write \(\mathcal{M} = \ker f\). Because \(f\) is linear and surjective onto \(\mathbb{F}\), the subspace \(\mathcal{M}\) has codimension one. Fixing \(x_1\) with \(f(x_1) = 1\), every \(x\) decomposes uniquely as \(x = \bigl(x - f(x) x_1\bigr) + f(x) x_1\) with the first summand in \(\mathcal{M}\), so \(\mathcal{X} = \mathcal{M} \oplus \mathbb{F} x_1\). Consequently \(\mathcal{M}\) is a maximal proper subspace. If a subspace \(\mathcal{N}\) strictly contains \(\mathcal{M}\), it contains some \(x\) with \(f(x) \neq 0\). Writing \(x = m + f(x) x_1\) with \(m = x - f(x) x_1 \in \mathcal{M} \subseteq \mathcal{N}\), the difference \(f(x) x_1 = x - m\) lies in \(\mathcal{N}\), so \(x_1 \in \mathcal{N}\) and hence \(\mathcal{N} = \mathcal{M} \oplus \mathbb{F} x_1 = \mathcal{X}\).

Now the closure \(\overline{\mathcal{M}}\) is again a subspace, because the closure of a linear subspace in a topological vector space is a subspace. Continuity of addition and scalar multiplication carries the subspace relations \(\overline{\mathcal{M}} + \overline{\mathcal{M}} \subseteq \overline{\mathcal{M}}\) and \(\alpha \overline{\mathcal{M}} \subseteq \overline{\mathcal{M}}\) over from \(\mathcal{M}\) by taking closures of continuous images, using the product-topology identity \(\overline{\mathcal{M} \times \mathcal{M}} = \overline{\mathcal{M}} \times \overline{\mathcal{M}}\). Since \(\mathcal{M} \subseteq \overline{\mathcal{M}} \subseteq \mathcal{X}\) and \(\mathcal{M}\) is maximal proper, either \(\overline{\mathcal{M}} = \mathcal{M}\) (so \(\mathcal{M}\) is closed) or \(\overline{\mathcal{M}} = \mathcal{X}\) (so \(\mathcal{M}\) is dense). No third possibility exists.

The final equivalence is the content of conditions (a) and (d) of the preceding theorem. Closedness of \(\ker f\) is equivalent to continuity of \(f\).

The Gauge of an Open Convex Set

We now perform the conversion of geometry into analysis. From an open convex set containing the origin we read off a functional that measures, in each direction, how far one must scale the set to swallow a given point. When the set is also balanced this functional is a seminorm, the Minkowski functional already constructed in that symmetric case. Dropping balancedness costs absolute homogeneity but keeps positive homogeneity and subadditivity. The result is a sublinear functional, exactly the kind of bound the Hahn-Banach extension theorem accepts.

Theorem: The Gauge of an Open Convex Neighborhood

Let \(\mathcal{X}\) be a topological vector space and let \(G\) be an open convex subset containing the origin. Then the gauge of \(G\), \[ q(x) = \inf\{\, t \geq 0 : x \in tG \,\}, \] is a non-negative continuous sublinear functional on \(\mathcal{X}\), and \[ G = \{\, x : q(x) \lt 1 \,\}. \]

Proof

Finiteness and non-negativity.
Every open set containing \(0\) is absorbing. Scalar multiplication \(t \mapsto tx\) is continuous and sends \(t = 0\) into the open set \(G\), so for each \(x\) there is \(\varepsilon \gt 0\) with \(tx \in G\) for \(0 \leq t \lt \varepsilon\). Hence \(x \in s G\) for \(s = 1/t\) large, the defining set \(\{t \geq 0 : x \in tG\}\) is nonempty, and \(0 \leq q(x) \lt \infty\).

Positive homogeneity.
Fix \(x\) and a scalar \(\alpha \gt 0\). Substituting \(s = t/\alpha\), \[ \begin{align*} q(\alpha x) &= \inf\{\, t \geq 0 : \alpha x \in tG \,\} \\\\ &= \inf\{\, t \geq 0 : x \in (t/\alpha) G \,\} \\\\ &= \alpha \,\inf\{\, s \geq 0 : x \in sG \,\} = \alpha\, q(x). \end{align*} \] For \(\alpha = 0\) both sides vanish, using \(0 \in G\) so that \(q(0) = 0\). Absolute homogeneity is not claimed. Since \(G\) need not be balanced, \(q(-x)\) and \(q(x)\) may differ.

Subadditivity.
Let \(x, y \in \mathcal{X}\) and \(\delta \gt 0\). Choose \(s, u \gt 0\) with \(q(x) \leq s \lt q(x) + \delta\), \(x \in sG\), and \(q(y) \leq u \lt q(y) + \delta\), \(y \in uG\). Such positive \(s, u\) exist by the definition of the infimum. Then \(x/s, y/u \in G\), and convexity of \(G\) gives \[ \frac{x + y}{s + u} = \frac{s}{s+u}\cdot\frac{x}{s} + \frac{u}{s+u}\cdot\frac{y}{u} \in G , \] since the coefficients are non-negative and sum to \(1\). Hence \(x + y \in (s+u) G\), so \(q(x+y) \leq s + u \lt q(x) + q(y) + 2\delta\). Letting \(\delta \to 0\) yields \(q(x+y) \leq q(x) + q(y)\). With positive homogeneity, \(q\) is sublinear.

The unit set is \(G\).
Suppose \(q(x) \lt 1\). Then there is \(t\) with \(q(x) \leq t \lt 1\) and \(x \in tG\), say \(x = t g\) with \(g \in G\). Since \(0 \in G\) and \(G\) is convex, the segment from \(0\) to \(g\) lies in \(G\), and \(x = tg = (1-t)\cdot 0 + t\cdot g \in G\). Thus \(\{q \lt 1\} \subseteq G\). Conversely let \(x \in G\). Because \(G\) is open, it is absorbing at the point \(x\). There is \(\varepsilon \gt 0\) with \(x + tx \in G\) for \(0 \leq t \lt \varepsilon\). Fixing one such \(t \gt 0\), \((1+t) x \in G\), so \(x \in (1+t)^{-1} G\) and \(q(x) \leq (1+t)^{-1} \lt 1\). Hence \(G \subseteq \{q \lt 1\}\), and the two sets coincide.

Continuity.
Subadditivity and positive homogeneity give, for any \(x, h\), \(q(x + h) \leq q(x) + q(h)\) and \(q(x) \leq q(x + h) + q(-h)\), so \[ -q(-h) \leq q(x + h) - q(x) \leq q(h). \] Thus it suffices to control \(q(h)\) and \(q(-h)\) for \(h\) near \(0\). Given \(\varepsilon \gt 0\), the set \(\varepsilon G\) is an open neighborhood of \(0\), and for \(h \in \varepsilon G\) we have \(q(h) \lt \varepsilon\) by the unit-set identity applied to \(\varepsilon^{-1} h \in G\). Likewise \(-\varepsilon G\) is an open neighborhood of \(0\) on which \(q(-h) \lt \varepsilon\). On the intersection \(\varepsilon G \cap (-\varepsilon G)\), both bounds hold, so \(|q(x+h) - q(x)| \lt \varepsilon\). Hence \(q\) is continuous.

The gauge is the precise generalization of the Minkowski functional to sets that need not be symmetric about the origin. When \(G\) is in addition balanced, \(q(-x) = q(x)\) is restored and \(q\) becomes a seminorm, recovering the symmetric construction. The asymmetry permitted here is exactly what lets a single open convex set, sitting anywhere, generate the dominating functional that the separation arguments will require.

Separating a Point from an Open Convex Set

The first geometric consequence is the prototype for everything that follows. A point lying outside an open convex set can be peeled away from it by a closed hyperplane. The proof is the conversion principle in action. The gauge turns the open convex set into a sublinear functional. A one-dimensional functional is defined on the line through the excluded point and dominated by that gauge. The Hahn-Banach extension theorem spreads it to the whole space without losing domination, and the resulting hyperplane misses the set. We carry out the real case in full, then reduce the complex case to it.

Theorem: A Point Outside an Open Convex Set Lies on a Closed Hyperplane Missing It

Let \(\mathcal{X}\) be a topological vector space over \(\mathbb{F}\) and let \(G\) be a nonempty open convex subset with \(0 \notin G\). Then there is a closed hyperplane \(\mathcal{M} = \ker f\), with \(f : \mathcal{X} \to \mathbb{F}\) a nonzero continuous linear functional, such that \(\mathcal{M} \cap G = \varnothing\).

Proof

Case 1: \(\mathbb{F} = \mathbb{R}\).
Pick any \(x_0 \in G\) and set \(H = x_0 - G\). Then \(H\) is open (translation and reflection are homeomorphisms), convex (the image of the convex \(G\) under the affine map \(x \mapsto x_0 - x\)), and contains \(0 = x_0 - x_0\). Let \(q\) be its gauge, a non-negative continuous sublinear functional with \(H = \{x : q(x) \lt 1\}\). Since \(0 \notin G\), we have \(x_0 = x_0 - 0 \notin x_0 - G = H\). Equivalently \(q(x_0) \geq 1\).

On the one-dimensional subspace \(\mathcal{Y} = \mathbb{R} x_0\) define \(f_0(\alpha x_0) = \alpha\, q(x_0)\). We verify \(f_0 \leq q\) on \(\mathcal{Y}\). For \(\alpha \geq 0\), positive homogeneity gives \(f_0(\alpha x_0) = \alpha\, q(x_0) = q(\alpha x_0)\). For \(\alpha \lt 0\), the left side \(f_0(\alpha x_0) = \alpha\, q(x_0) \leq \alpha \lt 0\) (using \(q(x_0) \geq 1\)) is negative, while the right side satisfies \(q(\alpha x_0) \geq 0\). Hence \(f_0(\alpha x_0) \leq q(\alpha x_0)\) holds in this case too. Thus \(f_0 \leq q\) on \(\mathcal{Y}\).

By the Hahn-Banach theorem there is a linear functional \(f : \mathcal{X} \to \mathbb{R}\) with \(f|_{\mathcal{Y}} = f_0\) and \(f \leq q\) on all of \(\mathcal{X}\). It is nonzero, since \(f(x_0) = q(x_0) \geq 1\). It is also continuous. The bounds \(f \leq q\) and \(f(-x) \leq q(-x)\) give \(-q(-x) \leq f(x) \leq q(x)\), and as \(q\) is continuous with \(q(0) = 0\), the functional \(f\) is squeezed to continuity at \(0\), hence continuous by the continuity criterion.

Set \(\mathcal{M} = \ker f\), a closed hyperplane. To see \(\mathcal{M} \cap G = \varnothing\), it is cleaner to show \(f \lt f(x_0)\) on all of \(G\). Take \(x \in G\). Then \(x_0 - x \in x_0 - G = H\), so \(q(x_0 - x) \lt 1\), and therefore \[ f(x_0) - f(x) = f(x_0 - x) \leq q(x_0 - x) \lt 1 . \] Hence \(f(x) \gt f(x_0) - 1 = q(x_0) - 1 \geq 0\) for every \(x \in G\). In particular \(f(x) \gt 0\) on \(G\), so no point of \(G\) lies in \(\mathcal{M} = \{f = 0\}\), giving \(\mathcal{M} \cap G = \varnothing\).

Case 2: \(\mathbb{F} = \mathbb{C}\).
View \(\mathcal{X}\) as a real topological vector space by restricting scalars to \(\mathbb{R}\). The set \(G\) is still open, convex, and avoids \(0\). Case 1 produces a continuous \(\mathbb{R}\)-linear functional \(f : \mathcal{X} \to \mathbb{R}\) with \(G \cap \ker f = \varnothing\). Define \(F(x) = f(x) - i\,f(ix)\). By the recovery lemma, \(F\) is \(\mathbb{C}\)-linear with \(f = \operatorname{Re} F\), and \(F\) is continuous because \(f\) and \(x \mapsto f(ix)\) are. Now \(F(x) = 0\) forces \(\operatorname{Re} F(x) = f(x) = 0\), so \(\ker F \subseteq \ker f\), whence \(\ker F \cap G = \varnothing\). Taking \(\mathcal{M} = \ker F\) completes the complex case.

The same engine separates a point not from a set through the origin but from an arbitrary affine object, a translate of a subspace. The reduction is to quotient out the subspace and apply the theorem in the quotient.

Affine Subspaces and the Quotient Reduction

Recall that a linear subspace \(\mathcal{Y} \subseteq \mathcal{X}\) carried along by a single translate forms an affine subspace: a set of the form \(x_1 + \mathcal{Y}\). When \(\mathcal{X}\) is a topological vector space and \(\mathcal{Y}\) is closed, the quotient \(\mathcal{X}/\mathcal{Y}\), with the natural map \(Q : \mathcal{X} \to \mathcal{X}/\mathcal{Y}\), is itself a topological vector space, and \(Q\) is continuous and open. Continuity is the defining property of the quotient topology. Openness follows because for an open \(U \subseteq \mathcal{X}\) the saturation \(Q^{-1}(Q(U)) = U + \mathcal{Y} = \bigcup_{y \in \mathcal{Y}} (U + y)\) is a union of translates of \(U\), hence open, so \(Q(U)\) is open by definition of the quotient topology. We use only these two facts.

Corollary: Separating an Affine Subspace from an Open Convex Set

Let \(\mathcal{X}\) be a topological vector space and let \(G\) be a nonempty open convex subset. If \(\mathcal{A}\) is an affine subspace of \(\mathcal{X}\) with \(\mathcal{A} \cap G = \varnothing\), then there is a closed affine hyperplane \(\mathcal{M}\) with \(\mathcal{A} \subseteq \mathcal{M}\) and \(\mathcal{M} \cap G = \varnothing\).

Proof

Since \(G\) is open and \(\mathcal{A} \cap G = \varnothing\), the closure also satisfies \(\overline{\mathcal{A}} \cap G = \varnothing\). Any \(x \in \overline{\mathcal{A}} \cap G\) would have the open neighborhood \(G\) meeting \(\mathcal{A}\), contradicting disjointness. As the separating hyperplane we build is closed, separating \(\overline{\mathcal{A}}\) yields the same conclusion for \(\mathcal{A} \subseteq \overline{\mathcal{A}}\), so we may replace \(\mathcal{A}\) by \(\overline{\mathcal{A}}\) and assume it closed. Fix \(x_1 \in \mathcal{A}\) and replace \((\mathcal{A}, G)\) by \((\mathcal{A} - x_1, G - x_1)\). This translation preserves openness, convexity, disjointness, and the conclusion, so we may assume \(\mathcal{A}\) is a closed linear subspace \(\mathcal{Y}\) containing \(0\). Let \(Q : \mathcal{X} \to \mathcal{X}/\mathcal{Y}\) be the quotient map. Then \(Q(G)\) is open (since \(Q\) is open) and convex (the image of a convex set under the linear map \(Q\)), and \(0 \notin Q(G)\). If \(0 = Q(g)\) for some \(g \in G\) then \(g \in \mathcal{Y} = \mathcal{A}\), contradicting \(\mathcal{A} \cap G = \varnothing\).

By the preceding theorem applied in \(\mathcal{X}/\mathcal{Y}\), there is a nonzero continuous linear functional \(g\) on \(\mathcal{X}/\mathcal{Y}\) with \(\ker g \cap Q(G) = \varnothing\). Set \(f = g \circ Q\), a nonzero continuous linear functional on \(\mathcal{X}\) (continuous as a composition of continuous maps), and \(\mathcal{M} = \ker f = Q^{-1}(\ker g)\). Since \(\mathcal{Y} = \ker Q \subseteq \mathcal{M}\), we have \(\mathcal{A} = \mathcal{Y} \subseteq \mathcal{M}\). If some \(x \in \mathcal{M} \cap G\), then \(Q(x) \in \ker g \cap Q(G) = \varnothing\), impossible. Hence \(\mathcal{M} \cap G = \varnothing\), and \(\mathcal{M}\), being the kernel of a continuous functional, is a closed (affine) hyperplane.

Half-Spaces and Separated Sets

A nonzero continuous real-linear functional splits the space into the two open regions on either side of a level hyperplane. Naming these regions fixes the vocabulary in which every separation statement is phrased.

Definition: Open and Closed Half-Spaces

Let \(\mathcal{X}\) be a real topological vector space. A subset \(S \subseteq \mathcal{X}\) is an open half-space if there is a continuous linear functional \(f : \mathcal{X} \to \mathbb{R}\) and a scalar \(\alpha\) with \[ S = \{\, x \in \mathcal{X} : f(x) \gt \alpha \,\} , \] and a closed half-space if \(S = \{\, x : f(x) \geq \alpha \,\}\) for some such \(f\) and \(\alpha\). The hyperplane \(\{f = \alpha\}\) is the boundary of the half-space.

Definition: Separated and Strictly Separated Sets

Two subsets \(A\) and \(B\) of a real topological vector space are strictly separated if they are contained in disjoint open half-spaces, and separated if they are contained in two closed half-spaces whose intersection is a closed affine hyperplane. Equivalently, \(A\) and \(B\) are separated by a continuous linear functional \(f\) and scalar \(\alpha\) when \[ f(a) \leq \alpha \leq f(b) \quad \text{for all } a \in A,\ b \in B, \] and strictly separated when there are two scalars \(\alpha_1 \lt \alpha_2\) with \(f(a) \lt \alpha_1\) and \(f(b) \gt \alpha_2\) for all \(a \in A\), \(b \in B\), placing the sets in the disjoint open half-spaces \(\{f \lt \alpha_1\}\) and \(\{f \gt \alpha_2\}\).

The geometric force of the definition is visible already. When \(f\) is a nonzero continuous real-linear functional, its kernel hyperplane disconnects the space into the two connected components \(\{f \gt 0\}\) and \(\{f \lt 0\}\). To separate two sets is to certify that one such functional sees them on opposite sides. The theorems of the next section establish when this certificate exists.

The Separation Theorems

We now reach the results that justify the name. The point-separation theorem is upgraded to separate two convex sets, then sharpened so that disjoint closed convex sets, one of them compact, are pried fully apart with a gap between them. The progression is governed by how much openness or compactness is available. An open set lets the gauge construction run, while a compact set can be fattened into an open one without colliding with its closed companion.

Functionals and Half-Spaces

Two technical facts make the half-space language interchangeable with the functional language. A half-space and its boundary behave as expected under closure and interior, and a continuous functional maps an open convex set to an open interval. The second is what later forces separating inequalities to be strict.

Proposition: Half-Spaces and the Functional Criterion

Let \(\mathcal{X}\) be a real topological vector space.

(a) The closure of an open half-space \(\{f \gt \alpha\}\) is the closed half-space \(\{f \geq \alpha\}\), and the interior of \(\{f \geq \alpha\}\) is \(\{f \gt \alpha\}\), provided \(f\) is a nonzero continuous linear functional.

(b) If \(f : \mathcal{X} \to \mathbb{R}\) is a nonzero continuous linear functional and \(A\) is open and convex, then \(f(A)\) is an open interval. In particular \(f(A)\) contains none of its endpoints.

Proof

(a).
Because \(f\) is nonzero, pick \(e\) with \(f(e) = 1\). The map \(t \mapsto x + t e\) is continuous, and \(f(x + te) = f(x) + t\), so along this line \(f\) takes every real value near \(f(x)\). If \(f(x) = \alpha\), then for \(t \gt 0\) the point \(x + te\) has \(f \gt \alpha\), and \(x + te \to x\) as \(t \to 0^+\). Hence every boundary point of \(\{f \gt \alpha\}\) is a limit of interior points, giving \(\overline{\{f \gt \alpha\}} \supseteq \{f \geq \alpha\}\). The reverse inclusion holds because \(\{f \geq \alpha\} = f^{-1}([\alpha, \infty))\) is closed and contains \(\{f \gt \alpha\}\). The interior statement is dual. The set \(\{f \gt \alpha\} = f^{-1}((\alpha, \infty))\) is open and contained in \(\{f \geq \alpha\}\), and any point with \(f(x) = \alpha\) has points \(x - te\) (\(t \gt 0\)) arbitrarily close with \(f \lt \alpha\), so it is not interior.

(b).
Since \(A\) is convex and \(f\) is linear, \(f(A)\) is convex in \(\mathbb{R}\), hence an interval. It remains to show \(f(A)\) is open. Let \(\beta = f(a) \in f(A)\) with \(a \in A\), and choose \(e\) with \(f(e) = 1\). As \(A\) is open, there is \(\varepsilon \gt 0\) with \(a \pm te \in A\) for \(0 \leq t \lt \varepsilon\) (openness gives a neighborhood of \(a\), and \(t \mapsto a + te\) is continuous through \(t = 0\)). Then \(f(a \pm te) = \beta \pm t\) ranges over \((\beta - \varepsilon, \beta + \varepsilon)\), so this whole interval lies in \(f(A)\). Thus every point of \(f(A)\) is interior, and \(f(A)\) is open.

Part (b) is the engine behind strict separation. As soon as a separating functional is applied to an open convex set, the resulting interval cannot touch the separating value, which forces a strict inequality. With this in hand the main separation theorem follows by reducing two sets to one through their difference.

Separation of Disjoint Convex Sets

Theorem: Separation of Disjoint Convex Sets

Let \(\mathcal{X}\) be a real topological vector space and let \(A\) and \(B\) be disjoint convex subsets with \(A\) open. Then there is a continuous linear functional \(f : \mathcal{X} \to \mathbb{R}\) and a real scalar \(\alpha\) with \[ f(a) \lt \alpha \leq f(b) \quad \text{for all } a \in A,\ b \in B . \] If \(B\) is also open, then \(A\) and \(B\) are strictly separated.

Proof

Form the difference set \(G = A - B = \{\, a - b : a \in A,\ b \in B \,\}\). It is convex. A convex combination \((1-t)(a_1 - b_1) + t(a_2 - b_2)\) equals \(\bigl((1-t)a_1 + t a_2\bigr) - \bigl((1-t)b_1 + t b_2\bigr)\), a difference of points of \(A\) and \(B\). It is open, being the union \(G = \bigcup_{b \in B}(A - b)\) of translates of the open set \(A\). Moreover \(0 \notin G\). If \(0 = a - b\) then \(a = b \in A \cap B\), contradicting disjointness.

By the point-separation theorem, applied to the open convex set \(G\) avoiding \(0\), there is a nonzero continuous linear functional \(f_1\) with \(\ker f_1 \cap G = \varnothing\). The proof of that theorem produced a fixed sign on \(G\), so we may take \(f_1 \gt 0\) on \(G\). Set \(f = -f_1\), matching the stated orientation, so \(f \lt 0\) on \(G\). Then for all \(a \in A\) and \(b \in B\), \[ f(a) - f(b) = f(a - b) \lt 0, \quad \text{that is} \quad f(a) \lt f(b) . \] Hence \(\sup_{a \in A} f(a) \leq \inf_{b \in B} f(b)\). Pick \(\alpha\) between them, so that \(f(a) \leq \alpha\) for all \(a \in A\) and \(f(b) \geq \alpha\) for all \(b \in B\).

Since \(A\) is open and \(f \neq 0\), part (b) of the preceding proposition shows \(f(A)\) is an open interval, so it cannot contain its supremum. Therefore \(f(a) \lt \alpha\) strictly for every \(a \in A\), giving \(f(a) \lt \alpha \leq f(b)\). If \(B\) is also open, the infimum of \(f(B)\) cannot be attained either, and the same argument shows \(f(b) \gt \alpha\) strictly for every \(b \in B\). Then \(A\) and \(B\) lie in the disjoint open half-spaces \(\{f \lt \alpha\}\) and \(\{f \gt \alpha\}\) and are strictly separated.

The hypothesis that one set be open is essential to the construction. It is what makes \(G = A - B\) open and what sharpens the inequality on its side. When neither set is open we cannot run the gauge, and a different lever is needed, namely compactness. The next lemma supplies it, fattening a compact set into an open neighborhood that still avoids a given closed companion.

Fattening a Compact Set

In a topological vector space, a compact set sitting inside an open set can be enlarged by a fixed neighborhood of the origin and still remain inside. This is the topological-vector-space refinement of the tube lemma, and it is where the additive structure interacts with compactness.

Lemma: A Compact Set Has a Fattening Inside Any Open Superset

Let \(\mathcal{X}\) be a topological vector space, let \(K \subseteq \mathcal{X}\) be compact, and let \(V \subseteq \mathcal{X}\) be open with \(K \subseteq V\). Then there is an open neighborhood \(U\) of \(0\) such that \[ K + U \subseteq V . \]

Proof

Fix \(x \in K\). Then \(x \in V\), and \(V - x\) is an open neighborhood of \(0\). Continuity of addition at \((0,0)\) furnishes an open neighborhood \(W_x\) of \(0\) with \(W_x + W_x \subseteq V - x\), that is, \(x + W_x + W_x \subseteq V\). The translates \(\{x + W_x : x \in K\}\) form an open cover of the compact set \(K\), so finitely many suffice. There are \(x_1, \ldots, x_n \in K\) with \(K \subseteq \bigcup_{j=1}^n (x_j + W_{x_j})\). Put \[ U = \bigcap_{j=1}^n W_{x_j} , \] an open neighborhood of \(0\) as a finite intersection of such.

Let \(y \in K\) and \(u \in U\). Then \(y \in x_j + W_{x_j}\) for some \(j\), so \(y = x_j + w\) with \(w \in W_{x_j}\), and \(u \in U \subseteq W_{x_j}\). Therefore \[ y + u = x_j + w + u \in x_j + W_{x_j} + W_{x_j} \subseteq V . \] Hence \(K + U \subseteq V\).

The compactness of \(K\) is indispensable. The cover by translated neighborhoods admits a finite subcover only because \(K\) is compact, and it is the finite intersection that produces a single \(U\) working uniformly across \(K\). Phrased through nets, the failure for a merely closed \(K\) is exactly that a net escaping to infinity along \(K\) has no convergent behavior to anchor the uniform neighborhood. Compactness restores the cluster point that the finite subcover encodes.

Strict Separation with a Compact Set

Combining the fattening lemma with the convex-separation theorem yields the result used most often in practice. In a locally convex space, two disjoint closed convex sets can be strictly separated as soon as one of them is compact. Local convexity is what guarantees the fattening can be taken convex, so that the separation theorem applies to the enlarged sets.

Theorem: Strict Separation of a Compact and a Closed Convex Set

Let \(\mathcal{X}\) be a real locally convex space and let \(A\) and \(B\) be disjoint closed convex subsets of \(\mathcal{X}\). If \(B\) is compact, then \(A\) and \(B\) are strictly separated. There are a continuous linear functional \(f\) and scalars \(\alpha_1 \lt \alpha_2\) with \(f(a) \leq \alpha_1 \lt \alpha_2 \leq f(b)\) for all \(a \in A\), \(b \in B\).

Proof

Since \(A\) and \(B\) are disjoint and \(A\) is closed, \(B\) is a compact subset of the open set \(\mathcal{X} \setminus A\). By the fattening lemma, there is an open neighborhood \(U_1\) of \(0\) with \(B + U_1 \subseteq \mathcal{X} \setminus A\), that is, \((B + U_1) \cap A = \varnothing\). Because \(\mathcal{X}\) is locally convex, there is a continuous seminorm \(p\) with \(\{x : p(x) \lt 1\} \subseteq U_1\). Set \(U = \{x : p(x) \lt \tfrac{1}{2}\}\), an open convex balanced neighborhood of \(0\) (a sublevel set of a seminorm) with \(U + U \subseteq U_1\).

Consider \(A + U\) and \(B + U\). Each is open (a union of translates of \(U\)) and convex (a sum of convex sets is convex), and they contain \(A\) and \(B\) respectively. They are disjoint. If \(a + u = b + u'\) with \(a \in A\), \(b \in B\), \(u, u' \in U\), then \(a = b + (u' - u) \in B + (U - U) \subseteq B + U_1\), because \(U\) is balanced and therefore \(U - U \subseteq U + U \subseteq U_1\). This places \(a \in (B + U_1) \cap A\), which is empty. Hence \((A + U) \cap (B + U) = \varnothing\).

Now \(A + U\) and \(B + U\) are disjoint open convex sets, so by the convex-separation theorem they are strictly separated by some continuous linear functional \(f\). There is \(\alpha\) with \(f \lt \alpha\) on \(A + U\) and \(f \gt \alpha\) on \(B + U\). Restricting to \(A \subseteq A + U\) and \(B \subseteq B + U\) gives \(f \lt \alpha\) on \(A\) and \(f \gt \alpha\) on \(B\). Since \(f(B)\) is the continuous image of the compact set \(B\), it is compact in \(\mathbb{R}\) and attains its minimum, so \(\alpha_2 := \min_{b \in B} f(b) \gt \alpha\). Setting \(\alpha_1 := \alpha\) gives \(f(a) \lt \alpha_1 \lt \alpha_2 \leq f(b)\) for all \(a \in A\), \(b \in B\), the asserted strict separation with a gap.

Compactness of one set cannot be dropped. In the plane, the closed convex region below the \(x\)-axis and the closed convex region on or above a branch of the hyperbola \(y = 1/x\) for \(x \gt 0\) are disjoint and closed, yet they approach each other arbitrarily closely as \(x \to \infty\). No horizontal gap separates them, and no line strictly separates the two. The gap in the conclusion is precisely what compactness buys.

Convex Sets as Intersections of Half-Spaces

The separation theorems have a structural payoff that closes the circle opened in the first section. There we said a functional cuts the space into half-spaces. We can now run that statement backwards. A closed convex set is exactly the intersection of all the closed half-spaces that contain it. It is carved out of the space by the continuous linear functionals that stay on one side of it. This is the geometric counterpart of the analytic fact that a closed subspace is the common kernel of the functionals vanishing on it, and it is the precise sense in which a weakly convergent sequence cannot escape the convex hull of its terms.

Separating a Point from a Closed Convex Set

Specializing the compact-versus-closed theorem to a single point gives the workhorse corollary. Any point outside a closed convex set is strictly separated from it. A single point is compact, so the hypotheses are met with no further effort.

Corollary: A Point Outside a Closed Convex Set Is Strictly Separated From It

Let \(\mathcal{X}\) be a real locally convex space, let \(A\) be a closed convex subset, and let \(x \notin A\). Then \(x\) is strictly separated from \(A\). There are a continuous linear functional \(f\) and scalars \(\alpha_1 \lt \alpha_2\) with \(f(a) \leq \alpha_1 \lt \alpha_2 \leq f(x)\) for all \(a \in A\).

Proof

The singleton \(B = \{x\}\) is convex, closed, and compact, and \(A \cap B = \varnothing\) because \(x \notin A\). The strict-separation theorem applied to \(A\) and \(B\) yields a continuous linear functional \(f\) and scalars \(\alpha_1 \lt \alpha_2\) with \(f(a) \leq \alpha_1\) for all \(a \in A\) and \(\alpha_2 \leq f(x)\).

The Closed Convex Hull as an Intersection of Half-Spaces

A closed half-space is itself closed and convex, so any intersection of closed half-spaces is closed and convex. The separation corollary supplies the converse. Every closed convex set arises this way, and applied to the smallest closed convex set containing a given set, the corollary identifies the closed convex hull as the intersection of all closed half-spaces over the set.

Theorem: The Closed Convex Hull Is an Intersection of Closed Half-Spaces

Let \(\mathcal{X}\) be a real locally convex space and let \(A \subseteq \mathcal{X}\). Then the closed convex hull \(\overline{\operatorname{co}}(A)\) equals the intersection of all closed half-spaces containing \(A\): \[ \overline{\operatorname{co}}(A) = \bigcap\{\, H : H \text{ a closed half-space},\ A \subseteq H \,\} . \] In particular, every closed convex set is the intersection of the closed half-spaces containing it.

Proof

Let \(\mathcal{H}\) be the collection of all closed half-spaces containing \(A\), and write \(P = \bigcap\{H : H \in \mathcal{H}\}\).

\(\overline{\operatorname{co}}(A) \subseteq P\).
Each \(H \in \mathcal{H}\) is closed and convex and contains \(A\), so it contains the smallest closed convex set containing \(A\), namely \(\overline{\operatorname{co}}(A)\). Intersecting over all \(H \in \mathcal{H}\) gives \(\overline{\operatorname{co}}(A) \subseteq P\).

\(P \subseteq \overline{\operatorname{co}}(A)\).
We prove the contrapositive. If \(x \notin \overline{\operatorname{co}}(A)\), then \(x \notin P\). The set \(\overline{\operatorname{co}}(A)\) is closed and convex, and \(x\) lies outside it, so by the preceding corollary there are a continuous linear functional \(f\) and scalars \(\alpha_1 \lt \alpha_2\) with \(f(y) \leq \alpha_1\) for all \(y \in \overline{\operatorname{co}}(A)\) and \(f(x) \geq \alpha_2\). The closed half-space \(H = \{\, y : f(y) \leq \alpha_1 \,\}\) then contains \(A \subseteq \overline{\operatorname{co}}(A)\), so \(H \in \mathcal{H}\). But \(f(x) \geq \alpha_2 \gt \alpha_1\) means \(x \notin H\), hence \(x \notin P\).

The two inclusions give \(\overline{\operatorname{co}}(A) = P\). Taking \(A\) itself closed and convex makes \(\overline{\operatorname{co}}(A) = A\), so \(A\) is the intersection of the closed half-spaces containing it.

This is the geometric form of Mazur's principle. A point in the closure of the convex hull of a set cannot be told apart from that set by any continuous linear functional kept on one side. Equivalently, if some functional strictly separates a point from the set, that point lies outside the closed convex hull. The same dichotomy, read on a subspace rather than a general convex set, recovers the analytic characterization of the closed span.

Corollary: The Closed Span Is an Intersection of Closed Hyperplanes

Let \(\mathcal{X}\) be a real locally convex space and let \(A \subseteq \mathcal{X}\). Then the closed linear span of \(A\) equals the intersection of all closed hyperplanes \(\{f = 0\}\) containing \(A\), as \(f\) ranges over the continuous linear functionals vanishing on \(A\).

Proof

Let \(\mathcal{Y} = \overline{\operatorname{span}}(A)\), a closed linear subspace, and let \(P\) be the intersection of all closed hyperplanes \(\ker f\) containing \(A\) with \(f\) a continuous linear functional. If \(f\) is continuous and vanishes on \(A\), it vanishes on the span of \(A\) by linearity and on its closure \(\mathcal{Y}\) by continuity. Hence \(\mathcal{Y} \subseteq \ker f\) for each such \(f\), giving \(\mathcal{Y} \subseteq P\).

Conversely, suppose \(x \notin \mathcal{Y}\). Then \(\mathcal{Y}\) is a closed convex set not containing \(x\), so by the separation corollary there is a continuous linear functional \(g\) with \(g(y) \leq \alpha_1 \lt \alpha_2 \leq g(x)\) for all \(y \in \mathcal{Y}\). A linear functional bounded above on the subspace \(\mathcal{Y}\) must vanish on it. If \(g(y_0) \neq 0\) for some \(y_0 \in \mathcal{Y}\), then \(g(t y_0) = t\, g(y_0)\) is unbounded above as \(t \to \pm\infty\) within \(\mathcal{Y}\), contradicting the bound \(g \leq \alpha_1\). Hence \(g\) vanishes on \(\mathcal{Y} \supseteq A\), so \(\ker g\) is a closed hyperplane containing \(A\). But \(g(x) \geq \alpha_2 \gt \alpha_1 \geq 0\) gives \(g(x) \neq 0\), so \(x \notin \ker g\) and therefore \(x \notin P\). Thus \(P \subseteq \mathcal{Y}\), and the two inclusions give \(P = \mathcal{Y}\).

The Complex Case

Every complex locally convex space is also a real locally convex space, obtained by restricting scalar multiplication to real scalars. The topology and convexity are untouched. The separation theorems therefore apply to the underlying real space, and the real functionals they produce are promoted to complex functionals by the recovery lemma. The separating inequalities are then phrased through the real part, since that is what the real theory controls.

Theorem: Strict Separation in a Complex Locally Convex Space

Let \(\mathcal{X}\) be a complex locally convex space and let \(A\) and \(B\) be disjoint closed convex subsets with \(B\) compact. Then there are a continuous linear functional \(f : \mathcal{X} \to \mathbb{C}\), a real scalar \(\alpha\), and an \(\varepsilon \gt 0\) such that \[ \operatorname{Re} f(a) \leq \alpha \lt \alpha + \varepsilon \leq \operatorname{Re} f(b) \quad \text{for all } a \in A,\ b \in B . \]

Proof

Regard \(\mathcal{X}\) as a real locally convex space. The sets \(A\) and \(B\) remain disjoint, closed, and convex, and \(B\) remains compact. By the real strict-separation theorem there is a continuous \(\mathbb{R}\)-linear functional \(u : \mathcal{X} \to \mathbb{R}\) and scalars \(\alpha \lt \alpha + \varepsilon\) with \(u(a) \leq \alpha\) for all \(a \in A\) and \(u(b) \geq \alpha + \varepsilon\) for all \(b \in B\).

Define \(f(x) = u(x) - i\, u(ix)\). By the recovery lemma, \(f\) is \(\mathbb{C}\)-linear with \(u = \operatorname{Re} f\), and \(f\) is continuous because \(u\) and \(x \mapsto u(ix)\) are continuous. Substituting \(u = \operatorname{Re} f\) into the inequalities above gives \[ \operatorname{Re} f(a) \leq \alpha \lt \alpha + \varepsilon \leq \operatorname{Re} f(b) \] for all \(a \in A\) and \(b \in B\), as claimed.

With this the geometric theory is complete in both scalar fields. A continuous linear functional is the basic instrument of separation. Over \(\mathbb{R}\) it acts directly, and over \(\mathbb{C}\) it acts through its real part, which is the real functional the separation theorems actually build. The closed convex sets of a locally convex space are thus exactly the sets cut out by these functionals, the intersections of the closed half-spaces that contain them. The existence of the functionals, the deep input that makes the whole geometry possible, traces back through the gauge construction to the single extension principle from which this development began.

Geometric Hahn-Banach: Separation of Convex Sets

Loading...

Continuity, Hyperplanes, and the Gauge

Continuity of Linear Functionals

The Gauge of an Open Convex Set

Separating a Point from an Open Convex Set

Affine Subspaces and the Quotient Reduction

Half-Spaces and Separated Sets

The Separation Theorems

Functionals and Half-Spaces

Separation of Disjoint Convex Sets

Fattening a Compact Set

Strict Separation with a Compact Set

Convex Sets as Intersections of Half-Spaces

Separating a Point from a Closed Convex Set

The Closed Convex Hull as an Intersection of Half-Spaces

The Complex Case