Approximating Continuous Functions
The embedding theorem placed every manifold inside a Euclidean space. The second application of
Sard's circle of ideas concerns approximation: a continuous map can always be deformed into a
smooth one, and a continuous map between manifolds is homotopic to a smooth one. The first claim,
for functions valued in a Euclidean space, is elementary and uses only a partition of unity; we
prove it here, and it becomes the analytic core of everything that follows. The passage to maps
between manifolds is deferred to the final section, because it requires a geometric construction —
the tubular neighborhood — built in between.
The right notion of approximation allows the tolerance to vary from point to point, which is what
makes the statement strong enough to be useful on noncompact manifolds. Given a positive
continuous function \(\delta : M \to \mathbb{R}\), two maps \(F, \widetilde{F} : M \to
\mathbb{R}^k\) are \(\delta\)-close if \(\lvert F(x) - \widetilde{F}(x) \rvert
< \delta(x)\) for every \(x \in M\). A uniform tolerance is the special case of a constant
\(\delta\); allowing \(\delta\) to shrink toward infinity lets the approximation be as tight as we
like where the manifold runs off to its ends.
Theorem (Whitney Approximation Theorem for Functions)
Let \(M\) be a smooth manifold with or without boundary, and let \(F : M \to \mathbb{R}^k\) be
a continuous function. For any positive continuous function \(\delta : M \to \mathbb{R}\),
there is a smooth function \(\widetilde{F} : M \to \mathbb{R}^k\) that is \(\delta\)-close to
\(F\). If \(F\) is already smooth on a closed subset \(A \subseteq M\), then \(\widetilde{F}\)
can be chosen to equal \(F\) on \(A\).
Proof.
If \(F\) is smooth on the closed set \(A\), the
extension lemma
provides a smooth function \(F_0 : M \to \mathbb{R}^k\) agreeing with \(F\) on \(A\); let
\[
U_0 = \{\, y \in M : \lvert F_0(y) - F(y) \rvert < \delta(y) \,\},
\]
an open set containing \(A\), since \(F_0\) and \(F\) agree there. (If there is no such \(A\),
take \(U_0 = A = \varnothing\) and \(F_0 \equiv 0\).)
Away from \(A\) we approximate locally by constants and patch. We claim there are countably
many points \(\{x_i\}\) in \(M \setminus A\) and neighborhoods \(U_i\) of \(x_i\) covering
\(M \setminus A\), such that
\[
\lvert F(y) - F(x_i) \rvert < \delta(y) \qquad \text{for all } y \in U_i.
\]
Indeed, around any \(x \in M \setminus A\), continuity of \(F\) and \(\delta\) lets us choose a
neighborhood \(U_x \subseteq M \setminus A\) small enough that \(\delta(y) > \tfrac12
\delta(x)\) and \(\lvert F(y) - F(x) \rvert < \tfrac12 \delta(x)\) hold on it; then for
\(y \in U_x\),
\[
\lvert F(y) - F(x) \rvert < \tfrac12 \delta(x) < \delta(y).
\]
The collection \(\{U_x\}\) covers \(M \setminus A\); choosing a countable subcover gives the
\(\{x_i\}\) and \(\{U_i\}\) with the displayed property.
Now take a smooth
partition of unity
\(\{\varphi_0, \varphi_i\}\) subordinate to the open cover \(\{U_0, U_i\}\) of \(M\) — these
sets do cover \(M\), since \(U_0 \supseteq A\) and the \(U_i\) cover \(M \setminus A\) — and
define
\[
\widetilde{F}(y) = \varphi_0(y) F_0(y) + \sum_{i \ge 1} \varphi_i(y)\, F(x_i).
\]
This is smooth: \(F_0\) is smooth, the values \(F(x_i)\) are constants, and the partition of
unity is smooth and locally finite. On \(A\) every \(\varphi_i\) with \(i \ge 1\) vanishes —
their supports miss \(A\) — so \(\widetilde{F} = \varphi_0 F_0 = F_0 = F\) there.
Finally we estimate the error. Since \(\sum_{i \ge 0} \varphi_i \equiv 1\), we may write
\(F(y) = \big(\varphi_0(y) + \sum_{i \ge 1} \varphi_i(y)\big) F(y)\) and subtract:
\[
\big\lvert \widetilde{F}(y) - F(y) \big\rvert
= \Big\lvert \varphi_0(y)\big(F_0(y) - F(y)\big)
+ \sum_{i \ge 1} \varphi_i(y)\big(F(x_i) - F(y)\big) \Big\rvert .
\]
Each surviving term is controlled: where \(\varphi_0(y) \ne 0\) we have \(y \in U_0\), so
\(\lvert F_0(y) - F(y) \rvert < \delta(y)\); and where \(\varphi_i(y) \ne 0\) we have
\(y \in U_i\), so \(\lvert F(x_i) - F(y) \rvert < \delta(y)\) by the local estimate above. The triangle
inequality and \(\sum_{i \ge 0} \varphi_i(y) = 1\) give
\[
\big\lvert \widetilde{F}(y) - F(y) \big\rvert
< \Big( \varphi_0(y) + \sum_{i \ge 1} \varphi_i(y) \Big)\, \delta(y) = \delta(y),
\]
so \(\widetilde{F}\) is \(\delta\)-close to \(F\). \(\blacksquare\)
A small reformulation of the theorem will be needed twice in what follows: a positive continuous
function can always be underestimated by a positive smooth one. It is the device that
converts a merely continuous tolerance into one we can differentiate.
Corollary: A Smaller Positive Smooth Function
If \(M\) is a smooth manifold with or without boundary and \(\delta : M \to \mathbb{R}\) is a
positive continuous function, then there is a positive smooth function \(e : M \to
\mathbb{R}\) with \(0 < e(x) < \delta(x)\) for all \(x \in M\).
Proof.
Apply the theorem to the continuous function \(\tfrac12 \delta\) with tolerance \(\tfrac12
\delta\): there is a smooth \(e\) with \(\lvert e(x) - \tfrac12 \delta(x) \rvert < \tfrac12
\delta(x)\) for all \(x\), and this inequality is exactly \(0 < e(x) < \delta(x)\).
\(\blacksquare\)
The Normal Bundle and Tubular Neighborhoods
The approximation theorem just proved smooths out functions valued in a Euclidean space. To smooth
a continuous map valued in a manifold \(M\), we will first embed \(M\) in some \(\mathbb{R}^n\),
approximate in \(\mathbb{R}^n\), and then push the approximation back onto \(M\). The last step
needs a smooth map that sends nearby points of \(\mathbb{R}^n\) back to \(M\) — a retraction of a
neighborhood of \(M\) onto \(M\). Constructing it is the geometric heart of this section, and the
construction runs through the normal directions to \(M\).
Throughout, \(M \subseteq \mathbb{R}^n\) is an embedded \(m\)-dimensional submanifold. The
identification of \(T_x\mathbb{R}^n\) with \(\mathbb{R}^n\) carries the Euclidean dot product onto
each tangent space, and we use it to speak of orthogonality.
Definition: Normal Space and Normal Bundle
For \(x \in M\), the normal space to \(M\) at \(x\) is the orthogonal
complement of the
tangent space
inside \(T_x\mathbb{R}^n \cong \mathbb{R}^n\),
\[
N_x M = (T_x M)^{\perp} \subseteq \mathbb{R}^n,
\]
an \((n - m)\)-dimensional subspace. The normal bundle of \(M\) is the set of
all normal vectors at all points,
\[
NM = \{\, (x, v) \in \mathbb{R}^n \times \mathbb{R}^n : x \in M,\ v \in N_x M \,\}
\subseteq T\mathbb{R}^n,
\]
viewed as a subset of
the tangent bundle
\(T\mathbb{R}^n \cong \mathbb{R}^n \times \mathbb{R}^n\), with natural projection
\(\pi_{NM} : NM \to M\) sending \((x, v)\) to \(x\).
The normal bundle is itself a smooth manifold, and of exactly the ambient dimension — the \(m\)
directions along \(M\) and the \(n - m\) normal directions together restore all \(n\). This is the
fact that lets us treat it like the domain of a chart.
Theorem: The Normal Bundle Is an Embedded Submanifold
If \(M \subseteq \mathbb{R}^n\) is an embedded \(m\)-dimensional submanifold, then \(NM\) is an
embedded \(n\)-dimensional submanifold of \(T\mathbb{R}^n \cong \mathbb{R}^n \times
\mathbb{R}^n\).
Proof.
Fix \(x_0 \in M\) and a
slice chart
\((U, \varphi)\) for \(M\) centered at \(x_0\), with coordinate functions \((u^1, \dots,
u^n)\) so that \(M \cap U\) is cut out by \(u^{m+1} = \dots = u^n = 0\). At each point
\(x \in U\), the vectors
\[
E_j\big|_x = (d\varphi_x)^{-1}\Big(\tfrac{\partial}{\partial u^j}\Big|_{\varphi(x)}\Big),
\qquad j = 1, \dots, n,
\]
form a basis for \(T_x\mathbb{R}^n\); writing \(E_j|_x = E_j^i(x)\, \partial/\partial x^i\),
each component \(E_j^i\) is a partial derivative of \(\varphi^{-1}\), hence a smooth function
of \(x\). For \(x \in M \cap U\), the first \(m\) of these vectors span \(T_x M\). Define
\[
\Phi : U \times \mathbb{R}^n \to \widehat{U} \times \mathbb{R}^n,
\qquad
\Phi(x, v) = \big(u^1(x), \dots, u^n(x),\, v \cdot E_1|_x, \dots, v \cdot E_n|_x\big),
\]
where \(\widehat{U} = \varphi(U)\). Its total derivative at \((x, v)\) has block form
\[
D\Phi_{(x, v)} =
\begin{pmatrix}
\dfrac{\partial u^i}{\partial x^j}(x) & 0 \\
* & E_j^i(x)
\end{pmatrix},
\]
whose diagonal blocks are invertible — the upper one because \(\varphi\) is a diffeomorphism,
the lower one because the \(E_j\) form a basis — so \(\Phi\) is a local diffeomorphism. It is
injective on \(U \times \mathbb{R}^n\): if \(\Phi(x, v) = \Phi(x', v')\), the first \(n\)
coordinates force \(x = x'\) since \(\varphi\) is injective, and then \(v \cdot E_i|_x = v'
\cdot E_i|_x\) for all \(i\) forces \(v = v'\) since the \(E_i\) are a basis. Thus \(\Phi\) is a
diffeomorphism onto its image, a smooth coordinate chart on \(U \times \mathbb{R}^n\).
In these coordinates \(NM\) is a slice. A pair \((x, v)\) lies in \(NM\) exactly when
\(x \in M \cap U\), so \(u^{m+1}(x) = \dots = u^n(x) = 0\), and \(v \perp T_x M\), so the
components \(v \cdot E_1|_x = \dots = v \cdot E_m|_x = 0\). These are \(n\) coordinate
functions of \(\Phi\) set to zero, exhibiting \(NM\) locally as the slice where they vanish.
Hence \(\Phi\) is a slice chart for \(NM\), and since such charts exist around every point,
\(NM\) is an embedded \(n\)-dimensional submanifold. \(\blacksquare\)
The normal bundle organizes the directions in which one leaves \(M\). Adding a normal vector to its
base point pushes off \(M\) along that direction, and doing so for all small normal vectors sweeps
out a neighborhood of \(M\). The map that performs this is addition.
Definition: Tubular Neighborhood
Let \(E : NM \to \mathbb{R}^n\) be the smooth map \(E(x, v) = x + v\), the restriction to
\(NM\) of the addition map \(\mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}^n\). A
tubular neighborhood of \(M\) is a neighborhood \(U\) of \(M\) in
\(\mathbb{R}^n\) that is the diffeomorphic image under \(E\) of an open subset \(V \subseteq
NM\) of the form
\[
V = \{\, (x, v) \in NM : \lvert v \rvert < \delta(x) \,\}
\]
for some positive continuous function \(\delta : M \to \mathbb{R}\).
A tubular neighborhood is a thickening of \(M\) by a variable radius, fibered by the normal disks.
Its existence is the central construction of the section.
Theorem (Tubular Neighborhood Theorem)
Every embedded submanifold of \(\mathbb{R}^n\) has a tubular neighborhood.
Proof.
Let \(M_0 = \{(x, 0) : x \in M\} \subseteq NM\) be the zero section. We first show that \(E\)
is a local diffeomorphism near every point of \(M_0\). By the
inverse function theorem,
it suffices that \(dE_{(x, 0)}\) be bijective. Two restrictions of \(E\) reveal its
differential. Restricted to the zero section, \(E\) is the obvious diffeomorphism
\(M_0 \to M\), so \(dE_{(x,0)}\) carries \(T_{(x,0)} M_0\) isomorphically onto \(T_x M\);
restricted to the fiber \(N_x M\), the map \(E\) is the affine map \(w \mapsto x + w\), so
\(dE_{(x,0)}\) carries \(T_{(x,0)}(N_x M)\) isomorphically onto \(N_x M\). Since
\(T_x\mathbb{R}^n = T_x M \oplus N_x M\), the differential \(dE_{(x,0)}\) is surjective, hence
bijective by equality of dimensions. So \(E\) restricts to a diffeomorphism on a neighborhood
of \((x, 0)\), which — because \(NM\) is embedded in \(\mathbb{R}^n \times \mathbb{R}^n\) and
carries the Euclidean metric — we may take of the form \(V_\delta(x) = \{(x', v') \in NM :
\lvert x - x' \rvert < \delta,\ \lvert v' \rvert < \delta\}\) for some \(\delta > 0\).
It remains to find a single open set of the form
\(V = \{(x, v) : \lvert v \rvert < \delta(x)\}\) on which \(E\) is a global
diffeomorphism. For each \(x \in M\), let \(\rho(x)\) be the supremum of all \(\delta \le 1\)
for which \(E\) is a diffeomorphism from \(V_\delta(x)\) onto its image; the previous paragraph
makes \(\rho(x)\) positive. Comparing the neighborhoods around two nearby base points \(x, x'\)
— a ball about \(x'\) of radius \(\delta\) is contained in one about \(x\) of radius \(\delta +
\lvert x - x' \rvert\) — gives the estimate \(\rho(x) - \rho(x') \le \lvert x - x' \rvert\),
and by symmetry \(\rho\) is continuous (indeed \(1\)-Lipschitz). Set
\(V = \{(x, v) \in NM : \lvert v \rvert < \tfrac12 \rho(x)\}\), an open set of the required
form with the positive continuous radius function \(\tfrac12\rho\).
On \(V\) the map \(E\) is injective. Suppose \(E(x, v) = E(x', v')\) with both points in
\(V\), and say \(\rho(x') \le \rho(x)\). From \(x + v = x' + v'\) we get \(x - x' = v' - v\), so
\[
\lvert x - x' \rvert = \lvert v' - v \rvert \le \lvert v \rvert + \lvert v' \rvert
< \tfrac12\rho(x) + \tfrac12\rho(x') \le \rho(x),
\]
using \(\lvert v \rvert < \tfrac12\rho(x)\), \(\lvert v' \rvert < \tfrac12\rho(x')\), and
\(\rho(x') \le \rho(x)\). Together with \(\lvert v \rvert, \lvert v' \rvert < \rho(x)\), this
places both \((x, v)\) and \((x', v')\) inside \(V_{\rho(x)}(x)\), where \(E\) is injective by
the definition of \(\rho(x)\); hence \((x, v) = (x', v')\). Being an injective local
diffeomorphism, \(E\) is a diffeomorphism from \(V\) onto the open set \(U = E(V)\) by the
criterion for a bijective local diffeomorphism.
Thus \(U\) is a tubular neighborhood of \(M\), with radius function \(\tfrac12\rho\).
\(\blacksquare\)
The payoff is the retraction we set out to build. A tubular neighborhood comes with a canonical map
back onto \(M\): undo the diffeomorphism \(E\) and forget the normal vector.
Proposition: Tubular Neighborhoods Retract onto the Submanifold
Let \(M \subseteq \mathbb{R}^n\) be an embedded submanifold with tubular neighborhood \(U\).
Then there is a smooth map \(r : U \to M\) that is both a retraction — meaning
\(r|_M\) is the identity — and a
submersion.
Proof.
Write \(U = E(V)\) with \(E : V \to U\) a diffeomorphism, and set \(r = \pi_{NM} \circ
E^{-1}\), the composition of the inverse diffeomorphism with the bundle projection
\(\pi_{NM} : NM \to M\). It is smooth as a composition of smooth maps. For \(x \in M\) we have
\(E^{-1}(x) = (x, 0)\), so \(r(x) = \pi_{NM}(x, 0) = x\), making \(r\) a retraction. To see
that \(r\) is a submersion it is enough that \(\pi_{NM}\) be one, since \(E^{-1}\) is a
diffeomorphism. In the slice chart \(\Phi\) built above, \(NM\) carries coordinates
\((u^1, \dots, u^m)\) along the base together with the normal-fiber coordinates, and
\(\pi_{NM}\) reads off the first \(m\) of them while \(M\) carries \((u^1, \dots, u^m)\) as its
chart; in these coordinates \(\pi_{NM}\) is the projection \((u^1, \dots, u^m, \text{fiber})
\mapsto (u^1, \dots, u^m)\), whose differential is surjective. Hence \(\pi_{NM}\), and
therefore \(r\), is a submersion. \(\blacksquare\)
Approximating Maps Between Manifolds
Now the two halves combine. To smooth a continuous map \(F : N \to M\) into a manifold, embed
\(M\) in \(\mathbb{R}^n\) by the Whitney theorem, approximate \(F\) by a smooth map into
\(\mathbb{R}^n\) using the approximation theorem for functions, and retract the result back onto
\(M\) through the tubular neighborhood. The retraction is what keeps the smoothed map valued in
\(M\) rather than drifting into the ambient space, and the straight-line path from \(F\) to its
approximation, pushed down by the retraction, supplies a homotopy for free.
Theorem (Whitney Approximation Theorem)
Let \(N\) be a smooth manifold with or without boundary, let \(M\) be a smooth manifold
without boundary, and let \(F : N \to M\) be a continuous map. Then \(F\) is homotopic to a
smooth map; if \(F\) is already smooth on a closed subset \(A \subseteq N\), the homotopy can
be taken relative to \(A\).
Proof.
By the
Whitney embedding theorem
we may regard \(M\) as a properly embedded submanifold of some \(\mathbb{R}^n\). Let \(U\) be a
tubular neighborhood
of \(M\) and \(r : U \to M\) the associated
smooth retraction.
For \(x \in M\) set
\[
\delta(x) = \sup\{\, \varepsilon \le 1 : B_\varepsilon(x) \subseteq U \,\},
\]
a positive function, continuous by a triangle-inequality argument like the one for the
tubular neighborhood theorem. Then \(\widetilde\delta = \delta \circ F : N \to \mathbb{R}\) is
positive and continuous, so by the
approximation theorem for functions
there is a smooth map \(\widetilde{F} : N \to \mathbb{R}^n\) that is \(\widetilde\delta\)-close
to \(F\) and equal to \(F\) on \(A\).
Define \(H : N \times I \to M\) by
\[
H(p, t) = r\big((1 - t) F(p) + t \widetilde{F}(p)\big).
\]
This is well defined: for each \(p\), the bound \(\lvert \widetilde{F}(p) - F(p) \rvert <
\widetilde\delta(p) = \delta(F(p))\) means \(\widetilde{F}(p)\) lies in the ball
\(B_{\delta(F(p))}(F(p)) \subseteq U\), and since that ball is convex the entire segment from
\(F(p)\) to \(\widetilde{F}(p)\) lies in \(U\), where \(r\) is defined. Then \(H(p, 0) = r(F(p))
= F(p)\), because \(F(p) \in M\) and \(r\) is a retraction, while \(H(p, 1) = r(\widetilde{F}(p))\)
is smooth in \(p\) as a composition of smooth maps. Thus \(H\) is a homotopy from \(F\) to the
smooth map \(r \circ \widetilde{F}\). On \(A\) we have \(\widetilde{F} = F\), so the segment is
constant and \(H(p, t) = r(F(p)) = F(p)\) throughout, making the homotopy relative to \(A\).
\(\blacksquare\)
Replacing the codomain \(\mathbb{R}^k\) of the function theorem by a manifold turns approximation
into homotopy, and it also upgrades the extension lemma. A continuous map that is smooth on a
closed set, and extends continuously, extends smoothly.
Corollary (Extension Lemma for Smooth Maps)
Let \(N\) be a smooth manifold with or without boundary, \(M\) a smooth manifold without
boundary, and \(A \subseteq N\) a closed subset. A smooth map \(f : A \to M\) has a smooth
extension to \(N\) if and only if it has a continuous extension to \(N\).
Proof.
A smooth extension is in particular continuous, giving one direction. Conversely, if
\(F : N \to M\) is a continuous extension of \(f\), then \(F\) is smooth on the closed set
\(A\), so the approximation theorem produces a homotopy relative to \(A\) from \(F\) to a
smooth map \(\widetilde{F}\). Being relative to \(A\), this homotopy fixes \(A\), so
\(\widetilde{F}\) agrees with \(F\), hence with \(f\), on \(A\) — a smooth extension.
\(\blacksquare\)
The approximation theorem also lets us upgrade homotopies themselves from continuous to smooth, a
refinement used whenever a homotopy-theoretic argument needs to be carried out with calculus. A
homotopy \(H : N \times I \to M\) is a smooth homotopy if it extends to a smooth
map on a neighborhood of \(N \times I\) in \(N \times \mathbb{R}\); two maps are
smoothly homotopic when such a homotopy connects them.
Lemma: Smooth Homotopy Is an Equivalence Relation
On the set of smooth maps from \(N\) to \(M\), smooth homotopy is an equivalence relation.
Proof.
Reflexivity and symmetry are immediate. For transitivity, suppose \(H_1\) is a smooth homotopy
from \(F\) to \(G\) and \(H_2\) from \(G\) to \(K\). Naively concatenating them at \(t =
\tfrac12\) would generally produce only a continuous map, with a corner in the time variable.
We remove the corner by reparametrizing time with a smooth function \(\varphi : [0, 1] \to
[0, 2]\) that is \(0\) near \(0\), is \(2\) near \(1\), and is identically \(1\) on a
neighborhood of \(\tfrac12\). Define
\[
H(x, t) =
\begin{cases}
H_1\big(x, \varphi(t)\big), & t \in [0, \tfrac12], \\
H_2\big(x, \varphi(t) - 1\big), & t \in [\tfrac12, 1].
\end{cases}
\]
Near \(t = \tfrac12\) both pieces equal the constant-in-\(t\) map \(G\) (since \(\varphi
\equiv 1\) there), so they agree to all orders and \(H\) is smooth; it is a smooth homotopy
from \(F\) to \(K\). \(\blacksquare\)
Finally, the distinction between continuous and smooth homotopy collapses for smooth maps: any two
smoothly-mappable endpoints that are homotopic at all are smoothly homotopic.
Theorem: Homotopic Smooth Maps Are Smoothly Homotopic
Let \(N\) be a smooth manifold with or without boundary, \(M\) a smooth manifold without
boundary, and \(F, G : N \to M\) smooth maps. If \(F\) and \(G\) are homotopic, they are
smoothly homotopic; if they are homotopic relative to a closed subset \(A \subseteq N\), they
are smoothly homotopic relative to \(A\).
Proof.
Let \(H : N \times I \to M\) be a homotopy from \(F\) to \(G\). Extend it to \(N \times
\mathbb{R}\) by holding it constant outside \([0, 1]\) in the time variable, setting
\(\overline{H}(x, t) = H(x, 0)\) for \(t \le 0\) and \(\overline{H}(x, t) = H(x, 1)\) for
\(t \ge 1\); this \(\overline{H}\) is continuous by the
gluing lemma
and is already smooth on the closed set \(N \times \{0\} \cup N \times \{1\}\), where it equals
the smooth maps \(F\) and \(G\). Since \(N \times \mathbb{R}\) is a smooth manifold with
boundary, the approximation theorem provides a smooth map agreeing with \(\overline{H}\) on
that closed set; restricting it to \(N \times I\) gives a smooth homotopy from \(F\) to \(G\).
The relative version follows by carrying the subset \(A\) through the same construction.
\(\blacksquare\)
A word on the hypothesis that \(M\) has no boundary, which has been standing since the
approximation theorem. The retraction onto \(M\) is built from a tubular neighborhood, and a
manifold with boundary has no two-sided collar of normal directions along its boundary, so the
construction breaks down there; the analogues of these theorems for a target with boundary require
different tools and are taken up later in the manifold series.
Embed, surround, retract: the manifold hypothesis completed
These results close a circle begun several pages ago. The
embedding theorem
placed an abstract data manifold inside a Euclidean space; the tubular neighborhood theorem
surrounds that manifold with a tube of ambient points; and the retraction \(r : U \to M\) sends
every point of the tube back to the base point of the normal fiber through it. This map is
the geometric prototype of denoising: a noisy observation, lying off the data manifold but
within its tube, is carried back along its normal direction to a clean point on the manifold.
A learned denoiser estimates a probabilistic version of this — a conditional expectation rather
than an exact projection — but the retraction captures its geometric essence, the collapse of
the normal directions. The
manifold hypothesis
thereby acquires not just a geometry — data on a low-dimensional surface in a high-dimensional
space — but a dynamics: embed the manifold, surround it with a tube,
retract noise onto it. The normal bundle that carries the construction is a basic
example of a vector bundle, the kind of structure on which the manifold series builds its later
geometry, and the normal directions it organizes are exactly the directions a model perturbs
when it moves a point off a learned manifold or projects one back onto it.