Tangent Vectors

Tangent Vectors The Differential Computations in Coordinates The Differential as a Jacobian

Tangent Vectors

At its core, differential calculus is the art of linear approximation: to differentiate is to replace a map, near a point, by the linear map that best matches it. A linear map needs a vector space for its domain — the displacements it acts on — and in \(\mathbb{R}^n\) that space is simply \(\mathbb{R}^n\) itself, supplied for free by the ambient coordinates. On a manifold this free supply is gone: a point of an abstract manifold is not a point of any \(\mathbb{R}^N\), and the naive picture of a tangent vector as a little arrow sticking out into the surrounding space has nothing to stick out into. Before we can linearize anything, we must first manufacture, intrinsically and out of the smooth structure alone, the vector space \(T_pM\) on which the linearization will act. The strategy is the one that recurs throughout the subject: identify what a tangent vector does, promote that action to a definition, and discard the scaffolding that made the action visible in the first place.

The geometric picture in Euclidean space

In \(\mathbb{R}^n\) the scaffolding is concrete. Fix a point \(a \in \mathbb{R}^n\) and attach to it a copy of \(\mathbb{R}^n\), the geometric tangent space \(\mathbb{R}^n_a = \{a\} \times \mathbb{R}^n\), whose elements we write \(v_a = (a, v)\) and picture as the arrow \(v\) drawn with its tail at \(a\). This is an honest \(n\)-dimensional real vector space, a translated copy of \(\mathbb{R}^n\), and it carries all the linear-algebraic structure we already have. Its defect is exactly its virtue: it is built from the ambient \(\mathbb{R}^n\) into which the point \(a\) is embedded, and an abstract manifold offers no such ambient room. To migrate the notion onto a manifold we must first reformulate it without reference to the surrounding space.

The reformulation comes from asking what an arrow \(v_a\) lets us compute. Given a smooth function \(f\) defined near \(a\), the arrow specifies a directional derivative: the instantaneous rate of change of \(f\) as one moves away from \(a\) in the direction \(v\). Writing \(D_v\big|_a\) for this operation,

\[ \begin{align*} D_v\big|_a f &= \frac{d}{dt}\bigg|_{t=0} f(a + tv) \\\\ &= v^i \, \frac{\partial f}{\partial x^i}(a), \end{align*} \]

where the second equality is the chain rule applied to the curve \(t \mapsto a + tv\), and the repeated index \(i\) is summed in the Einstein convention. The operator \(D_v\big|_a\) sends a smooth function to a real number, and it has two structural features that survive without any mention of \(\mathbb{R}^n_a\): it is \(\mathbb{R}\)-linear in \(f\), and it obeys the product rule

\[ D_v\big|_a (fg) = f(a)\, D_v\big|_a g + g(a)\, D_v\big|_a f . \]

These two properties — linearity and the Leibniz rule, the latter evaluated at the single point \(a\) — refer only to the values of functions and their derivatives at \(a\). They make no use of the arrow's tail, the ambient coordinates, or the embedding. They are precisely the data that transplant to a manifold, and so they become the definition.

Derivations as the intrinsic definition

We isolate the surviving structure and elevate it to a definition, first on \(\mathbb{R}^n\) and then, verbatim, on a manifold. The only ingredient required is the algebra \(C^\infty(M)\) of smooth real-valued functions, an object already at our disposal.

Definition: Derivation at a Point

Let \(M\) be a smooth manifold (with or without boundary) and \(p \in M\). A linear map \(v : C^\infty(M) \to \mathbb{R}\) is a derivation at \(p\) if it satisfies the Leibniz rule \[ v(fg) = f(p)\, v g + g(p)\, v f \quad \text{for all } f, g \in C^\infty(M). \]

The phrase "at \(p\)" is carried entirely by the appearance of \(f(p)\) and \(g(p)\) in the Leibniz rule: a single derivation pins all of its weight to one point. Specializing the definition to \(M = \mathbb{R}^n\) recovers a derivation at \(a\) in the Euclidean setting, of which every directional-derivative operator \(D_v\big|_a\) is an example. We will see shortly that on \(\mathbb{R}^n\) these are the only examples, which is what licenses the whole construction.

Definition: Tangent Space and Tangent Vectors

The set of all derivations of \(C^\infty(M)\) at \(p\) is a real vector space under the pointwise operations \((v + w)f = vf + wf\) and \((cv)f = c\,(vf)\); it is called the tangent space to \(M\) at \(p\) and denoted \(T_pM\). An element of \(T_pM\) is a tangent vector at \(p\).

That the sum and scalar multiple of derivations are again derivations is an immediate check against the Leibniz rule, which is linear in \(v\); we take the vector-space axioms for granted as inherited from the target \(\mathbb{R}\). What is not yet clear is that \(T_pM\) has the right size — that it is \(n\)-dimensional rather than enormous or trivial — and establishing this occupies the remainder of the section in the model case \(M = \mathbb{R}^n\), from which the general case will follow once charts are available.

A Lie Algebra Is a Tangent Space

This definition quietly settles a debt left open in our study of matrix groups. There the Lie algebra \(\mathfrak{g}\) of a group \(G\) was introduced as the set of velocity vectors of smooth curves through the identity \(I\) — a description that presupposed a notion of "tangent direction at \(I\)" we had not yet made precise. With \(T_pM\) now defined for an arbitrary manifold, the gap closes: \(\mathfrak{g}\) is nothing other than the tangent space \(T_I G\) at the identity element. The Lie algebra of a matrix group is a tangent space, and the bracket it carries is extra structure laid on top of this underlying linear object. The quantitative side of this identification — pinning down which concrete matrices constitute \(T_I G\) — waits until we can differentiate maps between manifolds; the velocity-vector half of the picture is completed when we study the velocities of curves directly.

The two definitions agree in \(\mathbb{R}^n\)

We now prove that on \(\mathbb{R}^n\) the abstract derivations are exactly the directional derivatives, so that the intrinsic tangent space \(T_a\mathbb{R}^n\) is canonically the geometric one \(\mathbb{R}^n_a\). The argument rests on a small lemma recording two elementary properties every derivation possesses, which we will reuse on manifolds without change.

Lemma: Elementary Properties of Derivations

Let \(v\) be a derivation at \(p \in M\), and let \(f, g \in C^\infty(M)\).

(a) If \(f\) is constant, then \(vf = 0\).

(b) If \(f(p) = g(p) = 0\), then \(v(fg) = 0\).

Proof:

For (a) it suffices, by linearity, to treat the constant function \(f_1 \equiv 1\). Applying the Leibniz rule to the product \(f_1 f_1 = f_1\) gives \[ \begin{align*} v f_1 &= v(f_1 f_1) = f_1(p)\, v f_1 + f_1(p)\, v f_1 \\\\ &= 2\, v f_1, \end{align*} \] whence \(v f_1 = 0\). A general constant \(f \equiv c\) is \(c\, f_1\), so \(vf = c\, v f_1 = 0\) by linearity.

For (b), the Leibniz rule applied to \(fg\) reads \(v(fg) = f(p)\, vg + g(p)\, vf\), and both coefficients \(f(p)\) and \(g(p)\) vanish by hypothesis, so \(v(fg) = 0\).

The same statement and the same proof hold word for word when \(M = \mathbb{R}^n\); we will invoke it in both settings. With the lemma in hand we can identify the tangent space of Euclidean space.

Proposition: The Tangent Space of Euclidean Space

For each \(a \in \mathbb{R}^n\), the map \[ \begin{align*} \mathbb{R}^n_a &\longrightarrow T_a\mathbb{R}^n, \\\\ v_a &\longmapsto D_v\big|_a, \end{align*} \] is an isomorphism of vector spaces. Consequently the \(n\) partial-derivative operators \(\partial/\partial x^1\big|_a, \dots, \partial/\partial x^n\big|_a\) form a basis for \(T_a\mathbb{R}^n\), and \(\dim T_a\mathbb{R}^n = n\).

Proof:

The map is linear in \(v\) by inspection of the formula \(D_v\big|_a = v^i \,\partial/\partial x^i\big|_a\), so it remains to prove that it is injective and surjective.

Injectivity. Suppose \(D_v\big|_a = 0\). Applying the operator to the \(j\)-th coordinate function \(x^j\) and using \(\partial x^j/\partial x^i = \delta^j_i\) gives \[ \begin{align*} 0 = D_v\big|_a (x^j) &= v^i \,\frac{\partial x^j}{\partial x^i}(a) \\\\ &= v^i \delta^j_i = v^j \end{align*} \] for every \(j\), so \(v = 0\) and the map is injective.

Surjectivity. Let \(w \in T_a\mathbb{R}^n\) be any derivation; we produce an arrow inducing it. Set \(v^i := w(x^i)\) and let \(v = (v^1, \dots, v^n)\). For an arbitrary \(f \in C^\infty(\mathbb{R}^n)\), the first-order Taylor expansion with integral remainder about \(a\) writes \[ \begin{align*} f(x) = f(a) &+ \frac{\partial f}{\partial x^i}(a)\,(x^i - a^i) \\\\ &+ \sum_{i,j} (x^i - a^i)(x^j - a^j)\, g_{ij}(x), \end{align*} \] with each \(g_{ij}\) smooth near \(a\). Apply \(w\) and read off the three groups of terms. The constant \(f(a)\) is killed by part (a) of the lemma. Each remainder term is a product of two factors \((x^i - a^i)\) and \((x^j - a^j)\,g_{ij}\), both of which vanish at \(a\); by part (b) the derivation annihilates it. Only the linear terms survive, and since \(w\) is linear with \(w(a^i) = 0\) for the constants \(a^i\), the term \(w(x^i - a^i)\) collapses to \(w(x^i) = v^i\), giving \[ \begin{align*} w f &= \frac{\partial f}{\partial x^i}(a)\, w(x^i - a^i) \\\\ &= \frac{\partial f}{\partial x^i}(a)\, w(x^i) \\\\ &= \frac{\partial f}{\partial x^i}(a)\, v^i = D_v\big|_a f . \end{align*} \] As \(f\) was arbitrary, \(w = D_v\big|_a\), proving surjectivity.

The map is therefore a linear isomorphism. Its inverse carries the standard basis arrow \((e_i)_a\) to the operator \(D_{e_i}\big|_a = \partial/\partial x^i\big|_a\), so these \(n\) operators are the image of a basis and hence themselves a basis of \(T_a\mathbb{R}^n\). In particular \(\dim T_a\mathbb{R}^n = n\).

The proposition is the linchpin of the entire theory. It certifies that the abstract definition, for all its austerity, reproduces exactly the familiar vector space of arrows in the one case where we can check by hand — and it does so intrinsically, phrasing the answer in terms of derivations rather than the ambient copy \(\mathbb{R}^n_a\) that motivated them. On a general manifold there is no \(\mathbb{R}^n_a\) to compare against, but there are charts, and a chart will let us transport this Euclidean verdict to any point of any manifold once we know how derivations behave under smooth maps. That mechanism — the differential — is the business of the next section, where the manifold versions of the lemma above will be proved by the identical argument and \(\dim T_pM = n\) will follow.

The Differential of a Smooth Map

A tangent space sitting in isolation at a single point is inert; what makes tangent vectors useful is that smooth maps carry them. If \(F : M \to N\) is smooth and \(p \in M\), we want to turn a tangent vector at \(p\) into a tangent vector at \(F(p)\) — to linearize \(F\) at \(p\) in a way that needs no coordinates. The mechanism is forced on us by the derivation definition: a tangent vector is something that differentiates functions, and the only functions available near \(F(p)\) become functions near \(p\) the moment we precompose with \(F\). Reading the construction off that observation gives the central operation of the chapter.

The differential of a smooth map

Definition: The Differential (Pushforward)

Let \(F : M \to N\) be a smooth map and \(p \in M\). The differential of \(F\) at \(p\) is the linear map \[ dF_p : T_pM \longrightarrow T_{F(p)}N \] defined, for \(v \in T_pM\) and \(f \in C^\infty(N)\), by \[ dF_p(v)(f) = v(f \circ F). \]

For the definition to make sense, \(dF_p(v)\) must actually be a tangent vector at \(F(p)\) — a derivation of \(C^\infty(N)\). It is linear in \(f\) because precomposition \(f \mapsto f \circ F\) is linear and \(v\) is linear. For the Leibniz rule, the smooth function \((fg) \circ F\) equals \((f \circ F)(g \circ F)\) pointwise, so applying the derivation \(v\) and evaluating the products at \(p\) gives \[ \begin{align*} dF_p(v)(fg) &= v\bigl((f\circ F)(g\circ F)\bigr) \\\\ &= (f\circ F)(p)\, v(g\circ F) + (g\circ F)(p)\, v(f\circ F) \\\\ &= f(F(p))\, dF_p(v)(g) + g(F(p))\, dF_p(v)(f). \end{align*} \] Thus \(dF_p(v)\) is a derivation at \(F(p)\), and \(dF_p\) is itself linear in \(v\) because \(v \mapsto v(f\circ F)\) is. The differential is the coordinate-free linearization of \(F\): it records, intrinsically, the first-order behavior of \(F\) at \(p\).

The Differential Is the Jacobian, Without Coordinates

A reader who has computed gradients in a neural network has already met \(dF_p\) in disguise. There the linearization of a smooth layer \(F\) at an input is the Jacobian matrix, the array of partial derivatives that forward-mode automatic differentiation pushes tangent vectors through — the operation the autodiff literature calls a Jacobian–vector product. The differential is exactly this object stripped of its basis: \(dF_p\) is the linear map that the Jacobian represents once coordinates are chosen, and the forward pass of a Jacobian–vector product is the numerical shadow of \(v \mapsto dF_p(v)\). Keeping the types straight is worth the effort. A tangent vector \(v\) and its pushforward \(dF_p(v)\) are vectors; the number \(dF_p(v)(f)\) obtained by feeding a function to that pushforward is a scalar. The Jacobian is a matrix; its action on a component vector is again a vector. We will see in the next section precisely which matrix \(dF_p\) becomes in coordinates.

Functorial properties and locality

The differential behaves exactly as a derivative should under composition, identity, and inversion. We collect the four properties in one proposition; each is routinely left as an exercise in the standard references, but the site closes every such gap inline, and each part falls directly out of the definition.

Proposition: Properties of the Differential

Let \(F : M \to N\) and \(G : N \to P\) be smooth maps and \(p \in M\).

(a) \(dF_p : T_pM \to T_{F(p)}N\) is linear.

(b) \(d(G \circ F)_p = dG_{F(p)} \circ dF_p\).

(c) \(d(\mathrm{Id}_M)_p = \mathrm{Id}_{T_pM}\).

(d) If \(F\) is a diffeomorphism, then \(dF_p\) is an isomorphism, with inverse \((dF_p)^{-1} = d(F^{-1})_{F(p)}\).

Proof:

Part (a) was checked above. For (b), let \(v \in T_pM\) and \(f \in C^\infty(P)\). Unwinding the definition twice and using associativity of composition, \[ \begin{align*} d(G\circ F)_p(v)(f) &= v\bigl(f \circ (G\circ F)\bigr) \\\\ &= v\bigl((f\circ G) \circ F\bigr) \\\\ &= dF_p(v)(f\circ G) \\\\ &= dG_{F(p)}\bigl(dF_p(v)\bigr)(f). \end{align*} \] Since this holds for every \(f\), the two differentials agree.

For (c), \(d(\mathrm{Id}_M)_p(v)(f) = v(f \circ \mathrm{Id}_M) = vf\), so the differential of the identity is the identity. Part (d) follows by applying (b) to \(F^{-1} \circ F = \mathrm{Id}_M\) and \(F \circ F^{-1} = \mathrm{Id}_N\): the chain rule and (c) give \(d(F^{-1})_{F(p)} \circ dF_p = \mathrm{Id}_{T_pM}\) and \(dF_p \circ d(F^{-1})_{F(p)} = \mathrm{Id}_{T_{F(p)}N}\), so \(dF_p\) is invertible with the stated inverse.

A subtle point lurks in the definition: a tangent vector is defined as a derivation of global smooth functions \(C^\infty(M)\), yet differentiation is an inherently local operation. The next proposition reconciles the two, showing that a tangent vector cannot tell apart two functions that agree near \(p\), even if they differ wildly elsewhere. This is what allows tangent vectors to be computed in a single chart.

Proposition: Tangent Vectors Act Locally

Let \(p \in M\) and \(v \in T_pM\). If \(f, g \in C^\infty(M)\) agree on some neighborhood of \(p\), then \(vf = vg\).

Proof:

Let \(h = f - g\), so \(h\) vanishes on a neighborhood \(U\) of \(p\); by linearity it suffices to show \(vh = 0\). Choose a smooth bump function \(\psi \in C^\infty(M)\) that equals \(1\) on a neighborhood of \(p\) and is supported in \(U\). On the support of \(\psi\) the function \(h\) vanishes, so \(\psi h \equiv 0\) on all of \(M\), and hence \(v(\psi h) = 0\). On the other hand \(\psi(p) = 1\) and \(h(p) = 0\), so the Leibniz rule gives \[ 0 = v(\psi h) = \psi(p)\, vh + h(p)\, v\psi = vh. \] Therefore \(vh = 0\), and \(vf = vg\).

Locality has an immediate structural payoff. It lets us regard a tangent vector to an open subset as a tangent vector to the whole manifold, removing any anxiety about whether \(T_pM\) depends on functions defined far from \(p\).

Proposition: The Tangent Space to an Open Submanifold

Let \(M\) be a smooth manifold with or without boundary, let \(U \subseteq M\) be an open subset, and let \(\iota : U \hookrightarrow M\) be the inclusion. For every \(p \in U\), the differential \(d\iota_p : T_pU \to T_pM\) is an isomorphism.

Proof:

Choose a neighborhood \(B\) of \(p\) with \(\bar B \subseteq U\).

Injectivity. Suppose \(v \in T_pU\) and \(d\iota_p(v) = 0\). Let \(f \in C^\infty(U)\) be arbitrary. By the extension lemma there is \(\tilde f \in C^\infty(M)\) with \(\tilde f = f\) on \(\bar B\). Then \(f\) and \(\tilde f|_U\) agree on a neighborhood of \(p\), so by the previous proposition \[ \begin{align*} vf &= v\bigl(\tilde f|_U\bigr) = v\bigl(\tilde f \circ \iota\bigr) \\\\ &= d\iota_p(v)\,\tilde f = 0. \end{align*} \] As \(f\) was arbitrary, \(v = 0\), so \(d\iota_p\) is injective.

Surjectivity. Let \(w \in T_pM\). Define \(v : C^\infty(U) \to \mathbb{R}\) by \(vf = w\tilde f\), where \(\tilde f \in C^\infty(M)\) is any smooth function agreeing with \(f\) on \(\bar B\). By locality (applied in \(M\)) the value \(w\tilde f\) is independent of the choice of extension, so \(v\) is well defined, and it is routine to check that \(v\) is a derivation of \(C^\infty(U)\) at \(p\). For any \(g \in C^\infty(M)\), \[ d\iota_p(v)\,g = v(g \circ \iota) = w\,\widetilde{g\circ\iota} = wg, \] where the last two equalities hold because \(g\circ\iota\), its extension \(\widetilde{g\circ\iota}\), and \(g\) all agree on \(\bar B\). Hence \(d\iota_p(v) = w\), and \(d\iota_p\) is surjective.

From now on we use this isomorphism to identify \(T_pU\) with \(T_pM\) for any point \(p\) of an open subset \(U \subseteq M\), suppressing \(d\iota_p\) from the notation. The identification is canonical, independent of any choices, and it amounts to the observation that tangent vectors are local objects. With locality and the open-set identification in hand, we can finally measure the size of an arbitrary tangent space.

The dimension of a tangent space

On \(\mathbb{R}^n\) we proved directly that the tangent space is \(n\)-dimensional. A chart transports this verdict to any manifold: a chart is a diffeomorphism onto an open subset of \(\mathbb{R}^n\) (or of the half-space \(\mathbb{H}^n\) at a boundary point), and the differential of a diffeomorphism is an isomorphism. We state the result for manifolds with and without boundary together, since the boundary case requires only one extra lemma.

Proposition: Dimension of the Tangent Space

If \(M\) is an \(n\)-dimensional smooth manifold with or without boundary, then for every \(p \in M\) the tangent space \(T_pM\) is an \(n\)-dimensional vector space. In particular this holds at boundary points: the tangent space at a boundary point is \(n\)-dimensional, not \((n-1)\)-dimensional.

Proof:

Suppose first that \(p\) is an interior point, and let \((U, \varphi)\) be a smooth chart with \(p \in U\), so that \(\varphi : U \to \widehat U\) is a diffeomorphism onto an open subset \(\widehat U \subseteq \mathbb{R}^n\). By the open-submanifold identification we may replace \(T_pM\) by \(T_pU\), and part (d) of the Properties of the Differential makes \(d\varphi_p : T_pU \to T_{\varphi(p)}\widehat U\) an isomorphism. Identifying \(T_{\varphi(p)}\widehat U\) with \(T_{\varphi(p)}\mathbb{R}^n\) once more by the open-submanifold proposition, the Euclidean computation gives \(\dim T_{\varphi(p)}\mathbb{R}^n = n\), so \(\dim T_pM = n\).

If \(p\) is a boundary point, the chart \(\varphi\) maps a neighborhood of \(p\) diffeomorphically onto an open subset of the half-space \(\mathbb{H}^n\), which is not open in \(\mathbb{R}^n\), so the open-submanifold identification of \(T_{\varphi(p)}\mathbb{H}^n\) with \(T_{\varphi(p)}\mathbb{R}^n\) is no longer available directly. The half-space lemma below supplies the missing isomorphism, and the same chain of identifications yields \(\dim T_pM = n\).

The lemma the boundary case rests on relates the tangent space of the half-space to that of the ambient Euclidean space at a boundary point.

Lemma: The Tangent Space of a Half-Space at a Boundary Point

Let \(\iota : \mathbb{H}^n \hookrightarrow \mathbb{R}^n\) be the inclusion. For any \(a \in \partial\mathbb{H}^n\), the differential \(d\iota_a : T_a\mathbb{H}^n \to T_a\mathbb{R}^n\) is an isomorphism.

Proof:

Injectivity. Suppose \(d\iota_a(v) = 0\). Let \(f : \mathbb{H}^n \to \mathbb{R}\) be smooth, and let \(\tilde f\) be any extension of \(f\) to a smooth function on all of \(\mathbb{R}^n\), which exists by the extension lemma. Then \(\tilde f \circ \iota = f\), so \[ vf = v\bigl(\tilde f \circ \iota\bigr) = d\iota_a(v)\,\tilde f = 0. \] Since this holds for every \(f\), we conclude \(v = 0\) and \(d\iota_a\) is injective.

Surjectivity. Let \(w \in T_a\mathbb{R}^n\) be arbitrary, and define \(v : C^\infty(\mathbb{H}^n) \to \mathbb{R}\) by \(vf = w\tilde f\), where \(\tilde f\) is any smooth extension of \(f\). Writing \(w = w^i\,\partial/\partial x^i|_a\) in the standard basis for \(T_a\mathbb{R}^n\), this reads \[ vf = w^i\,\frac{\partial \tilde f}{\partial x^i}(a). \] This is independent of the choice of \(\tilde f\): by continuity, the partial derivatives of \(\tilde f\) at the boundary point \(a\) are determined by the values of \(f\) on \(\mathbb{H}^n\) alone, since every derivative at \(a\) is a limit of difference quotients that can be taken from within \(\mathbb{H}^n\). That \(v\) is a derivation at \(a\) follows from the corresponding properties of \(w\): linearity is immediate from \(v(c_1 f_1 + c_2 f_2) = w(c_1 \tilde f_1 + c_2 \tilde f_2) = c_1\, w\tilde f_1 + c_2\, w\tilde f_2\), using that \(c_1 \tilde f_1 + c_2 \tilde f_2\) extends \(c_1 f_1 + c_2 f_2\); and the Leibniz rule holds because \(\tilde f \tilde g\) extends \(fg\), so that \[ v(fg) = w(\tilde f \tilde g) = \tilde f(a)\, w\tilde g + \tilde g(a)\, w\tilde f = f(a)\, vg + g(a)\, vf, \] the last equality using \(\tilde f(a) = f(a)\) and \(\tilde g(a) = g(a)\). By construction \(w = d\iota_a(v)\), so \(d\iota_a\) is surjective.

We henceforth identify \(T_a\mathbb{H}^n\) with \(T_a\mathbb{R}^n\) at boundary points exactly as we identified the tangent space to an open subset with that of the whole manifold, and we make no notational distinction between a tangent vector to the half-space and its image in \(T_a\mathbb{R}^n\). This is what closes the boundary case of the dimension proposition above.

Tangent spaces of vector spaces and products

Two special manifolds have tangent spaces simple enough to identify outright. The first is a finite-dimensional vector space, where the tangent space at every point is canonically the space itself — the precise sense in which "the tangent space to a flat space is flat."

Proposition: The Tangent Space of a Vector Space

Let \(V\) be a finite-dimensional real vector space, regarded as a smooth manifold with its standard smooth structure. For each \(a \in V\) there is a canonical isomorphism \[ V \longrightarrow T_aV, \qquad v \longmapsto D_v\big|_a, \quad D_v\big|_a f = \frac{d}{dt}\bigg|_{t=0} f(a + tv). \] Moreover, for any linear map \(L : V \to W\) between finite-dimensional vector spaces and any \(a \in V\), the differential \(dL_a : T_aV \to T_{La}W\) is, under these identifications, the map \(L\) itself: the square relating \(V \to W\) by \(L\) to \(T_aV \to T_{La}W\) by \(dL_a\) commutes.

Proof:

Choosing a basis identifies \(V\) with \(\mathbb{R}^n\), under which the displayed map becomes the isomorphism \(\mathbb{R}^n_a \to T_a\mathbb{R}^n\) already established; that the result does not depend on the basis is the content of the commuting square, which we now verify. For a linear \(L\), the directional-derivative characterization gives, for any \(f \in C^\infty(W)\), \[ \begin{align*} dL_a\bigl(D_v\big|_a\bigr) f &= D_v\big|_a (f \circ L) \\\\ &= \frac{d}{dt}\bigg|_{t=0} f\bigl(L(a + tv)\bigr) \\\\ &= \frac{d}{dt}\bigg|_{t=0} f\bigl(La + t\,Lv\bigr) \\\\ &= D_{Lv}\big|_{La} f, \end{align*} \] using linearity of \(L\) in the third step. Thus \(dL_a\) carries \(D_v|_a\) to \(D_{Lv}|_{La}\), which is exactly the statement that the square commutes — that \(dL_a\) is \(L\) read through the identifications.

Which Matrices Form \(T_I G\)

We can now make good on the qualitative claim that a Lie algebra is a tangent space. The general linear group \(GL(n, \mathbb{R})\) is an open subset of the vector space \(M(n, \mathbb{R})\) of all real \(n \times n\) matrices, so the open-submanifold identification gives \(T_I GL(n, \mathbb{R}) \cong T_I M(n, \mathbb{R})\), and the proposition just proved identifies the latter with \(M(n, \mathbb{R})\) itself. The tangent space to the general linear group at the identity is therefore the entire space of matrices, \[ T_I GL(n, \mathbb{R}) \cong M(n, \mathbb{R}). \] For a matrix Lie group \(G \subseteq GL(n, \mathbb{R})\), the Lie algebra \(\mathfrak{g}\) introduced earlier as a set of matrices is precisely the subspace of \(M(n, \mathbb{R})\) realized as \(T_I G\): the abstract "tangent direction at the identity" is a concrete matrix. The remaining half of the picture — that these matrices are exactly the velocities \(\gamma'(0)\) of curves through the identity, the form in which \(\mathfrak{g}\) was originally defined — is completed once we have velocity vectors of curves in hand.

The second special case is a product of manifolds, whose tangent space decomposes as a direct sum of the factors' tangent spaces — the manifold-level expression of the fact that a curve in a product is a tuple of curves in the factors.

Proposition: The Tangent Space of a Product

Let \(M_1, \dots, M_k\) be smooth manifolds, and for each \(j\) let \(\pi_j : M_1 \times \cdots \times M_k \to M_j\) be the projection. For any point \(p = (p_1, \dots, p_k)\), the map \[ \alpha : T_p(M_1 \times \cdots \times M_k) \longrightarrow T_{p_1}M_1 \oplus \cdots \oplus T_{p_k}M_k, \quad v \longmapsto \bigl(d(\pi_1)_p(v), \dots, d(\pi_k)_p(v)\bigr), \] is an isomorphism.

Proof Sketch:

It suffices to treat \(k = 2\); the general case follows by induction. The projections \(\pi_1, \pi_2\) and the inclusions \(\iota_1 : M_1 \to M_1 \times M_2\), \(x \mapsto (x, p_2)\) and \(\iota_2 : M_2 \to M_1 \times M_2\), \(y \mapsto (p_1, y)\) are smooth, and \(\pi_i \circ \iota_j\) is the identity when \(i = j\) and constant when \(i \neq j\). Define \[ \beta : T_{p_1}M_1 \oplus T_{p_2}M_2 \longrightarrow T_p(M_1 \times M_2), \quad (v_1, v_2) \longmapsto d(\iota_1)_{p_1}(v_1) + d(\iota_2)_{p_2}(v_2). \] Applying \(\alpha\) and using the chain rule together with the fact that the differential of a constant map is zero shows \(\alpha \circ \beta = \mathrm{Id}\). For \(\beta \circ \alpha = \mathrm{Id}\), one verifies that any \(v \in T_p(M_1 \times M_2)\) acts on a function \(f\) through its values along the two slices \(\iota_1, \iota_2\) — a consequence of the product chart, in which \(f\) is differentiated coordinate by coordinate. Hence \(\alpha\) is an isomorphism with inverse \(\beta\).

These identifications are the everyday currency of computation: a tangent vector to \(GL(n, \mathbb{R})\) is a matrix, a tangent vector to \(\mathbb{R}^n\) is an arrow, and a tangent vector to a product is a tuple. What remains is to make all of this explicit in coordinates, where the differential acquires its familiar matrix face and the abstract apparatus becomes a calculus one can carry out by hand.

Computations in Coordinates

The theory is now complete but, as it stands, hopelessly abstract: we know \(T_pM\) is an \(n\)-dimensional space of derivations, yet we have no concrete basis to compute with. A chart repairs this at once. Each chart carries the standard basis of Euclidean tangent space back to \(M\), producing a basis of \(T_pM\) made of partial derivatives, and once we have a basis everything reduces to bookkeeping with components.

Coordinate vectors as a basis

Let \((U, \varphi)\) be a smooth chart with coordinate functions \((x^1, \dots, x^n)\), and let \(p \in U\) with \(\widehat p = \varphi(p)\). The chart is a diffeomorphism onto its image, so by the properties of the differential \(d\varphi_p : T_pM \to T_{\widehat p}\mathbb{R}^n\) is an isomorphism, where we have used the open-submanifold identification to regard \(\varphi\) as a map into \(\mathbb{R}^n\). Pulling back the standard basis \(\partial/\partial x^i|_{\widehat p}\) of \(T_{\widehat p}\mathbb{R}^n\) gives a basis of \(T_pM\).

Definition: Coordinate Vectors and Components

With notation as above, the coordinate vectors at \(p\) are the tangent vectors \[ \frac{\partial}{\partial x^i}\bigg|_p := (d\varphi_p)^{-1}\!\left( \frac{\partial}{\partial x^i}\bigg|_{\widehat p} \right), \qquad i = 1, \dots, n. \] They act on a smooth function \(f\) through its coordinate representation \(\widehat f = f \circ \varphi^{-1}\) by \[ \frac{\partial}{\partial x^i}\bigg|_p f = \frac{\partial \widehat f}{\partial x^i}(\widehat p). \] The coordinate vectors form a basis for \(T_pM\), and every \(v \in T_pM\) is written \(v = v^i\,\partial/\partial x^i|_p\) with components recovered by \(v^j = v(x^j)\). At a boundary point the chart maps into \(\mathbb{H}^n\) and \(\partial/\partial x^n|_p\) is a one-sided derivative, but the formulas are unchanged.

The action formula is just the definition unwound: applying \(d\varphi_p\) and the chain through the chart turns \(\partial/\partial x^i|_p f\) into the ordinary Euclidean partial derivative of \(\widehat f\) at \(\widehat p\). The component formula \(v^j = v(x^j)\) follows by applying the expansion \(v = v^i\,\partial/\partial x^i|_p\) to the coordinate function \(x^j\) and using \(\partial/\partial x^i|_p\, x^j = \delta^j_i\). Collecting these observations with the dimension count of the previous section, we have proved everything in the following summary of the working vocabulary of the rest of the subject: \(T_pM\) is \(n\)-dimensional, the coordinate vectors of any chart are a basis, and a tangent vector is determined by its \(n\) components \(v^j = v(x^j)\).

Change of coordinates

A single tangent vector has different components in different charts, and we need the rule converting between them. Suppose \(p\) lies in the domains of two charts, with coordinates \((x^i)\) and \((\tilde x^j)\); write \(\partial/\partial\tilde x^j|_p\) for the coordinate vectors of the second chart. The transition map expresses the new coordinates as smooth functions of the old, and differentiating it gives the change-of-basis rule.

Proposition: Change of Coordinate Vectors and Components

Let \((x^i)\) and \((\tilde x^j)\) be two smooth coordinate systems near \(p\), with \(\widehat p\) the representation of \(p\) in the \((x^i)\) chart. The coordinate vectors transform by \[ \frac{\partial}{\partial x^i}\bigg|_p = \frac{\partial \tilde x^j}{\partial x^i}(\widehat p)\, \frac{\partial}{\partial \tilde x^j}\bigg|_p, \] and the components of a vector \(v = v^i\,\partial/\partial x^i|_p = \tilde v^j\,\partial/\partial\tilde x^j|_p\) transform by \[ \tilde v^j = \frac{\partial \tilde x^j}{\partial x^i}(\widehat p)\, v^i. \]

Proof:

Apply the coordinate vector \(\partial/\partial x^i|_p\) to the second coordinate function \(\tilde x^j\): by the action formula it yields \(\partial \tilde x^j/\partial x^i(\widehat p)\), which is the \(\partial/\partial\tilde x^j|_p\)-component of \(\partial/\partial x^i|_p\) by the component formula \(v^j = v(\tilde x^j)\). This is the first display. Substituting it into \(v = v^i\,\partial/\partial x^i|_p\) and collecting the coefficient of \(\partial/\partial\tilde x^j|_p\) gives the component rule.

A Coordinate Vector Depends on the Whole Coordinate System

It is tempting to read \(\partial/\partial x^i|_p\) as differentiation along the \(x^i\)-axis, depending only on the single function \(x^i\). It does not: the coordinate vector depends on the entire coordinate system, because it differentiates with respect to \(x^i\) while holding all the other coordinates fixed, and changing those other coordinates changes the direction. A sharp illustration on \(\mathbb{R}^2\) takes the standard coordinates \((x, y)\) and the new coordinates \[ \tilde x = x, \qquad \tilde y = y + x^3, \] which are global smooth coordinates. At the point with standard coordinates \((x, y) = (1, 0)\) the change-of-basis rule gives \(\partial \tilde y/\partial x = 3x^2 = 3 \neq 0\) there, so that \(\partial/\partial x|_p\) acquires a \(\partial/\partial\tilde y|_p\) term and \[ \frac{\partial}{\partial x}\bigg|_p \neq \frac{\partial}{\partial \tilde x}\bigg|_p, \] even though the coordinate functions \(x\) and \(\tilde x\) are identically equal. The first coordinate function is the same in both systems; the first coordinate vector is not.

The Differential as a Jacobian

We promised that the differential, defined abstractly to be coordinate-free, would reduce to the Jacobian once coordinates are chosen. We now redeem that promise. The payoff is conceptual as much as computational: the coordinate-free definition was cooked up precisely so that the Jacobian — an array that obviously depends on the chosen coordinates — would acquire a meaning independent of them.

Proposition: The Differential in Coordinates Is the Jacobian

Let \(F : M \to N\) be smooth, let \((U, \varphi)\) and \((V, \psi)\) be charts containing \(p\) and \(F(p)\) with coordinates \((x^i)\) and \((y^j)\), and let \(\widehat F = \psi \circ F \circ \varphi^{-1}\) be the coordinate representation of \(F\). Then \[ dF_p\!\left( \frac{\partial}{\partial x^i}\bigg|_p \right) = \frac{\partial \widehat F^j}{\partial x^i}(\widehat p)\, \frac{\partial}{\partial y^j}\bigg|_{F(p)}. \] In other words, the matrix of \(dF_p\) with respect to the coordinate bases is the Jacobian matrix of \(\widehat F\) at \(\widehat p\), with \(j\) indexing rows (the target coordinates) and \(i\) indexing columns (the source coordinates).

Proof:

Let \(f \in C^\infty(N)\). Applying the differential and then the action formula in the source chart, \[ \begin{align*} dF_p\!\left( \frac{\partial}{\partial x^i}\bigg|_p \right) f &= \frac{\partial}{\partial x^i}\bigg|_p (f \circ F) \\\\ &= \frac{\partial \, \widehat{(f\circ F)}}{\partial x^i}(\widehat p) = \frac{\partial (\widehat f \circ \widehat F)}{\partial x^i}(\widehat p), \end{align*} \] where \(\widehat f = f \circ \psi^{-1}\) and the last equality uses \(\widehat{f\circ F} = \widehat f \circ \widehat F\). The ordinary chain rule in \(\mathbb{R}^n\) expands the right-hand side as \(\dfrac{\partial \widehat F^j}{\partial x^i}(\widehat p)\, \dfrac{\partial \widehat f}{\partial y^j}(\widehat F(\widehat p))\), which is exactly the coordinate action of the right-hand vector in the statement. As \(f\) was arbitrary, the two tangent vectors agree.

The matrix \(\bigl(\partial \widehat F^j/\partial x^i(\widehat p)\bigr)\) appearing here is precisely the Jacobian matrix of the coordinate representation \(\widehat F\). The Jacobian we met earlier as the matrix of partial derivatives is thus revealed to have been the shadow, in a particular pair of charts, of the coordinate-free linear map \(dF_p\) — the same identification the insight box above drew between pushforwards and Jacobian–vector products, now made exact.

A concrete computation shows the change-of-coordinates machinery and the Jacobian formula working together. We convert a tangent vector given in polar coordinates into standard coordinates.

Example: Polar to Cartesian Coordinates

The transition map between polar and standard coordinates on suitable open subsets of the plane is \((x, y) = (r\cos\theta,\, r\sin\theta)\). Let \(p\) be the point of \(\mathbb{R}^2\) with polar representation \(\widehat p = (r, \theta) = (2, \pi/2)\), and let \(v \in T_p\mathbb{R}^2\) be the tangent vector with polar representation \[ v = 3\,\frac{\partial}{\partial r}\bigg|_p - \frac{\partial}{\partial \theta}\bigg|_p . \] Applying the change-of-basis rule to the polar coordinate vectors, with the transition derivatives evaluated at \(\widehat p\) so that \(\cos(\pi/2) = 0\), \(\sin(\pi/2) = 1\), \(r = 2\), \[ \begin{align*} \frac{\partial}{\partial r}\bigg|_p &= \cos\!\Big(\tfrac{\pi}{2}\Big)\frac{\partial}{\partial x}\bigg|_p + \sin\!\Big(\tfrac{\pi}{2}\Big)\frac{\partial}{\partial y}\bigg|_p = \frac{\partial}{\partial y}\bigg|_p, \\\\ \frac{\partial}{\partial \theta}\bigg|_p &= -2\sin\!\Big(\tfrac{\pi}{2}\Big)\frac{\partial}{\partial x}\bigg|_p + 2\cos\!\Big(\tfrac{\pi}{2}\Big)\frac{\partial}{\partial y}\bigg|_p = -2\,\frac{\partial}{\partial x}\bigg|_p . \end{align*} \] Substituting these into the polar expression for \(v\) gives its standard-coordinate representation \[ v = 3\,\frac{\partial}{\partial y}\bigg|_p + 2\,\frac{\partial}{\partial x}\bigg|_p . \]

With coordinate bases, the change-of-coordinates rule, and the Jacobian identification in place, the tangent space has become a working instrument: every tangent vector is a list of components, every smooth map acts by its Jacobian, and every change of chart is governed by a single transformation law. These are the computational foundations on which the tangent bundle — the assembly of all the tangent spaces into a single smooth manifold — is built.