Vector Spaces

Vector Space Null Space Column Space Basis Coordinate Systems Rank

Vector Spaces

Definition: Vector Space

A real vector space is a nonempty set \(V\) of vectors on which are defined addition and scalar multiplication, satisfying the following eight axioms.

For all \(\mathbf{u}, \mathbf{v}, \mathbf{w} \in V\) and scalars \(c, d \in \mathbb{R}\):

  1. Commutativity of vector addition: \(\mathbf{u} + \mathbf{v} = \mathbf{v} + \mathbf{u}\).
  2. Associativity of vector addition: \((\mathbf{u} + \mathbf{v}) + \mathbf{w} = \mathbf{u} + (\mathbf{v} + \mathbf{w})\).
  3. Existence of an additive identity: there exists \(\mathbf{0} \in V\) such that \(\mathbf{u} + \mathbf{0} = \mathbf{u}\).
  4. Existence of an additive inverse: for every \(\mathbf{u} \in V\) there exists \(-\mathbf{u} \in V\) such that \(\mathbf{u} + (-\mathbf{u}) = \mathbf{0}\).
  5. Existence of a multiplicative identity: \(1\mathbf{u} = \mathbf{u}\).
  6. Distributivity of scalar multiplication over vector addition: \(c(\mathbf{u} + \mathbf{v}) = c\mathbf{u} + c\mathbf{v}\).
  7. Distributivity of scalar multiplication over scalar addition: \((c + d)\mathbf{u} = c\mathbf{u} + d\mathbf{u}\).
  8. Compatibility of scalar multiplication: \(c(d\mathbf{u}) = (cd)\mathbf{u}\).

These axioms abstract the essential algebraic properties of \(\mathbb{R}^n\). By identifying which properties are fundamental, we can recognize vector space structure in many mathematical objects beyond \(\mathbb{R}^n\) — including spaces of functions, polynomials, matrices, and solutions to differential equations. Any set satisfying these eight axioms behaves algebraically like \(\mathbb{R}^n\), allowing us to extend linear algebra techniques to these more general settings.

From now on, the term "vector space" refers to a real vector space. If the context requires a complex vector space, where scalars \(c, d \in \mathbb{C}\), this will be explicitly mentioned. The general case — replacing \(\mathbb{R}\) by an arbitrary field — is developed in the upcoming section on abstract algebra, where fields and field-theoretic constructions are introduced.

The most immediate example of a vector space is \(\mathbb{R}^n\) for \(n \geq 1\). This is why the concept is introduced at this point in the curriculum — it generalizes the familiar arithmetic of \(\mathbb{R}^n\) to a broader setting.

Several immediate consequences follow from the axioms. For every \(\mathbf{u} \in V\) and every scalar \(c\): \[ 0\mathbf{u} = \mathbf{0}, \quad c\mathbf{0} = \mathbf{0}, \quad -\mathbf{u} = (-1)\mathbf{u}, \quad c\mathbf{u} = \mathbf{0} \Longrightarrow c = 0 \text{ or } \mathbf{u} = \mathbf{0}. \] As a representative derivation, we prove the first: \(0\mathbf{u} = \mathbf{0}\). By axiom 7 (distributivity over scalar addition), \(0\mathbf{u} = (0 + 0)\mathbf{u} = 0\mathbf{u} + 0\mathbf{u}\). Adding the additive inverse \(-(0\mathbf{u})\) (axiom 4) to both sides and using associativity (axiom 2), \[ \mathbf{0} = 0\mathbf{u} + \bigl(-(0\mathbf{u})\bigr) = \bigl(0\mathbf{u} + 0\mathbf{u}\bigr) + \bigl(-(0\mathbf{u})\bigr) = 0\mathbf{u} + \bigl(0\mathbf{u} + (-(0\mathbf{u}))\bigr) = 0\mathbf{u} + \mathbf{0} = 0\mathbf{u}. \] The remaining three consequences follow by analogous short arguments (for \(c\mathbf{0} = \mathbf{0}\) use axiom 6; for \(-\mathbf{u} = (-1)\mathbf{u}\) combine the previous result with axiom 5 and 7; for the last, argue by contradiction assuming \(c \neq 0\) and multiply by \(c^{-1}\)).

Definition: Vector Subspace

A subspace of a vector space \(V\) is a subset \(H \subseteq V\) satisfying:

  1. The zero vector of \(V\) lies in \(H\): \(\mathbf{0} \in H\).
  2. \(H\) is closed under vector addition: for all \(\mathbf{u}, \mathbf{v} \in H\), \(\mathbf{u} + \mathbf{v} \in H\).
  3. \(H\) is closed under scalar multiplication: for all \(\mathbf{u} \in H\) and every scalar \(c\), \(c\mathbf{u} \in H\).
Warning:

For example, the vector space \(\mathbb{R}^2\) is not a subspace of \(\mathbb{R}^3\), because \(\mathbb{R}^2\) is not even a subset of \(\mathbb{R}^3\): vectors in \(\mathbb{R}^2\) have two entries, those in \(\mathbb{R}^3\) have three.

Recall from the previous section the linear combination of vectors and the span of a set: \(\operatorname{Span}\{\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_p\}\) denotes the set of all linear combinations of \(\mathbf{v}_1, \ldots, \mathbf{v}_p\). Since linear combinations involve only vector addition and scalar multiplication, and subspaces are closed under these operations, any span is automatically a subspace. We record this formally:

Theorem: Span is a Subspace

Let \(V\) be a vector space and \(\mathbf{v}_1, \ldots, \mathbf{v}_p \in V\). Then \(\operatorname{Span}\{\mathbf{v}_1, \ldots, \mathbf{v}_p\}\) is a subspace of \(V\).

Proof:

Write \(H = \operatorname{Span}\{\mathbf{v}_1, \ldots, \mathbf{v}_p\}\). We verify the three subspace conditions.

(1) \(\mathbf{0} \in H\). Using the consequence \(0\mathbf{v}_i = \mathbf{0}\) of the axioms (noted at the start of this section), \[ \mathbf{0} = 0\mathbf{v}_1 + 0\mathbf{v}_2 + \cdots + 0\mathbf{v}_p, \] so \(\mathbf{0}\) is a linear combination of the \(\mathbf{v}_i\) with all weights zero. Hence \(\mathbf{0} \in H\).

(2) Closure under addition. Let \(\mathbf{u}, \mathbf{w} \in H\), so \[ \mathbf{u} = \sum_{i=1}^p c_i \mathbf{v}_i, \qquad \mathbf{w} = \sum_{i=1}^p d_i \mathbf{v}_i \] for some scalars \(c_i, d_i\). By commutativity and associativity of vector addition (axioms 1–2) and distributivity of scalar multiplication over scalar addition (axiom 7), \[ \mathbf{u} + \mathbf{w} = \sum_{i=1}^p c_i \mathbf{v}_i + \sum_{i=1}^p d_i \mathbf{v}_i = \sum_{i=1}^p (c_i + d_i)\mathbf{v}_i, \] which is again a linear combination of the \(\mathbf{v}_i\). Hence \(\mathbf{u} + \mathbf{w} \in H\).

(3) Closure under scalar multiplication. Let \(\mathbf{u} = \sum c_i \mathbf{v}_i \in H\) and let \(c\) be any scalar. By distributivity of scalar multiplication over vector addition (axiom 6) and compatibility of scalar multiplication (axiom 8), \[ c \mathbf{u} = c \sum_{i=1}^p c_i \mathbf{v}_i = \sum_{i=1}^p (c c_i) \mathbf{v}_i, \] again a linear combination of the \(\mathbf{v}_i\). Hence \(c\mathbf{u} \in H\).

All three subspace conditions hold, so \(H\) is a subspace of \(V\).

Null Space

Having established the general concept of vector spaces and subspaces, we now examine two fundamental subspaces associated with any matrix \(A\): the null space and the column space. These subspaces encode essential information about the solutions to \(A\mathbf{x} = \mathbf{b}\) and about the linear transformation \(\mathbf{x} \mapsto A\mathbf{x}\).

Definition: Null Space

The null space of \(A \in \mathbb{R}^{m \times n}\) is the set of all solutions to the homogeneous equation \(A\mathbf{x} = \mathbf{0}\): \[ \operatorname{Nul} A = \{\mathbf{x} \in \mathbb{R}^n \mid A\mathbf{x} = \mathbf{0} \in \mathbb{R}^m\}. \] Equivalently, \(\operatorname{Nul} A\) is the set of vectors in \(\mathbb{R}^n\) mapped to the zero vector of \(\mathbb{R}^m\) by the linear transformation \(\mathbf{x} \mapsto A\mathbf{x}\).

More generally, for a linear transformation \(T : V \to W\) between vector spaces (not necessarily a matrix transformation), the null space is called the kernel of \(T\) and denoted \(\operatorname{Ker} T\): \[ \operatorname{Ker} T = \{\mathbf{u} \in V \mid T(\mathbf{u}) = \mathbf{0} \in W\}. \]

Theorem: Null Space is a Subspace

For every \(A \in \mathbb{R}^{m \times n}\), \(\operatorname{Nul} A\) is a subspace of \(\mathbb{R}^n\).

Proof:

We verify the three subspace conditions for \(H = \operatorname{Nul} A \subseteq \mathbb{R}^n\). Along the way we use the fact that matrix-vector multiplication distributes over vector addition and scalar multiplication — i.e., \(A(\mathbf{u} + \mathbf{v}) = A\mathbf{u} + A\mathbf{v}\) and \(A(c\mathbf{u}) = c(A\mathbf{u})\) — which follows from the entrywise definition of \(A\mathbf{x}\) (see the Linear Transformation page for the verification).

(1) \(\mathbf{0} \in \operatorname{Nul} A\). Computing entry-wise, \(A\mathbf{0}\) has \(i\)-th entry \(\sum_{j=1}^n a_{ij} \cdot 0 = 0\), so \(A\mathbf{0} = \mathbf{0}\). Hence \(\mathbf{0} \in \operatorname{Nul} A\).

(2) Closure under addition. Let \(\mathbf{u}, \mathbf{v} \in \operatorname{Nul} A\), so \(A\mathbf{u} = \mathbf{0}\) and \(A\mathbf{v} = \mathbf{0}\). Then \[ A(\mathbf{u} + \mathbf{v}) = A\mathbf{u} + A\mathbf{v} = \mathbf{0} + \mathbf{0} = \mathbf{0}, \] so \(\mathbf{u} + \mathbf{v} \in \operatorname{Nul} A\).

(3) Closure under scalar multiplication. Let \(\mathbf{u} \in \operatorname{Nul} A\) and let \(c\) be a scalar. Then \[ A(c\mathbf{u}) = c(A\mathbf{u}) = c\mathbf{0} = \mathbf{0}, \] so \(c\mathbf{u} \in \operatorname{Nul} A\).

All three conditions hold. The identical argument with \(A\) replaced by a general linear transformation \(T : V \to W\) shows that \(\operatorname{Ker} T\) is a subspace of \(V\).

Example:

Consider the augmented matrix for \(A\mathbf{x} = \mathbf{0}\) and its reduced row echelon form: \[ \begin{align*} &\begin{bmatrix} 2 & 4 & 6 & 8 & 0 \\ 1 & 3 & 5 & 7 & 0 \end{bmatrix} \\\\ &\xrightarrow{\operatorname{rref}} \begin{bmatrix} 1 & 0 & -1 & -2 & 0 \\ 0 & 1 & 2 & 3 & 0 \end{bmatrix}. \end{align*} \] Here \(x_3\) and \(x_4\) are free variables, and the solution can be written as \[ \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = x_3 \begin{bmatrix} 1 \\ -2 \\ 1 \\ 0 \end{bmatrix} + x_4 \begin{bmatrix} 2 \\ -3 \\ 0 \\ 1 \end{bmatrix}. \] Thus a spanning set for \(\operatorname{Nul} A\) is \[ \left\{ \begin{bmatrix} 1 \\ -2 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 2 \\ -3 \\ 0 \\ 1 \end{bmatrix} \right\}. \] Every linear combination of these two vectors lies in \(\operatorname{Nul} A\).

Column Space

While the null space identifies vectors mapped to zero, the column space identifies all possible outputs of the linear transformation \(\mathbf{x} \mapsto A\mathbf{x}\). Together, the null space and column space give complementary perspectives on the behavior of \(A\): one captures loss of information, the other captures reach.

Definition: Column Space and Row Space

The column space of \(A \in \mathbb{R}^{m \times n}\) is the set of all linear combinations of the columns of \(A = \begin{bmatrix} \mathbf{a}_1 & \mathbf{a}_2 & \cdots & \mathbf{a}_n \end{bmatrix}\): \[ \operatorname{Col} A = \operatorname{Span}\{\mathbf{a}_1, \mathbf{a}_2, \ldots, \mathbf{a}_n\} = \{A\mathbf{x} \mid \mathbf{x} \in \mathbb{R}^n\} \subseteq \mathbb{R}^m. \] Equivalently, \(\operatorname{Col} A\) is the range of the linear transformation \(\mathbf{x} \mapsto A\mathbf{x}\).

The row space of \(A\), denoted \(\operatorname{Row} A\), is the column space of \(A^T\): \[ \operatorname{Row} A = \operatorname{Col} A^T \subseteq \mathbb{R}^n. \] It is the span of the row vectors of \(A\) (viewed as vectors in \(\mathbb{R}^n\)).

Theorem: Column and Row Spaces are Subspaces

For every \(A \in \mathbb{R}^{m \times n}\), \(\operatorname{Col} A\) is a subspace of \(\mathbb{R}^m\) and \(\operatorname{Row} A\) is a subspace of \(\mathbb{R}^n\).

Proof:

By definition, \(\operatorname{Col} A = \operatorname{Span}\{\mathbf{a}_1, \ldots, \mathbf{a}_n\}\) where \(\mathbf{a}_i \in \mathbb{R}^m\). The Span is a Subspace theorem above, applied in \(V = \mathbb{R}^m\), immediately gives that \(\operatorname{Col} A\) is a subspace of \(\mathbb{R}^m\). The row-space claim follows by the same argument applied to \(A^T \in \mathbb{R}^{n \times m}\).

Basis

When describing a subspace, we want a spanning set that contains no redundant vectors — that is, a set where each vector contributes essential information. This leads to the notion of a basis, the most economical way to describe a subspace.

Definition: Basis

Let \(H\) be a subspace of a vector space \(V\). An indexed set of vectors \[ \mathcal{B} = \{\mathbf{b}_1, \ldots, \mathbf{b}_p\} \subseteq V \] is a basis for \(H\) if

  1. \(\mathcal{B}\) is linearly independent, and
  2. \(H = \operatorname{Span}\{\mathbf{b}_1, \ldots, \mathbf{b}_p\}\).

For example, the columns of the identity matrix \(I_n\) form the standard basis for \(\mathbb{R}^n\). More generally, the columns of any \(n \times n\) invertible matrix form a basis for \(\mathbb{R}^n\), since by the Invertible Matrix Theorem those columns are simultaneously linearly independent and span \(\mathbb{R}^n\).

The spanning set we found earlier for \(\operatorname{Nul} A\), \[ \left\{ \begin{bmatrix} 1 \\ -2 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 2 \\ -3 \\ 0 \\ 1 \end{bmatrix} \right\}, \] is in fact a basis for \(\operatorname{Nul} A\) — linearly independent because each basis vector has a \(1\) in its "own" free-variable coordinate (\(x_3\) or \(x_4\)) and \(0\) in every other free-variable coordinate, so any linear relation among them forces every coefficient to be \(0\).

For a basis of \(\operatorname{Col} A\), we need a linearly independent spanning set with no redundancy. Consider \[ \begin{align*} A &= \begin{bmatrix} 2 & 4 & 6 & 8 \\ 1 & 3 & 5 & 7 \end{bmatrix} \\\\ &\xrightarrow{\operatorname{rref}} \begin{bmatrix} 1 & 2 & 3 & 4 \\ 0 & 1 & 2 & 3 \end{bmatrix} = B. \end{align*} \] In \(A\), columns \(\mathbf{a}_3\) and \(\mathbf{a}_4\) are linear combinations of earlier columns: \[ \mathbf{a}_3 = -\mathbf{a}_1 + 2\mathbf{a}_2, \qquad \mathbf{a}_4 = -2\mathbf{a}_1 + 3\mathbf{a}_2. \] The pivot columns of \(B\) are linearly independent, and since \(A\) is row-equivalent to \(B\), the pivot columns of \(A\) — namely \(\mathbf{a}_1\) and \(\mathbf{a}_2\) — form a basis for \(\operatorname{Col} A\): \[ \left\{ \begin{bmatrix} 2 \\ 1 \end{bmatrix}, \; \begin{bmatrix} 4 \\ 3 \end{bmatrix} \right\}. \]

The representations of \(\mathbf{a}_3\) and \(\mathbf{a}_4\) in terms of \(\mathbf{a}_1\) and \(\mathbf{a}_2\) are unique. This is a special case of the following general principle.

Theorem: Unique Representation in a Basis

Let \(\mathcal{B} = \{\mathbf{b}_1, \ldots, \mathbf{b}_n\}\) be a basis for a vector space \(V\). Then for every \(\mathbf{x} \in V\) there exists a unique tuple of scalars \((c_1, \ldots, c_n)\) such that \[ \mathbf{x} = c_1 \mathbf{b}_1 + \cdots + c_n \mathbf{b}_n. \]

Proof:

Existence. Since \(\mathcal{B}\) spans \(V\), there exist scalars \(c_1, \ldots, c_n\) with \[ \mathbf{x} = c_1 \mathbf{b}_1 + \cdots + c_n \mathbf{b}_n. \tag{1} \]

Uniqueness. Suppose also \[ \mathbf{x} = d_1 \mathbf{b}_1 + \cdots + d_n \mathbf{b}_n. \tag{2} \] Subtracting (2) from (1), \[ \mathbf{0} = (c_1 - d_1)\mathbf{b}_1 + \cdots + (c_n - d_n)\mathbf{b}_n. \] By linear independence of \(\mathcal{B}\), each coefficient vanishes: \(c_i - d_i = 0\), so \(c_i = d_i\) for all \(i\). The representation is therefore unique.

Coordinate Systems

The uniqueness guaranteed by the Unique Representation theorem means that once we fix a basis \(\mathcal{B}\) for a vector space \(V\), each vector \(\mathbf{x} \in V\) corresponds to exactly one tuple of scalars. This establishes a coordinate system on \(V\) relative to \(\mathcal{B}\), and turns the abstract vector space \(V\) into something we can manipulate with arithmetic in \(\mathbb{R}^n\).

Definition: Coordinate Vector

Let \(\mathcal{B} = \{\mathbf{b}_1, \ldots, \mathbf{b}_n\}\) be a basis for a vector space \(V\), and let \(\mathbf{x} \in V\). The \(\mathcal{B}\)-coordinates of \(\mathbf{x}\) are the unique scalars \(c_1, \ldots, c_n\) with \[ \mathbf{x} = c_1 \mathbf{b}_1 + \cdots + c_n \mathbf{b}_n. \] The vector \[ [\mathbf{x}]_{\mathcal{B}} = \begin{bmatrix} c_1 \\ \vdots \\ c_n \end{bmatrix} \in \mathbb{R}^n \] is called the coordinate vector of \(\mathbf{x}\) relative to \(\mathcal{B}\), and the assignment \(\mathbf{x} \mapsto [\mathbf{x}]_{\mathcal{B}}\) is the coordinate mapping.

Theorem: The Coordinate Mapping is a Linear Isomorphism

Let \(\mathcal{B} = \{\mathbf{b}_1, \ldots, \mathbf{b}_n\}\) be a basis for a vector space \(V\). The coordinate mapping \[ \varphi : V \to \mathbb{R}^n, \qquad \varphi(\mathbf{x}) = [\mathbf{x}]_{\mathcal{B}} \] is linear, one-to-one, and onto — in other words, a linear isomorphism between \(V\) and \(\mathbb{R}^n\).

Proof:

By the Unique Representation theorem above, \(\varphi\) is a well-defined function: each \(\mathbf{x} \in V\) has exactly one \(\mathcal{B}\)-coordinate tuple. We verify the three required properties.

Linearity. Let \(\mathbf{x}, \mathbf{y} \in V\) and let \(c\) be a scalar. Write \[ \mathbf{x} = \sum_{i=1}^n c_i \mathbf{b}_i, \qquad \mathbf{y} = \sum_{i=1}^n d_i \mathbf{b}_i, \] so \(\varphi(\mathbf{x}) = (c_1, \ldots, c_n)^T\) and \(\varphi(\mathbf{y}) = (d_1, \ldots, d_n)^T\). By the vector space axioms (commutativity, associativity, and distributivity over scalar addition, exactly as in the Span is a Subspace proof), \[ \mathbf{x} + \mathbf{y} = \sum_{i=1}^n (c_i + d_i) \mathbf{b}_i. \] By uniqueness of basis representation, this sum is the \(\mathcal{B}\)-representation of \(\mathbf{x} + \mathbf{y}\), so \[ \varphi(\mathbf{x} + \mathbf{y}) = (c_1 + d_1, \ldots, c_n + d_n)^T = \varphi(\mathbf{x}) + \varphi(\mathbf{y}). \] Similarly, by distributivity over vector addition and compatibility (axioms 6 and 8), \[ c\mathbf{x} = c \sum_{i=1}^n c_i \mathbf{b}_i = \sum_{i=1}^n (c c_i) \mathbf{b}_i, \] so \(\varphi(c\mathbf{x}) = (c c_1, \ldots, c c_n)^T = c\, \varphi(\mathbf{x})\). Hence \(\varphi\) is linear.

One-to-one. Suppose \(\varphi(\mathbf{x}) = \varphi(\mathbf{y})\). Writing \(\mathbf{x} = \sum c_i \mathbf{b}_i\) and \(\mathbf{y} = \sum d_i \mathbf{b}_i\), the hypothesis says \(c_i = d_i\) for each \(i\), hence \(\mathbf{x} = \mathbf{y}\). (Equivalently: by linearity, it suffices to check that \(\varphi(\mathbf{x}) = \mathbf{0}\) forces \(\mathbf{x} = \mathbf{0}\); but \(\varphi(\mathbf{x}) = \mathbf{0}\) means all \(c_i = 0\), hence \(\mathbf{x} = \sum 0 \cdot \mathbf{b}_i = \mathbf{0}\) by the consequence \(0\mathbf{b}_i = \mathbf{0}\) noted earlier.)

Onto. Given any \((c_1, \ldots, c_n)^T \in \mathbb{R}^n\), the vector \(\mathbf{x} := \sum_{i=1}^n c_i \mathbf{b}_i \in V\) satisfies \(\varphi(\mathbf{x}) = (c_1, \ldots, c_n)^T\). So every element of \(\mathbb{R}^n\) is in the image of \(\varphi\).

The notion of isomorphism is developed systematically in the upcoming abstract algebra series; here it captures the idea that \(V\) and \(\mathbb{R}^n\) are "the same" vector space up to relabeling of vectors. The practical consequence is that we can analyze any finite-dimensional vector space via the familiar \(\mathbb{R}^n\), with all vector-space operations preserved by the coordinate mapping.

Dimension and Rank

The dimension of a subspace is a fundamental invariant that measures its "size." For matrix-associated subspaces, this invariant has special significance and is captured by the notion of rank.

Definition: Dimension

The dimension of a vector space \(V\), written \(\dim V\), is the number of vectors in a basis for \(V\). A vector space is finite-dimensional if it has a basis with finitely many vectors, and infinite-dimensional otherwise.

For this definition to be sensible, any two bases of the same vector space must contain the same number of vectors — a non-trivial theorem proved below.

Theorem: Invariance of Dimension

Let \(V\) be a vector space that has a finite basis \(\mathcal{B} = \{\mathbf{b}_1, \ldots, \mathbf{b}_n\}\). Then every basis of \(V\) has exactly \(n\) vectors.

The core of the proof is a classical counting lemma: in a space spanned by \(n\) vectors, any collection of \(n + 1\) or more vectors must be linearly dependent. We isolate this as a separate statement because it is of independent interest — it is the germ of most of linear algebra's dimension arguments.

Lemma: Spanning Set Bound

Let \(V\) be a vector space spanned by \(n\) vectors \(\mathbf{v}_1, \ldots, \mathbf{v}_n\). Then any set of more than \(n\) vectors in \(V\) is linearly dependent.

Proof of Lemma:

Let \(\mathbf{w}_1, \ldots, \mathbf{w}_{n+1} \in V\); we show this set is linearly dependent. (If we are given more than \(n + 1\) vectors, the same argument applies to any \(n + 1\) of them, and linear dependence of a subset implies linear dependence of the whole set.)

Since \(V = \operatorname{Span}\{\mathbf{v}_1, \ldots, \mathbf{v}_n\}\), each \(\mathbf{w}_j\) admits an expansion \[ \mathbf{w}_j = \sum_{i=1}^n a_{ij} \mathbf{v}_i \qquad (j = 1, \ldots, n+1) \] for some scalars \(a_{ij}\). Arrange the coefficients into the \(n \times (n+1)\) matrix \(A = (a_{ij})\).

Consider the homogeneous system \(A\mathbf{c} = \mathbf{0}\) in the unknown \(\mathbf{c} = (c_1, \ldots, c_{n+1})^T \in \mathbb{R}^{n+1}\). This is a system of \(n\) equations in \(n + 1\) unknowns. In any row echelon form of \(A\), there are at most \(n\) pivots (one per row), hence at most \(n\) basic variables — leaving at least one free variable among the \(n + 1\) unknowns. Setting the free variable to \(1\) (and solving for the basic variables) produces a nontrivial solution \(\mathbf{c} \neq \mathbf{0}\) with \(A\mathbf{c} = \mathbf{0}\).

Using this \(\mathbf{c}\), compute \[ \sum_{j=1}^{n+1} c_j \mathbf{w}_j = \sum_{j=1}^{n+1} c_j \sum_{i=1}^n a_{ij} \mathbf{v}_i = \sum_{i=1}^n \Bigg(\sum_{j=1}^{n+1} a_{ij} c_j \Bigg) \mathbf{v}_i = \sum_{i=1}^n (A\mathbf{c})_i\, \mathbf{v}_i = \sum_{i=1}^n 0 \cdot \mathbf{v}_i = \mathbf{0}. \] (The interchange of sums uses commutativity and associativity of addition, and the extraction of coefficients uses distributivity — axioms 1, 2, and 6–8.) This exhibits a nontrivial linear combination of the \(\mathbf{w}_j\) equal to \(\mathbf{0}\), so \(\{\mathbf{w}_1, \ldots, \mathbf{w}_{n+1}\}\) is linearly dependent.

Proof of Invariance of Dimension:

Let \(\mathcal{B}' = \{\mathbf{b}'_1, \ldots, \mathbf{b}'_m\}\) be another basis of \(V\). Since \(\mathcal{B} = \{\mathbf{b}_1, \ldots, \mathbf{b}_n\}\) spans \(V\) with \(n\) vectors, and \(\mathcal{B}'\) is a linearly independent set in \(V\), the Lemma forbids \(\mathcal{B}'\) from having more than \(n\) elements: \(m \leq n\).

By symmetry (swapping the roles of \(\mathcal{B}\) and \(\mathcal{B}'\)): \(\mathcal{B}'\) spans \(V\) with \(m\) vectors, and \(\mathcal{B}\) is linearly independent, so \(n \leq m\).

Combining, \(m = n\). The cardinality of any basis for \(V\) equals \(n\), so \(\dim V\) is well-defined.

Definition: Rank

The rank of a matrix \(A\) is the dimension of its column space: \[ \operatorname{rank} A = \dim(\operatorname{Col} A). \] Equivalently (as we will see in the Rank–Nullity proof), \(\operatorname{rank} A\) is the number of pivot columns in any row echelon form of \(A\).

Two immediate companion quantities: since \(\operatorname{Row} A = \operatorname{Col} A^T\), we have \(\dim(\operatorname{Row} A) = \operatorname{rank} A^T\). And \(\dim(\operatorname{Nul} A)\) equals the number of free variables in \(A\mathbf{x} = \mathbf{0}\) — equivalently, the number of non-pivot columns of \(A\). The relationship between rank and nullity is captured by the following fundamental theorem.

Theorem: Rank–Nullity

For every \(m \times n\) matrix \(A\), \[ \operatorname{rank} A + \dim(\operatorname{Nul} A) = n. \]

Proof:

Let \(R\) be the reduced row echelon form of \(A\), and let \(r\) be the number of pivot columns of \(R\) (equivalently, the number of leading \(1\)'s in \(R\)). We claim \(\operatorname{rank} A = r\) and \(\dim(\operatorname{Nul} A) = n - r\); the theorem then follows.

Throughout, we use a key invariant of row reduction: the homogeneous equation \(A\mathbf{x} = \mathbf{0}\) has exactly the same solutions as \(R\mathbf{x} = \mathbf{0}\), since each elementary row operation is performed by left-multiplication by an invertible elementary matrix (see the Elementary Matrices section). In particular, a linear relation \(\sum_j c_j \mathbf{a}_j = \mathbf{0}\) among the columns of \(A\) holds if and only if the same relation \(\sum_j c_j \mathbf{r}_j = \mathbf{0}\) holds among the columns of \(R\), because both are statements of the form \(A\mathbf{c} = \mathbf{0}\) and \(R\mathbf{c} = \mathbf{0}\) with \(\mathbf{c} = (c_1, \ldots, c_n)^T\).

Step 1: the pivot columns of \(A\) form a basis for \(\operatorname{Col} A\), so \(\dim(\operatorname{Col} A) = r\).

Let \(j_1 < j_2 < \cdots < j_r\) be the indices of the pivot columns, and let \(j_{r+1} < \cdots < j_n\) be the indices of the non-pivot columns. In \(R\), the pivot columns are precisely the standard basis vectors \(\mathbf{e}_1, \mathbf{e}_2, \ldots, \mathbf{e}_r \in \mathbb{R}^m\) (this is exactly the RREF condition: each leading \(1\) sits in a column whose other entries are all \(0\)). These are clearly linearly independent in \(\mathbb{R}^m\). By the invariant above, the pivot columns \(\mathbf{a}_{j_1}, \ldots, \mathbf{a}_{j_r}\) of \(A\) are also linearly independent.

Moreover, each non-pivot column \(\mathbf{r}_{j_k}\) of \(R\) (for \(k > r\)) is a linear combination of the preceding pivot columns of \(R\): its entries in rows \(1, \ldots, r\) give the coefficients, and all later rows of \(R\) are zero. Writing this relation as \(\mathbf{r}_{j_k} = \sum_{\ell=1}^r \alpha_\ell \mathbf{e}_\ell\) and translating back to \(A\) by the same invariant (applied to the relation \(\sum \alpha_\ell \mathbf{r}_{j_\ell} - \mathbf{r}_{j_k} = \mathbf{0}\)), we obtain \(\mathbf{a}_{j_k} = \sum_{\ell=1}^r \alpha_\ell \mathbf{a}_{j_\ell}\). Hence every column of \(A\) — pivot or not — is a linear combination of the pivot columns of \(A\), so \(\{\mathbf{a}_{j_1}, \ldots, \mathbf{a}_{j_r}\}\) spans \(\operatorname{Col} A\).

The pivot columns of \(A\) are therefore a basis for \(\operatorname{Col} A\). By the Invariance of Dimension theorem — which guarantees every basis of a finite-dimensional space has the same cardinality — we conclude \(\operatorname{rank} A = \dim(\operatorname{Col} A) = r\).

Step 2: \(\dim(\operatorname{Nul} A) = n - r\).

Write the \(n - r\) non-pivot columns of \(R\) as free variables \(x_{j_{r+1}}, \ldots, x_{j_n}\). Solving \(R\mathbf{x} = \mathbf{0}\) expresses each basic variable \(x_{j_\ell}\) (for \(1 \leq \ell \leq r\)) as a linear combination of the free variables. Writing the general solution in parametric vector form, \[ \mathbf{x} = x_{j_{r+1}} \mathbf{v}_{r+1} + x_{j_{r+2}} \mathbf{v}_{r+2} + \cdots + x_{j_n} \mathbf{v}_n, \] where each \(\mathbf{v}_k \in \mathbb{R}^n\) is obtained by setting the free variable \(x_{j_k}\) to \(1\) and all other free variables to \(0\), then reading off the resulting basic coordinates. By the invariant, this is also the general solution of \(A\mathbf{x} = \mathbf{0}\), so \(\{\mathbf{v}_{r+1}, \ldots, \mathbf{v}_n\}\) spans \(\operatorname{Nul} A\).

To see linear independence: in \(\mathbf{v}_k\), the coordinate at position \(j_k\) (the free variable we set to \(1\)) equals \(1\), while at position \(j_{k'}\) for \(k' \neq k\) (the other free variables set to \(0\)) it equals \(0\). Suppose \(\sum_{k=r+1}^n t_k \mathbf{v}_k = \mathbf{0}\). Reading the coordinate at position \(j_k\) gives \(t_k \cdot 1 + \sum_{k' \neq k} t_{k'} \cdot 0 = t_k\), so \(t_k = 0\) for each \(k\).

The vectors \(\mathbf{v}_{r+1}, \ldots, \mathbf{v}_n\) thus form a basis for \(\operatorname{Nul} A\); combined with Invariance of Dimension, this yields \(\dim(\operatorname{Nul} A) = n - r\).

Combining Steps 1 and 2, \(\operatorname{rank} A + \dim(\operatorname{Nul} A) = r + (n - r) = n\).