47 min read

> Learning paths. Math majors — read everything, especially the replacement-theorem proof in §15.7 (that all bases have the same size) and the Math-Major Sidebars on the coordinate isomorphism and on infinite-dimensional spaces. CS / Data Science —...

Prerequisites

  • chapter-06-subspaces-span-independence
  • chapter-14-row-space-and-rank-nullity

Learning Objectives

  • Define a basis as a set that is simultaneously linearly independent and spanning, and explain why both conditions are necessary.
  • Define the dimension of a vector space as the number of vectors in any basis, and justify that every basis has the same size.
  • Compute the coordinate vector of a vector relative to a given basis by solving a linear system, and verify by reconstruction.
  • Explain why the same vector has different coordinate lists in different bases, using a plane in three-dimensional space as the anchor.
  • Distinguish the standard basis from an arbitrary basis and state when coordinates equal the raw entries of a vector.
  • Argue that every finitely generated vector space has a basis, and find a basis for spaces such as polynomials and matrices.

Dimension, Basis, and Coordinates: How Many Numbers Do You Need?

Learning paths. Math majors — read everything, especially the replacement-theorem proof in §15.7 (that all bases have the same size) and the Math-Major Sidebars on the coordinate isomorphism and on infinite-dimensional spaces. CS / Data Science — focus on the Geometric Intuition callouts, the coordinate computations in §15.5 and §15.8, the numpy verifications, and the "how many numbers" reading of dimension; the deepest proof is optional but the coordinate machinery is essential. Physics / Engineering — focus on the geometry of a plane in three-space, the degrees-of-freedom reading of dimension, and the worked coordinate examples; the replacement theorem is the one proof worth your time. This chapter assumes span and linear independence from Chapter 6 and the rank ideas of Chapter 14.

15.1 How many numbers does it take to name a point?

Here is a question so basic it sounds like it can't have a deep answer: when you write down a vector, how many numbers do you actually need? In the plane you write two, like $(3, 5)$. In space you write three. On a line, one. The pattern feels obvious — until you notice that the same point can be named by completely different lists of numbers, and that the count itself is a property worth proving, not assuming.

Picture a flat sheet of paper held at an angle in the air. It is a genuinely two-dimensional thing: an ant walking on it has exactly two independent directions to move. Yet that sheet lives in three-dimensional space, and every point on it carries a clumsy three-number address $(x, y, z)$ inherited from the room around it. Three numbers to describe a two-dimensional surface feels wasteful — and it is. If you laid down two rulers on the sheet itself, you could name any point with just two numbers. The sheet doesn't need the third number; the room does.

This chapter is about turning that intuition into mathematics. We will make three ideas precise, and they are the load-bearing concepts for everything that follows. A basis is the minimal set of rulers — the smallest collection of direction-vectors that can reach every point in a space without any redundancy. The dimension is how many rulers you need, and we will prove the count is the same no matter which valid set of rulers you pick. And the coordinate vector is the actual list of numbers a basis assigns to a vector — the answer to "how far along each ruler?"

The Key Insight — A vector is not its list of numbers. The list is the vector's address relative to a chosen basis. Change the basis and the same geometric arrow gets a different address — the way the same building has one address in street numbers and a different one in GPS coordinates. The arrow never moved; we re-measured it.

The throughline of this whole book is that a matrix is a transformation, and that the same transformation looks like different matrices in different coordinate systems. You cannot understand that sentence until you understand coordinates — and coordinates require a basis. So Chapter 15 is the hinge: it is where the abstract span-and-independence machinery of Chapter 6 finally becomes a usable tool, and it sets up Chapter 16, where we will watch the matrix of a transformation morph as we change the basis underneath it.

Keep one concrete picture in your pocket for the entire chapter: a plane through the origin in $\mathbb{R}^3$. It is the simplest example where the gap between "numbers the room uses" (three) and "numbers the space needs" (two) is visible, and we will return to it again and again.

What does "how many numbers do you need" actually mean?

It is tempting to answer "however many entries the vector has" — but that answer is about how the vector is written, not about the space it lives in. A point on our tilted plane is written with three entries because the plane sits in $\mathbb{R}^3$, yet the plane is two-dimensional. The honest version of the question is: what is the fewest numbers from which the whole space can be reconstructed, given an agreed-on set of building blocks? That fewest count is the dimension, and the building blocks are a basis. Everything in this chapter is an unpacking of that one sentence.

15.2 What is a basis, and why does it need two conditions?

Recall from Chapter 6 two properties a set of vectors might have. A set spans a space $V$ if every vector in $V$ is some linear combination of the set — the set reaches everything. A set is linearly independent if no vector in it is a linear combination of the others — there is no redundancy. Spanning is about being enough; independence is about being not too much. A basis is the Goldilocks set that is both at once.

Definition (basis). A basis for a vector space $V$ is an ordered set of vectors $\{\mathbf{b}_1, \mathbf{b}_2, \dots, \mathbf{b}_n\}$ that is (1) linearly independent and (2) spans $V$. Both conditions are required. We will usually care about the ordering too — a basis is an ordered list — because that is what makes coordinates a well-defined list rather than an unordered bag.

Why insist on both? Because each condition rules out a different kind of failure, and a "minimal measuring stick" needs both ruled out.

Geometric Intuition — Think of a basis as a set of measuring rulers laid down at the origin, one per independent direction. Spanning means the rulers reach everywhere: stand at any point in the space and you can get there by walking some distance along each ruler. Independence means no ruler is wasted: you can't reproduce one ruler's direction by combining the others, so dropping any ruler would leave a direction you can no longer measure. A spanning-but-dependent set has a redundant ruler (two rulers pointing the same way); an independent-but-non-spanning set is missing a ruler (a direction you can't reach). A basis has exactly one ruler per direction — no more, no less.

Let's see both failure modes concretely in $\mathbb{R}^2$.

  • Too few (independent but not spanning). The single vector $\{(1, 0)\}$ is independent (a one-element set with a nonzero vector always is), but it does not span $\mathbb{R}^2$ — you can only reach the horizontal axis. You can measure horizontal displacement but you have no ruler for vertical. Not a basis.
  • Too many (spanning but dependent). The set $\{(1,0), (0,1), (1,1)\}$ spans $\mathbb{R}^2$ — in fact the first two already do — but it is dependent, since $(1,1) = (1,0) + (0,1)$. The third vector is a redundant ruler. Coordinates would not be unique: the point $(1,1)$ could be reached as "one of the third ruler" or as "one of the first plus one of the second." Not a basis.
  • Just right. The set $\{(1,0), (0,1)\}$ is independent and spanning. It is a basis — the famous standard basis of $\mathbb{R}^2$, which we'll write $\{\mathbf{e}_1, \mathbf{e}_2\}$.

Common Pitfall"A basis is just any spanning set" or "a basis is just any independent set." Neither alone is enough, and the two errors have opposite consequences. A spanning set that is dependent gives you non-unique coordinates (more than one recipe reaches the same point). An independent set that doesn't span leaves vectors you cannot name at all. You need both conditions precisely so that every vector has exactly one coordinate list — existence (from spanning) and uniqueness (from independence). We make this airtight in §15.4.

What is the standard basis, and why is it "standard"?

The standard basis of $\mathbb{R}^n$ is $\{\mathbf{e}_1, \dots, \mathbf{e}_n\}$, where $\mathbf{e}_i$ is the vector with a $1$ in position $i$ and $0$ everywhere else. In $\mathbb{R}^3$ these are $\mathbf{e}_1 = (1,0,0)$, $\mathbf{e}_2 = (0,1,0)$, $\mathbf{e}_3 = (0,0,1)$. It is "standard" for one reason that will become the punchline of §15.6: when you measure a vector against the standard basis, the coordinates are the entries you already wrote down. The vector $(3, 5, -2)$ is exactly $3\mathbf{e}_1 + 5\mathbf{e}_2 - 2\mathbf{e}_3$, so its standard coordinates are $(3, 5, -2)$ — no computation needed. The standard basis is the one where the address and the arrow look identical, which is exactly why it hides the distinction this chapter is built to expose.

Check Your Understanding — Is $\{(2, 1), (4, 2)\}$ a basis for $\mathbb{R}^2$? Why or why not?

Answer No. The second vector is exactly $2$ times the first, so the set is linearly dependent (it fails the independence condition). Geometrically both rulers point in the same direction — along the line $y = x/2$ — so the set spans only that line, not all of $\mathbb{R}^2$. It fails both conditions at once: dependent, and not spanning. A basis of $\mathbb{R}^2$ needs two vectors pointing in genuinely different directions.

Two faces of "minimal": a basis from above and from below

The word "minimal measuring stick" hides a precise double meaning that is worth making explicit, because it gives you two independent ways to recognize a basis. A basis is at once a maximal independent set — independent, but so full that you cannot add any vector of the space without creating a dependence — and a minimal spanning set — spanning, but so lean that you cannot remove any vector without losing the ability to reach everything. The same Goldilocks set is "as large as independence allows" and "as small as spanning allows" simultaneously, and that coincidence is exactly what makes it special.

Here is why a basis is a maximal independent set. Suppose $\{\mathbf{b}_1, \dots, \mathbf{b}_n\}$ is a basis and you try to add any vector $\mathbf{w}$ from the space. Because the basis spans, $\mathbf{w}$ is already some combination of the $\mathbf{b}_i$ — so the enlarged set $\{\mathbf{b}_1, \dots, \mathbf{b}_n, \mathbf{w}\}$ contains a vector ($\mathbf{w}$) that is a combination of the others, making it dependent. You cannot grow a basis and stay independent. Conversely, why is a basis a minimal spanning set? If you remove some $\mathbf{b}_k$, the remaining vectors can no longer produce $\mathbf{b}_k$ (independence guarantees $\mathbf{b}_k$ was not a combination of the others), so the shrunken set fails to span — it has lost the direction $\mathbf{b}_k$ supplied. You cannot shrink a basis and keep spanning. A basis sits exactly at the boundary between "too small to span" and "too big to stay independent."

The Key Insight — A basis is the unique sweet spot where two opposite pressures meet: push for fewer vectors (minimal spanning) and push for more vectors (maximal independent), and they balance at the same set. This is why a basis is the right notion of "coordinate system": it is the smallest list that still names every vector (no missing directions) and the largest list with no redundancy (no ambiguous names). Either characterization can be used as a test — to confirm a candidate is a basis, you may check "independent and can't be extended" or "spanning and can't be reduced," whichever is easier in context.

Historical Note — The modern axiomatic notions of basis and dimension crystallized with Giuseppe Peano's 1888 Calcolo geometrico, the first place vector spaces were defined by axioms much as we state them today, and were sharpened by Hermann Grassmann's earlier (1844) Ausdehnungslehre, whose ideas were so far ahead of their notation that they went largely unread for decades. The exchange argument behind "all bases have the same size" is usually credited to Ernst Steinitz (around 1913) [verify]. The word dimension is far older than the algebra, of course — it long meant the everyday "length, width, height" count of independent directions, which is precisely the degrees-of-freedom intuition the linear-algebra definition makes rigorous.

15.3 How do you choose a basis for a plane in $\mathbb{R}^3$? (the anchor)

Now the picture we promised to carry through the chapter. Let $P$ be the plane through the origin in $\mathbb{R}^3$ consisting of all combinations of two specific vectors: $$\mathbf{b}_1 = \begin{bmatrix} 2 \\ 1 \\ 0 \end{bmatrix}, \qquad \mathbf{b}_2 = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}, \qquad P = \operatorname{span}\{\mathbf{b}_1, \mathbf{b}_2\}.$$

Is $\{\mathbf{b}_1, \mathbf{b}_2\}$ a basis for the plane $P$? Check the two conditions. By construction $\{\mathbf{b}_1, \mathbf{b}_2\}$ spans $P$ — we defined $P$ as their span, so reaching everything in $P$ is automatic. For independence, ask whether one is a scalar multiple of the other: is there a $c$ with $\mathbf{b}_2 = c\,\mathbf{b}_1$? That would require $1 = 2c$ and $1 = c$ simultaneously, which is impossible. So the two vectors are independent, and $\{\mathbf{b}_1, \mathbf{b}_2\}$ is a basis for $P$. The plane is two-dimensional — it needs exactly two rulers — even though every point on it is written with three coordinates inherited from $\mathbb{R}^3$.

Geometric Intuition — Picture $\mathbf{b}_1$ and $\mathbf{b}_2$ as two arrows pinned at the origin, not parallel, both lying in the tilted sheet $P$. They span a parallelogram, and sliding integer and fractional copies of that parallelogram tiles the entire infinite plane — every point of $P$ is "so far along $\mathbf{b}_1$, so far along $\mathbf{b}_2$." Those two distances are the point's coordinates within the plane. The ambient $z$-axis of the room is irrelevant to a creature living on the sheet; it has its own private two-number grid.

Here is the move that makes the chapter click. Take a specific point in the plane — say $$\mathbf{v} = 3\,\mathbf{b}_1 - 2\,\mathbf{b}_2 = 3\begin{bmatrix}2\\1\\0\end{bmatrix} - 2\begin{bmatrix}1\\1\\1\end{bmatrix} = \begin{bmatrix}4\\1\\-2\end{bmatrix}.$$ In the room's standard coordinates, this point is $(4, 1, -2)$ — three numbers. But measured against the plane's own basis $\{\mathbf{b}_1, \mathbf{b}_2\}$, the very same point is $(3, -2)$ — two numbers, because we built it as $3$ of the first ruler minus $2$ of the second. One arrow, two different addresses, and the shorter address is the honest one for a creature living in the plane. That two-number address $(3, -2)$ is what we will call the coordinate vector of $\mathbf{v}$ relative to the basis $\{\mathbf{b}_1, \mathbf{b}_2\}$, and §15.5 is about how to find it when it isn't handed to you.

Real-World Application — This is exactly how a 3D modeling or game engine handles a textured surface. A polygon floating in world space is flat, so the engine assigns it a private 2D coordinate system — UV coordinates — and stores the texture and the per-vertex data in those two numbers, not in the three world-space numbers. The plane's basis is the UV frame; world space is just where the polygon happens to be hung. The same idea, in higher dimensions, underlies dimensionality reduction: a dataset that lives in a 1000-dimensional space but really varies along only a few directions can be re-coordinatized against a small basis of those directions, replacing a thousand numbers per data point with a handful.

What if the vector isn't in the plane at all?

Coordinates relative to a basis of $P$ only make sense for vectors that actually live in $P$. Ask for the coordinates of a vector that sticks out of the plane and the existence half of §15.4 fails — there simply is no recipe $c_1\mathbf{b}_1 + c_2\mathbf{b}_2$ that reaches it, because the only points two rulers in $P$ can reach are points of $P$. This is not a defect; it is the spanning condition doing its job, and it is worth seeing concretely because it is exactly the error your toolkit function must detect.

Try $\mathbf{w} = (1, 0, 0)$ against our plane basis $\{\mathbf{b}_1, \mathbf{b}_2\}$. The system $B\mathbf{c} = \mathbf{w}$ is now three equations in two unknowns: $$\begin{cases} 2c_1 + c_2 = 1 \\ c_1 + c_2 = 0 \\ \phantom{2c_1 +{}} c_2 = 0 \end{cases}$$ The third equation forces $c_2 = 0$; then the second forces $c_1 = 0$; but then the first reads $0 = 1$, a contradiction. The system is inconsistent — no coordinates exist, because $\mathbf{w}$ is not in the plane. We can confirm geometrically: the plane's normal direction is $\mathbf{n} = \mathbf{b}_1 \times \mathbf{b}_2 = (1, -2, 1)$, and a vector lies in $P$ exactly when it is perpendicular to $\mathbf{n}$. Here $\mathbf{n}\cdot\mathbf{w} = 1 \neq 0$, so $\mathbf{w}$ pokes out of the plane and has no $\{\mathbf{b}_1, \mathbf{b}_2\}$-coordinates at all. (Both $\mathbf{b}_1$ and $\mathbf{b}_2$ satisfy $\mathbf{n}\cdot\mathbf{b}_i = 0$, as they must.) The closest we could come is to project $\mathbf{w}$ onto $P$ and coordinatize the projection — but projection is the least-squares story of Chapters 17 and 19, not coordinates. For now the lesson is sharp: a coordinate vector exists only for vectors inside the space the basis spans, and asking otherwise produces an inconsistent system, not a wrong number.

What if I want a different basis for the same plane?

Nothing forces the choice $\{\mathbf{b}_1, \mathbf{b}_2\}$. Any two independent vectors lying in $P$ form a basis for $P$. We could, for instance, run Gram–Schmidt (a procedure you'll meet properly in Chapter 20) to produce an orthonormal basis $\{\mathbf{q}_1, \mathbf{q}_2\}$ of the same plane — two perpendicular unit rulers instead of two skew ones. The plane is unchanged; only the rulers differ. And because the rulers differ, the same point $\mathbf{v} = (4, 1, -2)$ now reads as different coordinates against $\{\mathbf{q}_1, \mathbf{q}_2\}$ than it did against $\{\mathbf{b}_1, \mathbf{b}_2\}$. We will compute both lists in §15.8 and confront the discrepancy head-on. The lesson is already visible, though: the plane has a fixed dimension (two), but no privileged basis — and the coordinate list of a vector is meaningful only once you have said which basis you measured against.

15.4 Why does a basis give every vector exactly one coordinate list?

We claimed coordinates are unique once you fix a basis. That claim deserves a proof, because uniqueness is precisely what makes a coordinate vector a legitimate, unambiguous thing — and the proof is short and reveals exactly where each of the two basis conditions earns its keep.

Theorem (existence and uniqueness of coordinates). Let $\{\mathbf{b}_1, \dots, \mathbf{b}_n\}$ be a basis for a vector space $V$. Then every vector $\mathbf{v} \in V$ can be written as a linear combination $\mathbf{v} = c_1\mathbf{b}_1 + \dots + c_n\mathbf{b}_n$ in exactly one way. The scalars $c_1, \dots, c_n$ are uniquely determined by $\mathbf{v}$ and the (ordered) basis.

Why we care. This is the theorem that licenses the entire idea of a "coordinate vector." Without it, the list $(c_1, \dots, c_n)$ would be ambiguous, and naming a vector by its coordinates would be meaningless. With it, coordinates become a faithful renaming: fix the basis, and vectors and coordinate-lists are in perfect one-to-one correspondence.

Key idea. Spanning guarantees at least one representation exists; independence guarantees at most one. Two combinations that produce the same vector must, by independence, have been the same combination all along.

Proof. Existence. Because $\{\mathbf{b}_1, \dots, \mathbf{b}_n\}$ spans $V$, the vector $\mathbf{v}$ is some linear combination of the $\mathbf{b}_i$ — that is literally what spanning means. So at least one list of scalars $(c_1, \dots, c_n)$ with $\mathbf{v} = \sum_i c_i \mathbf{b}_i$ exists.

Uniqueness. Suppose $\mathbf{v}$ had two representations, $$\mathbf{v} = c_1\mathbf{b}_1 + \dots + c_n\mathbf{b}_n \qquad \text{and} \qquad \mathbf{v} = d_1\mathbf{b}_1 + \dots + d_n\mathbf{b}_n.$$ Subtract the second from the first. The left side becomes $\mathbf{v} - \mathbf{v} = \mathbf{0}$, and the right side collects into $$\mathbf{0} = (c_1 - d_1)\mathbf{b}_1 + (c_2 - d_2)\mathbf{b}_2 + \dots + (c_n - d_n)\mathbf{b}_n.$$ Now invoke independence. By definition, the only way a linear combination of independent vectors can equal the zero vector is if every coefficient is zero. Therefore $c_i - d_i = 0$ for each $i$, i.e. $c_i = d_i$ for all $i$. The two supposedly different representations were identical. $\blacksquare$

What this means. Read the proof as a clean division of labor between the two basis conditions: spanning gives existence, independence gives uniqueness. Drop spanning and some vectors get no coordinates; drop independence and some vectors get many coordinates. The Goldilocks definition of §15.2 was engineered to make exactly-one-coordinate-list true — that is why a basis is defined the way it is. With this in hand we can finally name the object.

Definition (coordinate vector). Given an ordered basis $\mathcal{B} = \{\mathbf{b}_1, \dots, \mathbf{b}_n\}$ of $V$ and a vector $\mathbf{v} \in V$ with the (unique) expansion $\mathbf{v} = c_1\mathbf{b}_1 + \dots + c_n\mathbf{b}_n$, the coordinate vector of $\mathbf{v}$ relative to $\mathcal{B}$ is the column of scalars $$[\mathbf{v}]_{\mathcal{B}} = \begin{bmatrix} c_1 \\ c_2 \\ \vdots \\ c_n \end{bmatrix} \in \mathbb{R}^n.$$ The bracket-subscript notation $[\,\cdot\,]_{\mathcal{B}}$ reads "coordinates relative to $\mathcal{B}$."

Math-Major Sidebar — coordinates are an isomorphism. The map $\mathbf{v} \mapsto [\mathbf{v}]_{\mathcal{B}}$ is a linear isomorphism from $V$ onto $\mathbb{R}^n$: it is linear ($[\mathbf{u} + \mathbf{v}]_{\mathcal{B}} = [\mathbf{u}]_{\mathcal{B}} + [\mathbf{v}]_{\mathcal{B}}$ and $[c\mathbf{v}]_{\mathcal{B}} = c[\mathbf{v}]_{\mathcal{B}}$, which you can check directly from uniqueness), it is one-to-one (different vectors get different coordinates, by uniqueness) and onto (every coordinate list comes from some vector, by spanning). The deep consequence: every $n$-dimensional real vector space is "the same as" $\mathbb{R}^n$ once you fix a basis. An abstract space of quadratic polynomials, or of $2\times 2$ matrices, is — structurally — just $\mathbb{R}^3$ or $\mathbb{R}^4$ wearing a costume. Choosing a basis is choosing the costume. This is why linear algebra over $\mathbb{R}^n$ is not a special case but the whole subject in disguise.

15.5 How do you actually compute a coordinate vector?

In the anchor we built $\mathbf{v}$ from the basis, so its coordinates were obvious. Usually you are handed a vector in standard coordinates and a basis, and you must find the coordinate list. The recipe is one you already own from Chapter 4: solving a linear system.

Suppose $\mathcal{B} = \{\mathbf{b}_1, \dots, \mathbf{b}_n\}$ is a basis for $\mathbb{R}^n$ and you want $[\mathbf{v}]_{\mathcal{B}}$. By definition you need scalars $c_1, \dots, c_n$ with $$c_1 \mathbf{b}_1 + c_2 \mathbf{b}_2 + \dots + c_n \mathbf{b}_n = \mathbf{v}.$$ Stack the basis vectors as the columns of a matrix $B = [\,\mathbf{b}_1 \mid \mathbf{b}_2 \mid \dots \mid \mathbf{b}_n\,]$. Then the displayed equation is exactly the matrix–vector equation $$B\,\mathbf{c} = \mathbf{v}, \qquad \text{where } \mathbf{c} = [\mathbf{v}]_{\mathcal{B}}.$$ So finding coordinates is solving $B\mathbf{c} = \mathbf{v}$. Because the columns of $B$ form a basis, they are independent and there are $n$ of them in $\mathbb{R}^n$, so $B$ is square and invertible (we proved invertibility ↔ independent columns back in Chapter 9). A unique solution always exists — which is the computational face of the existence-and-uniqueness theorem we just proved.

The Key Insight — To re-express a vector in a new basis, put the basis vectors in the columns of $B$ and solve $B\mathbf{c} = \mathbf{v}$. The coordinate vector is $\mathbf{c} = B^{-1}\mathbf{v}$. The matrix $B^{-1}$ is the machine that converts standard coordinates into $\mathcal{B}$-coordinates; $B$ itself runs the conversion the other way, rebuilding the standard vector from its coordinate list. We will name $B^{-1}$ the change-of-coordinates matrix and study it as a transformation in Chapter 16.

Worked example: coordinates in $\mathbb{R}^3$

Let the basis be $$\mathbf{b}_1 = \begin{bmatrix}1\\0\\0\end{bmatrix}, \quad \mathbf{b}_2 = \begin{bmatrix}1\\1\\0\end{bmatrix}, \quad \mathbf{b}_3 = \begin{bmatrix}1\\1\\1\end{bmatrix}, \qquad \mathcal{B} = \{\mathbf{b}_1, \mathbf{b}_2, \mathbf{b}_3\},$$ and find the coordinates of $\mathbf{v} = (4, 3, 2)$ relative to $\mathcal{B}$. First, is $\mathcal{B}$ even a basis? Stack the vectors as columns and note the matrix is upper-triangular with $1$'s on the diagonal, so $\det(B) = 1 \neq 0$ — the columns are independent and (being three independent vectors in $\mathbb{R}^3$) they span, so yes, $\mathcal{B}$ is a basis. (We will see in §15.6 why three independent vectors in $\mathbb{R}^3$ automatically span.)

Now solve $B\mathbf{c} = \mathbf{v}$: $$\begin{bmatrix} 1 & 1 & 1 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} c_1 \\ c_2 \\ c_3 \end{bmatrix} = \begin{bmatrix} 4 \\ 3 \\ 2 \end{bmatrix}.$$ This system is already upper-triangular, so back-substitution (Chapter 4) is immediate. The bottom row says $c_3 = 2$. The middle row says $c_2 + c_3 = 3$, so $c_2 = 3 - 2 = 1$. The top row says $c_1 + c_2 + c_3 = 4$, so $c_1 = 4 - 1 - 2 = 1$. Therefore $$[\mathbf{v}]_{\mathcal{B}} = \begin{bmatrix} 1 \\ 1 \\ 2 \end{bmatrix}.$$ Verify by reconstruction — the single most reliable habit in coordinate work. Rebuild $\mathbf{v}$ from the coordinates: $1\cdot\mathbf{b}_1 + 1\cdot\mathbf{b}_2 + 2\cdot\mathbf{b}_3 = (1,0,0) + (1,1,0) + (2,2,2) = (4, 3, 2) = \mathbf{v}$. It checks. Notice that $\mathbf{v}$ has standard coordinates $(4,3,2)$ but $\mathcal{B}$-coordinates $(1,1,2)$ — same arrow, different address, exactly as promised.

numpy verification

Let's confirm the hand computation and, while we are here, peek at the change-of-coordinates matrix $B^{-1}$.

# Coordinates of v relative to a basis B: solve B c = v.
import numpy as np
b1, b2, b3 = np.array([1.,0,0]), np.array([1.,1,0]), np.array([1.,1,1])
B = np.column_stack([b1, b2, b3])      # basis vectors as COLUMNS
v = np.array([4., 3, 2])
c = np.linalg.solve(B, v)              # the coordinate vector [v]_B
print("coords [v]_B =", c)             # -> [1. 1. 2.]
print("reconstruct  =", B @ c)         # -> [4. 3. 2.]  (matches v)
print("B^-1 =\n", np.linalg.inv(B))    # the standard -> B change-of-coords matrix

Output:

coords [v]_B = [1. 1. 2.]
reconstruct  = [4. 3. 2.]
B^-1 =
 [[ 1. -1.  0.]
 [ 0.  1. -1.]
 [ 0.  0.  1.]]

The coordinate vector is $(1, 1, 2)$, matching the hand result, and reconstructing through $B$ returns the original $\mathbf{v}$. The matrix $B^{-1}$ shown is the operator that turns any standard vector into its $\mathcal{B}$-coordinates — check that $B^{-1}\mathbf{v} = (1,1,2)$ yourself by reading off its action on $(4,3,2)$.

Computational Note — Use np.linalg.solve(B, v) rather than np.linalg.inv(B) @ v whenever you only need the coordinates of one vector. Solving is both faster and numerically more accurate — forming the explicit inverse first can amplify floating-point error and is wasted work. Compute $B^{-1}$ explicitly only when you will reuse it to convert many vectors, which is exactly the change-of-basis situation of Chapter 16.

FAQ — Why are the basis vectors the columns of $B$ and not the rows? Because the equation we are solving is $c_1\mathbf{b}_1 + \dots + c_n\mathbf{b}_n = \mathbf{v}$, and a matrix times a column vector is precisely a linear combination of the matrix's columns, weighted by the entries of that vector (this is the column picture from Chapter 7). Stacking the $\mathbf{b}_i$ as columns makes $B\mathbf{c}$ literally equal to that combination. If you stacked them as rows you'd be computing dot products instead, which answers a different question.

A second worked example: one vector, two grids of $\mathbb{R}^2$

The $\mathbb{R}^3$ example moved fast because the matrix was triangular. Let's do a fully general $\mathbb{R}^2$ computation slowly, because $\mathbb{R}^2$ is where you can draw the grids and see what coordinates mean — and because re-coordinatizing the same vector against two different bases is the exact rehearsal for Chapter 16. Take the vector $\mathbf{v} = (5, 1)$, written in standard coordinates, and read it against two different bases of $\mathbb{R}^2$.

Basis 1: $\mathcal{B} = \{(1,1),\ (1,-1)\}$. These are the two diagonal directions — northeast and southeast. To find $[\mathbf{v}]_{\mathcal{B}}$, solve $B\mathbf{c} = \mathbf{v}$: $$\begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}\begin{bmatrix} c_1 \\ c_2 \end{bmatrix} = \begin{bmatrix} 5 \\ 1 \end{bmatrix}.$$ The two scalar equations are $c_1 + c_2 = 5$ and $c_1 - c_2 = 1$. Add them: $2c_1 = 6$, so $c_1 = 3$. Subtract: $2c_2 = 4$, so $c_2 = 2$. Thus $[\mathbf{v}]_{\mathcal{B}} = (3, 2)$. Reconstruct to check: $3(1,1) + 2(1,-1) = (3,3) + (2,-2) = (5, 1) = \mathbf{v}$. ✓

Basis 2: $\mathcal{C} = \{(2,1),\ (1,3)\}$. Two skew directions. Solve $C\mathbf{c} = \mathbf{v}$: $$\begin{bmatrix} 2 & 1 \\ 1 & 3 \end{bmatrix}\begin{bmatrix} c_1 \\ c_2 \end{bmatrix} = \begin{bmatrix} 5 \\ 1 \end{bmatrix}.$$ From the second equation $c_1 = 1 - 3c_2$; substitute into the first: $2(1 - 3c_2) + c_2 = 5$, i.e. $2 - 5c_2 = 5$, so $c_2 = -3/5 = -0.6$ and then $c_1 = 1 - 3(-0.6) = 2.8$. Thus $[\mathbf{v}]_{\mathcal{C}} = (2.8,\ -0.6)$. Reconstruct: $2.8(2,1) + (-0.6)(1,3) = (5.6, 2.8) + (-0.6, -1.8) = (5, 1) = \mathbf{v}$. ✓

One vector $\mathbf{v} = (5,1)$, three honest coordinate lists so far: $(5,1)$ in the standard basis, $(3,2)$ in $\mathcal{B}$, and $(2.8, -0.6)$ in $\mathcal{C}$. Each list reconstructs the identical arrow. The numpy below confirms both new computations at once.

# v=(5,1) in two non-standard bases of R^2; both reconstruct to v.
import numpy as np
v = np.array([5., 1.])
B = np.column_stack([[1., 1.], [1., -1.]])     # basis {(1,1),(1,-1)}
C = np.column_stack([[2., 1.], [1.,  3.]])     # basis {(2,1),(1,3)}
cB, cC = np.linalg.solve(B, v), np.linalg.solve(C, v)
print("[v]_B =", cB, " reconstruct:", B @ cB)  # -> [3. 2.]   [5. 1.]
print("[v]_C =", cC, " reconstruct:", C @ cC)  # -> [2.8 -0.6] [5. 1.]

Output:

[v]_B = [3. 2.]  reconstruct: [5. 1.]
[v]_C = [2.8 -0.6]  reconstruct: [5. 1.]

Both lists reconstruct $(5,1)$ exactly, matching the hand computations.

Geometric Intuition (Figure 15.1). Picture three grids laid over the same plane. The standard grid is the familiar graph paper with horizontal and vertical lines; $\mathbf{v}=(5,1)$ sits "5 right, 1 up." The basis $\mathcal{B}$ grid is rotated 45° — its lines run northeast and southeast — and on that grid the very same dot reads "3 along the NE ruler, 2 along the SE ruler." The basis $\mathcal{C}$ grid is a sheared parallelogram grid, and the dot reads $(2.8, -0.6)$ there. The dot never moved; we changed the graph paper underneath it. This is precisely the re-gridding picture the recurring 2D visualizer from Chapter 1 will animate in Chapter 16, where the new basis vectors become the new $\hat{\imath}$ and $\hat{\jmath}$.

Figure 15.1 — The vector $(5,1)$ overlaid with three coordinate grids (standard, the rotated $\{(1,1),(1,-1)\}$, and the sheared $\{(2,1),(1,3)\}$). The same point reads as three different coordinate pairs, one per grid; the alt-text: a single black dot with three differently-oriented grids drawn through the origin, each grid labeling the dot with its own pair of numbers.

Common Pitfall"Coordinates in a non-orthonormal basis are found by dot products / projections." This shortcut is only valid when the basis is orthonormal (as in §15.8). For a general basis like $\mathcal{C} = \{(2,1),(1,3)\}$, you must solve the system $C\mathbf{c} = \mathbf{v}$ — taking dot products $\mathbf{v}\cdot(2,1)$ and $\mathbf{v}\cdot(1,3)$ gives the wrong answer, because the basis vectors are not perpendicular and don't have unit length. Reserve the dot-product shortcut for orthonormal bases; otherwise, solve.

15.6 What is dimension, and is the count actually well-defined?

We have used the word "dimension" informally — the plane is "two-dimensional," $\mathbb{R}^3$ is "three-dimensional." Now make it a definition, and confront the hidden assumption inside it.

Definition (dimension). The dimension of a vector space $V$, written $\dim(V)$, is the number of vectors in a basis for $V$. A space with a finite basis is called finite-dimensional.

This definition contains a landmine. It says "a basis," but we already saw a space has many bases — the plane $P$ had the basis $\{\mathbf{b}_1, \mathbf{b}_2\}$ and could equally use an orthonormal basis $\{\mathbf{q}_1, \mathbf{q}_2\}$, and $\mathbb{R}^2$ has the standard basis and the basis $\{(1,1),(1,-1)\}$. For "dimension" to mean anything, we need a guarantee: every basis of the same space has the same number of vectors. If one basis of $\mathbb{R}^2$ had two vectors and another had three, "dimension" would be incoherent. The next section proves the guarantee. For now, accept it and reap the consequences:

  • $\dim(\mathbb{R}^n) = n$. The standard basis has $n$ vectors, so every basis does.
  • $\dim(P) = 2$ for our plane, since $\{\mathbf{b}_1, \mathbf{b}_2\}$ is a basis with two vectors. A plane through the origin in any $\mathbb{R}^n$ is two-dimensional.
  • A line through the origin has dimension $1$; the trivial space $\{\mathbf{0}\}$ has dimension $0$ (its basis is the empty set — zero rulers, because there is nothing to measure).

Dimension is the precise version of "how many numbers do you need." It is an intrinsic property of the space — it does not depend on how the space happens to be embedded or written down. Our plane is two-dimensional whether you describe its points with three coordinates (in $\mathbb{R}^3$) or two (in its own basis); the intrinsic answer is two.

Geometric Intuition — Dimension counts degrees of freedom: the number of independent directions you can move while staying inside the space. On a line you have one (forward/back). On a plane you have two. In space, three. A creature confined to our tilted plane $P$ has exactly two degrees of freedom no matter that it floats in a three-dimensional room — and "two" is its dimension. This degrees-of-freedom reading is how engineers and physicists hear the word, and §15.10's case study cashes it out for robots and molecules.

Dimension also connects directly to the rank machinery of Chapter 14. There you learned that for a matrix $A$, the rank is the dimension of the column space $C(A)$ and equally of the row space $C(A^{\mathsf{T}})$, while the nullity is $\dim N(A)$, and rank–nullity says $\operatorname{rank}(A) + \dim N(A) = n$. Every one of those is now literally a count of basis vectors for a subspace. Rank–nullity is, at heart, a statement about dimensions — a balanced accounting of how many basis vectors each subspace needs.

The Key Insight — Two facts make dimension a powerful shortcut once you trust it. (1) In an $n$-dimensional space, any set of $n$ independent vectors is automatically a basis (independence at the full count forces spanning), and any set of $n$ spanning vectors is automatically independent. So at the magic number $n$, you only have to check one of the two basis conditions. (2) Any set with more than $n$ vectors in an $n$-dimensional space must be dependent, and any set with fewer than $n$ cannot span. Counting alone often settles whether something can be a basis — before you compute anything.

This is why, in §15.5, three independent vectors in $\mathbb{R}^3$ were guaranteed to span: three is the dimension, and at the full count independence upgrades itself to "basis" for free. You verified independence (via $\det \neq 0$) and got spanning thrown in.

Check Your Understanding — Can four vectors ever be a basis for $\mathbb{R}^3$? Can two?

Answer Neither. $\dim(\mathbb{R}^3) = 3$, so every basis has exactly three vectors. Four vectors in $\mathbb{R}^3$ are necessarily dependent (too many — more than the dimension forces a redundancy), so they fail independence. Two vectors cannot span $\mathbb{R}^3$ (too few — they reach at most a plane), so they fail spanning. Only a set of exactly three independent vectors can be a basis of $\mathbb{R}^3$.

15.7 Why does every basis have the same size?

This is the theorem that makes "dimension" well-defined, and it is the most important proof of the chapter. Everything above secretly leaned on it. The result is sometimes called the dimension theorem or attributed to the Steinitz exchange (replacement) lemma [verify]; whatever the name, the engine is a single idea about trading vectors.

Theorem (invariance of dimension). If a vector space $V$ has a basis with $n$ vectors, then every basis of $V$ has exactly $n$ vectors.

Why we care. Without this, the definition of dimension in §15.6 is meaningless — "the number of vectors in a basis" presupposes the number doesn't depend on which basis. This theorem is what lets us speak of the dimension as a single number attached to the space, and it is what justifies every counting shortcut in the previous section.

Key idea. A spanning set can never be smaller than an independent set in the same space. The reason is a replacement process: given a spanning set and an independent set, you can swap the independent vectors in one at a time, each time kicking out a spanning vector and keeping the set spanning. You never run out of spanning vectors to kick out before you've placed all the independent ones — which forces "independent count $\leq$ spanning count." Apply that twice and the two basis sizes squeeze together.

Proof. We prove the crux lemma and then finish in one line.

Replacement Lemma. Suppose $\{\mathbf{w}_1, \dots, \mathbf{w}_m\}$ spans $V$ and $\{\mathbf{u}_1, \dots, \mathbf{u}_k\}$ is linearly independent in $V$. We claim $k \leq m$ — there can be no more independent vectors than spanning vectors.

Argue by feeding the $\mathbf{u}$'s into the spanning set one at a time. Start with the spanning list $S_0 = (\mathbf{w}_1, \dots, \mathbf{w}_m)$.

Step 1. Because $S_0$ spans, $\mathbf{u}_1$ is a combination $\mathbf{u}_1 = a_1\mathbf{w}_1 + \dots + a_m\mathbf{w}_m$. Since $\mathbf{u}_1 \neq \mathbf{0}$ (it belongs to an independent set, which can't contain $\mathbf{0}$), at least one coefficient $a_j$ is nonzero. Solve that equation for the corresponding $\mathbf{w}_j$: it equals a combination of $\mathbf{u}_1$ and the other $\mathbf{w}$'s. So we may replace $\mathbf{w}_j$ by $\mathbf{u}_1$ and the new list $S_1 = (\mathbf{u}_1, \mathbf{w}_1, \dots, \widehat{\mathbf{w}_j}, \dots, \mathbf{w}_m)$ — with $\mathbf{w}_j$ removed — still spans $V$ (anything the old list reached, the new one reaches, because we can recover the discarded $\mathbf{w}_j$ from $\mathbf{u}_1$ and the survivors).

Step $t$. Inductively, suppose after $t-1$ steps the $\mathbf{u}_1, \dots, \mathbf{u}_{t-1}$ have replaced $t-1$ of the $\mathbf{w}$'s and the resulting list of $m$ vectors still spans. Express $\mathbf{u}_t$ in that spanning list: $$\mathbf{u}_t = \big(\text{combination of } \mathbf{u}_1, \dots, \mathbf{u}_{t-1}\big) + \big(\text{combination of the remaining } \mathbf{w}\text{'s}\big).$$ The coefficients on the remaining $\mathbf{w}$'s cannot all be zero — if they were, $\mathbf{u}_t$ would be a combination of $\mathbf{u}_1, \dots, \mathbf{u}_{t-1}$ alone, contradicting the independence of the $\mathbf{u}$'s. So some surviving $\mathbf{w}$ has a nonzero coefficient; solve for it and replace it by $\mathbf{u}_t$. The list, still of size $m$, still spans.

The crucial observation: at every step there is always a $\mathbf{w}$ left to remove. If we had run out of $\mathbf{w}$'s — i.e. if all $m$ of them had already been replaced before we placed all $k$ of the $\mathbf{u}$'s — then the spanning list would consist entirely of $\mathbf{u}$'s, and the next $\mathbf{u}_t$ would be a combination of earlier $\mathbf{u}$'s, again contradicting independence. So the process places all $k$ vectors $\mathbf{u}_1, \dots, \mathbf{u}_k$ without exhausting the $m$ slots, which is only possible if $k \leq m$. The lemma is proved.

Finishing the theorem. Let $\mathcal{B} = \{\mathbf{b}_1, \dots, \mathbf{b}_n\}$ and $\mathcal{C} = \{\mathbf{c}_1, \dots, \mathbf{c}_p\}$ be two bases of $V$. Each is both spanning and independent. Apply the lemma once with $\mathcal{C}$ independent and $\mathcal{B}$ spanning: $p \leq n$. Apply it again with $\mathcal{B}$ independent and $\mathcal{C}$ spanning: $n \leq p$. Together $n = p$. Every basis has the same size. $\blacksquare$

What this means. The size of a basis is not an accident of which basis you grabbed — it is a rigid invariant of the space, forced by the simple fact that you can't pack more independent directions into a space than it takes spanning vectors to fill it. That invariant is the dimension. Geometrically: a plane is two-dimensional, full stop; no clever choice of rulers can describe it with one ruler or require three. The number of degrees of freedom is built into the space, and the replacement argument is the precise reason why.

Common Pitfall"A bigger basis describes the space in more detail." There is no such thing as a bigger basis. By this theorem every basis of a given space has exactly the same number of vectors. A list with extra vectors isn't a "more detailed basis" — it's not a basis at all, because it must be dependent. Likewise a shorter list isn't a "coarser basis"; it simply fails to span. The dimension is a hard ceiling and a hard floor simultaneously.

15.8 The same vector, two bases: confronting the discrepancy

Return to the anchor plane $P$ and the point $\mathbf{v} = (4, 1, -2)$, which we found has coordinates $(3, -2)$ relative to $\mathcal{B} = \{\mathbf{b}_1, \mathbf{b}_2\}$ with $\mathbf{b}_1 = (2,1,0)$, $\mathbf{b}_2 = (1,1,1)$. Now measure the same point against a different basis of the same plane, and watch the coordinate list change.

Build an orthonormal basis $\{\mathbf{q}_1, \mathbf{q}_2\}$ of $P$ by Gram–Schmidt (Chapter 20 — here we just use the output). Normalizing $\mathbf{b}_1$ and removing its component from $\mathbf{b}_2$ then normalizing gives, to four decimals, $$\mathbf{q}_1 \approx (0.8944,\ 0.4472,\ 0), \qquad \mathbf{q}_2 \approx (-0.1826,\ 0.3651,\ 0.9129),$$ two perpendicular unit vectors lying in $P$. Because the basis is orthonormal, finding coordinates is even easier than solving a system — each coordinate is just a dot product (a fact we will prove in Chapter 19, that projecting onto a unit vector reads off the coordinate). So $$[\mathbf{v}]_{\mathcal{Q}} = \big(\mathbf{v}\cdot\mathbf{q}_1,\ \mathbf{v}\cdot\mathbf{q}_2\big) \approx (4.0249,\ -2.1909).$$

The same arrow now reads $(4.0249, -2.1909)$ instead of $(3, -2)$. Different rulers, different numbers, identical point. Let's confirm both with numpy.

# Same plane point v, measured against TWO different bases of the plane.
import numpy as np
b1, b2 = np.array([2.,1,0]), np.array([1.,1,1])
v = 3*b1 - 2*b2                                  # the point (4, 1, -2)

# Basis 1: the skew basis {b1, b2}. Solve via least squares (v lies in the plane).
B = np.column_stack([b1, b2])
cB, *_ = np.linalg.lstsq(B, v, rcond=None)
print("coords in {b1,b2}:", np.round(cB, 4))    # -> [ 3. -2.]

# Basis 2: an orthonormal basis {q1, q2} of the SAME plane (Gram-Schmidt).
q1 = b1/np.linalg.norm(b1)
q2 = b2 - (b2 @ q1)*q1;  q2 = q2/np.linalg.norm(q2)
cQ = np.array([v @ q1, v @ q2])                  # orthonormal coords = dot products
print("coords in {q1,q2}:", np.round(cQ, 4))     # -> [ 4.0249 -2.1909]
print("reconstruct B :", np.round(B @ cB, 4))     # -> [ 4.  1. -2.]
print("reconstruct Q :", np.round(cQ[0]*q1 + cQ[1]*q2, 4))  # -> [ 4.  1. -2.]
print("|v|=", round(np.linalg.norm(v),4), " |cQ|=", round(np.linalg.norm(cQ),4))

Output:

coords in {b1,b2}: [ 3. -2.]
coords in {q1,q2}: [ 4.0249 -2.1909]
reconstruct B : [ 4.  1. -2.]
reconstruct Q : [ 4.  1. -2.]
|v|= 4.5826  |cQ|= 4.5826

Both coordinate lists reconstruct the identical world-space point $(4, 1, -2)$ — the two reconstruct lines agree exactly. Yet the lists $(3, -2)$ and $(4.0249, -2.1909)$ are different, because the rulers are different. There is one elegant bonus visible in the last line: in the orthonormal basis, the length of the coordinate vector equals the length of $\mathbf{v}$ itself ($\lVert \mathbf{v}\rVert = \lVert[\mathbf{v}]_{\mathcal{Q}}\rVert \approx 4.5826$). That is not true for the skew basis $\mathcal{B}$ — orthonormal bases preserve lengths and angles, which is exactly why we will work so hard to build them in Part IV.

How do you solve $B\mathbf{c} = \mathbf{v}$ when $B$ isn't square?

There is a subtlety the $\mathbb{R}^2$ and $\mathbb{R}^3$ examples hid: when the space is a subspace like our plane, the matrix $B$ of basis vectors is not square. The plane's basis $\{\mathbf{b}_1, \mathbf{b}_2\}$ gives a $3\times 2$ matrix $B$ — three rows (the ambient dimension), two columns (the basis size). So $B\mathbf{c} = \mathbf{v}$ is three equations in two unknowns. How can two unknowns satisfy three equations? Only because the equations are not independent when $\mathbf{v}$ genuinely lies in the plane — one of them is automatically implied by the others, exactly the redundancy that signals $\mathbf{v}\in P$.

Let's see it. Recompute the coordinates of $\mathbf{v} = (4, 1, -2)$ in $\{\mathbf{b}_1, \mathbf{b}_2\}$ the honest way, by solving the $3\times 2$ system without knowing the answer in advance: $$\begin{cases} 2c_1 + c_2 = 4 \\ c_1 + c_2 = 1 \\ \phantom{2c_1 +{}} c_2 = -2 \end{cases}$$ Use the last two equations first — the cheapest pair. The third says $c_2 = -2$. Substitute into the second: $c_1 + (-2) = 1$, so $c_1 = 3$. We now have a candidate $\mathbf{c} = (3, -2)$ from two of the three equations. The third (here, the first) equation is the consistency check: does $2c_1 + c_2 = 4$ hold? Plug in: $2(3) + (-2) = 6 - 2 = 4$. ✓ The first equation was redundant — it carried no new information, precisely because $\mathbf{v}$ lies in the plane spanned by $\mathbf{b}_1, \mathbf{b}_2$. Had $\mathbf{v}$ been off the plane, that third equation would have clashed (as it did for $\mathbf{w} = (1,0,0)$ in §15.3), and the system would be inconsistent.

This is the general pattern for coordinates in a $k$-dimensional subspace of $\mathbb{R}^m$ with $k < m$: the system $B\mathbf{c} = \mathbf{v}$ has $m$ equations and $k$ unknowns, and it is consistent if and only if $\mathbf{v}$ is in the subspace — in which case $m - k$ of the equations are redundant and the remaining $k$ pin down the unique coordinate vector. Your toolkit coordinates function will solve the square sub-system and then verify by reconstruction through the full $B$, which is exactly this redundant-equation check done in one line. (In floating point you would test that the reconstruction matches $\mathbf{v}$ to within a tolerance, since exact equality rarely holds — see the Computational Note in §15.5.)

FAQ — Why does a $3\times 2$ system have a unique solution when usually "more equations than unknowns" means no solution? Because the right-hand side is special. A system with more equations than unknowns is overdetermined and generically has no solution — the equations over-constrain. But here $\mathbf{v}$ was chosen to lie in the column space of $B$ (the plane $P$ is literally $C(B)$, from Chapter 13), and for right-hand sides inside the column space a solution always exists. Independence of the two columns then makes it unique. "Overdetermined" describes the shape of the system; whether a solution exists depends on whether $\mathbf{v}$ is reachable — the column-space question from Chapter 13.

Real-World Application — This basis-dependence of coordinates is the heart of changing representations in quantum mechanics. A quantum state is a single abstract vector, but its coordinate list depends entirely on which basis of measurements you choose — the position basis, the momentum basis, or an energy eigenbasis. The numbers (the "amplitudes") differ in each, yet they all describe the same state, exactly as $(3,-2)$ and $(4.0249, -2.1909)$ describe the same point of our plane. Physicists call switching between these lists a change of representation; it is the change-of-basis idea of Chapter 16 wearing physics notation.

FAQ — If the coordinates change with the basis, what stays the same? The vector itself — the geometric arrow, the actual point in the plane. Also invariant are coordinate-free quantities: the vector's length and the angle between two vectors are the same no matter which orthonormal basis you measure in (we'll prove this in Chapter 21). What changes is only the bookkeeping: the list of numbers you record. The art of advanced linear algebra is choosing the basis in which the bookkeeping is simplest — diagonal, say (Part V) — without ever changing the underlying object.

15.9 Does every space have a basis, and how do you find one?

We have assumed bases exist. For the spaces in this book, they do — and constructing one is something you already know how to do.

Theorem (existence of a basis). Every finitely generated vector space — one spanned by some finite set of vectors — has a basis. Concretely: any finite spanning set can be pruned down to a basis, and any independent set can be extended up to a basis.

The idea is the two-sided squeeze you've already felt. Prune: start with a spanning set; if it is dependent, some vector is a combination of the others, so throw it out — the smaller set still spans. Repeat until no vector is removable; what remains is independent and still spanning — a basis. Extend: start with an independent set; if it doesn't span, some vector lies outside its span, so add that vector — the larger set is still independent (the new vector wasn't reachable, so it can't be a combination of the old ones). Repeat until the set spans. Both processes terminate in finite dimensions because each step changes the count by one and the count is trapped between $0$ and the dimension by the previous theorem.

In practice, finding a basis for a concrete subspace is exactly the row-reduction skill of Chapters 4 and 14. To get a basis for the span of a pile of vectors, stack them and row reduce: the pivot columns of the original matrix give a basis for the column space, and the nonzero rows of the RREF give a basis for the row space (Chapter 14 spelled out which is which). Finding a basis is not a new technique — it is row reduction, read through the lens of this chapter.

Bases beyond $\mathbb{R}^n$: a finite-dimensional space in disguise

The whole apparatus works in any vector space, not just $\mathbb{R}^n$ — and this is where the abstraction of Chapter 5 finally pays a concrete dividend. Consider $P_2$, the space of polynomials of degree at most $2$: all expressions $a_0 + a_1 t + a_2 t^2$. This is a genuine vector space (you can add polynomials and scale them). A natural basis is $$\{1,\ t,\ t^2\},$$ the monomial basis. It spans (every degree-$\leq 2$ polynomial is a combination of $1, t, t^2$ by definition) and is independent (the only way $a_0 + a_1 t + a_2 t^2$ is the zero polynomial — zero for all $t$ — is $a_0 = a_1 = a_2 = 0$). Three basis vectors, so $\dim(P_2) = 3$.

Coordinates work just as before. The polynomial $p(t) = 2 - 3t + t^2$ has coordinate vector $$[p]_{\{1, t, t^2\}} = \begin{bmatrix} 2 \\ -3 \\ 1 \end{bmatrix},$$ read straight off the coefficients. By the coordinate isomorphism of §15.4, $P_2$ is $\mathbb{R}^3$ in costume: every operation on quadratics — adding them, scaling them, even differentiating them — becomes an operation on their three-number coordinate vectors. The same is true of the space of $2\times 2$ matrices, which has the four-element basis $\left\{\begin{psmallmatrix}1&0\\0&0\end{psmallmatrix}, \begin{psmallmatrix}0&1\\0&0\end{psmallmatrix}, \begin{psmallmatrix}0&0\\1&0\end{psmallmatrix}, \begin{psmallmatrix}0&0\\0&1\end{psmallmatrix}\right\}$ and therefore dimension $4$ — it is secretly $\mathbb{R}^4$.

What is the dimension of a subspace, and how do you find it?

The same recipe answers a question that comes up constantly in data science and engineering: given a pile of vectors, what is the dimension of the space they span, and what is a basis for it? The pile may contain hidden redundancy — some vectors may be combinations of others — and the dimension counts only the genuinely independent directions. This is the rank idea of Chapter 14 wearing its dimension hat: the dimension of a span equals the rank of the matrix whose rows (or columns) are those vectors.

Take the three vectors from Chapter 14's running example, now read as a spanning set in $\mathbb{R}^4$: $$\mathbf{a}_1 = (1,2,1,3),\quad \mathbf{a}_2 = (2,4,0,4),\quad \mathbf{a}_3 = (3,6,1,7),\qquad U = \operatorname{span}\{\mathbf{a}_1, \mathbf{a}_2, \mathbf{a}_3\}.$$ Three vectors, but is $\dim U = 3$? Stack them as rows and row reduce (Chapter 4). The third vector is exactly $\mathbf{a}_1 + \mathbf{a}_2$, so it collapses, and the RREF has two nonzero rows. Hence $\dim U = 2$ — the three vectors span only a plane in $\mathbb{R}^4$, and a basis for $U$ is the two nonzero RREF rows (or, equivalently, any two of the original $\mathbf{a}_i$, since any two are independent). Counting the vectors you were handed (three) overcounts; the dimension is the number that survive the row reduction.

# Dimension of a span = rank of the stacked matrix.
import numpy as np
A = np.array([[1.,2,1,3],
              [2.,4,0,4],
              [3.,6,1,7]])           # rows are the spanning vectors
print("number of vectors :", A.shape[0])              # -> 3
print("dim of their span  :", np.linalg.matrix_rank(A))  # -> 2 (a3 = a1 + a2)

Output:

number of vectors : 3
dim of their span  : 2

So np.linalg.matrix_rank is, read correctly, a dimension calculator: it reports how many independent directions a set of vectors actually contributes. There is also a clean accounting law for how the dimensions of two subspaces combine, worth knowing because it surfaces in graphics (intersecting planes) and in control theory.

Theorem (the dimension formula). For finite-dimensional subspaces $U$ and $W$ of a vector space, $$\dim(U + W) = \dim U + \dim W - \dim(U \cap W),$$ where $U + W$ is the span of everything in $U$ together with everything in $W$, and $U \cap W$ is their intersection. Sanity check: in $\mathbb{R}^3$ take $U$ the $xy$-plane and $W$ the $yz$-plane, each of dimension $2$. They overlap in the $y$-axis (dimension $1$) and together span all of $\mathbb{R}^3$ (dimension $3$). The formula reads $3 = 2 + 2 - 1$. ✓ It is the exact subspace analogue of inclusion–exclusion for sets, and the $-\dim(U\cap W)$ term is the cost of double-counting the shared directions.

Real-World Application — In machine learning, the columns of a feature matrix are the measured attributes of your data, and their span is the set of patterns the model can express. If two features are perfectly correlated — say "height in inches" and "height in centimeters" — they are linearly dependent, so they add a redundant ruler: the dimension of the feature span is smaller than the number of columns. This redundancy (collinearity) makes the system $B\mathbf{c} = \mathbf{v}$ for fitting coefficients have non-unique solutions, which destabilizes regression — the same failure of uniqueness we traced to dependence in §15.4. Detecting it is exactly computing the rank, i.e. the dimension of the feature span; fixing it is choosing an independent sub-basis of features. This is the precise foundation under dimensionality reduction, which replaces a wide, redundant feature set with a small basis of new features that captures the same span.

Math-Major Sidebar — the infinite-dimensional caveat. The pruning/extending argument needs the space to be finitely generated. Spaces like $C[0,1]$ (all continuous functions on $[0,1]$) or the space of all polynomials are infinite-dimensional: no finite set spans them. They still have bases in a set-theoretic sense (a Hamel basis, whose existence requires the axiom of choice and is wildly non-constructive), but those bases are useless in practice. The functional analyst's fix is to relax "basis" to allow infinite combinations — Fourier series (Chapter 22) express a function as an infinite sum over an orthonormal basis of $\sin$'s and $\cos$'s. That is a Schauder / Hilbert basis, a genuinely different and richer notion. For everything finite-dimensional in this book, the clean theorem above is all you need.

Build Your Toolkit — Implement coordinates(v, basis) in toolkit/coordinates.py. It takes a vector v (a list of numbers) and a basis (a list of basis vectors), forms the matrix $B$ whose columns are the basis vectors, and returns the coordinate vector $\mathbf{c} = [\mathbf{v}]_{\mathcal{B}}$ by solving $B\mathbf{c} = \mathbf{v}$ — reusing your gaussian_elimination(A, b) from Chapter 4's toolkit/linear_systems.py rather than calling numpy. Then verify by reconstruction: rebuild $B\mathbf{c}$ from the returned coordinates and confirm it equals v (and raise a clear error if the system is inconsistent — that means v is not in the span of the basis). Check the whole thing against np.linalg.solve(np.column_stack(basis), v) in toolkit/tests/. Sketch: ```python

toolkit/coordinates.py

from .linear_systems import gaussian_elimination # reuse Chapter 4

def coordinates(v, basis): """Coordinate vector of v relative to basis (basis vectors as columns).""" B = [[basis[j][i] for j in range(len(basis))] # column j = basis[j] for i in range(len(v))] # row i of B c = gaussian_elimination(B, v) # solve B c = v # verify by reconstruction: sum c_j * basis[j] should equal v ... return c ```

15.10 What's the payoff, and where does this go next?

Step back and see what these three ideas bought you. A basis is a minimal, non-redundant set of rulers for a space — independent so nothing is wasted, spanning so everything is reachable. Dimension is the number of rulers, a hard invariant of the space (every basis has the same size, by the replacement theorem) that counts the genuine degrees of freedom. And a coordinate vector is the list of numbers a basis assigns to a vector — guaranteed to exist and be unique once the basis is fixed, computed by solving $B\mathbf{c} = \mathbf{v}$, and different for different bases even though the underlying vector never moves.

It is worth pausing on how much these three ideas tightened the loose language of earlier parts. In Chapter 6 "span" and "independence" were properties a set might have; here they fused into the single notion of a basis, and we proved that the fusion forces exactly-one-coordinate-list per vector. In Chapter 14 "rank" was a count of pivots; here it became the dimension of a subspace — a count of basis vectors — so rank–nullity is revealed as pure dimensional bookkeeping. And the vague "how many numbers do you need" of §15.1 became a theorem-backed invariant: the dimension, the same for every basis, equal to the degrees of freedom, computable as a rank. The abstraction did not pile up jargon; it collapsed several earlier ideas into one clean count and one clean object, which is what good abstraction always does.

That last clause is the threshold concept of the chapter and the seed of the next one: a vector is not its numbers; the numbers are an address relative to a chosen basis. Once you internalize that, the central theme of the book becomes sayable. A linear transformation is a single geometric object, but the matrix that represents it is its coordinate-bookkeeping — and changing the basis changes the matrix while leaving the transformation untouched. Chapter 16 makes this precise: it builds the change-of-coordinates matrix (which we already glimpsed as $B^{-1}$ here), shows how the matrix of a transformation transforms under a change of basis, and uses the recurring 2D visualizer to show a transformation re-gridded into new coordinates. That is the doorway to similarity and, ultimately, to diagonalization in Part V — where we will hunt for the one magic basis in which a transformation's matrix is as simple as possible.

Geometric Intuition — Hold the whole chapter in one picture: a tilted plane in space, with two different grids drawn on it. The plane is fixed (that's the space). The grids are different (those are two bases). A single dot on the plane is fixed (that's the vector), but its grid-coordinates differ between the two grids (those are its coordinate vectors). The number of grid axes is the same for both grids — always two (that's the dimension). Master that single image and you have mastered dimension, basis, and coordinates.

Check Your Understanding — The space of $2 \times 2$ matrices, the space $P_3$ of polynomials of degree $\leq 3$, and $\mathbb{R}^4$ all have something in common. What, and why does it matter?

Answer All three have dimension 4 ($2\times 2$ matrices need four entries; $P_3 = \{a_0 + a_1 t + a_2 t^2 + a_3 t^3\}$ has the four-element basis $\{1, t, t^2, t^3\}$; $\mathbb{R}^4$ obviously has dimension four). By the coordinate isomorphism of §15.4, all three are structurally the same space — each is $\mathbb{R}^4$ once you fix a basis. It matters because any algorithm or theorem you prove for $\mathbb{R}^4$ transfers verbatim to quadratic-and-cubic polynomials and to $2\times 2$ matrices, just by translating through coordinate vectors. Dimension, not the dressing, is what determines a finite-dimensional space up to isomorphism.