50 min read

> Learning paths. Math majors — read everything, especially the two proofs in §6.7 and §6.8 and the Math-Major Sidebar on dependence over different fields. CS / Data Science — focus on the Geometric Intuition callouts, the span pictures, the numpy...

Prerequisites

  • chapter-05-vector-spaces

Learning Objectives

  • Define a subspace and test a subset of R^n for closure under addition and scalar multiplication, recognizing that every subspace must contain the origin.
  • Compute and visualize the span of a set of vectors as a line, a plane, or all of space, and describe span as the set of all reachable linear combinations.
  • Decide whether a set of vectors is linearly independent or dependent, both by hand (row reduction to a pivot in every column) and with numpy's matrix_rank.
  • Prove that the span of any set of vectors is a subspace, and that a set is dependent if and only if some vector is a linear combination of the others.
  • Explain why independence plus spanning is the pair of conditions that defines a basis, and connect dependence to redundancy and the null space.
  • Implement independent(vectors) from scratch and verify it against np.linalg.matrix_rank.

Subspaces, Span, and Linear Independence: The Geometry of Solution Sets

Learning paths. Math majors — read everything, especially the two proofs in §6.7 and §6.8 and the Math-Major Sidebar on dependence over different fields. CS / Data Science — focus on the Geometric Intuition callouts, the span pictures, the numpy rank checks, and the collinearity application; the deepest proofs are optional. Physics / Engineering — focus on the geometry of spans as lines and planes through the origin, the degrees-of-freedom reading of independence, and keep a concrete picture (a plane in $\mathbb{R}^3$) in your head throughout. This chapter assumes the vector-space axioms from Chapter 5 and the row reduction of Chapter 4.

6.1 What does the solution set of a system actually look like?

In Chapter 5 you took the great abstract leap: a vector is anything that obeys the eight axioms, so polynomials, functions, and matrices are vectors too. That chapter asked you to let go of the arrow. This chapter hands the arrow back, because now we want to see the shapes that linear algebra builds, and the best way to see them is geometrically. We are going to answer three questions that turn out to be the same question wearing three hats: which vectors can I reach by combining a given set? what does that reachable region look like? and is any vector in my set doing redundant work?

Start with something you already met in Chapter 3: the solution set of a system of linear equations. When you solved a system there, the answer was never just "a point" — sometimes it was a whole line of solutions, sometimes a plane, sometimes nothing at all. Picture the homogeneous system $A\mathbf{x} = \mathbf{0}$, the one whose right-hand side is the zero vector. Its solutions always include $\mathbf{x} = \mathbf{0}$ (plug it in: $A\mathbf{0} = \mathbf{0}$, always true). But often there are more — a whole line through the origin, or a whole plane through the origin, of vectors that $A$ sends to zero. Those solution sets have a shape, and the shape is never random.

Geometric Intuition — The solution set of a homogeneous system $A\mathbf{x} = \mathbf{0}$ is always a flat object that passes through the origin: a single point (just $\mathbf{0}$), a line through the origin, a plane through the origin, or some higher-dimensional flat. It is never a curve, never an off-center line, never a sphere. Linear equations produce flat, origin-anchored solution sets — and naming exactly these objects is the goal of this chapter.

Why through the origin, and why flat? Because of superposition, the same linearity that organized Chapter 1. If $\mathbf{x}_1$ and $\mathbf{x}_2$ both solve $A\mathbf{x} = \mathbf{0}$, then so does any combination $c\mathbf{x}_1 + d\mathbf{x}_2$, because $A(c\mathbf{x}_1 + d\mathbf{x}_2) = cA\mathbf{x}_1 + dA\mathbf{x}_2 = c\mathbf{0} + d\mathbf{0} = \mathbf{0}$. The solution set is closed under taking linear combinations: combine two solutions and you get a third. That single property — closure under linear combination — is the definition of the most important kind of set in this book, the subspace, and it is where we begin.

This chapter is the bridge between the abstract vector space of Chapter 5 and the structural machinery of Part III. By its end you will have three precise tools — subspace, span, and linear independence — and you will see how they snap together to point straight at the idea of a basis, the minimal set of arrows that builds an entire space. We will only point at the basis here; Chapter 15 builds it fully, and Chapters 13–14 use these ideas to define the four fundamental subspaces that organize all of linear algebra. Think of this chapter as assembling the parts.

6.2 What is a subspace, and why must it contain the origin?

A subspace is a subset of a vector space that is itself a vector space, using the same addition and scalar multiplication. That definition is correct but unhelpful — it sounds like you would have to re-check all eight axioms from Chapter 5 every time. You don't. Almost all of the axioms (associativity, commutativity, distributivity) are inherited automatically from the parent space: if addition is commutative for all vectors in $\mathbb{R}^3$, it is certainly commutative for the vectors in any subset. The only things that can go wrong are whether the subset is self-contained under the two operations, and whether it contains the zero vector. That gives a clean three-part test.

Let $V$ be a vector space (think $\mathbb{R}^n$) and let $W$ be a subset of $V$. Then $W$ is a subspace of $V$ exactly when all three of these hold:

  1. It contains the zero vector: $\mathbf{0} \in W$.
  2. It is closed under addition: if $\mathbf{u} \in W$ and $\mathbf{v} \in W$, then $\mathbf{u} + \mathbf{v} \in W$.
  3. It is closed under scalar multiplication: if $\mathbf{v} \in W$ and $c$ is any scalar, then $c\mathbf{v} \in W$.

In words: a subspace is a subset you cannot escape by adding its members together or by scaling them. Whatever linear combinations you form from vectors inside $W$, you land back inside $W$. This is the same closure property we just saw for the solution set of $A\mathbf{x} = \mathbf{0}$ — which is our first hint that solution sets of homogeneous systems are subspaces, a fact we will nail down in Chapter 13.

The Key Insight — A subspace is a subset that is closed under linear combinations. Take any vectors inside it, scale them however you like, add them up — you never leave. Conditions 2 and 3 together are just "closed under linear combination," and condition 1 is forced by them anyway (set $c = 0$ in condition 3 to get $\mathbf{0} \in W$, as long as $W$ is nonempty).

Notice the parenthetical: condition 1 is technically a consequence of condition 3, because scaling any vector $\mathbf{v} \in W$ by $c = 0$ produces $\mathbf{0}$, which must therefore be in $W$. So why state it separately? Two reasons. First, it is the fastest disqualifier in all of linear algebra: if a set does not contain the origin, it is not a subspace, and you can stop checking immediately. Second, it guards against the empty set — the empty set is vacuously closed under both operations but is not a subspace, and requiring $\mathbf{0} \in W$ rules it out.

Common Pitfall"Any line is a subspace of $\mathbb{R}^2$." No — only lines through the origin are subspaces. Consider the line $y = x + 1$. The point $(0, 1)$ is on it, but the point $(0, 0)$ is not, so it fails condition 1 instantly. Worse, take two points on it, say $(0,1)$ and $(1,2)$; their sum $(1,3)$ is not on the line ($3 \neq 1 + 1$), so it also fails closure under addition. A line that misses the origin is the geometric cousin of the affine map $y = mx + b$ from Chapter 1: linear-looking, but not linear. Subspaces always pass through the origin. This single fact catches more student errors than any other in the chapter — when in doubt, check the origin first.

Let's catalog the subspaces of the spaces you can actually picture. In $\mathbb{R}^2$ (the plane), there are exactly three kinds of subspace: the zero subspace $\{\mathbf{0}\}$ (just the origin, a single point), every line through the origin, and the whole plane $\mathbb{R}^2$ itself. There is nothing else — no off-center lines, no half-planes, no curves. In $\mathbb{R}^3$ (3D space) there are four kinds: the zero subspace, every line through the origin, every plane through the origin, and all of $\mathbb{R}^3$. The pattern is already visible: subspaces come in dimensions $0, 1, 2, \dots, n$, and each is a flat object anchored at the origin. That word dimension is doing real work, and Chapter 15 makes it rigorous; for now, read it as "how many independent directions the flat extends in."

Check Your Understanding — Is the set $W = \{(x, y) : x \ge 0,\ y \ge 0\}$ (the first quadrant, including its edges) a subspace of $\mathbb{R}^2$?

Answer

No. It contains the origin (good) and is closed under addition (the sum of two vectors with non-negative components has non-negative components). But it fails closure under scalar multiplication: take $\mathbf{v} = (1, 1) \in W$ and the scalar $c = -1$. Then $c\mathbf{v} = (-1, -1)$, which is in the third quadrant, not $W$. A subspace must be closed under multiplication by every scalar, including negatives — so it can never be a "half" of space. Geometrically: a subspace stretches out symmetrically in both directions from the origin, which a quadrant does not.

The non-negative-quadrant example teaches the deep point: a subspace must extend infinitely in both directions along every direction it contains, because scaling by a negative number flips a vector to the opposite side of the origin and you must stay inside. This is why the only subspaces are points, full lines, full planes, and so on — symmetric, unbounded, origin-centered flats.

6.3 What is the span of a set of vectors?

We have been describing subspaces from the outside ("a line through the origin"). Now we build them from the inside, starting from a handful of vectors. This is the constructive heart of the chapter, and it is where the phrase span of a set of vectors earns its central place.

Given vectors $\mathbf{v}_1, \mathbf{v}_2, \dots, \mathbf{v}_k$, a linear combination of them is any vector of the form $$c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \dots + c_k\mathbf{v}_k,$$ where the $c_i$ are scalars (real numbers). You are scaling each $\mathbf{v}_i$ by some amount and adding the results — exactly the tip-to-tail combining from Part I. The span of the set is the collection of all such linear combinations, as the scalars range over every possible value: $$\operatorname{span}\{\mathbf{v}_1, \dots, \mathbf{v}_k\} = \{\, c_1\mathbf{v}_1 + \dots + c_k\mathbf{v}_k : c_1, \dots, c_k \in \mathbb{R} \,\}.$$ Read it as the set of every destination you can reach by combining the given vectors. If a vector is reachable, it is in the span; if no choice of scalars produces it, it is not.

The Key Insight — The span of a set of vectors is everything you can build from them by scaling and adding — the complete set of reachable combinations. The vectors are the raw materials; the span is the entire structure you can construct. And here is the payoff that ties this section to the last: the span of any set of vectors is always a subspace (we prove this in §6.7). Span is the machine that manufactures subspaces.

Let's see what spans look like, picture first, because the geometry is gorgeous and immediate.

The span of one nonzero vector is a line through the origin

Take a single nonzero vector $\mathbf{v}$ in the plane, say $\mathbf{v} = (2, 1)$. Its span is $\{c\,(2,1) : c \in \mathbb{R}\}$ — every scalar multiple of $\mathbf{v}$. As $c$ runs from $-\infty$ to $\infty$, the tip of $c\mathbf{v}$ traces out the entire line through the origin in the direction of $\mathbf{v}$. Positive $c$ goes one way, negative $c$ goes the other, $c = 0$ gives the origin. One vector spans a line through the origin.

# The span of one vector is a line through the origin.
import numpy as np
import matplotlib.pyplot as plt
v = np.array([2.0, 1.0])
t = np.linspace(-3, 3, 100)            # the scalars c, sampled
line = np.outer(t, v)                  # each row is c*v for some c
fig, ax = plt.subplots(figsize=(5, 5))
ax.plot(line[:, 0], line[:, 1], "C0-", lw=2, label=r"span$\{\mathbf{v}\}$ (a line)")
ax.arrow(0, 0, *v, color="C3", width=0.05, length_includes_head=True)  # v itself
ax.axhline(0, color="gray", lw=0.5); ax.axvline(0, color="gray", lw=0.5)
ax.set_aspect("equal"); ax.grid(True, alpha=0.3); ax.legend()
plt.show()

Figure 6.1 — The span of a single vector. The arrow is $\mathbf{v} = (2,1)$; the blue line is the set of all scalar multiples $c\mathbf{v}$, running through the origin in both directions. Alt-text: a straight line through the origin with slope one-half, with a single arrow lying along it, illustrating that one vector spans a line.

The span of two independent vectors is a plane through the origin

Now take two vectors in $\mathbb{R}^3$ that point in genuinely different directions — say $\mathbf{v}_1 = (1, 0, 0)$ and $\mathbf{v}_2 = (0, 1, 1)$. Their span is $\{s\,\mathbf{v}_1 + t\,\mathbf{v}_2 : s, t \in \mathbb{R}\}$. As $s$ and $t$ range over all reals, you sweep out a whole flat sheet — a plane through the origin. Every point on that plane is some recipe "$s$ of the first plus $t$ of the second"; no point off the plane is reachable. Two vectors that are not parallel span a plane.

# The span of two non-parallel vectors in R^3 is a plane through the origin.
import numpy as np
import matplotlib.pyplot as plt
v1 = np.array([1.0, 0.0, 0.0])
v2 = np.array([0.0, 1.0, 1.0])
s, t = np.meshgrid(np.linspace(-2, 2, 10), np.linspace(-2, 2, 10))
P = s[..., None] * v1 + t[..., None] * v2     # all combinations s*v1 + t*v2
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111, projection="3d")
ax.plot_surface(P[..., 0], P[..., 1], P[..., 2], alpha=0.4, color="C0")
ax.quiver(0, 0, 0, *v1, color="C3", lw=2)     # v1
ax.quiver(0, 0, 0, *v2, color="C2", lw=2)     # v2
ax.set_title(r"span$\{\mathbf{v}_1, \mathbf{v}_2\}$ — a plane through the origin")
plt.show()

Figure 6.2 — The span of two non-parallel vectors. The two arrows $\mathbf{v}_1$ and $\mathbf{v}_2$ lie in a tilted plane; every linear combination $s\mathbf{v}_1 + t\mathbf{v}_2$ fills out that plane, which passes through the origin. Alt-text: two arrows in three-dimensional space lying in a shaded tilted plane that contains the origin, illustrating that two independent vectors span a plane.

Three independent vectors span all of $\mathbb{R}^3$

Add a third vector that points out of that plane — say $\mathbf{v}_3 = (0, 0, 1)$, which is not in the span of $\mathbf{v}_1$ and $\mathbf{v}_2$ above (no combination $s(1,0,0) + t(0,1,1)$ can ever have its last two coordinates differ, yet $(0,0,1)$ has them as $0$ and $1$ — we will verify this carefully in §6.4). With three vectors that each reach a genuinely new direction, the span fills all of $\mathbb{R}^3$: every point in 3D space is reachable. Three "independent" vectors span the whole space.

So the span grows with the number of genuinely new directions the vectors supply: one direction gives a line, two give a plane, three give all of $\mathbb{R}^3$. But notice the slippery word genuinely new. What if the third vector had been $\mathbf{v}_3 = \mathbf{v}_1 + \mathbf{v}_2$ — already a combination of the first two? Then it adds nothing: it already lives in the plane, so the span stays a plane even though we now have three vectors. The number of vectors is not what matters; the number of independent directions is. That observation is the doorway to the chapter's third and subtlest idea.

Common Pitfall"More vectors means a bigger span." Not necessarily. Adding a vector that is already in the span changes nothing — $\operatorname{span}\{\mathbf{v}_1, \mathbf{v}_2\} = \operatorname{span}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_1 + \mathbf{v}_2\}$, because the third vector is reachable from the first two. The span only grows when you add a vector that points out of the current span. This is exactly the difference between a redundant vector and an independent one — the topic of §6.5.

6.4 How do I check whether a vector is in the span of others?

"Is $\mathbf{b}$ in $\operatorname{span}\{\mathbf{v}_1, \dots, \mathbf{v}_k\}$?" is not a vague geometric musing — it is a concrete computation you already know how to do. The vector $\mathbf{b}$ is in the span exactly when there exist scalars $c_1, \dots, c_k$ with $$c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \dots + c_k\mathbf{v}_k = \mathbf{b}.$$ Stack the $\mathbf{v}_i$ as the columns of a matrix $A$ and collect the unknown scalars into a vector $\mathbf{c}$; the question becomes whether the linear system $$A\mathbf{c} = \mathbf{b}$$ has a solution. Membership in a span is precisely the solvability of a linear system — and you solve linear systems by Gaussian elimination (Chapter 4). If row reduction yields a consistent system, $\mathbf{b}$ is in the span (and the solution $\mathbf{c}$ is the recipe); if it forces an impossible equation like $0 = 1$, then $\mathbf{b}$ is unreachable.

Let's make it concrete. Is $\mathbf{b} = (4, 5, 6)$ in the span of $\mathbf{v}_1 = (1, 2, 3)$ and $\mathbf{v}_2 = (2, 1, 0)$? We need $c_1(1,2,3) + c_2(2,1,0) = (4,5,6)$, i.e. the system $$\begin{aligned} c_1 + 2c_2 &= 4 \\ 2c_1 + c_2 &= 5 \\ 3c_1 + 0c_2 &= 6. \end{aligned}$$ The third equation gives $c_1 = 2$ immediately. Substituting into the first: $2 + 2c_2 = 4$, so $c_2 = 1$. Check the second: $2(2) + 1 = 5$. ✓ All three equations agree, so $(4,5,6) = 2\mathbf{v}_1 + 1\mathbf{v}_2$ — it is in the span, and we even know the recipe. Let's confirm with numpy.

# Is b in span{v1, v2}? Solve A c = b, where v1, v2 are the columns of A.
import numpy as np
v1 = np.array([1, 2, 3])
v2 = np.array([2, 1, 0])
A = np.column_stack([v1, v2]).astype(float)     # 3x2: columns are v1, v2
b = np.array([4.0, 5.0, 6.0])
c, residuals, rank, _ = np.linalg.lstsq(A, b, rcond=None)
print("recipe c =", np.round(c, 6))             # [2. 1.]
print("A @ c   =", A @ c)                        # [4. 5. 6.]  -> matches b exactly
print("in span?", np.allclose(A @ c, b))         # True

The output c = [2. 1.] confirms the hand result, and A @ c reproduces $\mathbf{b}$ exactly, so the answer is yes. (We used lstsq, least-squares, because $A$ is not square; when an exact solution exists, lstsq finds it, and np.allclose(A @ c, b) confirms there was no leftover error. Least squares as projection is the subject of Chapter 19 — here it is just a convenient solver.)

Geometric Intuition — Asking "is $\mathbf{b}$ in the span of these vectors?" is asking "does $\mathbf{b}$ lie on the line/plane/flat that the vectors sweep out?" If $\mathbf{b}$ pokes out of that flat even slightly, no combination can reach it and the system $A\mathbf{c} = \mathbf{b}$ is inconsistent. The span is a geometric net; membership asks whether $\mathbf{b}$ got caught in it.

A vector that pokes out: the deferred check from §6.3

Now we can settle the question we left open in §6.3, where we claimed that $\mathbf{v}_3 = (0,0,1)$ points out of the plane $\operatorname{span}\{\mathbf{v}_1, \mathbf{v}_2\}$ with $\mathbf{v}_1 = (1,0,0)$ and $\mathbf{v}_2 = (0,1,1)$ — that adding it is what grows a plane into all of $\mathbb{R}^3$. "Points out of the plane" means exactly "is not in the span," and we now have the tool to prove it: $\mathbf{v}_3$ is in the span if and only if the system $s\mathbf{v}_1 + t\mathbf{v}_2 = \mathbf{v}_3$ has a solution. Writing the three component equations, $$\begin{aligned} s + 0t &= 0 \\ 0s + t &= 0 \\ 0s + t &= 1, \end{aligned}$$ the second equation forces $t = 0$, but the third equation demands $t = 1$. No single value of $t$ can be both $0$ and $1$, so the system is inconsistent — there is no recipe, and $\mathbf{v}_3$ is genuinely off the plane. Geometrically, $\mathbf{v}_1$ and $\mathbf{v}_2$ between them can only build vectors whose last two coordinates are equal (the second coordinate comes entirely from $t\mathbf{v}_2$, and so does the third), but $(0,0,1)$ violates that constraint. The contradiction $0 = 1$ is the algebra's way of saying "you are reaching for a point the net does not cover."

# Is v3=(0,0,1) in span{v1=(1,0,0), v2=(0,1,1)}?  Look for a recipe, then check it.
import numpy as np
v1 = np.array([1.0, 0.0, 0.0]); v2 = np.array([0.0, 1.0, 1.0]); v3 = np.array([0.0, 0.0, 1.0])
A = np.column_stack([v1, v2])                     # 3x2: the plane's two generators
c, *_ = np.linalg.lstsq(A, v3, rcond=None)        # best (s, t) least-squares can offer
print("best recipe (s,t) =", np.round(c, 6))      # [0.  0.5]
print("A @ c             =", np.round(A @ c, 6))   # [0.  0.5 0.5]  -> NOT (0,0,1)
print("v3 in span?       =", np.allclose(A @ c, v3))            # False
print("rank{v1,v2}    =", np.linalg.matrix_rank(A))            # 2  (a plane)
print("rank{v1,v2,v3} =", np.linalg.matrix_rank(np.column_stack([v1, v2, v3])))  # 3 (all of R^3)

The closest lstsq can come is $(s,t) = (0, 0.5)$, which builds $(0, 0.5, 0.5)$ — the foot of the perpendicular from $\mathbf{v}_3$ onto the plane, not $\mathbf{v}_3$ itself — so membership fails. The last two lines tell the whole growth story numerically: $\{\mathbf{v}_1, \mathbf{v}_2\}$ has rank $2$ (a plane), and throwing in $\mathbf{v}_3$ lifts the rank to $3$ (all of $\mathbb{R}^3$). That jump from $2$ to $3$ is the precise meaning of "$\mathbf{v}_3$ supplies a genuinely new direction." When the rank does not jump — as it would for $\mathbf{v}_3 = \mathbf{v}_1 + \mathbf{v}_2$, which sits inside the plane — the new vector was redundant and the span stayed put. A vector enlarges a span exactly when it raises the rank, and that is the cleanest computational summary of the line-versus-plane-versus-space hierarchy we have been drawing by hand.

Check Your Understanding — Is $\mathbf{b} = (3, 1, 5)$ in $\operatorname{span}\{(1,0,1), (0,1,1)\}$?

Answer

No. Solving $s(1,0,1) + t(0,1,1) = (3,1,5)$ gives $s = 3$ (first coordinate) and $t = 1$ (second coordinate), and the third coordinate checks: $s + t = 3 + 1 = 4 \neq 5$. The third equation fails — $4 \neq 5$ — and $\mathbf{b}$ is not in the span; it pokes off the plane by one unit in the vertical direction. The lesson: always verify every coordinate. Two of the three equations pinned $s$ and $t$ down completely, and the leftover equation is the consistency test that membership can quietly fail. (Adding $\mathbf{b}$ would raise the rank from $2$ to $3$.)

6.5 What does it mean for vectors to be linearly independent?

We arrive at the chapter's most important and most slippery idea, the one the SEO gods and every linear-algebra exam care about: linear independence explained properly, geometry first. Here is the plain-English version before any symbols.

Geometric Intuition — A set of vectors is linearly independent when no vector is redundant — none of them lies in the span of the others, so each one contributes a genuinely new direction. Two vectors are independent when they are not parallel (neither is a scalar multiple of the other). Three vectors are independent when none lies in the plane spanned by the other two. Dependence is redundancy; independence is "every vector pulls its weight."

Now the precise definition, which looks abstract but says exactly that. The vectors $\mathbf{v}_1, \dots, \mathbf{v}_k$ are linearly independent if the only way to combine them to get the zero vector is to use all-zero coefficients: $$c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \dots + c_k\mathbf{v}_k = \mathbf{0} \quad\Longrightarrow\quad c_1 = c_2 = \dots = c_k = 0.$$ The all-zero combination $c_1 = \dots = c_k = 0$ always gives $\mathbf{0}$ — that is the trivial combination, and it is free for any set of vectors. Independence says it is the only combination that reaches $\mathbf{0}$. If, on the other hand, there is some nontrivial combination (at least one $c_i \neq 0$) that produces $\mathbf{0}$, the vectors are linearly dependent.

Why does "the only way to reach zero is trivially" capture "no vector is redundant"? Because the two statements are logically equivalent, and proving that equivalence is one of the two theorems of this chapter (§6.8). For now, feel the connection through an example. Suppose $\mathbf{v}_3 = 2\mathbf{v}_1 + \mathbf{v}_2$ — the third vector is redundant, a combination of the first two. Rearrange: $2\mathbf{v}_1 + \mathbf{v}_2 - \mathbf{v}_3 = \mathbf{0}$. That is a nontrivial combination (the coefficients $2, 1, -1$ are not all zero) that hits the zero vector. So redundancy produces a nontrivial way to reach zero — which is exactly the definition of dependence. The two ideas are the same fact stated from two angles.

Let's test independence the way you actually will: with row reduction, then with numpy. Take the three vectors $\mathbf{v}_1 = (1,2,3)$, $\mathbf{v}_2 = (2,1,0)$, $\mathbf{v}_3 = (4,5,6)$. We ask whether $c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + c_3\mathbf{v}_3 = \mathbf{0}$ forces all $c_i = 0$. Stack the vectors as columns of a matrix $A$; the equation is the homogeneous system $A\mathbf{c} = \mathbf{0}$. Independence holds exactly when this system has only the trivial solution $\mathbf{c} = \mathbf{0}$ — equivalently, when row reduction produces a pivot in every column (no free variables). If there is a free variable, there are infinitely many solutions, including nontrivial ones, and the set is dependent.

The Key Insight — To test independence, put the vectors as columns of $A$ and row reduce. A pivot in every column $\Leftrightarrow$ no free variables $\Leftrightarrow$ only the trivial solution to $A\mathbf{c}=\mathbf{0}$ $\Leftrightarrow$ the vectors are linearly independent. The number of pivots is the rank of $A$ (Chapter 4), so independence is the statement $\operatorname{rank}(A) = $ (number of vectors). This is the bridge from the geometric idea to a one-line numpy check.

For our example, recall from §6.4 that $(4,5,6) = 2(1,2,3) + 1(2,1,0)$ — so $\mathbf{v}_3 = 2\mathbf{v}_1 + \mathbf{v}_2$ is redundant, and we expect dependence. Reducing $A = [\mathbf{v}_1\ \mathbf{v}_2\ \mathbf{v}_3]$ will leave the third column without a pivot. Let's verify both the rank and the actual dependency relation in numpy.

# Independence test via rank, plus the explicit dependency relation.
import numpy as np
from scipy.linalg import null_space
v1 = np.array([1, 2, 3]); v2 = np.array([2, 1, 0]); v3 = np.array([4, 5, 6])
A = np.column_stack([v1, v2, v3]).astype(float)    # 3x3: vectors as columns
print("rank =", np.linalg.matrix_rank(A))          # 2  (< 3 columns -> dependent)
ns = null_space(A)                                  # nontrivial solutions of A c = 0
print("null space dimension =", ns.shape[1])        # 1  (a whole line of solutions)
rel = ns[:, 0] / ns[:, 0][2] * -1                  # rescale so the v3 coefficient is -1
print("dependency relation c =", np.round(rel, 4)) # [ 2.  1. -1.]

The rank is $2$, not $3$, so the three vectors are dependent: one of them is redundant. The null space is one-dimensional — an entire line of nontrivial coefficient vectors $\mathbf{c}$ solve $A\mathbf{c} = \mathbf{0}$ — and the printed relation $\mathbf{c} = (2, 1, -1)$ reads off as $2\mathbf{v}_1 + 1\mathbf{v}_2 - 1\mathbf{v}_3 = \mathbf{0}$, i.e. $\mathbf{v}_3 = 2\mathbf{v}_1 + \mathbf{v}_2$. The algebra found the redundancy we already knew geometrically. (The null_space returns a unit vector, so we rescaled it to make the dependency human-readable; the direction is what carries the meaning, not the length.)

The same test by hand: counting pivots

numpy gave the answer in one line, but you should be able to do this with a pencil, because the by-hand version is where the pivot in every column idea becomes tangible. Let's row reduce the matrix $A = [\mathbf{v}_1\ \mathbf{v}_2\ \mathbf{v}_3]$ for the same dependent example, using the elimination procedure from Chapter 4. We are reducing the coefficient matrix alone — there is no augmented right-hand side, because the system $A\mathbf{c} = \mathbf{0}$ has the zero vector on the right and row operations never change a column of zeros.

Starting from $$A = \begin{bmatrix} 1 & 2 & 4 \\ 2 & 1 & 5 \\ 3 & 0 & 6 \end{bmatrix},$$ the first pivot is the $1$ in the top-left corner. Clear the rest of column 1: replace row 2 with (row 2 $- 2\times$ row 1) and row 3 with (row 3 $- 3\times$ row 1): $$\begin{bmatrix} 1 & 2 & 4 \\ 0 & -3 & -3 \\ 0 & -6 & -6 \end{bmatrix}.$$ The second pivot is the $-3$ in position (2,2). Clear below it: replace row 3 with (row 3 $- 2\times$ row 2): $$\begin{bmatrix} 1 & 2 & 4 \\ 0 & -3 & -3 \\ 0 & 0 & 0 \end{bmatrix}.$$ We have reached echelon form. There are pivots in columns 1 and 2, but column 3 has no pivot — its variable $c_3$ is free. A free variable means infinitely many solutions to $A\mathbf{c} = \mathbf{0}$, including nontrivial ones, so the vectors are dependent. Two pivots for three columns: $\operatorname{rank} = 2 < 3$, exactly what matrix_rank reported. If you continue to reduced row-echelon form, the matrix becomes $\begin{bmatrix} 1 & 0 & 2 \\ 0 & 1 & 1 \\ 0 & 0 & 0 \end{bmatrix}$, and the third column $(2, 1)$ literally spells out the dependency: $\mathbf{v}_3 = 2\mathbf{v}_1 + 1\mathbf{v}_2$. The pivot columns (1 and 2) mark the independent vectors; the non-pivot column (3) marks the redundant one and tells you how to build it. This is the whole algorithm, and it is exactly what your toolkit function will automate in §6.10.

Computational Note — Why use np.linalg.matrix_rank instead of checking the determinant for independence? For a square matrix, $\det(A) = 0$ does signal dependence (Chapter 11). But the vectors you test are often not square — five vectors in $\mathbb{R}^3$, or three in $\mathbb{R}^5$ — and a non-square matrix has no determinant. matrix_rank works for any shape and is the right general tool. Under the hood it uses the SVD (Chapter 30) and counts singular values above a numerical tolerance, which is far more robust to floating-point error than checking whether a determinant is "exactly" zero. We will say more about that tolerance when we build the toolkit function.

Contrast: an independent set

For contrast, take $\mathbf{v}_1 = (1,1,1)$, $\mathbf{v}_2 = (0,1,1)$, $\mathbf{v}_3 = (0,0,1)$. Geometrically these climb out of each other — the first reaches into all three coordinates, the second adds a direction the first lacks, the third adds yet another. We expect independence, hence rank $3$.

# An independent set in R^3: full rank.
import numpy as np
B = np.array([[1, 0, 0],
              [1, 1, 0],
              [1, 1, 1]], dtype=float)   # columns are (1,1,1), (0,1,1), (0,0,1)
print("rank =", np.linalg.matrix_rank(B))   # 3  == number of vectors -> independent
print("det  =", round(float(np.linalg.det(B)), 6))  # 1.0 (nonzero, square case)

Rank $3$ equals the number of vectors, so they are independent; since the matrix is square, the nonzero determinant ($1.0$) confirms it. These three vectors are independent and they span $\mathbb{R}^3$ (three independent vectors in a 3-dimensional space always span it, as Chapter 15 will prove) — which makes them a basis, the punchline we are building toward.

Check Your Understanding — Are the vectors $(1, 2)$ and $(3, 6)$ in $\mathbb{R}^2$ linearly independent?

Answer

No, they are dependent. Notice $(3, 6) = 3\,(1, 2)$ — the second is just three times the first, so they are parallel and point along the same line. A nontrivial combination reaching zero is $3\,(1,2) - 1\,(3,6) = (0,0)$, with coefficients $3, -1$ (not both zero). Their span is a single line through the origin, not the whole plane. As a rank check: stacking them as columns gives a matrix of rank $1$, not $2$. Two vectors are independent iff neither is a scalar multiple of the other — iff they are not parallel.

6.6 Why is "spans $\mathbb{R}^3$" different from "is independent"?

Students conflate two genuinely different questions, and untangling them is the conceptual core of the chapter. Spanning asks: do these vectors reach everything? Independence asks: is any of them wasted? These pull in opposite directions, and a basis is the sweet spot where both hold at once.

Common PitfallConfusing "spans $\mathbb{R}^3$" with "is independent." They are different, and often opposite, conditions. A set can span without being independent: the four vectors $(1,0,0), (0,1,0), (0,0,1), (1,1,1)$ span all of $\mathbb{R}^3$ (the first three already do), but they are dependent — the fourth is the sum of the first three, hence redundant. A set can be independent without spanning: the two vectors $(1,0,0), (0,1,0)$ are independent (not parallel), but they only span the $xy$-plane, not all of $\mathbb{R}^3$ — they miss the vertical direction entirely. Spanning is about coverage (reach everything); independence is about economy (no waste). Do not assume one from the other.

There is a clean numerical relationship that organizes all of this. Put $k$ vectors from $\mathbb{R}^n$ as the columns of an $n \times k$ matrix $A$, and let $r = \operatorname{rank}(A)$ be the number of pivots. Then:

  • The vectors are independent $\iff r = k$ (a pivot in every column; no column is redundant).
  • The vectors span $\mathbb{R}^n$ $\iff r = n$ (a pivot in every row; every direction is covered).
  • They are a basis of $\mathbb{R}^n$ $\iff r = k = n$ (both at once; necessarily a square, full-rank matrix).

This little table quietly proves a fact you may have sensed: you need exactly $n$ vectors for a basis of $\mathbb{R}^n$. Fewer than $n$ vectors can be independent but cannot span (not enough pivots to fill every row); more than $n$ vectors can span but cannot be independent (more columns than rows forces a free variable). Only $k = n$ can satisfy both — and then independence and spanning become equivalent, each implying the other. Chapter 15 elevates this observation into the definition of dimension; here, savor that the rank $r$ controls everything.

Real-World ApplicationColor and the RGB basis (computer graphics). A digital display builds every color by mixing three primaries — red $(1,0,0)$, green $(0,1,0)$, blue $(0,0,1)$ — in varying intensities. These three "color vectors" are independent (no primary can be made from the other two) and they span the full color space your monitor can produce, so they form a basis: every displayable color is one unique combination $r\,\mathbf{R} + g\,\mathbf{G} + b\,\mathbf{B}$. If a manufacturer instead chose red, green, and yellow — and yellow is just red-plus-green — the three would be dependent, the "basis" would collapse to a 2D plane, and the display could never produce blue. The whole theory of color reproduction is a story about choosing an independent, spanning set. We unpack this in this chapter's graphics case study.

Let's confirm the span-without-independence claim from the pitfall in code, since the numbers make the abstract relationship vivid.

# "Spans R^3 but is NOT independent": four vectors, rank 3.
import numpy as np
A = np.column_stack([[1, 0, 0],
                     [0, 1, 0],
                     [0, 0, 1],
                     [1, 1, 1]]).astype(float)   # 3x4: four vectors in R^3
print("shape:", A.shape)                          # (3, 4): 4 vectors of length 3
print("rank =", np.linalg.matrix_rank(A))         # 3 == n  -> spans R^3
print("# vectors =", A.shape[1])                  # 4  != rank -> NOT independent
# rank 3 == 3 rows (spans), but rank 3 < 4 columns (dependent): exactly the pitfall.

Rank $3$ equals the number of rows ($n = 3$), so the set spans $\mathbb{R}^3$; but rank $3$ is less than the number of columns ($k = 4$), so the set is dependent. One matrix, two different comparisons, two different answers — that is the whole distinction in a single computation. Four vectors will always be dependent in $\mathbb{R}^3$, because a $3 \times 4$ matrix can have at most $3$ pivots for $4$ columns, guaranteeing a free variable. There is simply not enough room.

The mirror-image case — independent but not spanning — comes from too few vectors. Two vectors in $\mathbb{R}^3$ can be independent, but two pivots cannot fill three rows, so they cannot reach every direction.

# "Independent but does NOT span R^3": two vectors, rank 2 in a 3D space.
import numpy as np
A = np.column_stack([[1, 0, 0],
                     [0, 1, 0]]).astype(float)   # 3x2: two vectors in R^3
print("rank =", np.linalg.matrix_rank(A))         # 2 == # vectors -> independent
print("# vectors =", A.shape[1], " n =", A.shape[0])  # 2 vectors, ambient dim 3
# rank 2 == 2 columns (independent), but rank 2 < 3 rows (does NOT span R^3).

Here rank equals the number of columns (so independent) but is less than the number of rows (so it misses a whole direction — these two vectors span only the $xy$-plane). Compare the two snippets side by side: in the first, columns outnumber the rank (dependent, spanning); in the second, rows outnumber the rank (independent, not spanning). A basis is the one configuration where rank equals both counts at once.

6.7 What about the zero vector, a single vector, and other edge cases?

A few small cases trip people up precisely because they feel too simple to bother checking. Pinning them down sharpens the definitions and saves you from confident mistakes on exams and in code.

Does the zero vector matter? Enormously — in a bad way. Any set that contains the zero vector is automatically dependent. Here is why: if $\mathbf{v}_1 = \mathbf{0}$, then the combination $1\cdot\mathbf{v}_1 + 0\cdot\mathbf{v}_2 + \dots + 0\cdot\mathbf{v}_k = \mathbf{0}$ is nontrivial (its first coefficient is $1 \neq 0$) yet reaches the zero vector. So the very presence of $\mathbf{0}$ creates a nontrivial relation. Intuitively, the zero vector contributes no direction at all — it is the ultimate redundant vector — so it can never belong to an independent set or a basis.

What about a single vector? A one-vector set $\{\mathbf{v}\}$ is independent iff $\mathbf{v} \neq \mathbf{0}$. If $\mathbf{v}$ is nonzero, the only scalar with $c\mathbf{v} = \mathbf{0}$ is $c = 0$ (you cannot scale a nonzero arrow down to nothing without multiplying by zero), so the set is independent. If $\mathbf{v} = \mathbf{0}$, then $c\mathbf{0} = \mathbf{0}$ for every $c$, so nontrivial relations abound and the set is dependent — consistent with the rule above.

What about the empty set? By convention, the empty set $\varnothing$ is linearly independent (there is no nontrivial relation because there are no vectors to combine) and its span is the zero subspace $\{\mathbf{0}\}$. This sounds like a logician's joke, but it makes the theorems of Chapter 15 come out clean: the empty set is the basis of the zero subspace, which therefore has dimension $0$.

Can I have infinitely many vectors? Yes — span and independence are defined for infinite sets too, with the understanding that every linear combination uses only finitely many of them at a time. You will not need infinite sets until the function spaces of Chapter 34, but the definitions already accommodate them, which is part of why we phrased independence in terms of combinations rather than matrices.

Check Your Understanding — Are the three vectors $(0,0,0)$, $(1,2,3)$, $(4,5,6)$ linearly independent?

Answer

No, dependent — and you do not even need to look at the second and third vectors. The set contains the zero vector, so the relation $1\cdot(0,0,0) + 0\cdot(1,2,3) + 0\cdot(4,5,6) = \mathbf{0}$ is nontrivial. Any set containing $\mathbf{0}$ is dependent, full stop. (As a rank check, the matrix has a zero column, so it can have at most rank $2 < 3$.)

Historical Note — The modern, axiomatic definition of linear independence and basis crystallized surprisingly late. Giuseppe Peano gave an axiomatic treatment of vector spaces in 1888, but the now-standard language was popularized through the abstract-algebra revolution of the 1920s–30s, with Hermann Weyl and the influential textbooks of the era fixing the terminology [verify]. The underlying idea is far older: in solving linear systems, eighteenth- and nineteenth-century mathematicians (Euler, Gauss) repeatedly confronted "superfluous" equations that added no new information — dependent rows, in today's terms — long before anyone named the phenomenon. As with the word "matrix" (Chapter 1), the computational practice came first and the clean abstract concept was distilled out of it decades later.

6.8 PROOF: the span of a set of vectors is a subspace

We have asserted twice that span manufactures subspaces. Now we earn it. This is the chapter's first formal proof, and it is a model of how subspace arguments go — short, mechanical once you see the pattern, and worth internalizing because you will run this exact verification dozens of times in Part III.

Why we care. Span is how we build subspaces from a handful of generators, and the four fundamental subspaces of Chapter 13 (column space, null space, row space, left null space) are each defined as a span or a solution set. If span did not always produce a subspace, that entire framework would be built on sand. This theorem is the foundation stone.

Key idea. A linear combination of linear combinations is still a linear combination — so the set of all linear combinations is automatically closed under addition and scaling. That one sentence is the whole proof; the rest is bookkeeping.

Theorem. Let $\mathbf{v}_1, \dots, \mathbf{v}_k$ be any vectors in a vector space $V$. Then $W = \operatorname{span}\{\mathbf{v}_1, \dots, \mathbf{v}_k\}$ is a subspace of $V$.

Proof. We verify the three subspace conditions from §6.2.

Condition 1 — $\mathbf{0} \in W$. Choose all scalars equal to zero: $0\,\mathbf{v}_1 + 0\,\mathbf{v}_2 + \dots + 0\,\mathbf{v}_k = \mathbf{0}$. This is a (trivial) linear combination of the $\mathbf{v}_i$, so $\mathbf{0}$ belongs to $W$. ✓

Condition 2 — closure under addition. Take any two elements $\mathbf{u}, \mathbf{w} \in W$. By definition of span, each is some linear combination of the generators: $$\mathbf{u} = a_1\mathbf{v}_1 + \dots + a_k\mathbf{v}_k, \qquad \mathbf{w} = b_1\mathbf{v}_1 + \dots + b_k\mathbf{v}_k,$$ for some scalars $a_i, b_i$. Add them, grouping by each $\mathbf{v}_i$ (using the commutativity and distributivity that hold in $V$ by Chapter 5): $$\mathbf{u} + \mathbf{w} = (a_1 + b_1)\mathbf{v}_1 + (a_2 + b_2)\mathbf{v}_2 + \dots + (a_k + b_k)\mathbf{v}_k.$$ The right-hand side is again a linear combination of $\mathbf{v}_1, \dots, \mathbf{v}_k$ — with new coefficients $a_i + b_i$ — so $\mathbf{u} + \mathbf{w} \in W$. ✓

Condition 3 — closure under scalar multiplication. Take any $\mathbf{u} = a_1\mathbf{v}_1 + \dots + a_k\mathbf{v}_k \in W$ and any scalar $c$. Then, distributing $c$ across the sum, $$c\,\mathbf{u} = c(a_1\mathbf{v}_1 + \dots + a_k\mathbf{v}_k) = (c a_1)\mathbf{v}_1 + (c a_2)\mathbf{v}_2 + \dots + (c a_k)\mathbf{v}_k.$$ This is once more a linear combination of the generators — with coefficients $c a_i$ — so $c\,\mathbf{u} \in W$. ✓

All three conditions hold, so $W$ is a subspace of $V$. $\blacksquare$

What this means. Geometrically, the theorem says the set of all combinations of some arrows is always a flat, origin-anchored object — a line, plane, or higher flat through the origin, never anything bent or off-center. Algebraically, it licenses a powerful habit: to prove something is a subspace, you can exhibit it as a span (find generators), and you are instantly done — no need to re-verify closure by hand. Conversely, every subspace of $\mathbb{R}^n$ turns out to be the span of some finite set of vectors (Chapter 15 proves this, via the existence of a basis). Span and subspace are, in the finite-dimensional world, two names for the same thing: the subspaces are exactly the spans.

Math-Major Sidebar (optional) — The proof used nothing about $\mathbb{R}^n$ specifically; it used only the vector-space axioms (distributivity, commutativity of $+$). So the theorem holds in any vector space over any field — the polynomials, the function spaces, and the matrix spaces of Chapter 5 all included. The span of three polynomials is a subspace of the polynomial space; the span of two functions is a subspace of a function space. This generality is the payoff Chapter 5 promised: prove it once, abstractly, and it holds everywhere. The same argument, with $k = \infty$ and a finiteness caveat on the sums, underlies infinite-dimensional spaces in functional analysis (Chapter 34's inner-product spaces are a first step).

A companion fact: enlarging a spanning set keeps it spanning

The pitfall in §6.3 — "more vectors means a bigger span" — warned that adding vectors need not grow a span. There is a matching positive statement worth proving, because it is the formal backbone of the pruning procedure that extracts a basis: if a set already spans, you can throw extra vectors in and it still spans. The span can only stay the same or grow; it never shrinks when you add generators. Phrased carefully:

Theorem (supersets of spanning sets span). Let $S = \{\mathbf{v}_1, \dots, \mathbf{v}_k\}$ span a subspace $W$, and let $S' = \{\mathbf{v}_1, \dots, \mathbf{v}_k, \mathbf{w}_1, \dots, \mathbf{w}_m\}$ be any superset of $S$ formed by adjoining further vectors $\mathbf{w}_1, \dots, \mathbf{w}_m$ drawn from $W$. Then $S'$ also spans $W$: $\operatorname{span}(S') = W$.

Key idea. Every combination you could already make from $S$ is still a combination of $S'$ — just put coefficient $0$ on each new vector. Adding generators can only add recipes, never remove them.

Proof. We show the two spans are equal by showing each contains the other.

$\operatorname{span}(S) \subseteq \operatorname{span}(S')$. Take any $\mathbf{u} \in \operatorname{span}(S)$, say $\mathbf{u} = c_1\mathbf{v}_1 + \dots + c_k\mathbf{v}_k$. Write the same vector as a combination of all of $S'$ by attaching zero coefficients on the new vectors: $$\mathbf{u} = c_1\mathbf{v}_1 + \dots + c_k\mathbf{v}_k + 0\cdot\mathbf{w}_1 + \dots + 0\cdot\mathbf{w}_m.$$ This is a linear combination of $S'$, so $\mathbf{u} \in \operatorname{span}(S')$. Since $\operatorname{span}(S) = W$, this already gives $W \subseteq \operatorname{span}(S')$.

$\operatorname{span}(S') \subseteq W$. By §6.8, $\operatorname{span}(S') = W$ requires also that no new vector escaped $W$. But each adjoined $\mathbf{w}_j$ was drawn from $W$, and $W$ is a subspace, hence closed under linear combinations; so every combination of the $\mathbf{v}_i$ and $\mathbf{w}_j$ lands back in $W$. Thus $\operatorname{span}(S') \subseteq W$.

Both inclusions hold, so $\operatorname{span}(S') = W$. $\blacksquare$

What this means. Spanning is monotone: enlarging a generating set never costs you coverage. Combine this with §6.9 and the whole basis-extraction algorithm appears in outline. Start with any spanning set. If it is independent, you are done — it is a basis. If it is dependent, §6.9 hands you a redundant vector; delete it, and the present theorem's logic (run in reverse) guarantees the smaller set still spans, because the deleted vector was reachable from the rest. Repeat until independence. You are guaranteed to land on a basis, and you never lose the span along the way. The two proofs of this chapter are precisely the two levers — "you can always drop a redundant vector" and "extra vectors never hurt spanning" — that make a basis reachable from either side.

6.9 PROOF: a set is dependent if and only if some vector is a combination of the others

The second proof closes the gap we have been leaning on intuitively: that the abstract definition of dependence ("a nontrivial combination gives $\mathbf{0}$") means the same thing as the intuitive one ("some vector is redundant"). This is an if and only if, so we prove both directions.

Why we care. The whole geometric story of this chapter — independence as "no redundancy," dependence as "a wasted vector" — rests on these two phrasings being equivalent. Without this theorem, the picture and the algebra are two unconnected ideas; with it, they are provably one. It is also the engine behind "throwing away redundant vectors to extract a basis," a procedure you will use constantly from Chapter 15 on.

Key idea. Dependence gives you a nontrivial relation $\sum c_i\mathbf{v}_i = \mathbf{0}$; pick any vector with a nonzero coefficient and solve for it, expressing it via the others. The reverse runs the same algebra backward.

Theorem. Let $\mathbf{v}_1, \dots, \mathbf{v}_k$ be vectors in a vector space $V$, with $k \ge 2$. Then the set is linearly dependent if and only if at least one of the vectors can be written as a linear combination of the others.

Proof. We prove the two implications separately.

($\Rightarrow$) Dependent implies some vector is a combination of the others. Suppose the set is dependent. By definition there exist scalars $c_1, \dots, c_k$, not all zero, with $$c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \dots + c_k\mathbf{v}_k = \mathbf{0}.$$ Since the coefficients are not all zero, pick an index $j$ with $c_j \neq 0$. Move every other term to the right-hand side and divide through by $c_j$ (legal precisely because $c_j \neq 0$): $$c_j\mathbf{v}_j = -\sum_{i \neq j} c_i\mathbf{v}_i \quad\Longrightarrow\quad \mathbf{v}_j = \sum_{i \neq j} \left(-\frac{c_i}{c_j}\right)\mathbf{v}_i.$$ The right-hand side is a linear combination of the other vectors, so $\mathbf{v}_j$ is expressible through them — it is redundant. ✓

($\Leftarrow$) Some vector is a combination of the others implies dependent. Conversely, suppose some vector, say $\mathbf{v}_j$, is a combination of the rest: $$\mathbf{v}_j = \sum_{i \neq j} d_i\mathbf{v}_i$$ for some scalars $d_i$. Move $\mathbf{v}_j$ to the right (equivalently, subtract it from both sides) to get a combination equal to $\mathbf{0}$: $$\sum_{i \neq j} d_i\mathbf{v}_i \;-\; 1\cdot\mathbf{v}_j = \mathbf{0}.$$ This is a linear combination of all the vectors that equals $\mathbf{0}$, and its coefficient on $\mathbf{v}_j$ is $-1 \neq 0$. So at least one coefficient is nonzero: the combination is nontrivial. By definition, the set is dependent. ✓

Both directions hold, so the two notions are equivalent. $\blacksquare$

What this means. This is the theorem that lets you switch freely between the computational definition (used by row reduction and numpy: "only the trivial combination reaches $\mathbf{0}$") and the geometric one (used by your intuition: "no vector lies in the span of the others"). They are the same condition. The proof also reveals which vector is redundant — any one with a nonzero coefficient in the relation can be expressed via the others and deleted without shrinking the span. That is exactly how, given a dependent spanning set, you prune it down to an independent spanning set — a basis. Notice one subtlety the proof respects: it does not say every vector is a combination of the others, only that at least one is. In the dependent set $\{(1,0), (2,0), (0,1)\}$, the vector $(0,1)$ is not a combination of the other two (they only reach the $x$-axis), yet the set is dependent because $(2,0) = 2(1,0)$. Dependence needs only one redundant vector, not universal redundancy.

Warning

— The definition of independence is a statement about every coefficient, and the order of quantifiers matters. "Independent" means: for all coefficient choices, if the combination is $\mathbf{0}$ then all coefficients are $0$. A common error is to test only one or two convenient combinations, find they give $\mathbf{0}$ only trivially, and declare independence. That proves nothing — you must show no nontrivial combination works, which is why row reduction (which finds all solutions of $A\mathbf{c} = \mathbf{0}$ at once) is the correct test, not spot-checking.

6.10 How do span and independence point toward a basis?

Step back and see the architecture you have assembled. Three ideas, each refining the last:

  • A subspace is a flat region through the origin — closed under linear combination (§6.2).
  • The span of a set is the subspace you build from those vectors — all their combinations (§6.3, proved in §6.7).
  • Independence is the quality control: it certifies that none of your building vectors is redundant (§6.5).

Put the last two together and you get the most important definition in the next part of the book. A basis of a subspace $W$ is a set of vectors that is both independent and spans $W$ — a generating set with no waste. It reaches every vector in $W$ (spanning) using the fewest possible vectors (independence). It is the "just right" set: drop a vector and you lose coverage; add a vector and you create redundancy.

The Key InsightBasis = independence + spanning. Spanning guarantees the set is big enough to reach everything; independence guarantees it is small enough to have no redundancy. A basis is the unique sweet spot between too few vectors (can't span) and too many (can't be independent). Every vector in the subspace is then a unique combination of the basis vectors — and the number of basis vectors is the dimension. This is the doorway from Part I into the structural heart of linear algebra.

We will not develop the basis here — that is Chapter 15, where we prove that every basis of a given subspace has the same number of vectors (so dimension is well-defined) and that every vector has unique coordinates in a basis. But you now hold both keys. When you read "$\mathbb{R}^3$ has dimension $3$," it means: you need exactly three independent vectors to span it — no fewer (they wouldn't reach everything) and no more (they couldn't stay independent). The standard basis $\mathbf{e}_1, \mathbf{e}_2, \mathbf{e}_3$ is the obvious choice, but the RGB primaries, or the climbing vectors $(1,1,1), (0,1,1), (0,0,1)$ from §6.5, work equally well. Same dimension, different coordinate systems — a theme that explodes into "change of basis" in Chapter 16.

Here is the precise sense in which independence makes a basis useful: it forces uniqueness of coordinates. If $\{\mathbf{v}_1, \dots, \mathbf{v}_n\}$ is independent, then every vector in their span is hit by exactly one combination — there is no ambiguity about "how much of each" to use. The argument is a one-liner that shows off both proofs of this chapter. Suppose a vector $\mathbf{w}$ had two recipes, $\mathbf{w} = \sum a_i\mathbf{v}_i = \sum b_i\mathbf{v}_i$. Subtract them: $\sum (a_i - b_i)\mathbf{v}_i = \mathbf{0}$. By independence, the only combination reaching $\mathbf{0}$ is the trivial one, so every $a_i - b_i = 0$, i.e. $a_i = b_i$ — the two recipes were the same all along. Dependence destroys this: with a redundant vector, you can shuffle weight between recipes and get the same $\mathbf{w}$ many ways, so "coordinates" become meaningless. Independence is exactly the condition that makes coordinates well-defined — which is why a basis, not just any spanning set, is what we coordinatize with.

Let's see unique coordinates in action. Take the basis $\mathbf{v}_1 = (1,1,1)$, $\mathbf{v}_2 = (0,1,1)$, $\mathbf{v}_3 = (0,0,1)$ and ask for the coordinates of $\mathbf{w} = (5, -2, 4)$ — the unique $(c_1, c_2, c_3)$ with $c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + c_3\mathbf{v}_3 = \mathbf{w}$.

# Unique coordinates of w in the basis {v1, v2, v3}: solve B c = w.
import numpy as np
B = np.column_stack([[1, 1, 1],
                     [0, 1, 1],
                     [0, 0, 1]]).astype(float)   # basis vectors as columns
w = np.array([5.0, -2.0, 4.0])
c = np.linalg.solve(B, w)                          # B is square & full-rank -> unique c
print("coordinates c =", c)                        # [ 5. -7.  6.]
print("reconstruct  =", B @ c)                     # [ 5. -2.  4.]  == w

The unique coordinates are $(5, -7, 6)$: indeed $5(1,1,1) - 7(0,1,1) + 6(0,0,1) = (5, -2, 4)$. Because the basis is independent, np.linalg.solve returns exactly one answer — there is no other combination that works. Change the basis and the same vector $\mathbf{w}$ gets different coordinates (in the standard basis its coordinates are just $(5, -2, 4)$ itself). The vector is one fixed arrow in space; its coordinates are a description relative to a chosen basis. Holding those two ideas apart — the invariant vector versus its basis-dependent coordinates — is the conceptual leap of Chapter 16, and it all rests on the independence you learned to test here.

Real-World ApplicationDegrees of freedom in robotics (engineering). A robot arm's configuration is described by its joint angles; the number of independent joint motions is the arm's degrees of freedom, and it equals the dimension of the space of reachable configurations. Engineers analyze the Jacobian matrix of the arm — its columns are the velocity contributions of each joint — and ask whether those columns are linearly independent. When they are, every desired end-effector motion is achievable; when they become dependent (a singular configuration, like an arm stretched perfectly straight), the arm loses a degree of freedom and cannot move in some direction no matter how the joints spin. Detecting these singularities is exactly a rank/independence test on the Jacobian — the same matrix_rank check from §6.5, applied to keep robots from locking up.

This degrees-of-freedom reading echoes far beyond robots. The same idea governs the superposition of states in quantum mechanics, where a quantum state is a linear combination of independent basis states (the qubit's $|0\rangle$ and $|1\rangle$ are an independent, spanning set — a basis — for a single-qubit space), and the dimension counts the distinguishable configurations. And in machine learning, the dimensionality of feature spaces in machine learning is governed by how many independent features you have: redundant (dependent) features inflate the apparent dimension without adding information, which is the collinearity problem our data-science case study dissects. Across these wildly different fields, the same arithmetic — count the independent directions — measures the true size of the space.

Worked application: redundant features and collinearity in data

Let's make that data-science claim concrete, because it is where independence quietly decides whether a model can even be fit. Imagine a tiny dataset of five people, with three measured features arranged as the columns of a design matrix $X$: height in centimeters, height in inches, and weight in kilograms. The catch is that inches are not a new measurement — they are centimeters divided by $2.54$. The "height in inches" column is therefore a scalar multiple of the "height in centimeters" column, and the two are linearly dependent. The third feature, weight, is genuinely different. So we have three columns but only two independent directions of information.

# Three features, but two are the same measurement in different units -> collinear.
import numpy as np
height_cm = np.array([150., 160., 170., 180., 190.])
height_in = height_cm / 2.54           # NOT new information: a scalar multiple of column 1
weight    = np.array([55., 62., 70., 80., 95.])
X = np.column_stack([height_cm, height_in, weight])   # 5x3 design matrix
print("columns =", X.shape[1])                  # 3 features
print("rank(X) =", np.linalg.matrix_rank(X))    # 2  -> only 2 independent features

The matrix has three columns but rank $2$: the feature space the data actually occupies is two-dimensional, not three. This is collinearity (or multicollinearity when several features conspire), and it is the same dependence you have tested all chapter, now wearing a data-science hat — one column is a redundant vector lying in the span of the others. The consequence is concrete and severe. Linear regression and many other models must solve a system built from $X$, and that system asks for the coordinates of the outcome in terms of the feature columns. But §6.10 just taught us that coordinates are unique only when the vectors are independent. With a dependent column, infinitely many coefficient vectors fit the data equally well — you can pour weight into the centimeter coefficient and pull it back out of the inch coefficient with no visible effect — so the fitted coefficients become unstable and uninterpretable, the exact failure that destroys uniqueness in §6.10. The cure is to detect the rank deficiency (a matrix_rank check, or its softer cousin the condition number of Chapter 38) and drop or combine the redundant features until the remaining columns are independent. Independence is not an abstract nicety here; it is the precise condition under which a model's coefficients mean anything at all.

Real-World ApplicationDummy-variable traps in statistics. The same trap springs in regression whenever a categorical variable is one-hot encoded and an intercept column of all ones is kept. The category columns sum to the all-ones column — a dependence relation — so the design matrix loses rank and the coefficients are not identifiable. Statisticians call this the "dummy-variable trap," and the standard fix (drop one category as a baseline) is nothing but pruning a dependent set down to an independent one, exactly the §6.9 procedure applied to columns of data.

6.11 Build your own toolkit: testing independence from scratch

You now know the algorithm cold: to decide whether a list of vectors is independent, stack them as the columns of a matrix, row reduce to echelon form, and check whether every column received a pivot. A pivot in every column means no free variables, means only the trivial combination reaches $\mathbf{0}$, means independence. This is the perfect toolkit contribution because it reuses the row-reduction routine you already wrote in Chapter 4 and turns it toward a brand-new question.

Build Your Toolkit — Add independent(vectors) to toolkit/vectors.py (the module you started in Chapter 2). It takes a list of vectors (each a list or tuple of equal length) and returns True if they are linearly independent, False otherwise — implemented from scratch, no numpy in the body. The recipe: (1) arrange the vectors as the columns of a matrix; (2) run your row_reduce from toolkit/linear_systems.py (Chapter 4) to reduce it toward echelon form, tracking which columns get a pivot; (3) return True exactly when the number of pivot columns equals the number of vectors (a pivot in every column). ```python

toolkit/vectors.py (sketch — reuse your Chapter 4 row-reduction)

def independent(vectors): """True iff the given vectors are linearly independent. Vectors as columns -> reduce -> a pivot in EVERY column?""" # 1. Build the matrix with the vectors as its columns. # 2. Forward-eliminate (your row_reduce), counting pivot columns. # 3. return (number_of_pivots == len(vectors)) ... `` A subtle but important detail: comparing floats to *exactly* zero is unsafe (round-off can leave a tiny $10^{-17}$ where a true zero belongs). Treat any entry with absolute value below a small tolerance, say1e-9, as zero when hunting for a pivot — this mirrors whatnp.linalg.matrix_rankdoes internally. **Verify your function againstnp.linalg.matrix_rank:** for any list of vectors,independent(vectors)should equalnp.linalg.matrix_rank(np.array(vectors).T) == len(vectors). Test it on independent sets (e.g. $(1,1,1),(0,1,1),(0,0,1)$), dependent sets (e.g. $(1,2,3),(2,1,0),(4,5,6)$), parallel pairs, and "too many vectors" cases (four vectors in $\mathbb{R}^3$ must come backFalse`).

The verification harness lives in toolkit/tests/ and checks independent against numpy across a battery of random and hand-picked cases; if your implementation and matrix_rank ever disagree, the most likely culprit is the zero-tolerance, not the logic. Building this function cements the chapter's central computational fact — independence is a pivot in every column — into code you wrote yourself.

Computational Note — Your from-scratch independent and np.linalg.matrix_rank can rarely disagree on a knife's-edge case where a set is just barely dependent (a vector that is a combination of the others only up to round-off). numpy's rank uses a principled SVD-based tolerance scaled by the largest singular value and the matrix size; a naive fixed 1e-9 threshold is coarser. For the exact-integer and simple-rational vectors you will test, the two agree perfectly. For messy floating-point data — say, near-collinear feature columns in a real dataset — prefer numpy's rank, and treat "rank deficiency" as a matter of degree (how close to dependent) rather than a hard yes/no. Chapter 38 (numerical linear algebra) takes this seriously; the condition number there measures exactly this near-dependence.

6.12 Summary: three ideas that become a basis

This chapter took the abstraction of Chapter 5 and gave it shape. You learned to recognize a subspace — a flat region closed under linear combination, always through the origin — and to disqualify impostors instantly by checking the origin. You learned that the span of a set of vectors is everything reachable by scaling and adding them, that it is always a subspace (you proved it), and that it looks like a line, a plane, or all of space depending on how many independent directions the vectors supply. You learned linear independence as the precise statement that no vector is redundant — equivalently, that only the trivial combination reaches $\mathbf{0}$, equivalently, that row reduction gives a pivot in every column — and you proved that this matches the geometric idea of "no vector lies in the span of the others." And you saw the punchline forming: a basis is a set that is both independent and spanning, the minimal generating set, with the number of basis vectors being the dimension.

The recurring themes of this book all surfaced. Geometry and algebra are two views of one object: a span is a geometric flat and the solution set of a system; independence is "no redundant arrow" and "a pivot in every column." Computation validates theory: every hand result here was confirmed by numpy's matrix_rank, and you built independent from scratch to check against it. And the connective-tissue theme — that the same idea reappears everywhere — showed up as RGB color bases in graphics, degrees of freedom in robotics, qubit states in quantum mechanics, and feature dimensionality in machine learning, all governed by the single arithmetic of counting independent directions.

In Chapter 7 the book makes its central turn: we stop treating matrices as grids for systems and reveal them as functions that transform space — the idea seeded back in Chapter 1, now developed in full. The span and independence you just mastered become the column space and rank of those transformations. And in Chapter 15, the basis we just glimpsed gets built properly, with dimension proven well-defined. The flat shapes you learned to see here — lines and planes through the origin — are about to become the four fundamental subspaces (Chapters 13–14) that organize the entire subject. Keep the picture of the spanning plane in your head; you will see it on nearly every page from here on.