Key Takeaways — Chapter 20: Gram-Schmidt and QR Decomposition
The big ideas
-
Gram-Schmidt is repeated projection. It is not a new technique but the orthogonal projection of Chapter 19 applied vector by vector: for each new vector, subtract its shadow on everything already orthogonalized, and the perpendicular remainder is a fresh orthogonal direction. Normalize, and move on. The entire chapter rests on this one reused idea.
-
The condition is linear independence. Gram-Schmidt turns any linearly independent set into an orthonormal basis for the same span. Fed a dependent set, it produces a zero remainder at the first redundant vector and cannot normalize — the failure pinpoints the dependence. State this condition; the theorem is false without it.
-
Every subspace has an orthonormal basis. Because Gram-Schmidt is constructive, it proves that any finite-dimensional subspace of $\mathbb{R}^m$ has an orthonormal basis — and hands you one. This existence guarantee silently underwrites the Spectral Theorem (Chapter 27) and the SVD (Chapter 30).
-
QR is Gram-Schmidt in matrix form. Stacking the orthonormal vectors as the columns of $Q$ and recording the lengths and dot products in $R$ gives the factorization $A = QR$, with $Q$ having orthonormal columns ($Q^{\mathsf{T}}Q = I$) and $R$ upper-triangular. The triangularity comes from processing vectors in order — each one rebuilt from only the earlier directions.
-
QR is the stable way to do least squares. Substituting $A = QR$ collapses the normal equations to the triangular system $R\hat{\mathbf{x}} = Q^{\mathsf{T}}\mathbf{b}$, solved by back substitution. It never forms the ill-conditioned $A^{\mathsf{T}}A$ from Chapter 17, so it keeps the significant digits that squaring the condition number would destroy.
-
How you compute matters, not just what you compute. Classical and modified Gram-Schmidt are identical on paper but worlds apart in floating point: classical loses orthogonality on nearly-dependent data, modified does not. This is a first, concrete encounter with numerical stability (Chapter 38).
Skills you gained
- Run Gram-Schmidt by hand on a small basis, producing an orthogonal then orthonormal set, and verify orthonormality by checking $Q^{\mathsf{T}}Q = I$.
- Read the QR factorization off a Gram-Schmidt computation: diagonal of $R$ from remainder lengths, above-diagonal from dot products.
- Reconcile a hand QR with
np.linalg.qrby fixing the sign convention (force a positive $R$-diagonal). - Solve least squares the stable way via $R\hat{\mathbf{x}} = Q^{\mathsf{T}}\mathbf{b}$, and explain why it beats the normal equations on correlated data.
- Distinguish reduced from full QR and know which numpy returns by default.
- Distinguish classical from modified Gram-Schmidt and explain the stability difference.
Terms to know
Gram-Schmidt process · orthonormal basis · orthogonal set · normalization · QR decomposition (factorization) · upper-triangular matrix · orthonormal columns ($Q^{\mathsf{T}}Q = I$) · reduced vs. full QR · sign convention (positive $R$-diagonal) · back substitution · classical vs. modified Gram-Schmidt · numerical stability · loss of orthogonality · condition number (its squaring under $A^{\mathsf{T}}A$).
How this connects
-
Backward. It is the direct sequel to Chapter 19 (orthogonal projection) — Gram-Schmidt is projection repeated — and it repairs the numerical weakness of the normal equations from Chapter 17. It uses the dot product and norm of Chapter 18 and the back substitution of Chapter 4, and it answers the "does an orthonormal basis exist?" question left open by Chapter 15. The column space spanned by $Q$ is the $C(A)$ of Chapter 13.
-
Forward. The orthonormal $Q$ is the protagonist of Chapter 21 (orthogonal matrices and rotations) and, in complex form, the unitary quantum gates of Chapter 27 and beyond. Reversing the QR factors and iterating is the QR algorithm for eigenvalues (Chapters 24, 29) that will compute PageRank at scale. The existence of orthonormal bases underwrites the Spectral Theorem (Chapter 27) and the SVD (Chapter 30). The inner-product viewpoint opens onto Chapter 22 (Fourier series as projection onto orthogonal functions) and Chapter 34 (abstract inner product spaces). The stability theme is completed in Chapter 38.
Themes revisited
This chapter is the clearest illustration yet of two of the book's recurring themes. Geometry and algebra are one object: orthogonality is simultaneously a picture (right angles), an algebraic identity ($Q^{\mathsf{T}}Q = I$), and a computational gift (triangular systems, stable least squares). And linear algebra is the most applied branch of pure mathematics: the single idea of repeated projection fits regression lines, builds camera frames, stabilizes audio filters, generates orthogonal polynomials, and — iterated — computes eigenvalues. Learn it once, use it everywhere.
Build Your Toolkit — this chapter's contribution
Add gram_schmidt(vectors) and qr_decompose(A) to toolkit/gram_schmidt.py, implemented from scratch (no numpy in the implementation), reusing dot/norm from toolkit/vectors.py and the projection idea from toolkit/projection.py. Use the modified ordering for stability, raise a clear error on a near-zero remainder, and verify against np.linalg.qr after reconciling the sign convention.