Annotated Bibliography

These are the sources this book learned from and the ones you should read next. Each entry says what the work is, who it is for, and how it relates to our geometry-first, application-driven approach. Editions and dates are given where we are confident; anything we could not verify is flagged [verify] so you can confirm before citing it in your own work. Free and openly licensed resources are marked (free).


Foundational textbooks

Gilbert Strang, Introduction to Linear Algebra (Wellesley–Cambridge Press, 6th ed., 2023 [verify]). The spiritual parent of this book. Strang teaches linear algebra through the four fundamental subspaces and the geometry of $A\mathbf{x} = \mathbf{b}$, with an applied, engineer's sensibility. If you read one other linear algebra book, read this one. Our notation for the four subspaces — $C(A)$, $N(A)$, $C(A^{\mathsf{T}})$, $N(A^{\mathsf{T}})$ — is his.

Gilbert Strang, MIT 18.06 Linear Algebra (MIT OpenCourseWare). (free) The legendary video lecture course, freely available with problem sets and exams. Watching Strang derive elimination and the SVD on the blackboard is the closest thing to the "draw the picture first" ethos we have tried to capture in print. Start here if you prefer lectures to reading. URL: ocw.mit.edu (course 18.06) [verify].

Sheldon Axler, Linear Algebra Done Right (Springer, 4th ed., 2024; 3rd ed. 2015). (free, 4th ed. open access) The rigorous, proof-first counterpoint to Strang. Axler famously develops the theory without determinants, building eigenvalues from the structure of operators on finite-dimensional spaces instead. Read it after this book if you are a math major who wants the abstract inner product space and linear transformation material (our Chapters 34–35) done with full care. The 4th edition is openly available online.

David C. Lay, Steven R. Lay, and Judi J. McDonald, Linear Algebra and Its Applications (Pearson, 6th ed., 2021 [verify]). The standard US undergraduate course text. Gentler and more example-driven than Strang, with abundant drill problems and a clean treatment of the Invertible Matrix Theorem and diagonalization. An excellent source of extra practice at the ⭐⭐ computational tier.

Jim Hefferon, Linear Algebra (4th ed., 2020 [verify]). (free) A complete, genuinely free (Creative Commons) textbook with full solutions to every exercise. Notably thorough on the theory of vector spaces, bases, and change of basis (our Chapters 5–6, 15–16). The companion answer key makes it ideal for self-study alongside this book. Available at hefferon.net.

Stephen Boyd and Lieven Vandenberghe, Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares (VMLS) (Cambridge University Press, 2018). (free) The book to read if you came for the applications. VMLS builds everything around vectors, matrices, and least squares, with a relentless focus on data, signals, and optimization, and a companion language (Julia/Python) for computation. It pairs perfectly with our applied chapters (12, 17, 29, 31–33). Freely downloadable at stanford.edu/~boyd/vmls [verify].

Numerical and advanced references

Lloyd N. Trefethen and David Bau III, Numerical Linear Algebra (SIAM, 1997). The most beautiful book on the computational side of the subject, organized around the SVD, QR, and the idea of conditioning and stability. Its perspective informs our Chapter 38. Concise, elegant, and surprisingly readable for a graduate text; the place to go after you wonder "but what does the computer actually do?"

Gene H. Golub and Charles F. Van Loan, Matrix Computations (Johns Hopkins University Press, 4th ed., 2013). The encyclopedic reference for matrix algorithms — factorizations, eigenvalue methods, iterative solvers, the lot. Not a book to read cover to cover, but the definitive place to look up how any decomposition is actually computed and analyzed. For researchers and serious practitioners of numerical linear algebra.

Roger A. Horn and Charles R. Johnson, Matrix Analysis (Cambridge University Press, 2nd ed., 2012). The rigorous reference for the theory of matrices: eigenvalue inequalities, the spectral theorem, positive definiteness, norms, and canonical forms (including Jordan normal form, our Chapter 36) in full generality. Dense and authoritative; the mathematician's matrix handbook.

Visual and online resources

3Blue1Brown (Grant Sanderson), Essence of Linear Algebra (YouTube, 2016). (free) A short series of stunning animated videos that build the same intuition this book is built on: a matrix is a transformation of space, the determinant is an area factor, and eigenvectors are the directions that don't get knocked off their span. If a concept here feels abstract, there is almost certainly a 3Blue1Brown video that makes it move. We recommend watching the series before or alongside Parts I–V.

Khan Academy, Linear Algebra. (free) Free, methodical video walkthroughs of the computational mechanics — row reduction, matrix multiplication, the dot product — with practice exercises. Best as a remedial or reinforcement resource for the ⭐⭐ hand-computation skills.

Founding papers

Carl Eckart and Gale Young, "The approximation of one matrix by another of lower rank," Psychometrika 1(3):211–218 (1936). The paper that proved the Eckart–Young theorem: the truncated SVD is the best low-rank approximation of a matrix. It is the theorem behind every use of the SVD for compression, denoising, and dimensionality reduction in our Chapters 31–32.

Karl Pearson, "On lines and planes of closest fit to systems of points in space," Philosophical Magazine 2(11):559–572 (1901). The origin of principal component analysis (Chapter 32): Pearson framed PCA as finding the line or plane minimizing perpendicular distance to a cloud of points — an orthogonal projection problem — decades before the matrix machinery existed. (Harold Hotelling gave the modern variance-maximization formulation in 1933 [verify].)

Sergey Brin and Lawrence Page, "The anatomy of a large-scale hypertextual Web search engine," Computer Networks and ISDN Systems 30(1–7):107–117 (1998). The paper that introduced Google and PageRank (Chapter 29). It models the web as a giant column-stochastic matrix and ranks pages by its dominant eigenvector, computed by power iteration — possibly the most lucrative eigenvector problem in history. (A companion technical report, Page, Brin, Motwani & Winograd, "The PageRank Citation Ranking," Stanford 1999, gives the damping-factor details. [verify])


A note on using these sources. This book is deliberately a bridge: more geometric and applied than Axler, more proof-aware than VMLS, less encyclopedic than Golub & Van Loan. When a chapter's further-reading.md points you to "Strang Ch. X" or "Axler §Y," it is sending you to the source that does that one topic best — not asking you to abandon this one. Read across them; the subject rewards seeing the same idea from several angles.