Key Takeaways — Chapter 38: Numerical Linear Algebra
The big ideas
-
Exact arithmetic is a fantasy; real computers use finite precision. IEEE 754 double precision stores about 16 significant decimal digits ($\varepsilon_{\text{mach}} = 2^{-52} \approx 2.22\times 10^{-16}$). Every operation rounds, introducing a relative error of at most $\tfrac12\varepsilon_{\text{mach}}$. One operation is harmless; millions can interact and amplify.
-
Conditioning and stability are different things — respect both. Conditioning is a property of the problem: how much the true answer moves when the input is perturbed. Stability is a property of the algorithm: whether it adds more error than the problem inherently demands. A hopelessly ill-conditioned problem cannot be saved by any algorithm; a well-conditioned problem can still be ruined by an unstable one. This is the chapter's threshold concept.
-
The condition number $\kappa(A) = \sigma_{\max}/\sigma_{\min}$ is the master diagnostic. Computed from the SVD (Chapter 30), it is the worst-case factor by which input error is amplified into output error. The perturbation bound $\dfrac{\lVert\Delta\mathbf{x}\rVert}{\lVert\mathbf{x}\rVert} \le \kappa(A)\dfrac{\lVert\Delta\mathbf{b}\rVert}{\lVert\mathbf{b}\rVert}$ makes this precise.
-
Rule of thumb: you lose about $\log_{10}\kappa(A)$ digits. Starting from ~16 in double precision, a matrix with $\kappa = 10^{10}$ leaves you ~6 trustworthy digits. When $\kappa(A)$ nears $1/\varepsilon_{\text{mach}} \approx 4.5\times 10^{15}$, your answer may have zero correct digits — as the Hilbert matrix at $n=12$ demonstrates.
-
A small residual does NOT mean a small error. The residual $\lVert A\hat{\mathbf{x}}-\mathbf{b}\rVert$ certifies you solved a nearby problem; the condition number decides whether the nearby problem has a nearby answer. They are linked by (relative error) $\le \kappa(A) \times$ (relative residual). Always compute $\kappa(A)$ before trusting a solution.
-
The forward-error master formula: forward error $\lesssim \kappa(A) \times \varepsilon_{\text{mach}}$. Conditioning sets the toll; a backward-stable algorithm pays only $\varepsilon_{\text{mach}}$ more, an unstable one pays much more.
Skills you gained
- Compute machine epsilon and explain why
0.1 + 0.2 != 0.3and why floats must never be compared with==. - Compute $\kappa(A)$ from the SVD and verify it against
np.linalg.cond. - Use the perturbation bound to predict how many digits of a linear-system solution you can trust.
- Diagnose an ill-conditioned matrix (Hilbert, Vandermonde, near-collinear design matrices, stiff stiffness matrices) and predict the accuracy loss.
- Explain why we pivot in LU (Chapter 10), why QR/SVD beat the normal equations for least squares (Chapters 17, 20, 30), and why
solvebeatsinv @ b. - Recognize catastrophic cancellation and reformulate algebraically to avoid it.
- Choose between direct and iterative methods, and relate iterative convergence to the spectral radius and to $\sqrt{\kappa}$.
Terms to know
floating-point, machine epsilon ($\varepsilon_{\text{mach}}$), rounding error, condition number ($\kappa(A)$), ill-conditioned, numerical stability, backward stability, forward/backward error, residual vs. error, catastrophic cancellation, Hilbert matrix, pivoting (partial pivoting), growth factor, normal equations (and why $\kappa(A^{\mathsf{T}}A) = \kappa(A)^2$), iterative method, Jacobi method, Gauss–Seidel method, spectral radius (convergence criterion), conjugate gradient, Krylov subspace, preconditioner, LAPACK / BLAS.
How this connects to the rest of the book
- Chapter 30 (SVD) supplies the definition: $\kappa(A) = \sigma_{\max}/\sigma_{\min}$. The condition number is a direct reading of the singular values you already know.
- Chapter 10 (LU) is where pivoting was introduced; this chapter explains it is what makes LU backward stable.
- Chapters 17, 20 (regression, QR) flagged that the normal equations carry a hidden numerical cost; this chapter reveals it as the squared condition number $\kappa(A)^2$, with QR/SVD as the cure.
- Chapters 23, 25, 29 (eigenvalues, diagonalization, power method) return as the theory of iterative convergence: the error obeys $\mathbf{e}^{(k)} = T^k\mathbf{e}^{(0)}$, and the spectral radius $\rho(T)$ (largest eigenvalue magnitude) decides whether and how fast it converges.
- Chapter 28 (positive definite matrices) is the setting for conjugate gradient.
- Chapter 37 (matrix exponential / ODEs) showed eigenvalues as the destiny of a dynamical system; here the same eigenvalues set the speed of an iterative solver.
Recurring themes revisited
This chapter is the sharpest possible statement of the book's third theme, computation validates theory and theory guides computation: the proofs of every earlier chapter guarantee the right answer in exact arithmetic, but it takes numerical insight — conditioning, stability, and an honest reckoning with finite precision — to extract a trustworthy answer from a finite machine. The condition number, born from the SVD, is the precise quantity where the theoretical and computational views of linear algebra meet.
Looking ahead
Part VIII brings everything home. The capstone (Chapter 39) drives your from-scratch toolkit on a real application, where the conditioning and stability ideas here become the difference between a demo that works and one that quietly produces nonsense — your condition_number, hilbert_demo, and compare_lstsq functions are exactly the instruments that tell you whether to trust your own results. Chapter 40 looks ahead to where linear algebra goes next: tensors, multilinear algebra, and functional analysis, where these same questions of approximation and stability return in infinite-dimensional dress.