Chapter 16 — Key Takeaways

DataField.Dev

Chapter 16 — Key Takeaways

The big ideas

A matrix is a representation, not the thing itself. A linear transformation is a coordinate-free object; a matrix is its shadow in one chosen basis. Change the basis and the matrix changes, but the transformation does not. This is recurring theme #1 of the book, and this chapter is where it stops being a slogan and becomes a theorem (§16.7).
The change-of-basis matrix $P$ has the new basis vectors as its columns (written in the old coordinates). It converts new coordinates to old: $[\mathbf{v}]_{\text{old}} = P[\mathbf{v}]_{\text{new}}$. It is always invertible, because a basis is independent.
Its inverse converts the other way. To go into the new basis, multiply by the inverse: $$[\mathbf{v}]_{\text{new}} = P^{-1}[\mathbf{v}]_{\text{old}}.$$ The correctness test is the round trip: converting old $\to$ new $\to$ old must return the original, because $PP^{-1} = I$.
The matrix of a transformation changes by similarity. If $A$ represents a transformation in the old basis, the same transformation in the new basis is $$B = P^{-1}AP,$$ read right-to-left as "go to old coordinates ($P$), act ($A$), come back to new ($P^{-1}$)." Matrices related this way are similar; conjugation is the operation $A \mapsto P^{-1}AP$.
Similar matrices share their invariants. Trace, determinant, rank, characteristic polynomial, and eigenvalues are all basis-independent — they are properties of the transformation, not the coordinate system. Individual entries, symmetry, and being diagonal are basis-dependent.
The right basis simplifies; the wrong basis does not. Re-gridding $\begin{bmatrix}2&1\\1&2\end{bmatrix}$ into its aligned basis $\{(1,1),(-1,1)\}$ gave the diagonal $\begin{bmatrix}3&0\\0&1\end{bmatrix}$ — but re-gridding a clean projection into a skewed basis made it messier. Only the eigenbasis simplifies a given transformation.

Skills you should now have

Build $P$ from a stated new basis, and compute $P^{-1}$ (by hand for $2\times 2$/$3\times 3$, with numpy otherwise).
Convert a coordinate vector between any two bases, routing through the standard basis when neither is standard: $[\mathbf{v}]_{\mathcal{W}} = P_{\mathcal{W}}^{-1}P_{\mathcal{U}}[\mathbf{v}]_{\mathcal{U}}$.
Run and interpret the round-trip check as the definition of a correct change of basis.
Derive and compute $B = P^{-1}AP$, and verify your work by checking that trace and determinant match $A$.
Diagnose the $P$-versus-$P^{-1}$ direction error (test on a basis vector: $\mathbf{b}_1$ must have new coordinates $(1,0,\dots)$).
Re-grid a transformation in the recurring 2D visualizer and explain why the picture is the same transformation.
Implement change_basis_matrix and a coordinate converter from scratch and verify the round trip against numpy.

Terms to know

change of basis, change-of-basis matrix ($P$), coordinate vector, old / new coordinates, similar matrices, similarity ($B = P^{-1}AP$), conjugation, invariant (basis-independent), basis-dependent, trace invariance, determinant invariance, eigenbasis (preview), transition matrix (a synonym for the change-of-basis matrix).

How this connects to the rest of the book

Back to Part III. Change of basis completes the conceptual arc of Chapters 13–15: having learned what a matrix reaches, destroys, and how many independent directions it carries, you now know that the matrix itself was only ever one description among many. Rank — which you met as a count of pivots — is a similarity invariant precisely because it is the dimension of the (basis-free) image.
Forward to the heart of the book (Part V). This chapter is the doorway to eigenvalues. Chapter 23 names the special directions a transformation merely scales (eigenvectors) and the scaling factors (eigenvalues) — exactly the $(1,1), (-1,1)$ and $3, 1$ you saw re-grid a matrix into diagonal form. Chapter 24 computes them via the characteristic polynomial (a similarity invariant). Chapter 25, Diagonalization, is the direct payoff of this chapter: it is the systematic act of choosing the eigenbasis as your new basis so that $B = P^{-1}AP$ comes out diagonal, written $A = PDP^{-1}$ — which makes powers $A^k = PD^kP^{-1}$, matrix exponentials, and long-run dynamics trivial to compute. Everything in Chapter 25 is this chapter's grammar applied to the best possible basis.
Further forward. The spectral theorem (Chapter 27) guarantees the diagonalizing change of basis is an orthogonal one (a pure rotation) whenever the matrix is symmetric — the fact that powers PCA (Case Study 1). The SVD (Chapter 30) generalizes to every matrix by allowing two different orthonormal bases, one for input and one for output. And Jordan normal form (Chapter 36) is the cleanest representative under similarity when no eigenbasis exists — the "almost diagonal" canonical form that completes the classification this chapter opened.

The one image to keep

The recurring visualizer showing the same stretch of the plane two ways: a dense, cross-term-laden matrix $\begin{bmatrix}2&1\\1&2\end{bmatrix}$ on the standard square grid, and the clean diagonal $\begin{bmatrix}3&0\\0&1\end{bmatrix}$ on the aligned diamond grid. The transformation never moved. We changed the graph paper — and the right graph paper revealed its true, simple nature. That is change of basis, and it is the engine of everything to come.