Chapter 30 Exercises — The Singular Value Decomposition

Work these in order; the tiers build. ⭐ checks the concept, ⭐⭐ asks you to compute by hand, ⭐⭐⭐ asks for a proof (A track) or code (C track), and ⭐⭐⭐⭐ pushes into applications. Problems marked [proof] want a rigorous argument; [code] want a short numpy script with output. Throughout, the SVD is $A = U\Sigma V^{\mathsf{T}}$ with singular values $\sigma_1 \ge \sigma_2 \ge \cdots \ge 0$. Remember the conventions of §30.4: the singular values are unique, but the singular vectors carry paired sign and (for repeated singular values) rotational freedom, and np.linalg.svd returns $V^{\mathsf{T}}$, not $V$.


Tier 1 — ⭐ Conceptual

30.1. In one sentence, state the rotate–stretch–rotate picture: what does each of $V^{\mathsf{T}}$, $\Sigma$, and $U$ do to a vector as $A = U\Sigma V^{\mathsf{T}}$ acts on it?

30.2. True or false, with a reason for each: (a) every matrix has an SVD; (b) every matrix has an eigen-decomposition; (c) singular values can be negative; (d) singular values can be zero.

30.3. What shape are $U$, $\Sigma$, and $V$ for a $7\times 4$ matrix $A$ (full SVD)? What changes in the reduced SVD?

30.4. A matrix $A$ has singular values $\{6, 4, 0, 0\}$. What is its rank? If $A$ is $5\times 4$, what are the dimensions of its column space, row space, null space, and left null space?

30.5. Explain in your own words why the SVD needs two orthogonal matrices $U$ and $V$, whereas diagonalization $A = PDP^{-1}$ uses only one change of basis $P$.

30.6. Which numpy call returns the SVD, and what exactly are its three outputs? Name a common bug that comes from forgetting what the third output is.

30.7. For which special class of matrices do the singular values equal the eigenvalues, and the left and right singular vectors coincide? Why does this matter for PCA (Chapter 32)?

Tier 2 — ⭐⭐ Computation (by hand)

30.8. Let $A = \begin{psmallmatrix} 3 & 0 \\ 0 & -2 \end{psmallmatrix}$. Find $A^{\mathsf{T}}A$, its eigenvalues, the singular values, and a full SVD by hand. (Watch the sign: the singular value is $2$, not $-2$ — where does the minus sign go?)

30.9. Let $A = \begin{psmallmatrix} 0 & 2 \\ 3 & 0 \end{psmallmatrix}$. Compute $A^{\mathsf{T}}A$ and its eigenvalues, the singular values $\sigma_1, \sigma_2$, and the matrices $U, \Sigma, V$. Verify $U\Sigma V^{\mathsf{T}} = A$.

30.10. For $A = \begin{psmallmatrix} 1 & 1 \\ 0 & 1 \end{psmallmatrix}$ (the shear from §30.6), compute $A^{\mathsf{T}}A$ and find the exact singular values $\sigma = \sqrt{\frac{3\pm\sqrt5}{2}}$. Confirm numerically that $\sigma_1 \approx 1.618$ and $\sigma_2 \approx 0.618$, and note these are not the eigenvalues of $A$ (which are both $1$). What does this say about the difference between singular values and eigenvalues?

30.11. A matrix $A$ has SVD with $\sigma_1 = 5$, $\sigma_2 = 3$, $\sigma_3 = 1$. Compute (a) the operator norm $\lVert A\rVert_2$, (b) the Frobenius norm $\lVert A\rVert_F$, (c) the condition number $\kappa(A)$. If $A$ is $3\times 3$, what is $\lvert\det A\rvert$?

30.12. For the matrix $A = \begin{psmallmatrix} 2 & 0 \\ 0 & 1 \\ 0 & 0 \end{psmallmatrix}$ (a $3\times 2$ matrix), write down the singular values and a full SVD by inspection. Identify orthonormal bases for all four fundamental subspaces.

30.13. Given the rank-1 matrix $A = \mathbf{u}\mathbf{v}^{\mathsf{T}}$ where $\mathbf{u} = (1, 2, 2)^{\mathsf{T}}$ and $\mathbf{v} = (3, 4)^{\mathsf{T}}$, find its single nonzero singular value $\sigma_1$ without forming $A^{\mathsf{T}}A$. (Hint: $\sigma_1 = \lVert\mathbf{u}\rVert\,\lVert\mathbf{v}\rVert$ — explain why.)

30.14. Compute the pseudoinverse $\Sigma^{+}$ of the rectangular $\Sigma = \begin{psmallmatrix} 4 & 0 & 0 \\ 0 & 2 & 0 \end{psmallmatrix}$ (a $2\times 3$ matrix). What shape is $\Sigma^{+}$, and what are its nonzero entries?

Tier 3 — ⭐⭐⭐ Proof (A) / Coding (C)

30.15. [proof] Prove that the eigenvalues of $A^{\mathsf{T}}A$ are all non-negative, for any real matrix $A$. (This is step (b) of the existence proof, §30.5 — reconstruct it: show $\lambda_i = \lVert A\mathbf{v}_i\rVert^2 \ge 0$.)

30.16. [proof] Prove that $\lVert A\rVert_F = \sqrt{\sum_i \sigma_i^2}$, using the fact that the Frobenius norm is unchanged by multiplication by an orthogonal matrix. State clearly where orthogonality is used.

30.17. [proof] Prove that for a square matrix $A$, $\lvert\det(A)\rvert = \prod_i \sigma_i$. (Hint: take determinants of $A = U\Sigma V^{\mathsf{T}}$ and use $\lvert\det U\rvert = \lvert\det V\rvert = 1$ for orthogonal matrices, Chapter 21.)

30.18. [proof] Show that the top singular value satisfies $\sigma_1 = \max_{\lVert\mathbf{x}\rVert=1}\lVert A\mathbf{x}\rVert$, and that the maximizing $\mathbf{x}$ is $\mathbf{v}_1$. (Hint: write $\mathbf{x}$ in the orthonormal basis $\{\mathbf{v}_i\}$, expand $\lVert A\mathbf{x}\rVert^2 = \sum_i \sigma_i^2 c_i^2$, and maximize subject to $\sum_i c_i^2 = 1$.)

30.19. [code] Write a numpy script that computes the SVD of $A = \begin{psmallmatrix} 4 & 11 & 14 \\ 8 & 7 & -2 \end{psmallmatrix}$ and prints the singular values, $U$, and $V$. Verify $U\Sigma V^{\mathsf{T}} = A$ and that $U, V$ have orthonormal columns. Then reconstruct $A$ keeping only the largest singular value and report the Frobenius error.

30.20. [proof] Prove that $A^{\mathsf{T}}A$ and $AA^{\mathsf{T}}$ have the same nonzero eigenvalues, for any matrix $A$. (Hint: if $A^{\mathsf{T}}A\mathbf{v} = \lambda\mathbf{v}$ with $\lambda \ne 0$, show $A\mathbf{v}$ is an eigenvector of $AA^{\mathsf{T}}$ with the same eigenvalue. Why is $A\mathbf{v} \ne \mathbf{0}$?)

30.21. [code] Implement the four-step svd_from_scratch(A) of §30.12 (via $A^{\mathsf{T}}A$, then $\mathbf{u}_i = A\mathbf{v}_i/\sigma_i$). Test it on three matrices of different shapes (a square, a tall, and a wide one) and verify against np.linalg.svd — comparing singular values and the reconstruction, not the raw singular vectors. Explain in a comment why you compare the reconstruction rather than $U$ and $V$ entrywise.

30.22. [code] Write a function numerical_rank(A, tol) that returns the number of singular values of $A$ exceeding tol. Build a matrix that is mathematically rank-3 but numerically rank-2 (e.g. with singular values $\{10, 5, 10^{-12}\}$ by construction) and show that numerical_rank and np.linalg.matrix_rank agree on it. Why is this more robust than counting pivots?

Tier 4 — ⭐⭐⭐⭐ Application

30.23. [code] (Least squares via the pseudoinverse.) You have noisy measurements $(x_i, y_i)$ and want the best line $y = c_0 + c_1 x$. Generate 20 points from $y = 2 + 3x + \text{noise}$, build the design matrix $A$, and solve for $(c_0, c_1)$ three ways: (a) the normal equations $A^{\mathsf{T}}A\mathbf{c} = A^{\mathsf{T}}\mathbf{y}$; (b) the SVD-based pseudoinverse $\mathbf{c} = A^{+}\mathbf{y}$ built from $V\Sigma^{+}U^{\mathsf{T}}$; (c) np.linalg.lstsq. Confirm all three agree, and explain (referencing §30.10) why method (b) is the most numerically reliable for ill-conditioned $A$.

30.24. [code] (Image compression preview, Chapter 31.) Load a small grayscale image as a matrix (or build one, e.g. np.add.outer(np.sin(...), np.cos(...))), compute its SVD, and plot the singular values on a log scale. Reconstruct the image keeping the top $k = 5, 20, 50$ singular values and display the three reconstructions. At what $k$ does the image become visually indistinguishable from the original? Report the fraction $\sum_{i\le k}\sigma_i^2 / \sum_i \sigma_i^2$ of "energy" retained at each $k$.

30.25. [code] (Latent semantic analysis preview.) Build a $6\times 6$ term-document matrix where documents 1–3 share one vocabulary and documents 4–6 share another (see Case Study 30.1). Compute its SVD, report the singular values, and project the documents onto the top 2 right singular vectors. Show that the documents cluster into two groups in this 2D "latent semantic" space. What do the two largest singular values represent?

30.26. [proof/conceptual] (Polar decomposition.) Show that any square matrix factors as $A = QP$, where $Q = UV^{\mathsf{T}}$ is orthogonal and $P = V\Sigma V^{\mathsf{T}}$ is symmetric positive semidefinite. (This is the matrix analogue of writing a complex number as $re^{i\theta}$ — a "rotation" $Q$ times a "stretch" $P$.) Verify on $A = \begin{psmallmatrix} 2 & 2 \\ -1 & 1 \end{psmallmatrix}$ that $Q$ is orthogonal and $P$ is symmetric, using its SVD from §30.3.


Hints and selected answers

  • 30.2. (a) True — §30.5, the SVD exists for every matrix. (b) False — defective matrices and matrices with too few eigenvectors are not diagonalizable (Chapter 25). (c) False — $\sigma_i \ge 0$ always. (d) True — a zero singular value signals rank deficiency (§30.7).
  • 30.8. $A^{\mathsf{T}}A = \begin{psmallmatrix}9&0\\0&4\end{psmallmatrix}$, eigenvalues $9, 4$, singular values $3, 2$. The minus sign of the $-2$ entry goes into a singular vector: e.g. $U = \begin{psmallmatrix}1&0\\0&-1\end{psmallmatrix}$, $\Sigma = \operatorname{diag}(3,2)$, $V = I$ (or any paired-sign variant).
  • 30.10. $A^{\mathsf{T}}A = \begin{psmallmatrix}1&1\\1&2\end{psmallmatrix}$, eigenvalues $\frac{3\pm\sqrt5}{2}$; singular values are the golden ratio $\varphi \approx 1.618$ and $1/\varphi \approx 0.618$. The eigenvalues of $A$ are both $1$, so they completely fail to report the genuine stretching — singular values measure stretch, eigenvalues measure invariant directions.
  • 30.11. $\lVert A\rVert_2 = 5$; $\lVert A\rVert_F = \sqrt{25+9+1} = \sqrt{35} \approx 5.916$; $\kappa = 5/1 = 5$; $\lvert\det A\rvert = 5\cdot 3\cdot 1 = 15$.
  • 30.13. $\sigma_1 = \lVert\mathbf{u}\rVert\lVert\mathbf{v}\rVert = 3\cdot 5 = 15$. A rank-1 matrix $\mathbf{u}\mathbf{v}^{\mathsf{T}}$ has exactly one nonzero singular value, equal to the product of the lengths, with left singular vector $\mathbf{u}/\lVert\mathbf{u}\rVert$ and right singular vector $\mathbf{v}/\lVert\mathbf{v}\rVert$.
  • 30.14. $\Sigma^{+}$ is $3\times 2$ (transposed shape) with diagonal entries $\tfrac14, \tfrac12$ and zeros elsewhere.
  • 30.20 and 30.26 are excellent preparation for Chapters 32 and 37; the polar decomposition (30.26) reappears in continuum mechanics and computer animation as the cleanest way to separate rotation from deformation.