Chapter 28 Exercises — Positive Definite Matrices and Quadratic Forms
Work the tiers in order; each builds on the last. ⭐ checks understanding, ⭐⭐ asks for hand computation, ⭐⭐⭐ asks for a proof (A) or code (C), and ⭐⭐⭐⭐ is an open-ended application. Problems marked [code] want a short numpy snippet; [proof] wants a rigorous argument in the §10 four-part style. Throughout, $A$ denotes a real symmetric matrix unless stated otherwise, and "positive definite" is abbreviated PD, "positive semidefinite" PSD.
Tier 1 — ⭐ Conceptual
28.1. In one sentence each, describe the surface (bowl, dome, saddle, or trough) of a quadratic form $\mathbf{x}^{\mathsf{T}}A\mathbf{x}$ that is (a) positive definite, (b) negative definite, (c) indefinite, (d) positive semidefinite but not definite.
28.2. True or false, with a one-line reason: a symmetric matrix with all positive entries is positive definite.
28.3. Why must $A$ be symmetric for the phrase "positive definite" to be applied? What part of a non-symmetric $M$ does the quadratic form $\mathbf{x}^{\mathsf{T}}M\mathbf{x}$ actually "see"?
28.4. State the three equivalent tests for positive definiteness. Which one is usually fastest to do by hand on a $2\times 2$ matrix, and which is the most numerically robust in code?
28.5. A symmetric matrix has eigenvalues $\{4, 0, 2\}$. Classify it (PD, PSD, indefinite, …). What does the zero eigenvalue mean geometrically for the surface?
28.6. The contour lines (level sets) of a quadratic form are observed to be a family of hyperbolas. What is the definiteness of the matrix? What contour shape would you see instead if it were positive definite?
28.7. Explain in one or two sentences why every covariance matrix is positive semidefinite. What would a negative eigenvalue of a covariance matrix mean, and why is that impossible?
28.8. State the Cholesky factorization and the exact condition under which it exists. Why is "try to Cholesky-factor it" a good practical test for positive definiteness?
28.9. At a critical point of a smooth function, the Hessian is found to be indefinite. Is the point a max, a min, or a saddle? Connect your answer to the surface shape of the corresponding quadratic form.
Tier 2 — ⭐⭐ Hand computation
28.10. Write the symmetric matrix $A$ for each quadratic form: (a) $q = 5x^2 + 4xy + 2y^2$; (b) $q = x^2 - 6xy + 9y^2$; (c) $q = 3x^2 + 8xy - y^2$. (Remember to halve the cross-term coefficient.)
28.11. Classify $A = \begin{psmallmatrix}3 & 1\\ 1 & 3\end{psmallmatrix}$ using all three tests (eigenvalues, pivots, leading principal minors). Confirm they agree, and note that the eigenvalues and pivots are different numbers with the same signs.
28.12. Classify $A = \begin{psmallmatrix}1 & 2\\ 2 & 1\end{psmallmatrix}$ by finding a single vector that makes the form negative, and by computing its eigenvalues. Which definiteness category is it?
28.13. Use Sylvester's criterion to determine whether $A = \begin{psmallmatrix}2 & -1 & 0\\ -1 & 2 & -1\\ 0 & -1 & 2\end{psmallmatrix}$ is positive definite. Compute $\Delta_1, \Delta_2, \Delta_3$ by hand. (This is the $3\times3$ second-difference matrix — a three-spring chain.)
28.14. Find the values of the parameter $c$ for which $A = \begin{psmallmatrix}2 & c\\ c & 8\end{psmallmatrix}$ is positive definite. (Hint: apply Sylvester's criterion and solve the inequalities.)
28.15. For $A = \begin{psmallmatrix}4 & 0\\ 0 & 1\end{psmallmatrix}$, write the equation of the level set $\mathbf{x}^{\mathsf{T}}A\mathbf{x} = 4$ and identify it as an ellipse. Give the half-lengths of its two axes and state which axis is longer and why.
28.16. Compute the Cholesky factor $L$ of $A = \begin{psmallmatrix}9 & 3\\ 3 & 5\end{psmallmatrix}$ by hand using the recurrence $L_{11} = \sqrt{a_{11}}$, $L_{21} = a_{21}/L_{11}$, $L_{22} = \sqrt{a_{22} - L_{21}^2}$. Verify $LL^{\mathsf{T}} = A$.
28.17. The function $f(x,y) = x^2 + xy + y^2 - 3x$ has one critical point. Find it (set the gradient to zero), compute the Hessian, and classify the critical point using a definiteness test.
28.18. A symmetric matrix has pivots $\{3, -2\}$ from elimination (no row swaps). Without computing eigenvalues, state its definiteness and explain how you know.
Tier 3 — ⭐⭐⭐ Proof (A) / Coding (C)
28.19. [proof] Prove that a symmetric matrix $A$ is positive definite if and only if all its eigenvalues are positive. Use the spectral factorization $A = QDQ^{\mathsf{T}}$ and the substitution $\mathbf{y} = Q^{\mathsf{T}}\mathbf{x}$, in the four-part proof style.
28.20. [proof] Prove that for any matrix $B$ (not necessarily square or symmetric), the matrix $A = B^{\mathsf{T}}B$ is symmetric and positive semidefinite. Then prove $A$ is positive definite if and only if $B$ has linearly independent columns. (This is why $A^{\mathsf{T}}A$ in least squares and the covariance matrix are PSD.)
28.21. [proof] Prove that if $A$ is symmetric and $S$ is skew-symmetric ($S^{\mathsf{T}} = -S$), then $\mathbf{x}^{\mathsf{T}}S\mathbf{x} = 0$ for all $\mathbf{x}$, and conclude that $\mathbf{x}^{\mathsf{T}}M\mathbf{x} = \mathbf{x}^{\mathsf{T}}\big(\tfrac12(M+M^{\mathsf{T}})\big)\mathbf{x}$ for any square $M$. (This justifies always symmetrizing.)
28.22. [proof] Prove the pivot–minor identity $d_k = \Delta_k/\Delta_{k-1}$ for a symmetric matrix admitting elimination without row swaps, and use it to deduce Sylvester's criterion from the pivot test. State the conditions you assume.
28.23. [proof] Prove that if $A$ is positive definite then so is $A^{-1}$. (Hint: relate the eigenvalues of $A^{-1}$ to those of $A$, using the spectral theorem.) Then prove that the diagonal entries $a_{ii}$ of a positive definite matrix are all positive (Hint: test the form on a standard basis vector $\mathbf{e}_i$).
28.24. [code] Write classify(A) in numpy that returns one of the strings "positive definite", "positive semidefinite", "negative definite", "negative semidefinite", or "indefinite" by inspecting the sign pattern of np.linalg.eigvalsh(A) (use a small tolerance for "zero"). Test it on $I$, $-I$, $\begin{psmallmatrix}1&0\\0&0\end{psmallmatrix}$, $\begin{psmallmatrix}1&0\\0&-1\end{psmallmatrix}$, and $\begin{psmallmatrix}2&-1\\-1&2\end{psmallmatrix}$, printing each verdict.
28.25. [code] Write is_positive_definite(A, tol=1e-12) that first checks symmetry (np.allclose(A, A.T, atol=tol)) and then attempts np.linalg.cholesky(A) inside a try/except, returning True only if it succeeds. Cross-check it against (np.linalg.eigvalsh(A) > 0).all() on five random symmetric matrices built as $B + B^{\mathsf{T}} + cI$ for varying $c$, and report any disagreements.
28.26. [code] Verify the pivot–minor identity numerically: for a random $4\times4$ symmetric positive definite matrix (build one as $B^{\mathsf{T}}B + I$), compute the leading principal minors $\Delta_1,\dots,\Delta_4$ and the pivots from scipy.linalg.ldl, and confirm $d_k \approx \Delta_k/\Delta_{k-1}$ to within $10^{-8}$.
28.27. [code] Reproduce the chapter's anchor figure (Figure 28.1) for a matrix of your choice that is positive definite, plotting both the 3D surface and the contour ellipses with the eigenvectors overlaid. Then change one entry to make the matrix indefinite and re-plot; confirm the surface becomes a saddle and the contours become hyperbolas.
Tier 4 — ⭐⭐⭐⭐ Application
28.28. [code] Mahalanobis distance. Given a $2\times 2$ covariance matrix $\Sigma = \begin{psmallmatrix}4 & 2\\ 2 & 3\end{psmallmatrix}$ and a mean $\boldsymbol\mu = (0,0)$, the Mahalanobis distance of a point $\mathbf{x}$ is $\sqrt{(\mathbf{x}-\boldsymbol\mu)^{\mathsf{T}}\Sigma^{-1}(\mathbf{x}-\boldsymbol\mu)}$. (a) Confirm $\Sigma$ is positive definite. (b) Compute the Mahalanobis distance of the points $(2, 0)$ and $(0, 2)$ and explain, using the eigenvectors of $\Sigma$, why they differ even though both are Euclidean-distance 2 from the mean. (c) Plot the unit Mahalanobis circle (the set of points at distance 1) and confirm it is an ellipse.
28.29. [code] Gradient descent and conditioning. Minimize $f(\mathbf{x}) = \tfrac12\mathbf{x}^{\mathsf{T}}A\mathbf{x}$ by gradient descent for two positive definite matrices: $A_1 = \begin{psmallmatrix}1 & 0\\ 0 & 1\end{psmallmatrix}$ (a round bowl) and $A_2 = \begin{psmallmatrix}10 & 0\\ 0 & 1\end{psmallmatrix}$ (an elongated bowl). Use a fixed step size and the same start point. Count the iterations each needs to reach the minimum within $10^{-6}$, and relate the difference to the condition number $\lambda_{\max}/\lambda_{\min}$. Explain in two sentences why feature normalization helps.
28.30. [proof + code] Energy and stability of a spring chain. The stiffness matrix of two masses joined by three identical springs (constant $k=1$) is $K = \begin{psmallmatrix}2 & -1\\ -1 & 2\end{psmallmatrix}$. (a) [proof] Prove $K$ is positive definite, and explain why this is exactly the statement that the equilibrium is stable. (b) [code] Compute the eigenvalues (squared natural frequencies) and eigenvectors (normal modes) of $K$, and describe the two modes physically (in-phase vs. out-of-phase oscillation). (c) Add a fourth scenario by changing the middle spring to a negative stiffness and show the matrix becomes indefinite — a runaway instability.
28.31. [open-ended] PCA preview. A dataset of two correlated features has covariance matrix $\Sigma = \begin{psmallmatrix}3 & 1\\ 1 & 3\end{psmallmatrix}$. Argue that $\Sigma$ is positive definite, find its eigenvectors and eigenvalues, and explain — referencing the data ellipse — which direction is the first principal component and how much of the total variance it captures (the fraction $\lambda_1/(\lambda_1+\lambda_2)$). Connect this to the level-set ellipse of Figure 28.1, and forward-link to Chapter 32. No heavy code required, but you may include a snippet that samples points from this covariance and plots the cloud with its principal axes.