Chapter 28 — Key Takeaways

DataField.Dev

Chapter 28 — Key Takeaways

The one idea

A symmetric matrix is positive definite precisely when its quadratic form $\mathbf{x}^{\mathsf{T}}A\mathbf{x}$ is an upward-opening bowl with a unique minimum — and because $A$ is symmetric, the spectral theorem makes that single geometric fact equivalent to "all eigenvalues positive." From this one equivalence flows everything: three agreeing tests, contour ellipses with eigenvector axes, the second-derivative test of optimization, the positive semidefinite covariance matrices of statistics, and the Cholesky factorization. Positive definiteness is where eigenvalues become curvature — a thing you can see, optimize against, and build structures on.

The big ideas, in order

The quadratic form (state symmetry!). $\mathbf{x}^{\mathsf{T}}A\mathbf{x}$ with $A = A^{\mathsf{T}}$ is the simplest nonlinear function a matrix produces — a surface over the plane. Symmetry is mandatory and free: only the symmetric part $\tfrac12(M+M^{\mathsf{T}})$ of any matrix affects the form. Halve the cross-term coefficient when building $A$.
The four definiteness types are four surfaces. Positive definite = bowl ($\mathbf{x}^{\mathsf{T}}A\mathbf{x} > 0$ for all $\mathbf{x}\ne\mathbf{0}$); negative definite = dome; indefinite = saddle; positive semidefinite = flat-bottomed trough (a zero eigenvalue is a flat direction = the null space).
Eigenvalues decide the shape (the spectral picture). Substituting $A = QDQ^{\mathsf{T}}$ and $\mathbf{y} = Q^{\mathsf{T}}\mathbf{x}$ turns the form into $\sum \lambda_i y_i^2$ — a sum of squares weighted by the eigenvalues. The signs of the eigenvalues are the directions of curving: all positive ⇒ bowl ⇒ positive definite.
Three tests, one truth. Positive definite $\iff$ all eigenvalues $> 0$ $\iff$ all pivots $> 0$ ($A=LDL^{\mathsf{T}}$) $\iff$ all leading principal minors $> 0$ (Sylvester's criterion). They agree because each writes the form as a sum of squares, and by Sylvester's law of inertia congruence preserves the sign pattern (not the values).
Level sets are ellipses. The contours $\mathbf{x}^{\mathsf{T}}A\mathbf{x} = c$ of a positive definite form are nested ellipses; their axes point along the eigenvectors, with half-lengths $\propto 1/\sqrt{\lambda}$ (steep, large-$\lambda$ direction = short axis). Ellipses ⇒ definite; hyperbolas ⇒ indefinite; parallel lines ⇒ semidefinite.
Optimization is the definiteness test in disguise. Near a critical point, any smooth function is a quadratic form built from the symmetric Hessian; the multivariable second-derivative test is the definiteness classification (PD Hessian = local min, indefinite = saddle). Convex = positive semidefinite Hessian everywhere = no bad local minima.
Covariance matrices are positive semidefinite. $\Sigma = \tfrac1N B^{\mathsf{T}}B$ gives $\mathbf{w}^{\mathsf{T}}\Sigma\mathbf{w} = \tfrac1N\lVert B\mathbf{w}\rVert^2 \ge 0$ — the variance of projected data, never negative. A zero eigenvalue = a direction of zero variance = a redundant feature.
Cholesky $A = LL^{\mathsf{T}}$. Exists with positive-diagonal $L$ if and only if $A$ is positive definite — so a successful np.linalg.cholesky is the fast, robust definiteness test, cheaper than eigenvalues.

Skills you gained

Read a quadratic form as a surface and classify a symmetric matrix as PD / PSD / negative / indefinite.
Run all three definiteness tests by hand and know why they must agree.
Sketch the contour ellipses of a form: orientation from eigenvectors, aspect ratio from eigenvalues.
Classify the critical points of a multivariable function via the Hessian's definiteness.
Explain why covariance matrices and $A^{\mathsf{T}}A$ are positive semidefinite, and what a zero eigenvalue means for data.
Compute and use the Cholesky factorization, and test positive definiteness in code robustly (is_positive_definite).

Terms to know

quadratic form · positive definite / semidefinite · negative definite / semidefinite · indefinite · definiteness test · eigenvalue test · pivot · leading principal minor · Sylvester's criterion · Sylvester's law of inertia · congruence · Rayleigh quotient · level set · Hessian matrix · second-derivative test · convexity · condition number · covariance matrix · Mahalanobis distance · Cholesky factorization · energy / stiffness matrix

How this connects to the book's themes

Eigenvalues reveal what a matrix really does. Here "what it does" is curve space — the eigenvalues are the curvatures of the surface along its principal axes, and their signs are the shape. This is the chapter where the abstract spectrum of Part V becomes the tangible geometry of a bowl.
Geometry and algebra are two views of one object. "Upward bowl" (geometry) and "all eigenvalues positive" / "all pivots positive" / "all leading minors positive" (three pieces of algebra) are the same statement, fused by the spectral theorem and Sylvester's law of inertia.
Linear algebra is the most applied branch of pure mathematics. One idea — positive definiteness — is simultaneously the second-derivative test in optimization, the stability of an equilibrium in physics, the well-posedness of a covariance in statistics, and the existence of a Cholesky factor in numerical computing.
Toolkit contribution. You added toolkit/positive_definite.py with is_positive_definite(A) (symmetry check + from-scratch Cholesky attempt), verified against np.linalg.cholesky and np.linalg.eigvalsh — joining lu.py from Chapter 10 and feeding pca.py in Chapter 32.

Where this leads (forward references)

This chapter is the bridge from the spectral theorem to the back half of the book:

Chapter 30 (The Singular Value Decomposition). Where a positive definite symmetric matrix factors as $A = QDQ^{\mathsf{T}}$ with positive eigenvalues, every matrix factors as $A = U\Sigma V^{\mathsf{T}}$ with non-negative singular values — and those singular values are exactly the square roots of the eigenvalues of the positive semidefinite matrix $A^{\mathsf{T}}A$ from §28.7. The "square root of a matrix" intuition of the Cholesky factor previews the SVD's rotate–stretch–rotate. Positive (semi)definiteness is the doorway to the SVD.
Chapter 32 (Principal Component Analysis). PCA is the eigen-decomposition of the positive semidefinite covariance matrix of §28.7: the principal components are its eigenvectors, the variances are its (non-negative) eigenvalues, and the data ellipsoid is the contour ellipse of Figure 28.1. The zero-eigenvalue "flat directions" of §28.3.1 are the redundant dimensions PCA discards. Everything PCA does is this chapter applied to data.
Chapter 33 (Machine Learning). The loss-bowl and condition-number story of Case Study 28.2 — positive definite Hessians, convexity, conditioning — is the geometric foundation of how models train.

The marble you rolled into a bowl at the start of this chapter is rolling, it turns out, straight toward the heart of modern data science.