Chapter 31 — Further Reading

DataField.Dev

Chapter 31 — Further Reading

Each entry below is annotated with what it adds and where it maps. The two reference textbooks this book is measured against (continuity §8) are cross-walked first.

Mapping to the Two Reference Texts

Stewart, J. (2021). Calculus: Early Transcendentals (9th ed.), Cengage.
§14.7 — Maximum and Minimum Values. Direct match for Sections 31.2–31.5 of this chapter: critical points, the second-derivative test with discriminant $D$, and absolute extrema on closed regions (Stewart's "closed and bounded set" method). Includes the least-squares-line derivation that anchors our Case Study 2.
§14.8 — Lagrange Multipliers. Direct match for Sections 31.8–31.9: single- and two-constraint optimization, with the production-maximization example mirroring our Section 31.10 utility problem.
Strength: large, well-graded exercise sets. Use for additional drill on the Hessian and Lagrange mechanics.
Strang, G., & Herman, E. Calculus, Volume 3 (OpenStax, free).
§4.7 — Maxima/Minima Problems. Free counterpart to Stewart §14.7 and our Sections 31.2–31.5; includes the discriminant test and a worked least-squares example.
§4.8 — Lagrange Multipliers. Free counterpart to Stewart §14.8 and our Sections 31.8–31.9, including a two-constraint example.
Strength: openly licensed, with applied problems. The best zero-cost source for extra practice.

Going Deeper — Multivariable Calculus

Marsden, J. E., & Tromba, A. J. (2011). Vector Calculus (6th ed.), W. H. Freeman. A geometry-forward treatment; its derivation of the second-derivative test from the Taylor quadratic form (our Section 31.4 "Why the test works") is especially clear. Read it if you want the picture behind the algebra.
Apostol, T. M. (1969). Calculus, Volume II, Wiley. Rigorous and proof-complete; states the eigenvalue/definiteness version of the test (our Section 31.6) in full generality. For readers who took the Math Major Sidebars seriously.

Optimization Theory

Boyd, S., & Vandenberghe, L. (2004). Convex Optimization, Cambridge (free online). The definitive reference for the convexity ideas of Section 31.12. Develops duality and the KKT conditions that generalize Lagrange multipliers to inequality constraints. The single best next step if Section 31.12 intrigued you.
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization (2nd ed.), Springer. The algorithmic companion to Section 31.7: Newton, quasi-Newton (BFGS, L-BFGS), and why the inverse-Hessian cost drives practical method choice. The reference behind scipy.optimize.
Bertsekas, D. P. (1996). Constrained Optimization and Lagrange Multiplier Methods, Athena Scientific. A deep dive into Lagrange-multiplier theory and its sensitivity/shadow-price interpretation (Section 31.10).

Economics — Case Study 1 (Utility Maximization)

Varian, H. R. (2014). Intermediate Microeconomics (9th ed.), Norton. Accessible derivation of the consumer's problem and the equal-marginal-utility-per-dollar condition; the gentlest entry point to the economics behind Case Study 1.
Mas-Colell, A., Whinston, M. D., & Green, J. R. (1995). Microeconomic Theory, Oxford. The graduate standard; treats $\lambda$ as a first-class economic object via the envelope theorem. For the rigorous version of "$\lambda$ = marginal utility of income."
Simon, C. P., & Blume, L. (1994). Mathematics for Economists, Norton. Bridges the calculus and the economics; excellent on second-order conditions for constrained optima — the piece our case study handles by appeal to convexity.

Statistics & Machine Learning — Case Study 2 (Least Squares / MLE)

Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning, Springer (free PDF), Ch. 3. Scales the two-parameter least-squares fit of Case Study 2 to many predictors: normal equations in matrix form, projection geometry, and regularization (where the ridge penalty is a Lagrange multiplier, tying back to Section 31.12).
Bishop, C. M. (2006). Pattern Recognition and Machine Learning, Springer, Ch. 1. The cleanest exposition of "least squares = maximum likelihood under Gaussian noise," the equivalence proved in our Case Study 2.
Wasserman, L. (2004). All of Statistics, Springer. A fast, broad tour of MLE across distributions; good for the practice recommendation below.
Casella, G., & Berger, R. L. (2002). Statistical Inference (2nd ed.), Duxbury. The rigorous reference for maximum-likelihood theory, including the constrained-MLE problems (probabilities summing to one) solved with Lagrange multipliers in Section 31.11.

Advanced — Calculus of Variations

Gelfand, I. M., & Fomin, S. V. (1963/2000). Calculus of Variations, Dover. Where Lagrange multipliers go infinite-dimensional: optimizing over functions rather than points. The natural sequel to Section 31.8 for physics-minded readers (least-action principles).

Practice Recommendations

Derive the Cobb–Douglas demand functions (Section 31.10, Exercise E2), then vary prices, income, and exponents to feel how the optimum and $\lambda$ respond. This builds the economic intuition behind Case Study 1.
Prove the second-derivative test from the Taylor quadratic form (Exercise F1). The completion-of-the-square computation in Section 31.4 is finite and illuminating.
Compute MLEs for several distributions (Normal, Poisson, Bernoulli, Exponential), and confirm each matches a sample statistic (mean, fraction). This cements the "estimation = optimization" theme of Section 31.11.

Together these exercises build fluency in the two engines of the chapter — the Hessian test and Lagrange multipliers — which are among the most consequential tools calculus offers to science, engineering, economics, and machine learning.