Part VI — Multivariable Calculus
"In nature there are neither rewards nor punishments — there are consequences." — Robert Green Ingersoll
The world is not one-dimensional. Temperature varies across a room. Wind blows in three dimensions. The cost of a manufactured product depends on dozens of inputs. The output of a neural network depends on millions of parameters. Calculus must generalize to this richer reality. That is the project of Part VI.
The leap from single-variable calculus to multivariable calculus is the largest conceptual jump in the book. New objects appear: surfaces in 3D, vector fields, partial derivatives, gradients, multiple integrals. New ideas appear: that a derivative is best understood as a linear approximation (a fact that was already true in single-variable calculus, but became visible only in multivariable). That the gradient points in the direction of steepest ascent. That optimization in multiple variables looks for "saddle points" as well as peaks and valleys. That changing coordinates affects integrals through a "Jacobian" factor.
This is also where calculus meets modern machine learning, modern physics, and modern engineering most directly. The gradient descent algorithm that trains every neural network in the world is multivariable calculus. The Maxwell equations of electromagnetism, the Schrödinger equation of quantum mechanics, the Navier-Stokes equations of fluid flow, the heat equation, the wave equation — all are multivariable calculus.
What This Part Covers
-
Chapter 28 — Vector-Valued Functions and Space Curves. Functions $\mathbf{r}(t)$ that trace curves in 3D. Velocity, acceleration, arc length, curvature. The bridge from single-variable to multivariable.
-
Chapter 29 — Functions of Several Variables. Functions $f(x, y)$ and $f(x, y, z)$. Graphs (surfaces), level curves (contour maps), partial derivatives, tangent planes.
-
Chapter 30 — The Multivariable Chain Rule, Gradient, and Directional Derivatives. The gradient $\nabla f$ — the direction of steepest ascent. Directional derivatives. Full development of the gradient-descent anchor. Gradient descent for training neural networks.
-
Chapter 31 — Optimization in Several Variables. Critical points, the second derivative test for surfaces, Lagrange multipliers for constrained optimization. Applications: utility maximization, optimal design, maximum likelihood estimation.
-
Chapter 32 — Multiple Integrals. Double and triple integrals. Iterated integrals via Fubini's theorem. Polar, cylindrical, and spherical coordinates. Climax of the area-under-the-normal-curve anchor: we finally compute $\int_{-\infty}^\infty e^{-x^2}\,dx = \sqrt{\pi}$.
-
Chapter 33 — Change of Variables and Jacobians. The general substitution rule for multiple integrals. Why polar/cylindrical/spherical work. Probability density transformations.
What You Should Be Able to Do by the End of Part VI
- Visualize a function $f(x, y)$ as a surface and as a contour plot
- Compute partial derivatives and gradients fluently
- Apply the multivariable chain rule
- Find critical points and classify them (max, min, saddle)
- Solve constrained optimization problems via Lagrange multipliers
- Set up and evaluate double and triple integrals in Cartesian, polar, cylindrical, and spherical coordinates
- Implement gradient descent in Python to minimize a function
Why This Part Matters
Single-variable calculus is the version most students remember and most textbooks emphasize. But multivariable calculus is the version that matters for almost every modern application. Gradient descent — the engine of deep learning — is multivariable calculus, not single-variable. The integral that defines a probability distribution over $\mathbb{R}^n$ is multivariable. The equations of physics that describe the universe at every scale are multivariable.
The conceptual leap is real, but the payoff is enormous. By the end of Part VI you will understand why training a neural network works (gradient descent on a high-dimensional loss surface), how the normal distribution gets its $\sqrt{2\pi}$ factor (a polar-coordinate trick), and what it means for a function of three variables to have a saddle point (it can be increasing in one direction and decreasing in another).
Part VI is also where the book gets visual in a new way. Surfaces in 3D, gradient fields, contour plots, saddle points — these are inherently graphical objects, and the chapter draws on matplotlib's 3D capabilities heavily. The pictures here are not decoration. They are the content.