Every previous chapter taught you to describe a function — its slope, its concavity, its behavior near a point. This chapter teaches you to use those descriptions to answer the question every quantitative field eventually asks: what is the best we...
Prerequisites
- Chapter 9: Applications of Derivatives
Learning Objectives
- Translate an English word problem into an objective function and a constraint.
- Use a constraint to reduce a multivariable objective to a single-variable function.
- Identify the correct domain and find critical points of the objective.
- Apply the closed-interval method and the derivative tests to classify and confirm a global extremum.
- Solve classical optimization problems: maximum area, minimum material, shortest path, economic order quantity, and profit maximization.
- Recognize that the same optimization machinery governs geometry, economics, physics, and biology.
In This Chapter
- 10.1 The Best of All Possible Values
- 10.2 The Anatomy of an Optimization Problem
- 10.3 The Standard Procedure
- 10.4 The Closed-Interval Method
- 10.5 Maximizing Area with a Fixed Perimeter
- 10.6 Minimizing Material: The Can Problem
- 10.7 Shortest Distance: Closest Point on a Curve
- 10.8 Shortest Time: The Lifeguard and Snell's Law
- 10.9 Maximum Volume: The Open Box
- 10.10 Economic Optimization I: Maximizing Profit
- 10.11 Economic Optimization II: The Economic Order Quantity
- 10.12 Biological Optimization: Optimal Foraging
- 10.13 A Cautionary Example: Max Versus Min
- 10.14 An Engineering Optimum: Maximum Power Transfer
- 10.15 Verifying a Global Extremum
- 10.16 Looking Ahead to Several Variables
- 10.17 Computation: Verifying an Optimum with Python
- Looking Ahead
- Reflection
Chapter 10 — Optimization: Finding the Best Answer
10.1 The Best of All Possible Values
Every previous chapter taught you to describe a function — its slope, its concavity, its behavior near a point. This chapter teaches you to use those descriptions to answer the question every quantitative field eventually asks: what is the best we can do?
A farmer wants the largest pen the fencing allows. A canning company wants the least aluminum that still holds the soup. A lifeguard wants the fastest route to a drowning swimmer. A firm wants the production level that earns the most profit. A foraging bird wants the feeding pattern that yields the most calories per hour. These are wildly different scenarios, yet calculus answers all of them with one idea: at a maximum or minimum, the rate of change is zero. The peak of a hill is flat; the bottom of a valley is flat. Find where the derivative vanishes, and you have found a candidate for the best.
This connects directly to the first theme of the book — calculus is the mathematics of change — but read in reverse. Usually we ask how a quantity changes. In optimization we ask where it stops changing, because that is exactly where extreme values live.
The Key Insight. The calculus in an optimization problem is mechanical: find the critical points, classify them, check the endpoints. The genuine difficulty — and the entire skill of this chapter — is the setup: turning an English sentence into one function of one variable. Master the translation, and the derivative work takes thirty seconds.
This chapter is single-variable optimization: one quantity to optimize, one free variable after the constraint is used. When a problem has two genuinely independent variables and a constraint linking them, the elegant tool is the method of Lagrange multipliers, which we develop in Chapter 31. Everything here is the foundation that method generalizes.
10.2 The Anatomy of an Optimization Problem
Every optimization problem has the same three parts, and naming them is half the battle.
- The objective function is the quantity you want to make as large or as small as possible — area, cost, time, profit. This is what you will differentiate.
- The constraint is an equation that ties your variables together — the fixed perimeter, the fixed volume, the geometry of the region. The constraint is what lets you eliminate variables.
- The feasible domain is the set of input values that actually make physical sense — a length cannot be negative, a cut cannot exceed half the sheet. The domain determines which endpoints you must check.
Confuse the objective with the constraint and you will optimize the wrong thing. A reliable tell: the constraint is the quantity that is fixed ("100 ft of fencing," "holds 350 mL"); the objective is the quantity you are free to push ("maximize area," "minimize material").
Geometric Intuition. Picture the objective as a landscape and the constraint as a road painted across it. You are not free to wander the whole landscape — you must stay on the road. Optimization asks for the highest (or lowest) point on the road. In single-variable problems we use the constraint to walk the road itself, parameterizing it by one variable, which flattens the landscape into an ordinary curve $y = f(x)$ whose peaks and valleys we already know how to find.
10.3 The Standard Procedure
Here is the recipe. Follow it in order and optimization problems stop being mysterious.
- Draw a picture and name the variables. Almost every geometry problem becomes obvious once it is drawn and labeled.
- Write the objective as a formula — possibly in several variables for now.
- Write the constraint as an equation relating those variables.
- Eliminate variables. Solve the constraint for one variable and substitute, until the objective is a function of a single variable.
- Determine the feasible domain of that single variable.
- Find the critical points: where the derivative is zero or undefined inside the domain.
- Classify and confirm. Use the first- or second-derivative test, and — when the domain is a closed interval — evaluate the endpoints. The largest value found is the global maximum; the smallest is the global minimum.
- Translate back and sanity-check. Report the original quantity, with units, and ask whether the number is plausible.
Steps 1–4 are the setup; steps 5–8 are the calculus and the bookkeeping. The closed-interval method in step 7 deserves its own emphasis, so we turn to it next.
10.4 The Closed-Interval Method
When the objective is continuous on a closed interval $[a, b]$, the Extreme Value Theorem (Chapter 9) guarantees that a global maximum and a global minimum both exist — and it tells you exactly where to look. An extreme value can occur only at a critical point inside the interval or at an endpoint. There are no other possibilities.
So the method is a short, finite checklist:
- Find every critical point of $f$ in $(a, b)$.
- Evaluate $f$ at each critical point and at both endpoints $a$ and $b$.
- The largest of these numbers is the global maximum; the smallest is the global minimum.
No second-derivative test is even required when the domain is closed and bounded: you simply compare a finite list of values. This is the single most reliable optimization technique in Calculus I, because it cannot accidentally report a local extremum as a global one.
Common Pitfall. Many students find the critical point, confirm it is a local maximum with the second-derivative test, and stop — never checking the endpoints. On a closed interval that is a real gap. Consider maximizing $f(x) = x^3 - 3x$ on $[0, 3]$. The only interior critical point is $x = 1$, where $f(1) = -2$ is a local minimum. The maximum on $[0,3]$ lives at the endpoint $x = 3$, where $f(3) = 18$. Skip the endpoints and you would miss the answer entirely. Always evaluate $a$ and $b$.
Check Your Understanding. Find the global maximum and minimum of $f(x) = x^3 - 3x$ on the closed interval $[-2, 2]$.
Answer
$f'(x) = 3x^2 - 3 = 3(x-1)(x+1)$, so the critical points are $x = \pm 1$, both inside $[-2,2]$. Now evaluate the four candidates: $f(-2) = -8 + 6 = -2$, $f(-1) = -1 + 3 = 2$, $f(1) = 1 - 3 = -2$, $f(2) = 8 - 6 = 2$. The global maximum is $2$ (achieved at both $x = -1$ and $x = 2$); the global minimum is $-2$ (at both $x = -2$ and $x = 1$). Notice the extremes are tied between an interior critical point and an endpoint — exactly why you must check all four.
Why a Maximum Forces $f'=0$: Three Levels
The whole chapter rests on one fact — at an interior extremum the derivative is zero — so it is worth seeing at the book's three levels of rigor.
Intuitive. Stand at the top of a smooth hill. Whichever way you face, the ground in front of you is level for an instant; if it sloped up you could climb higher, and if it sloped down you were never at the top. "Level" means zero slope, and zero slope means $f'(x) = 0$. The same picture, flipped, describes the bottom of a valley.
Computational. This gives the working rule you apply in every problem: the candidates for a maximum or minimum are the points where $f'(x) = 0$, the points where $f'(x)$ fails to exist, and (on a closed interval) the endpoints. Collect that finite list, evaluate $f$ at each, and compare. You never have to search the whole interval — calculus has narrowed an infinite hunt to a handful of points.
Formal (Fermat's theorem). Suppose $f$ has a local maximum at an interior point $c$ and $f'(c)$ exists. For small $h > 0$, $f(c+h) \le f(c)$, so the right-hand difference quotient satisfies $\frac{f(c+h)-f(c)}{h} \le 0$; taking the limit gives $f'(c) \le 0$. For small $h < 0$, the same inequality $f(c+h) \le f(c)$ flips the sign of the quotient (we divide by a negative $h$), giving $\frac{f(c+h)-f(c)}{h} \ge 0$ and hence $f'(c) \ge 0$. The only number that is both $\le 0$ and $\ge 0$ is $0$, so $f'(c) = 0$. This is Fermat's theorem (Chapter 9), and notice what it does not claim: it says nothing about endpoints (where only a one-sided quotient exists) or about points where $f'$ is undefined. Those must be checked separately — which is exactly why the closed-interval checklist has three kinds of candidate, not one.
10.5 Maximizing Area with a Fixed Perimeter
Our first full problem is the farmer's pen — the archetype of "fixed resource, maximize size."
Problem. A farmer has $100$ ft of fencing and builds a rectangular pen against a long straight wall, so no fence is needed on the wall side. What dimensions enclose the greatest area?
Setup. Let $x$ be the length of each side perpendicular to the wall and $y$ the side parallel to the wall (the side opposite the wall). Three sides are fenced, so the constraint is
$$2x + y = 100.$$
The objective is the enclosed area
$$A = xy.$$
Eliminate. Solve the constraint for $y = 100 - 2x$ and substitute:
$$A(x) = x(100 - 2x) = 100x - 2x^2.$$
Domain. We need $x \ge 0$ and $y = 100 - 2x \ge 0$, so $x \in [0, 50]$ — a closed interval, perfect for the closed-interval method.
Critical points. $A'(x) = 100 - 4x = 0 \implies x = 25$.
Classify and confirm. Evaluate the three candidates: $A(0) = 0$, $A(50) = 0$ (both degenerate — no pen at all), and $A(25) = 25 \cdot 50 = 1250$. The interior critical point wins. (As a check, $A''(x) = -4 < 0$ everywhere, so the curve is concave down and $x = 25$ is a maximum.) The optimal pen is $25 \text{ ft} \times 50 \text{ ft}$, enclosing $\boxed{1250 \text{ ft}^2}$.
Notice the elegant result: the side parallel to the wall ($y = 50$) is exactly twice each perpendicular side ($x = 25$). The optimal three-sided pen uses half its fence on the long side and half split between the two short sides. That kind of clean structural fact is common in optimization and worth pausing on — it is the algebra revealing a hidden symmetry.
Geometric Intuition. The area function $A(x) = 100x - 2x^2$ is a downward parabola pinned to zero at $x = 0$ and $x = 50$. By symmetry its vertex sits exactly halfway between the roots, at $x = 25$. You could have found the answer with no calculus at all — but the derivative gives the same vertex instantly and, unlike the parabola trick, generalizes to objectives that are not quadratic.
10.6 Minimizing Material: The Can Problem
The mirror image of "maximize size for fixed resource" is "minimize resource for fixed size." This is the canning industry's daily question.
Problem. A cylindrical can must hold $350 \text{ cm}^3$ (a standard soda volume). What radius and height minimize the aluminum used — that is, minimize the total surface area?
Setup. Let $r$ be the radius and $h$ the height. The constraint fixes the volume:
$$\pi r^2 h = 350.$$
The objective is the surface area of the closed cylinder — two circular ends plus the wrapped side:
$$A = \underbrace{2\pi r^2}_{\text{top + bottom}} + \underbrace{2\pi r h}_{\text{side}}.$$
Eliminate. Solve the constraint for $h = \dfrac{350}{\pi r^2}$ and substitute:
$$A(r) = 2\pi r^2 + 2\pi r \cdot \frac{350}{\pi r^2} = 2\pi r^2 + \frac{700}{r}.$$
Domain. Here $r > 0$ with no upper bound — an open interval. So instead of checking endpoints we examine the behavior as $r \to 0^+$ (the $700/r$ term blows up to $+\infty$) and as $r \to \infty$ (the $2\pi r^2$ term blows up to $+\infty$). The surface area is huge at both ends, so any interior critical point must be the global minimum.
Critical points.
$$A'(r) = 4\pi r - \frac{700}{r^2} = 0 \implies 4\pi r^3 = 700 \implies r^3 = \frac{175}{\pi} \implies r = \left(\frac{175}{\pi}\right)^{1/3} \approx 3.83 \text{ cm}.$$
Confirm. $A''(r) = 4\pi + \dfrac{1400}{r^3} > 0$ for all $r > 0$, so $A$ is concave up everywhere and the critical point is a minimum — consistent with the boundary analysis.
The payoff. The corresponding height is
$$h = \frac{350}{\pi r^2} = \frac{350}{\pi r^2}.$$
Use the critical-point relation $\pi r^3 = 175$, so $\pi r^2 = 175/r$, giving
$$h = \frac{350}{175/r} = 2r.$$
The material-minimizing can has height equal to its diameter. Stand it next to a soda can and you will see this clean optimum, derived from a single derivative.
Real-World Application — Why real cans are taller (manufacturing economics). Actual soda cans are noticeably taller than $h = 2r$. The pure-material optimum ignores everything except aluminum area: it does not account for the thicker, more expensive top lid versus the thin wall, the seams, the cost of shipping a squat shape, or how a hand grips the can. When engineers add those real costs to the objective, the optimum shifts toward a taller cylinder. The lesson is not that the calculus is wrong — it is that the objective function must capture every cost that matters. Garbage in, optimum out.
Check Your Understanding. Suppose the can is open-topped (a cup): only one circular end plus the side. With the same volume constraint $\pi r^2 h = 350$, set up the surface-area objective $A(r)$ and find the critical-point equation. (You need not finish the arithmetic.)
Answer
Now $A = \pi r^2 + 2\pi r h$. Substituting $h = 350/(\pi r^2)$ gives $A(r) = \pi r^2 + 700/r$. Then $A'(r) = 2\pi r - 700/r^2 = 0 \implies r^3 = 350/\pi$, so $r = (350/\pi)^{1/3} \approx 4.83$ cm. Dropping the top lid lets the can be wider. Working through to $h$ gives $h = r$ for the open can — height equals radius, not diameter.
10.7 Shortest Distance: Closest Point on a Curve
Distance problems introduce a powerful trick: optimizing distance is the same as optimizing squared distance, and the square is far easier to differentiate.
Problem. Find the point on the parabola $y = x^2$ that is closest to the point $(0, 5)$.
Setup. A point on the parabola is $(x, x^2)$. Its distance to $(0,5)$ is
$$D(x) = \sqrt{(x - 0)^2 + (x^2 - 5)^2} = \sqrt{x^2 + (x^2 - 5)^2}.$$
Simplify. Because $\sqrt{\;\cdot\;}$ is an increasing function, $D$ and $D^2$ are minimized at the same $x$. So minimize the cleaner objective
$$f(x) = D^2 = x^2 + (x^2 - 5)^2 = x^2 + x^4 - 10x^2 + 25 = x^4 - 9x^2 + 25.$$
Critical points.
$$f'(x) = 4x^3 - 18x = 2x(2x^2 - 9) = 0 \implies x = 0 \quad\text{or}\quad x = \pm\frac{3}{\sqrt{2}}.$$
Classify. The domain is all of $\mathbb{R}$, so compare values. At $x = 0$: $f(0) = 25$. At $x = \pm 3/\sqrt{2}$ we have $x^2 = 9/2$, $x^4 = 81/4$, so
$$f\!\left(\pm\frac{3}{\sqrt 2}\right) = \frac{81}{4} - 9\cdot\frac{9}{2} + 25 = \frac{81}{4} - \frac{162}{4} + \frac{100}{4} = \frac{19}{4} = 4.75.$$
The minimum squared distance is $19/4$, so the minimum distance is $D = \sqrt{19}/2 \approx 2.18$, achieved at the two symmetric points $\left(\pm\tfrac{3}{\sqrt 2}, \tfrac{9}{2}\right)$. The point $x = 0$ (directly below the target) is a local maximum of closeness among nearby symmetric candidates — it is farther away, at distance $5$.
Geometric Intuition. The closest point is not the one directly beneath $(0,5)$. It lies off to the side, and there is a beautiful reason. At the true closest point, the line segment from $(0,5)$ meets the parabola perpendicular to the curve. If the segment were not perpendicular, you could slide along the curve and shorten it. Setting $f'(x) = 0$ is precisely the algebraic enforcement of that perpendicularity — the calculus and the geometry are saying the same thing in two languages, our second recurring theme made literal.
Common Pitfall. Students often try to differentiate $D(x) = \sqrt{x^2 + (x^2-5)^2}$ directly, drowning in chain-rule fractions. Minimize $D^2$ instead. The square has the same minimizer and a polynomial derivative. This "minimize the square" move works for any distance, time, or magnitude objective, and it should be your reflex.
10.8 Shortest Time: The Lifeguard and Snell's Law
Sometimes the quantity to minimize is not distance but time — and the two differ whenever speed changes along the path.
Problem. A lifeguard stands on the beach at point $A$, set back a distance $a$ from the waterline. A swimmer is in distress at point $B$, a distance $b$ out into the water, with the horizontal separation between $A$ and $B$ equal to $L$ along the shore. The lifeguard runs on sand at speed $v_1$ and swims at the slower speed $v_2$. At what point $P$ along the waterline should the lifeguard enter the water to reach $B$ in the least time?
Setup. Let $x$ be the horizontal distance from the foot of $A$ to the entry point $P$. Running covers a hypotenuse of $\sqrt{a^2 + x^2}$ on land; swimming covers $\sqrt{b^2 + (L - x)^2}$ in water. Since time $=$ distance$/$speed, the objective is
$$T(x) = \frac{\sqrt{a^2 + x^2}}{v_1} + \frac{\sqrt{b^2 + (L - x)^2}}{v_2}, \qquad x \in [0, L].$$
Critical points. Differentiate each term with the chain rule:
$$T'(x) = \frac{x}{v_1\sqrt{a^2 + x^2}} - \frac{L - x}{v_2\sqrt{b^2 + (L - x)^2}} = 0.$$
Now read the two fractions geometrically. The first, $\dfrac{x}{\sqrt{a^2 + x^2}}$, is $\sin\theta_1$, the sine of the angle the running leg makes with the line perpendicular to the shore. The second is $\sin\theta_2$ for the swimming leg. So the optimality condition $T'(x) = 0$ becomes
$$\boxed{\dfrac{\sin\theta_1}{v_1} = \dfrac{\sin\theta_2}{v_2}.}$$
This is Snell's law of refraction — the exact law light obeys when it crosses from one medium into another where its speed changes. The lifeguard, minimizing travel time, bends their path at the waterline in precisely the way a light ray bends at a glass surface. Light, it turns out, always takes the path of least time (Fermat's principle), so a photon entering water "solves" the lifeguard's optimization problem automatically.
The Key Insight. Optimization is universal. A lifeguard minimizing time, a photon obeying Fermat's principle, a firm minimizing cost, and an ant minimizing foraging effort are all solving the same mathematical problem — find where the derivative of a cost function vanishes. Different vocabulary, identical calculus. This is the fifth theme of the book: calculus appears in every quantitative field because the underlying structure is shared.
10.9 Maximum Volume: The Open Box
A classic that rewards careful domain analysis.
Problem. From a $12$-inch square sheet of cardboard, cut an equal square of side $x$ from each corner, then fold up the four flaps to form an open-topped box. What cut size $x$ maximizes the volume?
Setup. After cutting corners of side $x$ and folding, the base is a square of side $12 - 2x$ and the height is $x$. The volume is
$$V(x) = x(12 - 2x)^2.$$
Domain. We need $x \ge 0$ and $12 - 2x \ge 0$, so $x \in [0, 6]$ — closed and bounded.
Critical points. Expand $V(x) = x(144 - 48x + 4x^2) = 4x^3 - 48x^2 + 144x$, so
$$V'(x) = 12x^2 - 96x + 144 = 12(x^2 - 8x + 12) = 12(x - 2)(x - 6) = 0 \implies x = 2 \text{ or } x = 6.$$
Closed-interval method. Evaluate all candidates: $V(0) = 0$, $V(6) = 0$ (the base has shrunk to nothing), and
$$V(2) = 2 \cdot (12 - 4)^2 = 2 \cdot 64 = 128 \text{ in}^3.$$
The maximum volume is $\boxed{128 \text{ in}^3}$, achieved with $2$-inch corner cuts, giving an $8 \times 8 \times 2$ box. Note how the endpoint candidate $x = 6$ is a critical point that is not the answer — another reminder that finding $V' = 0$ only produces candidates, never conclusions.
10.10 Economic Optimization I: Maximizing Profit
Economics is the field where optimization is most explicit, because rational actors are defined by what they maximize. The central tool is marginal analysis — the derivative of a total quantity.
Problem. A firm sells $x$ units at a price that falls as it floods the market, giving total revenue $R(x) = 100x - x^2$. Its cost to produce $x$ units is $C(x) = 10x + 100$ (a fixed cost of $100$ plus $10$ per unit). What production level maximizes profit?
Setup. Profit is revenue minus cost:
$$P(x) = R(x) - C(x) = (100x - x^2) - (10x + 100) = 90x - x^2 - 100.$$
Critical points. $P'(x) = 90 - 2x = 0 \implies x = 45$.
Confirm. $P''(x) = -2 < 0$, so the profit curve is concave down and $x = 45$ is the global maximum. The maximum profit is
$$P(45) = 90(45) - 45^2 - 100 = 4050 - 2025 - 100 = 1925.$$
The economic interpretation. The condition $P'(x) = 0$ means $R'(x) - C'(x) = 0$, that is
$$R'(x) = C'(x), \qquad \text{i.e.} \qquad \textbf{marginal revenue} = \textbf{marginal cost}.$$
Here $R'(x) = 100 - 2x$ and $C'(x) = 10$. At $x = 45$: $R'(45) = 100 - 90 = 10 = C'(45)$. ✓ This is the most famous rule in microeconomics: produce up to the point where the revenue from the next unit exactly equals the cost of that unit. Beyond it, each extra unit costs more than it earns and profit falls; before it, you are leaving money on the table. The derivative of "total" is "marginal," and setting marginals equal is just $P' = 0$ in disguise.
Real-World Application — Marginal pricing in practice (economics). Airlines, electricity markets, and cloud-computing providers all set output where marginal revenue meets marginal cost, recomputed continuously as conditions change. An airline's marginal cost of one more passenger on an already-scheduled flight is nearly zero, which is why last-minute seats are sometimes sold cheaply — any price above marginal cost adds profit. The entire theory of competitive supply is the equation $R'(x) = C'(x)$ applied across a market.
10.11 Economic Optimization II: The Economic Order Quantity
A second economic classic shows optimization balancing two competing costs.
Problem. A retailer sells $D$ units of a product per year at a steady rate. Each time it places an order it pays a fixed ordering cost $K$ (paperwork, shipping setup), regardless of order size. Holding inventory costs $h$ per unit per year (warehouse space, spoilage, tied-up capital). If it orders $Q$ units at a time, what order size minimizes total annual cost?
Setup. With demand $D$ and order size $Q$, the firm places $D/Q$ orders per year, costing $K \cdot D/Q$ in ordering. Because inventory is drawn down steadily from $Q$ to $0$ and refilled, the average inventory on hand is $Q/2$, costing $h \cdot Q/2$ to hold. The total annual cost is
$$T(Q) = \frac{DK}{Q} + \frac{hQ}{2}, \qquad Q > 0.$$
Critical points.
$$T'(Q) = -\frac{DK}{Q^2} + \frac{h}{2} = 0 \implies Q^2 = \frac{2DK}{h} \implies Q^* = \sqrt{\frac{2DK}{h}}.$$
Confirm. $T''(Q) = \dfrac{2DK}{Q^3} > 0$ for all $Q > 0$, so the cost curve is convex and $Q^*$ is the global minimum. As $Q \to 0^+$ the ordering cost explodes, and as $Q \to \infty$ the holding cost explodes — the interior critical point is squeezed between two infinities, just as in the can problem.
This is the celebrated Economic Order Quantity (EOQ) formula. It captures a real tension: order in large batches to save on ordering costs, but not so large that holding costs swamp the savings. The square root is the precise balance point.
Historical Note. The EOQ formula was derived by Ford W. Harris in 1913 while he worked at Westinghouse, and popularized by consultant R. H. Wilson in the 1930s (it is still sometimes called the Wilson formula). Over a century later, with all the sophistication of modern supply-chain software, the square-root EOQ remains the baseline against which inventory policies are measured — a single derivative that has saved industry uncountable dollars.
10.12 Biological Optimization: Optimal Foraging
Optimization is not only for conscious decision-makers. Evolution is a relentless optimizer, and behavioral ecology models animals as if they maximize energy intake — because natural selection rewards those that do.
Optimal foraging theory studies how animals decide where and how long to feed. Consider a bird feeding from a berry bush. At first berries are plentiful and the bird gathers them quickly, but as the bush is depleted the rate of return drops — a textbook case of diminishing returns. Travel to a fresh bush costs time. When should the bird give up the current bush and move on?
The answer is the marginal value theorem (Charnov, 1976): a forager should leave the current patch when its instantaneous rate of energy gain drops to the average rate of gain for the whole environment (including travel time). Formally, if $g(t)$ is the cumulative energy gained after spending time $t$ in a patch and $\tau$ is the average travel time between patches, the optimal residence time $t^*$ satisfies
$$g'(t^*) = \frac{g(t^*)}{\tau + t^*}.$$
The left side is the marginal gain rate (the derivative — slope of the gain curve right now); the right side is the long-run average rate (total gain divided by total time, including travel). Leave exactly when the marginal rate falls to the average rate. Stay longer and you do worse than the environment's average; leave sooner and you abandon easy calories.
Real-World Application — Foraging without a calculator (biology). Bumblebees, starlings, and even foraging bacteria have been shown experimentally to follow the marginal value theorem strikingly well. No animal computes a derivative consciously — evolution does the optimization, selecting over generations for behaviors that happen to satisfy $g'(t^*) = g(t^*)/(\tau + t^*)$. The calculus describes a strategy that natural selection discovered long before humans wrote it down. The same equation guides human decisions too: when to leave a depleting fishing ground, or how long to keep clicking through search results before trying a new query.
10.13 A Cautionary Example: Max Versus Min
Optimization punishes the careless. The single most common failure is finding a critical point and assuming it is the kind of extremum you wanted. Always verify.
Problem. Among all rectangles of fixed perimeter $P$, which has the largest diagonal?
Setup. Let the sides be $a$ and $b$ with $2a + 2b = P$, so $b = \tfrac{P}{2} - a$. The diagonal is $d = \sqrt{a^2 + b^2}$; minimize-or-maximize the cleaner $d^2$:
$$f(a) = a^2 + \left(\frac{P}{2} - a\right)^2.$$
Critical points. Expand: $f(a) = a^2 + \tfrac{P^2}{4} - aP + a^2 = 2a^2 - aP + \tfrac{P^2}{4}$, so
$$f'(a) = 4a - P = 0 \implies a = \frac{P}{4}.$$
Classify — and beware. $f''(a) = 4 > 0$, so $a = P/4$ is a minimum of the diagonal, not a maximum. At $a = P/4$ we get $b = P/4$ too: the rectangle is a square, and the square has the shortest diagonal of all rectangles with that perimeter. The largest diagonal does not occur at an interior critical point at all — it is approached at the degenerate boundary, as the rectangle collapses to a thin sliver ($a \to 0$ or $a \to P/2$) and the diagonal stretches toward $P/2$.
Warning. Setting the derivative to zero finds where the objective is flat — but flat points are minima, maxima, or neither, and the boundary may beat them all. A critical point is a candidate, never an answer. Confirm its type with the second-derivative test, and on a bounded domain always compare against the endpoints. The square here is the right critical point for the wrong question: it answers "shortest diagonal," not "longest." Read what you actually computed.
10.14 An Engineering Optimum: Maximum Power Transfer
A staple of electrical engineering shows the quotient rule earning its keep.
Problem. A power source with fixed electromotive force $V$ and fixed internal resistance $\rho$ drives a load resistor $R$. The power delivered to the load is
$$P(R) = \frac{V^2 R}{(\rho + R)^2}.$$
What load resistance $R$ extracts the most power from the source?
Critical points. Differentiate with the quotient rule, treating $V$ and $\rho$ as constants:
$$P'(R) = V^2 \cdot \frac{(\rho + R)^2 - R \cdot 2(\rho + R)}{(\rho + R)^4} = V^2 \cdot \frac{(\rho + R) - 2R}{(\rho + R)^3} = \frac{V^2(\rho - R)}{(\rho + R)^3}.$$
Setting $P'(R) = 0$ requires $\rho - R = 0$, so $R = \rho$.
Confirm. For $R < \rho$ the numerator $\rho - R > 0$ so $P' > 0$ (power rising); for $R > \rho$ we have $P' < 0$ (power falling). The function rises then falls, so $R = \rho$ is the global maximum on $(0, \infty)$. The maximum power transfer theorem states exactly this: a source delivers the most power when the load resistance matches the source's internal resistance. Audio engineers matching speaker impedance and radio engineers matching antenna impedance use this result constantly.
10.15 Verifying a Global Extremum
You have now seen every tool for confirming that a critical point is truly the best answer, not merely a local one. Collect them:
- Closed, bounded domain $[a,b]$: use the closed-interval method. Compare $f$ at every critical point and both endpoints; the largest and smallest values are global. This is the gold standard — it cannot fail.
- Open or unbounded domain: examine the limits at the boundary. If $f \to +\infty$ at both ends (the can, the EOQ), the lone interior critical point is the global minimum; the symmetric situation gives a global maximum.
- Single critical point on the whole domain: if a continuous function has exactly one critical point and it is a local maximum, it is automatically the global maximum (and likewise for minima). There is nowhere else for a competing extreme to hide.
- Constant concavity: if $f'' > 0$ everywhere the function is convex and any critical point is the global minimum; if $f'' < 0$ everywhere it is concave and any critical point is the global maximum.
The recurring mistake is reporting a local extremum as global without one of these arguments. Make the global justification an explicit step, not an afterthought.
10.16 Looking Ahead to Several Variables
Every problem in this chapter had, after the constraint was applied, a single free variable. But many real optimizations resist that reduction — you want the dimensions of a box minimizing cost where length, width, and height trade off in ways no single constraint collapses to one variable.
For a function $f(x, y)$ of two variables, a critical point is where all partial derivatives vanish, written $\nabla f = \mathbf{0}$ (the gradient is zero), and classifying it requires the second-derivative test built from the Hessian matrix. When the variables are tied by a constraint that cannot be cleanly solved, the method of Lagrange multipliers turns "optimize on the road" into a system of equations directly. Both tools live in Chapter 31, and the geometric picture there — gradients of the objective and constraint becoming parallel at the optimum — is the multivariable echo of the perpendicularity we saw in the closest-point problem (§10.7). The single-variable instincts you are building now transfer intact; you will simply have more directions to set to zero.
Computational Note. When the objective is messy, the geometry is hard to draw, or the critical-point equation has no closed-form solution, hand the problem to a numerical optimizer. Many engineering and machine-learning objectives have thousands of variables — far beyond hand computation — yet rest on exactly the principle of this chapter: descend until the derivative (the gradient) is zero. This is the gradient descent anchor example, introduced in Chapter 6 and reaching its full multivariable, machine-learning form in Chapter 30.
10.17 Computation: Verifying an Optimum with Python
Following the book's three-tier pattern — analytic, then by hand, then machine check — we confirm the can result of §10.6 two ways: symbolically with sympy and numerically with scipy.
# Verify the minimum-surface-area can two ways: by symbolic calculus and by
# numerical optimization. Both should agree with the hand answer r ≈ 3.83 cm.
import numpy as np
import sympy as sp
from scipy.optimize import minimize_scalar
# --- Tier 3a: symbolic (sympy) — reproduce the hand derivation exactly ---
r = sp.symbols('r', positive=True)
A = 2*sp.pi*r**2 + 700/r # surface area after eliminating h
crit = sp.solve(sp.diff(A, r), r) # solve A'(r) = 0
r_exact = crit[0]
print("symbolic r* =", r_exact, "=", float(r_exact)) # (175/pi)**(1/3) = 3.8278...
# --- Tier 3b: numerical (scipy) — optimize without any calculus by hand ---
def area(radius: float) -> float:
h = 350 / (np.pi * radius**2)
return 2*np.pi*radius**2 + 2*np.pi*radius*h
result = minimize_scalar(area, bracket=(1, 10))
print("numeric r* =", round(result.x, 4)) # 3.8278
print("optimal h =", round(350/(np.pi*result.x**2), 4),
" (should equal 2r =", round(2*result.x, 4), ")") # h = 7.6556 = 2r
The symbolic solver returns the exact $(175/\pi)^{1/3}$, the numerical optimizer finds the same $3.83$ cm with no derivative computed by us, and the printed check confirms $h = 2r$. Three independent routes — pencil, symbolic algebra, and numerical search — converge on one answer. That is the fourth theme of the book in action: hand computation builds the understanding; the machine builds the power and the confidence.
Add to Your Modeling Portfolio. Add an optimization step to your running model — pick the parameter value that makes your system best by some explicit criterion. Biology: Use the marginal value theorem to find a forager's optimal patch-residence time $t^*$ given a gain curve $g(t)$ and travel time $\tau$; interpret what "best" means for fitness. Economics: Find the profit-maximizing output where marginal revenue equals marginal cost, or the EOQ that minimizes your inventory cost; report the optimal quantity and the cost saved versus a naive choice. Physics: Find the launch angle or path that extremizes range or travel time (a least-time or least-action problem), and identify the conserved structure at the optimum. Data Science: Frame a simple loss function (e.g., squared error for a one-parameter model) and minimize it analytically by setting the derivative to zero — the seed of the gradient descent you will scale up in Chapter 30.
Looking Ahead
You can now translate a wide class of real-world problems into single-variable calculus and extract the best answer with confidence. Chapter 11 turns to linear approximation and Newton's method — using the derivative not to find extrema but to approximate function values and to solve equations that have no algebraic solution. Newton's method will reappear inside optimization algorithms themselves, hunting for the points where a derivative is zero. After that, Part 3 leaves differentiation behind and takes up the inverse art of integration, beginning with antiderivatives in Chapter 12 and culminating in the Fundamental Theorem of Calculus in Chapter 14.
Reflection
Optimization is where calculus stops describing and starts deciding. A farmer's fence, a soda can's shape, a lifeguard's route, a firm's output, a bird's feeding schedule — all bend to the same three-line procedure: write the objective, use the constraint to reach one variable, set the derivative to zero and check the boundary. The deepest lesson is not any single formula but the discipline of translation: nature and commerce hand you English, and your job is to render it as a function you can differentiate. The calculus, as promised, is the easy part. The judgment of what to optimize, what is fixed, and what is plausible — that is the craft you carry into every quantitative field you will ever touch.