Chapter 11 — Key Takeaways

The whole chapter rests on one picture: up close, a smooth curve looks like its tangent line. Every result below is that picture put to work.


Linear Approximation (§11.2–11.3)

  • The linearization of $f$ at $a$ is $$L(x) = f(a) + f'(a)(x - a), \qquad f(x) \approx L(x) \text{ for } x \text{ near } a.$$
  • $L$ is just the tangent line written as an evaluable function: it passes through $(a, f(a))$ with slope $f'(a)$.
  • Recipe: choose an anchor $a$ near your target where $f(a)$ and $f'(a)$ are easy; evaluate $L$ at the target.
  • Famous special cases, all linearizations at $0$: $\ln(1+x)\approx x$, $\sin\theta\approx\theta$, $e^x\approx 1+x$, $(1+x)^k \approx 1 + kx$.
  • The approximation degrades as you leave the anchor; pick a closer anchor when the target is far.

Differentials (§11.4)

  • For $y = f(x)$, the differential is $dy = f'(x)\,dx$ — the change in $y$ predicted by the tangent line.
  • Contrast with the actual change $\Delta y = f(x+\Delta x) - f(x)$, which rides the curve. The fundamental approximation: $$\Delta y \approx dy = f'(x)\,\Delta x \qquad (\Delta x \text{ small}).$$
  • The gap $\Delta y - dy$ shrinks quadratically as the step shrinks — why the linear approximation is so good for small steps.
  • Differentials turn a messy difference (like $5.02^3 - 5^3$) into a single multiplication ($dV = 3s^2\,ds$).

Error Propagation (§11.5)

  • A measurement uncertainty $\Delta x$ propagates through $y = f(x)$ as $$\Delta y \approx |f'(x)|\,\Delta x.$$ The derivative is the amplification factor.
  • Relative error $\Delta y / y$ and percentage error ($\times 100\%$) are what scientists usually report — dimensionless and comparable.
  • Power-law rule: for $y = x^n$, $$\frac{\Delta y}{y} \approx |n|\,\frac{\Delta x}{x}.$$ A cube ($n=3$) triples relative error; a square root ($n=\tfrac12$) halves it.
  • For a product/quotient of powers, each input's relative error is weighted by its exponent and the (worst-case) errors add: e.g. $g = 4\pi^2\ell\,T^{-2}$ gives $\Delta g/g \approx \Delta\ell/\ell + 2\,\Delta T/T$.

Newton's Method (§11.6–11.8)

  • To solve $f(x) = 0$, replace $f$ by its tangent line at $x_n$, solve the linear equation, and iterate: $$x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}.$$
  • Geometry: slide down the tangent line to the $x$-axis; that intercept is the next guess.
  • Special cases collapse to classic algorithms: $f(x)=x^2-S$ gives the Babylonian $x_{n+1}=\tfrac12(x_n + S/x_n)$; $f(x)=1/x-d$ gives the division-free $x_{n+1}=x_n(2-d\,x_n)$.

Quadratic Convergence

  • Near a simple root ($f'(r)\neq 0$), the error satisfies $$e_{n+1} \approx \frac{f''(r)}{2 f'(r)}\,e_n^2.$$
  • The new error is proportional to the square of the old: the number of correct digits roughly doubles each step. Three or four steps reach machine precision from a decent start.

Failure Modes (§11.9)

Failure Cause
Bad starting guess Tangent line points away from the root; no global sense of direction.
Near-zero derivative $f'(x_n)\approx 0$ Step $f/f'$ blows up; a flat spot flings the guess far away.
Cycling Iterates fall into a repeating loop (e.g. $x^3-2x+2$ from $x_0=0$ cycles $0\to1\to0$).
Multiple (repeated) root $f'(r)=0$ at the root degrades convergence from quadratic to linear.
  • Newton's method gives no warning when it fails: always plug the answer back into $f$, cap the iterations, and prefer a bracketing method when reliability matters (§11.12).

Reliable Cousins (§11.12)

  • Bisection: needs only continuity and a sign-change bracket (Intermediate Value Theorem, Chapter 4); never fails but converges only linearly.
  • Secant method: Newton with a finite-difference stand-in for $f'$; superlinear (order $\approx 1.618$), no derivative needed.
  • Production root-finders blend these for robustness plus speed.

The Error Bound and the Road Forward (§11.11)

  • $|f(x) - L(x)| \le \dfrac{M}{2}(x-a)^2$ with $M = \max|f''|$ near $a$: the error is quadratic in distance from the anchor and proportional to curvature.
  • Keeping more terms gives the quadratic approximation and then the Taylor series — linear approximation is the first-order Taylor polynomial (Chapter 23).

Common Errors to Avoid

  • Anchoring too far from the target. Approximating $\sqrt 7$ from $a=4$ is poor; use $a=9$. Always check the target is near $a$.
  • Confusing $dy$ with $\Delta y$. $dy$ is the tangent-line (linear) change; $\Delta y$ is the true change. They agree only approximately, for small steps.
  • Forgetting the exponent weighting in error propagation. A squared quantity contributes twice its relative error, not once.
  • Trusting Newton blindly. Verify the output, cap iterations, and watch for $f'\approx 0$, cycling, or repeated roots.
  • Expecting quadratic convergence at a double root. There convergence is only linear.
  • Dropping the thin space in differentials. Write $dy = f'(x)\,dx$ and $\int f(x)\,dx$ (continuity §2).

Connections

  • Backward: the tangent line and differentiability come from Chapter 6; the critical-point / optimization use of Newton on $g'$ connects to Chapter 10; bisection rests on the Intermediate Value Theorem from Chapter 4.
  • Forward — Taylor series (Chapter 23): the linearization is the first-order Taylor polynomial; higher orders give the quadratic approximation and the full series.
  • Forward — tangent plane (Chapter 29): in two variables the tangent line becomes a tangent plane, $L(x,y)=f(a,b)+f_x(a,b)(x-a)+f_y(a,b)(y-b)$, and differentials become $dz = f_x\,dx + f_y\,dy$.
  • Forward — differential notation (Chapter 12): the $dx$ introduced here becomes the integration element $\int f(x)\,dx$ that opens Part III.
  • Anchor example — gradient descent / optimization (Chapters 6 → 30): "Newton's method for optimization," $x_{n+1}=x_n-g'(x_n)/g''(x_n)$, is the curvature-aware cousin of gradient descent.
  • Anchor example — Euler's formula (Chapter 24): the linearizations $\sin\theta\approx\theta$, $\cos\theta\approx 1$ are the opening terms of the series that link exponentials and trigonometry.