Case Study 2 — Computing the Bell Curve: The Normal CDF as a Taylor Series

Field: Statistics and computational statistics (the normal-curve anchor of this book) Calculus used: term-by-term integration of the Maclaurin series of $e^{-x^2/2}$ (Section 23.7)


Open any statistics textbook to the back cover and you will find a "z-table" — a dense grid of numbers giving the area under the bell curve to the left of each value $z$. Those numbers decide every confidence interval, every p-value, every claim that a result is "statistically significant." For most of a statistics course they appear as if handed down on tablets: you look up $0.8413$ for $z = 1$ and move on. But where do they come from? No one can integrate the bell curve in closed form — its antiderivative is one of the famous functions that does not exist in elementary terms. This case study tells the story of how those table values are actually manufactured, and the answer is the climax of the anchor this book has followed since Chapter 13: you compute them by turning the integrand into an infinite polynomial and integrating it term by term.

The function with no antiderivative

The standard normal cumulative distribution function is

$$\Phi(z) = \int_{-\infty}^z \frac{1}{\sqrt{2\pi}}\, e^{-x^2/2}\,dx,$$

the probability that a standard normal random variable falls at or below $z$. The Fundamental Theorem of Calculus (Chapter 14) tells us this area is a net change in some antiderivative of the bell-curve density $\phi(x) = \tfrac{1}{\sqrt{2\pi}}e^{-x^2/2}$. The trouble, proved by Liouville in the 1830s and mentioned back in Chapter 14, is that no elementary function has derivative $e^{-x^2/2}$. There is no finite formula in polynomials, exponentials, logarithms, and trig functions that equals $\int e^{-x^2/2}\,dx$. For nine chapters this gap stood open: the Fundamental Theorem applied perfectly, yet we simply could not write down the antiderivative to evaluate it.

Taylor series close the gap by refusing to look for an antiderivative at all. Instead we expand the integrand as a power series — which we can integrate, because every term is just a power of $x$.

Expanding and integrating term by term

Start from the exponential series of Section 23.4 and substitute $u = -x^2/2$:

$$e^{-x^2/2} = \sum_{n=0}^\infty \frac{(-x^2/2)^n}{n!} = \sum_{n=0}^\infty \frac{(-1)^n x^{2n}}{2^n\, n!} = 1 - \frac{x^2}{2} + \frac{x^4}{8} - \frac{x^6}{48} + \cdots$$

This series has radius of convergence $\infty$ — it converges for every $x$, just as $e^x$ does, because the substitution does not introduce any singularity. By the term-by-term integration license of Section 23.3, we may integrate inside the sum, raising each power $x^{2n}$ to $x^{2n+1}/(2n+1)$:

$$\int_0^z e^{-x^2/2}\,dx = \sum_{n=0}^\infty \frac{(-1)^n z^{2n+1}}{2^n\, n!\,(2n+1)}.$$

Dividing by $\sqrt{2\pi}$ and adding the left-half area of $\tfrac12$ gives the CDF:

$$\boxed{\;\Phi(z) = \frac12 + \frac{1}{\sqrt{2\pi}}\sum_{n=0}^\infty \frac{(-1)^n z^{2n+1}}{2^n\, n!\,(2n+1)}\;}$$

There it is — a closed expression for the "impossible" area, valid for every $z$ and computable to any precision by truncating. The function that defied antidifferentiation was, all along, just an infinite polynomial.

Watching it converge: the $z = 1$ table value by hand

Let us manufacture one z-table entry ourselves. For $z = 1$ — exactly one standard deviation — the bracketed sum is, term by term,

$$1 - \frac{1}{2\cdot 1\cdot 3} + \frac{1}{4\cdot 2\cdot 5} - \frac{1}{8\cdot 6\cdot 7} + \frac{1}{16\cdot 24\cdot 9} - \cdots = 1 - 0.16667 + 0.02500 - 0.00298 + 0.00029 - \cdots$$

Adding these five terms gives $0.85564$. Because the series alternates with shrinking terms, the truncation error is bounded by the first omitted term (about $0.00002$), so this is already good to four places. Multiply by $1/\sqrt{2\pi} = 0.39894$:

$$\Phi(1) - \tfrac12 \approx 0.39894 \times 0.85564 \approx 0.34134, \qquad \text{so } \Phi(1) \approx 0.84134.$$

That is precisely the $0.8413$ printed in every z-table. We built it from scratch with five terms of a Taylor series and a hand calculation — no table, no black box.

# Hand-verified partial sums for Phi(1) - 0.5 (do not execute; values computed above).
# bracket sum after 5 terms = 0.85564
# Phi(1) = 0.5 + 0.39894 * 0.85564 = 0.84134   (z-table value: 0.8413)
import math
inv_sqrt_2pi = 1 / math.sqrt(2 * math.pi)   # 0.39894...

The same series at $z = 0.5$ gives $0.5 - \tfrac{0.125}{6} + \tfrac{0.03125}{40} - \cdots = 0.47995$, and $0.39894 \times 0.47995 \approx 0.1915$ — the probability that a standard normal lands in $[0, 0.5]$, again matching the table to four digits.

The famous 68–95–99.7 rule, derived

The single most-quoted fact in statistics — that about 68%, 95%, and 99.7% of a normal population lies within one, two, and three standard deviations of the mean — is nothing but three evaluations of this series, doubled and reflected:

Within Probability $= 2\Phi(k) - 1$ Value
$1\sigma$ $2(0.84134) - 1$ $0.6827$
$2\sigma$ $2(0.97725) - 1$ $0.9545$
$3\sigma$ $2(0.99865) - 1$ $0.9973$

Every one of these numbers — the backbone of statistical intuition, the basis of control charts and Six Sigma tolerances — is delivered by integrating a Taylor series term by term. Without this machinery, those figures would be unknowable except by laborious numerical quadrature.

Where the simple series breaks, and what replaces it

The term-by-term series is superb for moderate $z$ but, as the Computational Note in Section 23.7 warns, it degrades for large $z$. The individual terms $z^{2n+1}/(2^n n! (2n+1))$ grow large before the factorial finally overwhelms them, so the partial sums involve enormous alternating quantities that nearly cancel. In floating-point arithmetic that near-cancellation destroys precision — the dreaded catastrophic cancellation. For $z \gtrsim 5$ or $6$ the naive series is useless.

Production libraries handle the tail differently. They evaluate the closely related error function $\operatorname{erf}(z) = \tfrac{2}{\sqrt\pi}\int_0^z e^{-t^2}\,dt$ — related to the normal CDF by $\Phi(z) = \tfrac12\bigl(1 + \operatorname{erf}(z/\sqrt2)\bigr)$ — using the Maclaurin series only near the origin, and switching to an asymptotic expansion of the tail (a series that diverges if summed forever but gives excellent accuracy when truncated early) for large arguments. Two different series, each ruling its own region: the convergent Taylor series near the center, the asymptotic series far out. This pairing is a recurring pattern across numerical computing, and the bell curve is its canonical illustration.

Connections to the textbook

  • Section 23.7 — the anchor payoff, where $\int e^{-x^2}\,dx$ and $\Phi(z)$ are computed by term-by-term integration; this case study is its full worked narrative.
  • Section 23.3 — the license to integrate a power series term-by-term inside its radius of convergence.
  • Chapter 13 — where the bell curve and the "area = probability" idea were introduced, opening this anchor.
  • Chapter 14 — the Fundamental Theorem, plus the note that $e^{-x^2/2}$ has no elementary antiderivative (Liouville).
  • Chapter 22 — the alternating-series error estimate used to bound the truncation above.

Discussion questions

  1. The series for $e^{-x^2/2}$ converges for all $x$, yet the series for $\Phi(z)$ becomes numerically unreliable for large $z$. Distinguish mathematical convergence from numerical reliability in one or two sentences.
  2. Why does the alternating structure of the series make error bounding so easy (Chapter 22)? State the bound you would quote after truncating at the $z^9$ term.
  3. A p-value of $0.05$ in a two-sided z-test corresponds to $z \approx 1.96$. Outline how you would use the series of this case study to verify that $\Phi(1.96) \approx 0.975$.
  4. No elementary antiderivative of $e^{-x^2/2}$ exists, yet we computed its integral to four digits by hand. In what sense have we "solved" the integral, and in what sense have we not?
  5. Mathematicians give the unsolvable integral a name — the error function — and study it as a special function. Why is naming-and-tabulating a legitimate mathematical response to "no closed form"?

A short annotated reading list

  • Stewart, Calculus: Early Transcendentals (9th ed.), §11.10. Works the term-by-term integration of $e^{-x^2}$ as a headline example of why Taylor series matter; the direct parallel to Section 23.7.
  • OpenStax Calculus Volume 2, §6.4 (Working with Taylor Series). Free; derives the error-function series exactly as we did and discusses convergence speed.
  • Abramowitz, M., & Stegun, I. (1965). Handbook of Mathematical Functions. Dover — or its modern successor, the NIST Digital Library of Mathematical Functions, https://dlmf.nist.gov/. The classic source for the erf series and the asymptotic tail expansion mentioned above.
  • Cody, W. J. (1969). "Rational Chebyshev approximations for the error function." Mathematics of Computation, 23, 631–637. Shows what real libraries do once the naive Taylor series runs out of accuracy — the algorithm behind scipy.special.erf.

The area under the bell curve cannot be written as a formula — but it can be written as a series. Every confidence interval you will ever report rests on integrating an infinite polynomial, one harmless power of $x$ at a time.