Key Takeaways: Sampling Distributions and the Central Limit Theorem

One-Sentence Summary

The Central Limit Theorem guarantees that the sampling distribution of the mean is approximately normal — regardless of the population's shape — with mean $\mu$ and standard error $\sigma / \sqrt{n}$, making it the single theorem that bridges probability to inference and enables confidence intervals, hypothesis tests, and every other tool you'll learn in the rest of this course.

Core Concepts at a Glance

Concept Definition Why It Matters
Sampling distribution The distribution of a statistic across all possible samples of a given size Tells us how much a sample statistic varies from sample to sample
Sampling variability The natural variation of statistics from sample to sample Why different samples give different answers — the reason we need inference
Central Limit Theorem The sampling distribution of $\bar{x}$ is approximately normal for large $n$ Makes inference possible regardless of population shape
Standard error The standard deviation of the sampling distribution ($\sigma / \sqrt{n}$) Quantifies the precision of sample estimates

Three Distributions: Don't Confuse Them

Distribution What It Describes Shape Spread
Population Individual values in the entire population Any shape (could be skewed, bimodal, etc.) $\sigma$
Sample Individual values in one particular sample Resembles the population $s \approx \sigma$
Sampling distribution The statistic $\bar{x}$ across many samples Approximately normal (CLT!) $\text{SE} = \sigma / \sqrt{n}$

The key insight: individual values can follow any distribution, but their averages follow a normal distribution. This is the CLT.

The Central Limit Theorem: Three Guarantees

$$\bar{X} \stackrel{\text{approx.}}{\sim} N\!\left(\mu, \frac{\sigma^2}{n}\right)$$

Guarantee What It Says In Words
Shape Sampling distribution of $\bar{x}$ is approximately normal Normality emerges from averaging, even from non-normal populations
Center $\mu_{\bar{x}} = \mu$ Sample means are centered on the true value (unbiased)
Spread $\sigma_{\bar{x}} = \sigma / \sqrt{n}$ Larger samples give more precise estimates

When Is $n$ "Large Enough"?

Population Shape Minimum $n$ for CLT
Already normal Any $n$ (even 1)
Roughly symmetric $n \geq 15$
Moderately skewed $n \geq 30$
Strongly skewed $n \geq 50+$
Heavy tails / extreme outliers Larger still

Rule of thumb: $n \geq 30$ works for most practical situations.

Standard Error Formulas

For the Mean

$$\boxed{\text{SE}_{\bar{x}} = \frac{\sigma}{\sqrt{n}}}$$

When $\sigma$ is unknown (the usual case), estimate it:

$$\widehat{\text{SE}}_{\bar{x}} = \frac{s}{\sqrt{n}}$$

For a Proportion

$$\boxed{\text{SE}_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}}$$

CLT conditions for proportions: $np \geq 10$ and $n(1-p) \geq 10$

Standardization

$$z = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}} \qquad \text{or} \qquad z = \frac{\hat{p} - p}{\sqrt{p(1-p)/n}}$$

The Diminishing Returns of Larger Samples

$$\text{SE} = \frac{\sigma}{\sqrt{n}}$$

To improve SE by... You must multiply $n$ by...
2× (cut in half) 4
3× (cut to one-third) 9
4× (cut to one-quarter) 16
10× (cut to one-tenth) 100

The rule: To halve the standard error, quadruple the sample size.

      SE
      ▲
  σ   │ ╲
      │   ╲
      │     ╲__
      │        ╲___
      │            ╲_____
      │                  ╲__________
  0   ┼───────────────────────────── n
      0                      ►
        Steep drop          Diminishing returns
        at first            at large n

Law of Large Numbers vs. CLT

Law of Large Numbers Central Limit Theorem
What it says $\bar{x} \to \mu$ as $n \to \infty$ The distribution of $\bar{x}$ is normal
Focus Where $\bar{x}$ goes (the center) How $\bar{x}$ is distributed (shape + spread)
Key quantity $\bar{x}$ gets close to $\mu$ $\text{SE} = \sigma / \sqrt{n}$
From Chapter Ch.8 Ch.11 (this chapter)

Both explain why statistics works: the LLN says sample means aim for the truth; the CLT says they're predictably distributed around the truth.

CLT Conditions and Cautions

Requirements

  1. Random sample (or random assignment)
  2. Independence (10% condition: $n \leq 0.10 \times N$)
  3. Sufficiently large $n$ (depends on population shape)

What the CLT Does NOT Fix

Problem Why CLT Can't Help
Biased sampling CLT centers the sampling distribution on the biased target, not the true population value
Dependent observations SE formula assumes independence; correlated data has larger true variability
Populations without finite variance CLT requires $\sigma < \infty$
Very small samples from skewed populations The normal approximation is poor

Worked Example Summary: Sam's Daria Analysis

Quantity Value
Null assumption $p = 0.31$ (no improvement)
Sample size $n = 65$ attempts
Observed proportion $\hat{p} = 0.38$
Standard error $\sqrt{0.31 \times 0.69 / 65} \approx 0.057$
$z$-score $(0.38 - 0.31) / 0.057 \approx 1.23$
$P(Z > 1.23)$ $\approx 0.109$ (11%)
Conclusion Not conclusive — 11% chance of seeing this by luck

Python Quick Reference

import numpy as np
from scipy import stats

# --- Standard error calculations ---
sigma = 18    # Population SD (or use sample SD s)
n = 100       # Sample size
se = sigma / np.sqrt(n)

# Standard error for a proportion
p = 0.35
se_p = np.sqrt(p * (1 - p) / n)

# --- CLT probabilities ---
# P(x_bar > value)
mu = 126
prob = 1 - stats.norm.cdf(130, loc=mu, scale=se)

# --- CLT simulation ---
population = np.random.exponential(scale=10, size=100_000)
sample_means = [np.random.choice(population, size=30).mean()
                for _ in range(10_000)]
# sample_means will be approximately normal!

# --- Verify CLT conditions for proportions ---
print(f"np = {n * p:.1f} >= 10? {n * p >= 10}")
print(f"n(1-p) = {n * (1-p):.1f} >= 10? {n * (1-p) >= 10}")

Common Misconceptions

Misconception Reality
"The CLT says large samples are normally distributed" The CLT says sample means have an approximately normal distribution — the sample data retains the population's shape
"n = 30 is always enough" It's a guideline, not a rule; very skewed or heavy-tailed populations need more
"Standard error and standard deviation are the same thing" SD measures spread of individual values; SE measures spread of sample means
"Larger samples are always worth the cost" Diminishing returns: quadruple the sample to halve the SE
"The CLT eliminates the need for random sampling" The CLT tells you the shape of the sampling distribution; random sampling tells you whether the center is correct
"I can't use the CLT if I don't know $\sigma$" In practice, use $s$ (sample SD) as an estimate; works well for $n \geq 30$

The One Thing to Remember

If you forget everything else from this chapter, remember this:

The Central Limit Theorem is why statistics works. It guarantees that sample means are approximately normally distributed — regardless of the population's shape — with a spread that shrinks as $\sigma / \sqrt{n}$. This means every sample mean you compute comes from a known, predictable distribution. That predictability is what makes confidence intervals possible (Chapter 12), hypothesis tests valid (Chapter 13), and A/B testing reliable (Chapter 16). Without the CLT, we'd compute sample means and have no idea how close they were to the truth. With it, we know the pattern of uncertainty — and that knowledge transforms data from a pile of numbers into a basis for decisions.

Key Terms

Term Definition
Sampling distribution The distribution of a statistic (like $\bar{x}$ or $\hat{p}$) computed from all possible random samples of the same size from the same population
Central Limit Theorem (CLT) The theorem stating that the sampling distribution of $\bar{x}$ is approximately normal for large $n$, regardless of population shape, with mean $\mu$ and SD $\sigma / \sqrt{n}$
Standard error (SE) The standard deviation of the sampling distribution; measures how much a statistic typically varies from sample to sample; $\text{SE}_{\bar{x}} = \sigma / \sqrt{n}$
Sampling variability The natural variation in a statistic that occurs because different random samples contain different individuals
Sample size ($n$) The number of observations in a sample; larger $n$ reduces standard error and improves CLT approximation
Law of large numbers (revisited) As $n$ increases, $\bar{x}$ converges to $\mu$; related to but distinct from the CLT
Sampling distribution of the mean The distribution of $\bar{x}$ across all possible samples; central object of the CLT
Sampling distribution of the proportion The distribution of $\hat{p}$ across all possible samples; approximately $N(p, p(1-p)/n)$ for large $n$