Key Takeaways: Sampling Distributions and the Central Limit Theorem
One-Sentence Summary
The Central Limit Theorem guarantees that the sampling distribution of the mean is approximately normal — regardless of the population's shape — with mean $\mu$ and standard error $\sigma / \sqrt{n}$, making it the single theorem that bridges probability to inference and enables confidence intervals, hypothesis tests, and every other tool you'll learn in the rest of this course.
Core Concepts at a Glance
| Concept | Definition | Why It Matters |
|---|---|---|
| Sampling distribution | The distribution of a statistic across all possible samples of a given size | Tells us how much a sample statistic varies from sample to sample |
| Sampling variability | The natural variation of statistics from sample to sample | Why different samples give different answers — the reason we need inference |
| Central Limit Theorem | The sampling distribution of $\bar{x}$ is approximately normal for large $n$ | Makes inference possible regardless of population shape |
| Standard error | The standard deviation of the sampling distribution ($\sigma / \sqrt{n}$) | Quantifies the precision of sample estimates |
Three Distributions: Don't Confuse Them
| Distribution | What It Describes | Shape | Spread |
|---|---|---|---|
| Population | Individual values in the entire population | Any shape (could be skewed, bimodal, etc.) | $\sigma$ |
| Sample | Individual values in one particular sample | Resembles the population | $s \approx \sigma$ |
| Sampling distribution | The statistic $\bar{x}$ across many samples | Approximately normal (CLT!) | $\text{SE} = \sigma / \sqrt{n}$ |
The key insight: individual values can follow any distribution, but their averages follow a normal distribution. This is the CLT.
The Central Limit Theorem: Three Guarantees
$$\bar{X} \stackrel{\text{approx.}}{\sim} N\!\left(\mu, \frac{\sigma^2}{n}\right)$$
| Guarantee | What It Says | In Words |
|---|---|---|
| Shape | Sampling distribution of $\bar{x}$ is approximately normal | Normality emerges from averaging, even from non-normal populations |
| Center | $\mu_{\bar{x}} = \mu$ | Sample means are centered on the true value (unbiased) |
| Spread | $\sigma_{\bar{x}} = \sigma / \sqrt{n}$ | Larger samples give more precise estimates |
When Is $n$ "Large Enough"?
| Population Shape | Minimum $n$ for CLT |
|---|---|
| Already normal | Any $n$ (even 1) |
| Roughly symmetric | $n \geq 15$ |
| Moderately skewed | $n \geq 30$ |
| Strongly skewed | $n \geq 50+$ |
| Heavy tails / extreme outliers | Larger still |
Rule of thumb: $n \geq 30$ works for most practical situations.
Standard Error Formulas
For the Mean
$$\boxed{\text{SE}_{\bar{x}} = \frac{\sigma}{\sqrt{n}}}$$
When $\sigma$ is unknown (the usual case), estimate it:
$$\widehat{\text{SE}}_{\bar{x}} = \frac{s}{\sqrt{n}}$$
For a Proportion
$$\boxed{\text{SE}_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}}$$
CLT conditions for proportions: $np \geq 10$ and $n(1-p) \geq 10$
Standardization
$$z = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}} \qquad \text{or} \qquad z = \frac{\hat{p} - p}{\sqrt{p(1-p)/n}}$$
The Diminishing Returns of Larger Samples
$$\text{SE} = \frac{\sigma}{\sqrt{n}}$$
| To improve SE by... | You must multiply $n$ by... |
|---|---|
| 2× (cut in half) | 4 |
| 3× (cut to one-third) | 9 |
| 4× (cut to one-quarter) | 16 |
| 10× (cut to one-tenth) | 100 |
The rule: To halve the standard error, quadruple the sample size.
SE
▲
σ │ ╲
│ ╲
│ ╲__
│ ╲___
│ ╲_____
│ ╲__________
0 ┼───────────────────────────── n
0 ►
Steep drop Diminishing returns
at first at large n
Law of Large Numbers vs. CLT
| Law of Large Numbers | Central Limit Theorem | |
|---|---|---|
| What it says | $\bar{x} \to \mu$ as $n \to \infty$ | The distribution of $\bar{x}$ is normal |
| Focus | Where $\bar{x}$ goes (the center) | How $\bar{x}$ is distributed (shape + spread) |
| Key quantity | $\bar{x}$ gets close to $\mu$ | $\text{SE} = \sigma / \sqrt{n}$ |
| From Chapter | Ch.8 | Ch.11 (this chapter) |
Both explain why statistics works: the LLN says sample means aim for the truth; the CLT says they're predictably distributed around the truth.
CLT Conditions and Cautions
Requirements
- Random sample (or random assignment)
- Independence (10% condition: $n \leq 0.10 \times N$)
- Sufficiently large $n$ (depends on population shape)
What the CLT Does NOT Fix
| Problem | Why CLT Can't Help |
|---|---|
| Biased sampling | CLT centers the sampling distribution on the biased target, not the true population value |
| Dependent observations | SE formula assumes independence; correlated data has larger true variability |
| Populations without finite variance | CLT requires $\sigma < \infty$ |
| Very small samples from skewed populations | The normal approximation is poor |
Worked Example Summary: Sam's Daria Analysis
| Quantity | Value |
|---|---|
| Null assumption | $p = 0.31$ (no improvement) |
| Sample size | $n = 65$ attempts |
| Observed proportion | $\hat{p} = 0.38$ |
| Standard error | $\sqrt{0.31 \times 0.69 / 65} \approx 0.057$ |
| $z$-score | $(0.38 - 0.31) / 0.057 \approx 1.23$ |
| $P(Z > 1.23)$ | $\approx 0.109$ (11%) |
| Conclusion | Not conclusive — 11% chance of seeing this by luck |
Python Quick Reference
import numpy as np
from scipy import stats
# --- Standard error calculations ---
sigma = 18 # Population SD (or use sample SD s)
n = 100 # Sample size
se = sigma / np.sqrt(n)
# Standard error for a proportion
p = 0.35
se_p = np.sqrt(p * (1 - p) / n)
# --- CLT probabilities ---
# P(x_bar > value)
mu = 126
prob = 1 - stats.norm.cdf(130, loc=mu, scale=se)
# --- CLT simulation ---
population = np.random.exponential(scale=10, size=100_000)
sample_means = [np.random.choice(population, size=30).mean()
for _ in range(10_000)]
# sample_means will be approximately normal!
# --- Verify CLT conditions for proportions ---
print(f"np = {n * p:.1f} >= 10? {n * p >= 10}")
print(f"n(1-p) = {n * (1-p):.1f} >= 10? {n * (1-p) >= 10}")
Common Misconceptions
| Misconception | Reality |
|---|---|
| "The CLT says large samples are normally distributed" | The CLT says sample means have an approximately normal distribution — the sample data retains the population's shape |
| "n = 30 is always enough" | It's a guideline, not a rule; very skewed or heavy-tailed populations need more |
| "Standard error and standard deviation are the same thing" | SD measures spread of individual values; SE measures spread of sample means |
| "Larger samples are always worth the cost" | Diminishing returns: quadruple the sample to halve the SE |
| "The CLT eliminates the need for random sampling" | The CLT tells you the shape of the sampling distribution; random sampling tells you whether the center is correct |
| "I can't use the CLT if I don't know $\sigma$" | In practice, use $s$ (sample SD) as an estimate; works well for $n \geq 30$ |
The One Thing to Remember
If you forget everything else from this chapter, remember this:
The Central Limit Theorem is why statistics works. It guarantees that sample means are approximately normally distributed — regardless of the population's shape — with a spread that shrinks as $\sigma / \sqrt{n}$. This means every sample mean you compute comes from a known, predictable distribution. That predictability is what makes confidence intervals possible (Chapter 12), hypothesis tests valid (Chapter 13), and A/B testing reliable (Chapter 16). Without the CLT, we'd compute sample means and have no idea how close they were to the truth. With it, we know the pattern of uncertainty — and that knowledge transforms data from a pile of numbers into a basis for decisions.
Key Terms
| Term | Definition |
|---|---|
| Sampling distribution | The distribution of a statistic (like $\bar{x}$ or $\hat{p}$) computed from all possible random samples of the same size from the same population |
| Central Limit Theorem (CLT) | The theorem stating that the sampling distribution of $\bar{x}$ is approximately normal for large $n$, regardless of population shape, with mean $\mu$ and SD $\sigma / \sqrt{n}$ |
| Standard error (SE) | The standard deviation of the sampling distribution; measures how much a statistic typically varies from sample to sample; $\text{SE}_{\bar{x}} = \sigma / \sqrt{n}$ |
| Sampling variability | The natural variation in a statistic that occurs because different random samples contain different individuals |
| Sample size ($n$) | The number of observations in a sample; larger $n$ reduces standard error and improves CLT approximation |
| Law of large numbers (revisited) | As $n$ increases, $\bar{x}$ converges to $\mu$; related to but distinct from the CLT |
| Sampling distribution of the mean | The distribution of $\bar{x}$ across all possible samples; central object of the CLT |
| Sampling distribution of the proportion | The distribution of $\hat{p}$ across all possible samples; approximately $N(p, p(1-p)/n)$ for large $n$ |