Key Takeaways: Sampling Distributions and the Central Limit Theorem

Contributors

Key Takeaways: Sampling Distributions and the Central Limit Theorem

One-Sentence Summary

The Central Limit Theorem guarantees that the sampling distribution of the mean is approximately normal — regardless of the population's shape — with mean $\mu$ and standard error $\sigma / \sqrt{n}$, making it the single theorem that bridges probability to inference and enables confidence intervals, hypothesis tests, and every other tool you'll learn in the rest of this course.

Core Concepts at a Glance

Concept	Definition	Why It Matters
Sampling distribution	The distribution of a statistic across all possible samples of a given size	Tells us how much a sample statistic varies from sample to sample
Sampling variability	The natural variation of statistics from sample to sample	Why different samples give different answers — the reason we need inference
Central Limit Theorem	The sampling distribution of $\bar{x}$ is approximately normal for large $n$	Makes inference possible regardless of population shape
Standard error	The standard deviation of the sampling distribution ($\sigma / \sqrt{n}$)	Quantifies the precision of sample estimates

Three Distributions: Don't Confuse Them

Distribution	What It Describes	Shape	Spread
Population	Individual values in the entire population	Any shape (could be skewed, bimodal, etc.)	$\sigma$
Sample	Individual values in one particular sample	Resembles the population	$s \approx \sigma$
Sampling distribution	The statistic $\bar{x}$ across many samples	Approximately normal (CLT!)	$\text{SE} = \sigma / \sqrt{n}$

The key insight: individual values can follow any distribution, but their averages follow a normal distribution. This is the CLT.

The Central Limit Theorem: Three Guarantees

$$\bar{X} \stackrel{\text{approx.}}{\sim} N\!\left(\mu, \frac{\sigma^2}{n}\right)$$

Guarantee	What It Says	In Words
Shape	Sampling distribution of $\bar{x}$ is approximately normal	Normality emerges from averaging, even from non-normal populations
Center	$\mu_{\bar{x}} = \mu$	Sample means are centered on the true value (unbiased)
Spread	$\sigma_{\bar{x}} = \sigma / \sqrt{n}$	Larger samples give more precise estimates

When Is $n$ "Large Enough"?

Population Shape	Minimum $n$ for CLT
Already normal	Any $n$ (even 1)
Roughly symmetric	$n \geq 15$
Moderately skewed	$n \geq 30$
Strongly skewed	$n \geq 50+$
Heavy tails / extreme outliers	Larger still

Rule of thumb: $n \geq 30$ works for most practical situations.

Standard Error Formulas

For the Mean

$$\boxed{\text{SE}_{\bar{x}} = \frac{\sigma}{\sqrt{n}}}$$

When $\sigma$ is unknown (the usual case), estimate it:

$$\widehat{\text{SE}}_{\bar{x}} = \frac{s}{\sqrt{n}}$$

For a Proportion

$$\boxed{\text{SE}_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}}$$

CLT conditions for proportions: $np \geq 10$ and $n(1-p) \geq 10$

Standardization

$$z = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}} \qquad \text{or} \qquad z = \frac{\hat{p} - p}{\sqrt{p(1-p)/n}}$$

The Diminishing Returns of Larger Samples

$$\text{SE} = \frac{\sigma}{\sqrt{n}}$$

To improve SE by...	You must multiply $n$ by...
2× (cut in half)	4
3× (cut to one-third)	9
4× (cut to one-quarter)	16
10× (cut to one-tenth)	100

The rule: To halve the standard error, quadruple the sample size.

      SE
      ▲
  σ   │ ╲
      │   ╲
      │     ╲__
      │        ╲___
      │            ╲_____
      │                  ╲__________
  0   ┼───────────────────────────── n
      0                      ►
        Steep drop          Diminishing returns
        at first            at large n

Law of Large Numbers vs. CLT

	Law of Large Numbers	Central Limit Theorem
What it says	$\bar{x} \to \mu$ as $n \to \infty$	The distribution of $\bar{x}$ is normal
Focus	Where $\bar{x}$ goes (the center)	How $\bar{x}$ is distributed (shape + spread)
Key quantity	$\bar{x}$ gets close to $\mu$	$\text{SE} = \sigma / \sqrt{n}$
From Chapter	Ch.8	Ch.11 (this chapter)

Both explain why statistics works: the LLN says sample means aim for the truth; the CLT says they're predictably distributed around the truth.

CLT Conditions and Cautions

Requirements

Random sample (or random assignment)
Independence (10% condition: $n \leq 0.10 \times N$)
Sufficiently large $n$ (depends on population shape)

What the CLT Does NOT Fix

Problem	Why CLT Can't Help
Biased sampling	CLT centers the sampling distribution on the biased target, not the true population value
Dependent observations	SE formula assumes independence; correlated data has larger true variability
Populations without finite variance	CLT requires $\sigma < \infty$
Very small samples from skewed populations	The normal approximation is poor

Worked Example Summary: Sam's Daria Analysis

Quantity	Value
Null assumption	$p = 0.31$ (no improvement)
Sample size	$n = 65$ attempts
Observed proportion	$\hat{p} = 0.38$
Standard error	$\sqrt{0.31 \times 0.69 / 65} \approx 0.057$
$z$-score	$(0.38 - 0.31) / 0.057 \approx 1.23$
$P(Z > 1.23)$	$\approx 0.109$ (11%)
Conclusion	Not conclusive — 11% chance of seeing this by luck

Python Quick Reference

import numpy as np
from scipy import stats

# --- Standard error calculations ---
sigma = 18    # Population SD (or use sample SD s)
n = 100       # Sample size
se = sigma / np.sqrt(n)

# Standard error for a proportion
p = 0.35
se_p = np.sqrt(p * (1 - p) / n)

# --- CLT probabilities ---
# P(x_bar > value)
mu = 126
prob = 1 - stats.norm.cdf(130, loc=mu, scale=se)

# --- CLT simulation ---
population = np.random.exponential(scale=10, size=100_000)
sample_means = [np.random.choice(population, size=30).mean()
                for _ in range(10_000)]
# sample_means will be approximately normal!

# --- Verify CLT conditions for proportions ---
print(f"np = {n * p:.1f} >= 10? {n * p >= 10}")
print(f"n(1-p) = {n * (1-p):.1f} >= 10? {n * (1-p) >= 10}")

Common Misconceptions

Misconception	Reality
"The CLT says large samples are normally distributed"	The CLT says sample means have an approximately normal distribution — the sample data retains the population's shape
"n = 30 is always enough"	It's a guideline, not a rule; very skewed or heavy-tailed populations need more
"Standard error and standard deviation are the same thing"	SD measures spread of individual values; SE measures spread of sample means
"Larger samples are always worth the cost"	Diminishing returns: quadruple the sample to halve the SE
"The CLT eliminates the need for random sampling"	The CLT tells you the shape of the sampling distribution; random sampling tells you whether the center is correct
"I can't use the CLT if I don't know $\sigma$"	In practice, use $s$ (sample SD) as an estimate; works well for $n \geq 30$

The One Thing to Remember

If you forget everything else from this chapter, remember this:

The Central Limit Theorem is why statistics works. It guarantees that sample means are approximately normally distributed — regardless of the population's shape — with a spread that shrinks as $\sigma / \sqrt{n}$. This means every sample mean you compute comes from a known, predictable distribution. That predictability is what makes confidence intervals possible (Chapter 12), hypothesis tests valid (Chapter 13), and A/B testing reliable (Chapter 16). Without the CLT, we'd compute sample means and have no idea how close they were to the truth. With it, we know the pattern of uncertainty — and that knowledge transforms data from a pile of numbers into a basis for decisions.

Key Terms

Term	Definition
Sampling distribution	The distribution of a statistic (like $\bar{x}$ or $\hat{p}$) computed from all possible random samples of the same size from the same population
Central Limit Theorem (CLT)	The theorem stating that the sampling distribution of $\bar{x}$ is approximately normal for large $n$, regardless of population shape, with mean $\mu$ and SD $\sigma / \sqrt{n}$
Standard error (SE)	The standard deviation of the sampling distribution; measures how much a statistic typically varies from sample to sample; $\text{SE}_{\bar{x}} = \sigma / \sqrt{n}$
Sampling variability	The natural variation in a statistic that occurs because different random samples contain different individuals
Sample size ($n$)	The number of observations in a sample; larger $n$ reduces standard error and improves CLT approximation
Law of large numbers (revisited)	As $n$ increases, $\bar{x}$ converges to $\mu$; related to but distinct from the CLT
Sampling distribution of the mean	The distribution of $\bar{x}$ across all possible samples; central object of the CLT
Sampling distribution of the proportion	The distribution of $\hat{p}$ across all possible samples; approximately $N(p, p(1-p)/n)$ for large $n$