Key Takeaways: Nonparametric Methods

One-Sentence Summary

Nonparametric methods — including the sign test, Wilcoxon rank-sum (Mann-Whitney U), Wilcoxon signed-rank, and Kruskal-Wallis tests — replace raw data values with ranks to compare groups without requiring normality, making them ideal for ordinal data, small samples, and datasets with heavy outliers, at a cost of only about 5% efficiency when normality actually holds.

Core Concepts at a Glance

Concept Definition Why It Matters
Nonparametric test A hypothesis test that does not assume a specific probability distribution (e.g., normality) for the population Broadens your toolkit to handle ordinal data, small non-normal samples, and outlier-heavy data
Rank-based methods Tests that convert raw data values to ranks (1st, 2nd, 3rd...) before analysis Ranks are unaffected by outliers and don't require equal intervals — the key insight behind most nonparametric tests
Distribution-free A synonym for nonparametric: the test works regardless of the population distribution's shape Removes the most common roadblock to valid inference — the normality assumption
Power tradeoff Nonparametric tests lose ~5% efficiency vs. parametric tests when normality holds, but can be more powerful when it doesn't The "insurance premium" for robustness is remarkably small — and sometimes you get more power, not less

The Nonparametric Toolkit

When to Use Each Test

Scenario Parametric Test Nonparametric Alternative Use Nonparametric When...
Two independent groups Two-sample t-test (Ch.16) Wilcoxon rank-sum / Mann-Whitney U Small $n$ + non-normal; ordinal data; heavy outliers
Paired / matched data Paired t-test (Ch.16) Wilcoxon signed-rank test Small $n$ + non-normal differences; ordinal data
Paired data (simplest) Paired t-test (Ch.16) Sign test Very small $n$; can only determine direction
Three or more groups One-way ANOVA (Ch.20) Kruskal-Wallis test Small $n$ per group + non-normal; ordinal data

Information Used by Each Test

Test What It Uses What It Ignores
Sign test Direction of differences (+/-) Magnitude of differences
Wilcoxon signed-rank Direction AND rank of differences Exact size of differences
Mann-Whitney U Relative ordering (ranks) of all observations Exact values
Kruskal-Wallis Mean ranks across $k$ groups Exact values
t-test / ANOVA Exact values (means, variances) Nothing — uses all information

More information used = more power (when assumptions hold). Less information used = more robustness (when assumptions fail).

The Ranking Procedure

  1. Combine all observations from all groups into a single list
  2. Sort from smallest to largest
  3. Assign ranks 1, 2, 3, ... from smallest to largest
  4. Handle ties with average ranks (midranks)
  5. Return each rank to its original group

Quick check: Sum of all ranks = $N(N+1)/2$

Test Procedures

Sign Test (Paired Data)

  1. Compute differences $d_i$
  2. Drop zeros
  3. Count positives ($n^+$) and negatives ($n^-$)
  4. Under $H_0$: $n^+ \sim \text{Binomial}(n, 0.5)$
  5. p-value from binomial distribution

Mann-Whitney U (Two Independent Groups)

  1. Combine and rank all $N = n_1 + n_2$ observations
  2. Compute rank sums: $W_1$, $W_2$
  3. Compute: $U_1 = W_1 - n_1(n_1+1)/2$, $U_2 = W_2 - n_2(n_2+1)/2$
  4. Check: $U_1 + U_2 = n_1 \times n_2$
  5. p-value from Mann-Whitney distribution or normal approximation

Wilcoxon Signed-Rank (Paired Data)

  1. Compute differences $d_i$
  2. Drop zeros
  3. Rank absolute differences $|d_i|$
  4. Apply signs to ranks
  5. Compute $W^+ = \sum$ positive ranks, $W^- = \sum$ negative ranks
  6. Check: $W^+ + W^- = n(n+1)/2$
  7. Test statistic: $W = \min(W^+, W^-)$

Kruskal-Wallis (Three or More Groups)

  1. Combine and rank all $N$ observations
  2. Compute mean rank $\bar{R}_i$ for each group
  3. Test statistic: $H = \frac{12}{N(N+1)} \sum n_i (\bar{R}_i - \bar{R})^2$
  4. Under $H_0$: $H \sim \chi^2(k-1)$
  5. If significant: pairwise Mann-Whitney U with Bonferroni correction

Key Python Code

from scipy import stats

# Sign test (via binomial test)
p_value = stats.binomtest(n_positive, n_total, 0.5,
                          alternative='two-sided').pvalue

# Mann-Whitney U test
stat, p_value = stats.mannwhitneyu(group1, group2,
                                    alternative='two-sided')

# Wilcoxon signed-rank test
stat, p_value = stats.wilcoxon(x, y,
                                alternative='two-sided')

# Kruskal-Wallis test
H_stat, p_value = stats.kruskal(group1, group2, group3)

Excel: Limited Support

Excel's Data Analysis ToolPak does not include nonparametric tests. Options: - Manual calculation using RANK.AVG() - Free add-in: Real Statistics Resource Pack - Recommended: Use Python for nonparametric analysis

The Power Tradeoff

Condition Which Test Wins?
Data are normal Parametric — but only by ~5%
Data are heavy-tailed or skewed Nonparametric — sometimes dramatically
Data are ordinal Nonparametric — parametric is technically inappropriate
Large samples ($n > 30$) Parametric — CLT makes normality less critical
Small samples ($n < 15$) Nonparametric — can't assess or rely on normality

Asymptotic Relative Efficiency (normal data): ~0.955 for all three rank-based tests vs. their parametric counterparts. That's a 4.5% efficiency loss — roughly equivalent to needing one extra observation per group of 20.

Common Mistakes

Mistake Correction
"Nonparametric = assumption-free" Independence is still required. Similar distribution shapes are needed for median interpretation.
"Always use nonparametric to be safe" When assumptions hold, parametric tests are more powerful. Match the method to your data.
"Mann-Whitney tests medians" It tests stochastic dominance — whether one group's values tend to be larger. It compares medians only when distributions have similar shapes.
Running pairwise tests after non-significant Kruskal-Wallis Same rule as ANOVA: no post-hoc tests after a non-significant omnibus test.
Choosing the test that gives the smaller p-value Choose based on data type and assumptions, not results.

Decision Flowchart Summary

Is your data ordinal (1-5 scale, rankings)?
  → YES: Use nonparametric
  → NO: Continue

Is n < 15 per group AND data clearly non-normal?
  → YES: Use nonparametric
  → NO: Continue

Are there heavy outliers that distort means?
  → YES: Use nonparametric
  → NO: Parametric tests are likely fine

Still unsure?
  → Run BOTH. If they agree, report the one matching
    your data type. If they disagree, investigate why.

Connections

Connection Details
Ch.10 (Normality assessment) QQ-plots and Shapiro-Wilk test help you decide whether normality holds — and therefore whether to go nonparametric
Ch.15 (One-sample t-test) The Wilcoxon signed-rank test on deviations from a hypothesized median is the nonparametric analogue
Ch.16 (Two-sample t-test) Mann-Whitney U is the direct nonparametric alternative; paired t-test → Wilcoxon signed-rank
Ch.18 (Bootstrap) Bootstrap is another assumption-light approach; nonparametric tests use ranks, bootstrap uses resampling
Ch.20 (ANOVA) Kruskal-Wallis is the nonparametric ANOVA; both compare groups, but K-W uses ranks
Ch.22 (Regression) Spearman's rank correlation is the nonparametric alternative to Pearson's $r$ — same ranking logic