Key Takeaways: Nonparametric Methods

Contributors

Key Takeaways: Nonparametric Methods

One-Sentence Summary

Nonparametric methods — including the sign test, Wilcoxon rank-sum (Mann-Whitney U), Wilcoxon signed-rank, and Kruskal-Wallis tests — replace raw data values with ranks to compare groups without requiring normality, making them ideal for ordinal data, small samples, and datasets with heavy outliers, at a cost of only about 5% efficiency when normality actually holds.

Core Concepts at a Glance

Concept	Definition	Why It Matters
Nonparametric test	A hypothesis test that does not assume a specific probability distribution (e.g., normality) for the population	Broadens your toolkit to handle ordinal data, small non-normal samples, and outlier-heavy data
Rank-based methods	Tests that convert raw data values to ranks (1st, 2nd, 3rd...) before analysis	Ranks are unaffected by outliers and don't require equal intervals — the key insight behind most nonparametric tests
Distribution-free	A synonym for nonparametric: the test works regardless of the population distribution's shape	Removes the most common roadblock to valid inference — the normality assumption
Power tradeoff	Nonparametric tests lose ~5% efficiency vs. parametric tests when normality holds, but can be more powerful when it doesn't	The "insurance premium" for robustness is remarkably small — and sometimes you get more power, not less

The Nonparametric Toolkit

When to Use Each Test

Scenario	Parametric Test	Nonparametric Alternative	Use Nonparametric When...
Two independent groups	Two-sample t-test (Ch.16)	Wilcoxon rank-sum / Mann-Whitney U	Small $n$ + non-normal; ordinal data; heavy outliers
Paired / matched data	Paired t-test (Ch.16)	Wilcoxon signed-rank test	Small $n$ + non-normal differences; ordinal data
Paired data (simplest)	Paired t-test (Ch.16)	Sign test	Very small $n$; can only determine direction
Three or more groups	One-way ANOVA (Ch.20)	Kruskal-Wallis test	Small $n$ per group + non-normal; ordinal data

Information Used by Each Test

Test	What It Uses	What It Ignores
Sign test	Direction of differences (+/-)	Magnitude of differences
Wilcoxon signed-rank	Direction AND rank of differences	Exact size of differences
Mann-Whitney U	Relative ordering (ranks) of all observations	Exact values
Kruskal-Wallis	Mean ranks across $k$ groups	Exact values
t-test / ANOVA	Exact values (means, variances)	Nothing — uses all information

More information used = more power (when assumptions hold). Less information used = more robustness (when assumptions fail).

The Ranking Procedure

Combine all observations from all groups into a single list
Sort from smallest to largest
Assign ranks 1, 2, 3, ... from smallest to largest
Handle ties with average ranks (midranks)
Return each rank to its original group

Quick check: Sum of all ranks = $N(N+1)/2$

Test Procedures

Sign Test (Paired Data)

Compute differences $d_i$
Drop zeros
Count positives ($n^+$) and negatives ($n^-$)
Under $H_0$: $n^+ \sim \text{Binomial}(n, 0.5)$
p-value from binomial distribution

Mann-Whitney U (Two Independent Groups)

Combine and rank all $N = n_1 + n_2$ observations
Compute rank sums: $W_1$, $W_2$
Compute: $U_1 = W_1 - n_1(n_1+1)/2$, $U_2 = W_2 - n_2(n_2+1)/2$
Check: $U_1 + U_2 = n_1 \times n_2$
p-value from Mann-Whitney distribution or normal approximation

Wilcoxon Signed-Rank (Paired Data)

Compute differences $d_i$
Drop zeros
Rank absolute differences $|d_i|$
Apply signs to ranks
Compute $W^+ = \sum$ positive ranks, $W^- = \sum$ negative ranks
Check: $W^+ + W^- = n(n+1)/2$
Test statistic: $W = \min(W^+, W^-)$

Kruskal-Wallis (Three or More Groups)

Combine and rank all $N$ observations
Compute mean rank $\bar{R}_i$ for each group
Test statistic: $H = \frac{12}{N(N+1)} \sum n_i (\bar{R}_i - \bar{R})^2$
Under $H_0$: $H \sim \chi^2(k-1)$
If significant: pairwise Mann-Whitney U with Bonferroni correction

Key Python Code

from scipy import stats

# Sign test (via binomial test)
p_value = stats.binomtest(n_positive, n_total, 0.5,
                          alternative='two-sided').pvalue

# Mann-Whitney U test
stat, p_value = stats.mannwhitneyu(group1, group2,
                                    alternative='two-sided')

# Wilcoxon signed-rank test
stat, p_value = stats.wilcoxon(x, y,
                                alternative='two-sided')

# Kruskal-Wallis test
H_stat, p_value = stats.kruskal(group1, group2, group3)

Excel: Limited Support

Excel's Data Analysis ToolPak does not include nonparametric tests. Options: - Manual calculation using RANK.AVG() - Free add-in: Real Statistics Resource Pack - Recommended: Use Python for nonparametric analysis

The Power Tradeoff

Condition	Which Test Wins?
Data are normal	Parametric — but only by ~5%
Data are heavy-tailed or skewed	Nonparametric — sometimes dramatically
Data are ordinal	Nonparametric — parametric is technically inappropriate
Large samples ($n > 30$)	Parametric — CLT makes normality less critical
Small samples ($n < 15$)	Nonparametric — can't assess or rely on normality

Asymptotic Relative Efficiency (normal data): ~0.955 for all three rank-based tests vs. their parametric counterparts. That's a 4.5% efficiency loss — roughly equivalent to needing one extra observation per group of 20.

Common Mistakes

Mistake	Correction
"Nonparametric = assumption-free"	Independence is still required. Similar distribution shapes are needed for median interpretation.
"Always use nonparametric to be safe"	When assumptions hold, parametric tests are more powerful. Match the method to your data.
"Mann-Whitney tests medians"	It tests stochastic dominance — whether one group's values tend to be larger. It compares medians only when distributions have similar shapes.
Running pairwise tests after non-significant Kruskal-Wallis	Same rule as ANOVA: no post-hoc tests after a non-significant omnibus test.
Choosing the test that gives the smaller p-value	Choose based on data type and assumptions, not results.

Decision Flowchart Summary

Is your data ordinal (1-5 scale, rankings)?
  → YES: Use nonparametric
  → NO: Continue

Is n < 15 per group AND data clearly non-normal?
  → YES: Use nonparametric
  → NO: Continue

Are there heavy outliers that distort means?
  → YES: Use nonparametric
  → NO: Parametric tests are likely fine

Still unsure?
  → Run BOTH. If they agree, report the one matching
    your data type. If they disagree, investigate why.

Connections

Connection	Details
Ch.10 (Normality assessment)	QQ-plots and Shapiro-Wilk test help you decide whether normality holds — and therefore whether to go nonparametric
Ch.15 (One-sample t-test)	The Wilcoxon signed-rank test on deviations from a hypothesized median is the nonparametric analogue
Ch.16 (Two-sample t-test)	Mann-Whitney U is the direct nonparametric alternative; paired t-test → Wilcoxon signed-rank
Ch.18 (Bootstrap)	Bootstrap is another assumption-light approach; nonparametric tests use ranks, bootstrap uses resampling
Ch.20 (ANOVA)	Kruskal-Wallis is the nonparametric ANOVA; both compare groups, but K-W uses ranks
Ch.22 (Regression)	Spearman's rank correlation is the nonparametric alternative to Pearson's $r$ — same ranking logic