Rank-Based Statistics
Parametric tests like the t-test and ANOVA are powerful, but they come with strings attached: The Normality Assumption.
If your data is skewed, has outliers, or is ordinal (like a 1-5 star rating), using a t-test can lead to false conclusions.
Rank-based tests are the solution. Instead of analyzing the raw values, we analyze their ranks.
1. The Transformation: Values → Ranks
The core idea is simple:
- Combine all data from all groups.
- Sort the data from smallest to largest.
- Assign a rank (1, 2, 3…) to each value.
- If values are tied, assign the average rank (e.g., if 5th and 6th values are equal, both get rank 5.5).
- Perform the test on the ranks, not the values.
This transformation makes the test robust to outliers. A value of 1,000,000 is just “Rank N”, same as if it were 100 (provided it’s still the largest).
2. Interactive: The Rank-Sum Racer
This visualizer demonstrates the Mann-Whitney U Test logic.
- We have two groups: Group A (Blue) and Group B (Green).
- Drag the points along the line.
- Watch how their Ranks change relative to each other.
- The U Statistic measures the degree of separation.
Group A (Blue)
Rank Sum (RA): --
Group B (Green)
Rank Sum (RB): --
U Statistic
U = min(UA, UB): --
Lower U = More Separation
3. The “Big Three” Non-Parametric Tests
Here is your cheat sheet for choosing the right test.
| Scenario | Parametric Test (Normal) | Non-Parametric Test (Any Distribution) |
|---|---|---|
| 2 Independent Groups | Independent t-test | Mann-Whitney U Test |
| 2 Paired Groups | Paired t-test | Wilcoxon Signed-Rank Test |
| 3+ Groups | One-way ANOVA | Kruskal-Wallis H Test |
1. Mann-Whitney U Test
Used to test if two independent populations have the same distribution.
- Null Hypothesis (H0): The distributions of both populations are identical.
- Alternative (H1): One population tends to have larger values than the other.
2. Wilcoxon Signed-Rank Test
Used for paired data (e.g., Before vs After). It looks at the differences between pairs.
- It ranks the absolute differences.
- It tests if the median difference is zero.
3. Kruskal-Wallis Test
An extension of Mann-Whitney for more than two groups.
- It ranks all data together.
- If the null is true, the average rank for each group should be roughly the same.
4. Implementation Examples
Python (SciPy)
We use scipy.stats for these tests.
from scipy import stats
import numpy as np
# Example Data (Small Sample Sizes, Non-Normal)
group_a = [12, 15, 14, 11, 45] # Outlier 45
group_b = [22, 24, 25, 28, 26]
# 1. Mann-Whitney U Test
u_stat, p_val = stats.mannwhitneyu(group_a, group_b, alternative='two-sided')
print(f"Mann-Whitney U statistic: {u_stat}")
print(f"P-value: {p_val:.4f}")
if p_val < 0.05:
print("Result: Significant difference between groups.")
else:
print("Result: No significant difference.")
Java
Calculating the Mann-Whitney U statistic manually involves sorting ranks.
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
class RankData implements Comparable<RankData> {
double value;
String group;
double rank;
public RankData(double value, String group) {
this.value = value;
this.group = group;
}
@Override
public int compareTo(RankData o) {
return Double.compare(this.value, o.value);
}
}
public class MannWhitney {
public static void main(String[] args) {
double[] groupA = {12, 15, 14, 11, 45};
double[] groupB = {22, 24, 25, 28, 26};
List<RankData> combined = new ArrayList<>();
for (double v : groupA) combined.add(new RankData(v, "A"));
for (double v : groupB) combined.add(new RankData(v, "B"));
Collections.sort(combined);
// Assign ranks (simplified, no tie handling for brevity)
double sumRankA = 0;
for (int i = 0; i < combined.size(); i++) {
combined.get(i).rank = i + 1;
if (combined.get(i).group.equals("A")) {
sumRankA += combined.get(i).rank;
}
}
int nA = groupA.length;
// U = R - (n(n+1))/2
double uA = sumRankA - (nA * (nA + 1)) / 2.0;
// We usually take min(uA, uB), but uA is sufficient for one-sided check
// Or calculate uB = nA*nB - uA
int nB = groupB.length;
double uB = (nA * nB) - uA;
double u = Math.min(uA, uB);
System.out.println("Mann-Whitney U statistic: " + u);
}
}
Go
package main
import (
"fmt"
"sort"
)
type RankData struct {
Value float64
Group string
Rank float64
}
func main() {
groupA := []float64{12, 15, 14, 11, 45}
groupB := []float64{22, 24, 25, 28, 26}
var combined []RankData
for _, v := range groupA {
combined = append(combined, RankData{Value: v, Group: "A"})
}
for _, v := range groupB {
combined = append(combined, RankData{Value: v, Group: "B"})
}
// Sort
sort.Slice(combined, func(i, j int) bool {
return combined[i].Value < combined[j].Value
})
// Assign Ranks and Sum
sumRankA := 0.0
for i := range combined {
rank := float64(i + 1)
combined[i].Rank = rank
if combined[i].Group == "A" {
sumRankA += rank
}
}
nA := float64(len(groupA))
nB := float64(len(groupB))
uA := sumRankA - (nA * (nA + 1)) / 2.0
uB := (nA * nB) - uA
u := uA
if uB < uA {
u = uB
}
fmt.Printf("Mann-Whitney U statistic: %.1f\n", u)
}
[!IMPORTANT] Power Trade-off: Non-parametric tests are generally less powerful than parametric tests if the data is actually Normal. This means they are less likely to detect a real effect when one exists. Only use them when the assumptions of parametric tests are violated.