Paired-Sample Non-Parametric Tests¶

Paired-Sample Wilcoxon Signed-Rank Test¶

The Paired-Sample Wilcoxon Signed-Rank Test is a non-parametric test used to determine whether the median difference between paired observations is significantly different from zero. It serves as a non-parametric alternative to the paired t-test when the data do not meet the normality assumption.

Key Features¶

Purpose: Test whether the median difference between paired observations is zero.
Null Hypothesis (\(H_0\)): The median of the differences between the paired samples is zero.
Alternative Hypothesis (\(H_a\)):
- Two-tailed: The median difference is not zero.
- One-tailed: The median difference is either greater than or less than zero.
Data Requirements:
- Data must be paired and continuous or ordinal.
- The differences between pairs should be symmetrically distributed.

Test Procedure¶

Step 1: Calculate Differences

Compute the difference between paired observations:

\[d_i = X_i - Y_i\]

Ignore pairs where \(d_i = 0\) (these are excluded from the test).

Step 2: Rank the Absolute Differences

Take the absolute values \(|d_i|\) and rank them in ascending order, assigning tied ranks if necessary.

Step 3: Assign Signs to Ranks

Assign the sign of each difference to its corresponding rank.

Step 4: Compute the Test Statistic

Sum the ranks of the positive differences: \(W^+\)
Sum the ranks of the negative differences: \(W^-\)
The test statistic is: \(W = \min(W^+, W^-)\)

Step 5: Determine Significance

Compare \(W\) to a critical value from the Wilcoxon Signed-Rank Test table for the given sample size and significance level (\(\alpha\)).
For larger samples (\(n > 20\)), use the normal approximation:

\[Z = \frac{W - \frac{n(n+1)}{4}}{\sqrt{\frac{n(n+1)(2n+1)}{24}}}\]

Worked Example: Teaching Method Improvement¶

Scenario: A researcher wants to test whether a new teaching method improves test scores. Ten students take a test before and after using the new method. Test if there is a significant improvement in scores.

Data:

Student	Before	After	Difference (\(d_i\))
1	70	72	2
2	68	69	1
3	75	78	3
4	80	85	5
5	72	75	3
6	74	76	2
7	69	70	1
8	77	79	2
9	73	74	1
10	76	80	4

Step 1 — Differences: All differences are positive: \(d = [2, 1, 3, 5, 3, 2, 1, 2, 1, 4]\)

Step 2 — Rank Absolute Differences:

\[|d| = [2, 1, 3, 5, 3, 2, 1, 2, 1, 4]\]

Sorted: \([1, 1, 1, 2, 2, 2, 3, 3, 4, 5]\)

Tied ranks: \([2, 2, 2, 5, 5, 5, 7.5, 7.5, 9, 10]\)

Assigned to original order: \([5, 2, 7.5, 10, 7.5, 5, 2, 5, 2, 9]\)

Step 3 — Assign Signs: Since all differences are positive, all ranks are positive.

Step 4 — Compute \(W^+\) and \(W^-\):

\(W^+ = 5 + 2 + 7.5 + 10 + 7.5 + 5 + 2 + 5 + 2 + 9 = 55\)
\(W^- = 0\)
\(W = \min(55, 0) = 0\)

Step 5 — Determine Significance:

Using a Wilcoxon table for \(n = 10\), \(\alpha = 0.05\) (two-tailed), the critical value is 8. Since \(W = 0 < 8\), we reject \(H_0\).

Conclusion: There is significant evidence that the new teaching method improves test scores.

Python Implementation¶

import numpy as np
from scipy.stats import wilcoxon

# Data: before and after scores
before = np.array([70, 68, 75, 80, 72, 74, 69, 77, 73, 76])
after = np.array([72, 69, 78, 85, 75, 76, 70, 79, 74, 80])

# Perform Wilcoxon Signed-Rank Test
stat, p_value = wilcoxon(after, before)

print(f"Test Statistic: {stat}")
print(f"P-value: {p_value}")

# Interpretation
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: Significant improvement in scores.")
else:
    print("Fail to reject the null hypothesis: No significant improvement in scores.")

Paired Sign Test¶

When the symmetry assumption of the Wilcoxon Signed-Rank test is violated, the Paired Sign Test can be used. It only considers the signs of the paired differences, ignoring both magnitude and ranks.

Procedure¶

Compute differences \(d_i = X_i - Y_i\).
Count the number of positive (\(n_+\)) and negative (\(n_-\)) differences. Ignore ties (\(d_i = 0\)).
Under \(H_0\), each difference is equally likely to be positive or negative, so \(n_+ \sim \text{Binomial}(n, 0.5)\).
Compute p-value using the binomial distribution.

from scipy.stats import binom
import numpy as np

before = np.array([70, 68, 75, 80, 72, 74, 69, 77, 73, 76])
after = np.array([72, 69, 78, 85, 75, 76, 70, 79, 74, 80])

differences = after - before
n_plus = np.sum(differences > 0)
n_minus = np.sum(differences < 0)
n = n_plus + n_minus
W = min(n_plus, n_minus)

p_value = 2 * binom.cdf(W, n, 0.5)  # Two-tailed

print(f"n+ = {n_plus}, n- = {n_minus}")
print(f"P-value: {p_value:.4f}")

Comparison: Paired t-Test vs Non-Parametric Alternatives¶

Feature	Paired t-Test	Paired Wilcoxon	Paired Sign Test
Assumption	Normal differences	Symmetric differences	None
Tests	Mean difference	Median difference	Median difference
Uses	Raw differences	Ranks of differences	Signs only
Power	Highest (when normal)	Moderate	Lowest
Robustness	Sensitive to outliers	Moderate	Very robust

Guideline for Choosing¶

Normal differences: Use the paired t-test for maximum power.
Symmetric but non-normal differences: Use the paired Wilcoxon signed-rank test.
No assumptions met (skewed, ordinal): Use the paired sign test.

Advantages and Limitations¶

Advantages:

Does not assume normality.
Robust to outliers.
Works with ordinal data.

Limitations:

Assumes symmetry of the distribution of differences (Wilcoxon only).
Less powerful than parametric tests when normality holds.
Sign test discards magnitude information, reducing power further.

Student	Before	After	Difference (\(d_i\))
1	70	72	2
2	68	69	1
3	75	78	3
4	80	85	5
5	72	75	3
6	74	76	2
7	69	70	1
8	77	79	2
9	73	74	1
10	76	80	4

Student	Before	After	Difference (\(d_i\))
1	70	72	2
2	68	69	1
3	75	78	3
4	80	85	5
5	72	75	3
6	74	76	2
7	69	70	1
8	77	79	2
9	73	74	1
10	76	80	4

Student	Before	After	Difference (\(d_i\))
1	70	72	2
2	68	69	1
3	75	78	3
4	80	85	5
5	72	75	3
6	74	76	2
7	69	70	1
8	77	79	2
9	73	74	1
10	76	80	4