ANOVA¶
When comparing means across more than two groups, running multiple pairwise t-tests inflates the overall Type I error rate. For example, with five groups there are ten pairwise comparisons, and even at \(\alpha = 0.05\) per test the probability of at least one false rejection grows substantially. Analysis of Variance (ANOVA) solves this by testing all group means simultaneously in a single F-test, controlling the family-wise error rate.
Mental Model
ANOVA compares the variance between group means to the variance within groups. If the groups truly share the same mean, between-group variance should be similar to within-group variance and the F-ratio clusters near 1. A large F-ratio means the group means are more spread out than random noise would explain -- evidence that at least one group differs.
One-Way ANOVA Model¶
The one-way ANOVA model assumes \(k\) independent groups, where observations in group \(i\) follow
with \(\varepsilon_{ij} \overset{\text{iid}}{\sim} N(0, \sigma^2)\). The total sample size is \(N = \sum_{i=1}^{k} n_i\).
The hypotheses are
Sum of Squares Decomposition¶
ANOVA partitions the total variability into between-group and within-group components. Define the grand mean \(\bar{X} = \frac{1}{N}\sum_{i=1}^{k}\sum_{j=1}^{n_i} X_{ij}\) and each group mean \(\bar{X}_i = \frac{1}{n_i}\sum_{j=1}^{n_i} X_{ij}\). The decomposition is
where SST is the total sum of squares, SSB is the between-group sum of squares, and SSW is the within-group sum of squares.
F-Statistic¶
The mean squares are
The F-statistic is the ratio of between-group variance to within-group variance:
Under \(H_0\), this statistic follows an \(F\)-distribution with degrees of freedom \(k - 1\) and \(N - k\):
Large values of \(F\) indicate that the between-group variability is large relative to the within-group variability, providing evidence against \(H_0\).
Assumptions¶
One-way ANOVA requires three assumptions:
- Independence: observations are independent both within and across groups.
- Normality: each group is drawn from a normal distribution. ANOVA is moderately robust to departures from normality, especially with large sample sizes.
- Homoscedasticity: all groups share the same variance \(\sigma^2\). Use the Levene or Bartlett test to verify this assumption before running ANOVA.
Violation of Equal Variances
When the equal-variance assumption fails, the standard F-test can produce misleading p-values. Use Welch's ANOVA (scipy.stats.alexandergovern) or a non-parametric alternative such as the Kruskal-Wallis test.
SciPy Implementation¶
The scipy.stats.f_oneway function computes the one-way ANOVA F-test:
```python from scipy import stats
Three treatment groups¶
group_a = [23.1, 25.3, 24.8, 22.9, 26.1] group_b = [28.4, 30.1, 27.6, 29.8, 31.2] group_c = [33.5, 35.2, 34.1, 32.8, 36.0]
f_stat, p_value = stats.f_oneway(group_a, group_b, group_c) print(f"F-statistic: {f_stat:.4f}") print(f"p-value: {p_value:.6f}") ```
The function returns the F-statistic and the corresponding p-value. Reject \(H_0\) when the p-value is below the chosen significance level \(\alpha\).
ANOVA Table¶
Results are typically organized in an ANOVA table:
| Source | SS | df | MS | F |
|---|---|---|---|---|
| Between groups | SSB | \(k - 1\) | MSB | \(F = \text{MSB}/\text{MSW}\) |
| Within groups | SSW | \(N - k\) | MSW | |
| Total | SST | \(N - 1\) |
Summary¶
ANOVA tests whether the means of multiple groups are equal by comparing between-group and within-group variability through the F-statistic. The key requirements are independence, normality, and equal variances across groups. In SciPy, scipy.stats.f_oneway provides a direct implementation for the one-way case.
Exercises¶
Exercise 1. Three fertilizers are tested on plant growth (cm): A = [20, 22, 19, 24, 21], B = [28, 30, 27, 29, 31], C = [25, 23, 26, 24, 27]. Perform a one-way ANOVA and compute eta-squared as the effect size.
Solution to Exercise 1
import numpy as np
from scipy import stats
a = [20, 22, 19, 24, 21]
b = [28, 30, 27, 29, 31]
c = [25, 23, 26, 24, 27]
f_stat, p_val = stats.f_oneway(a, b, c)
all_data = np.concatenate([a, b, c])
grand_mean = np.mean(all_data)
ss_b = sum(len(g)*(np.mean(g)-grand_mean)**2 for g in [a,b,c])
ss_t = np.sum((all_data - grand_mean)**2)
eta_sq = ss_b / ss_t
print(f"F={f_stat:.4f}, p={p_val:.6f}")
print(f"Eta-squared: {eta_sq:.4f}")
Exercise 2. Verify the equal-variance assumption for the three groups in Exercise 1 using Levene's test before running ANOVA. If the assumption fails, suggest an alternative.
Solution to Exercise 2
from scipy import stats
a = [20, 22, 19, 24, 21]
b = [28, 30, 27, 29, 31]
c = [25, 23, 26, 24, 27]
w_stat, p_val = stats.levene(a, b, c)
print(f"Levene's test: W={w_stat:.4f}, p={p_val:.4f}")
if p_val < 0.05:
print("Unequal variances — use Kruskal-Wallis instead")
else:
print("Equal variances — ANOVA is appropriate")
Exercise 3. Generate three groups from \(N(50, 10^2)\) with \(n = 30\) each (all null hypotheses true). Run ANOVA 1000 times and verify the false positive rate is approximately 5% at \(\alpha = 0.05\).
Solution to Exercise 3
import numpy as np
from scipy import stats
np.random.seed(42)
rejections = 0
for _ in range(1000):
g1 = np.random.normal(50, 10, 30)
g2 = np.random.normal(50, 10, 30)
g3 = np.random.normal(50, 10, 30)
_, p = stats.f_oneway(g1, g2, g3)
if p < 0.05:
rejections += 1
print(f"False positive rate: {rejections/1000:.3f} (expected ~0.05)")