Confidence Interval ↔ Hypothesis Test Duality¶
The Duality Principle¶
There is a deep connection between confidence intervals and hypothesis tests. A \((1 - \alpha) \times 100\%\) confidence interval and a hypothesis test at significance level \(\alpha\) are two sides of the same coin:
A two-sided hypothesis test at level \(\alpha\) rejects \(H_0: \theta = \theta_0\) if and only if \(\theta_0\) falls outside the \((1-\alpha) \times 100\%\) confidence interval for \(\theta\).
This duality means that you can perform a hypothesis test by examining a confidence interval, and vice versa.
How the Duality Works¶
From Confidence Interval to Hypothesis Test¶
Given a \((1 - \alpha) \times 100\%\) confidence interval \((L, U)\) for a parameter \(\theta\):
- If \(\theta_0 \in (L, U)\): Fail to reject \(H_0: \theta = \theta_0\) at significance level \(\alpha\).
- If \(\theta_0 \notin (L, U)\): Reject \(H_0: \theta = \theta_0\) at significance level \(\alpha\).
From Hypothesis Test to Confidence Interval¶
A \((1 - \alpha) \times 100\%\) confidence interval is the set of all values \(\theta_0\) for which the hypothesis test \(H_0: \theta = \theta_0\) would not be rejected at significance level \(\alpha\).
Examples¶
Example 1: One-Sample Mean¶
For a one-sample z-test of \(H_0: \mu = \mu_0\) vs \(H_a: \mu \neq \mu_0\):
- Test: Reject \(H_0\) if \(|z| > z_{\alpha/2}\), where \(z = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}\).
- CI: \(\bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}\)
The test rejects \(H_0\) if and only if \(\mu_0\) lies outside the confidence interval.
Proof of equivalence:
Example 2: Two Varieties of Pears¶
Yuna compares caloric content of Bosc and Anjou pears. The 99% confidence interval for \(\mu_{\text{Bosc}} - \mu_{\text{Anjou}}\) is \(4 \pm 6.44 = (-2.44, 10.44)\).
Testing \(H_0: \mu_{\text{Bosc}} = \mu_{\text{Anjou}}\) (i.e., \(\mu_{\text{Bosc}} - \mu_{\text{Anjou}} = 0\)) at \(\alpha = 0.01\):
Since \(0 \in (-2.44, 10.44)\), we fail to reject \(H_0\). There is not enough evidence to conclude the caloric contents differ.
Example 3: In-person vs Online Classes¶
A 95% confidence interval for \(p_{\text{in\_person}} - p_{\text{online}}\) is \((-0.04, 0.14)\).
Testing \(H_0: p_{\text{in\_person}} = p_{\text{online}}\) at \(\alpha = 0.05\):
Since \(0 \in (-0.04, 0.14)\), we fail to reject \(H_0\). There is no significant difference in passing rates.
One-Sided Tests and Confidence Intervals¶
The duality extends to one-sided tests using one-sided confidence intervals (confidence bounds):
- Upper confidence bound: \(\theta < U\) at confidence level \(1 - \alpha\) corresponds to the test \(H_0: \theta \geq \theta_0\) vs \(H_a: \theta < \theta_0\).
- Lower confidence bound: \(\theta > L\) at confidence level \(1 - \alpha\) corresponds to the test \(H_0: \theta \leq \theta_0\) vs \(H_a: \theta > \theta_0\).
Python Illustration¶
import numpy as np
from scipy import stats
# Sample data
x_bar = 52
mu_0 = 50
sigma = 10
n = 25
alpha = 0.05
# Hypothesis test approach
z = (x_bar - mu_0) / (sigma / np.sqrt(n))
p_value = 2 * stats.norm.sf(abs(z))
reject_test = p_value <= alpha
# Confidence interval approach
z_crit = stats.norm.ppf(1 - alpha / 2)
ci_lower = x_bar - z_crit * sigma / np.sqrt(n)
ci_upper = x_bar + z_crit * sigma / np.sqrt(n)
reject_ci = mu_0 < ci_lower or mu_0 > ci_upper
print(f"Test: z = {z:.4f}, p-value = {p_value:.4f}, Reject = {reject_test}")
print(f"CI: ({ci_lower:.4f}, {ci_upper:.4f}), mu_0 outside CI = {reject_ci}")
print(f"Both methods agree: {reject_test == reject_ci}")
Key Takeaways¶
- Confidence intervals and hypothesis tests provide equivalent information for two-sided tests.
- Confidence intervals are often more informative because they show the range of plausible values, not just a binary reject/fail-to-reject decision.
- When reporting results, it is good practice to report both the p-value and the confidence interval.
- The duality holds exactly for two-sided tests; one-sided tests correspond to one-sided confidence bounds.