CI for p₁ − p₂¶
Two-Sample Proportion Confidence Interval¶
In many practical situations, we compare the proportions of two populations — for instance, the proportion of people who support two different policies or the defect rates from two production lines.
Formula (Wald)¶
Let \(p_1\) and \(p_2\) be the population proportions for two independent groups. The confidence interval for \(p_1 - p_2\) is
where \(\hat{p}_1 = x_1/n_1\) and \(\hat{p}_2 = x_2/n_2\) are the sample proportions.
Conditions for Validity¶
For the normal approximation to hold:
- \(n_1\hat{p}_1 \ge 5\) and \(n_1(1 - \hat{p}_1) \ge 5\),
- \(n_2\hat{p}_2 \ge 5\) and \(n_2(1 - \hat{p}_2) \ge 5\).
Alternative Methods¶
| Method | Description | When to Use |
|---|---|---|
| Wald | \(\Delta \pm z \cdot \text{SE}\) | Large \(n\), not near 0 or 1 |
| Newcombe (Wilson-based) | Wilson CI per group, then combine: \([L_1 - U_2,\; U_1 - L_2]\) | Recommended default |
| Clopper–Pearson combined | Exact CI per group, then combine | Small \(n\), regulatory settings |
Python Code¶
import numpy as np
import scipy.stats as stats
n1, n2 = 200, 250
x1, x2 = 120, 130
confidence_level = 0.95
p1 = x1 / n1
p2 = x2 / n2
standard_error = np.sqrt((p1 * (1 - p1) / n1) + (p2 * (1 - p2) / n2))
z_critical = stats.norm.ppf(1 - (1 - confidence_level) / 2)
margin_of_error = z_critical * standard_error
confidence_interval = ((p1 - p2) - margin_of_error, (p1 - p2) + margin_of_error)
print(f"{confidence_interval = }")
Examples¶
Example 1: 95% CI for Difference in Proportions¶
Sample 1: \(n_1 = 200\), \(x_1 = 120\) successes. Sample 2: \(n_2 = 250\), \(x_2 = 130\) successes.
Solution.
We are 95% confident that the true difference lies between \(-0.0119\) and \(0.1719\). Since the interval includes zero, there is no statistically significant difference at the 95% level.
Example 2: New High School Construction¶
Duncan compares support for a new high school in north and south parts of the city.
| Support? | North | South |
|---|---|---|
| Yes | 54 | 77 |
| No | 66 | 63 |
| Total | 120 | 140 |
Construct a 90% CI for \(p_N - p_S\).
Solution.
import numpy as np
from scipy import stats
n_1 = 120 # north
n_2 = 140 # south
p_1_hat = 54 / n_1
p_2_hat = 77 / n_2
confidence_level = 0.90
alpha = 1 - confidence_level
z_star = -stats.norm().ppf(alpha / 2)
margin_of_error = z_star * np.sqrt(
p_1_hat * (1 - p_1_hat) / n_1 + p_2_hat * (1 - p_2_hat) / n_2
)
print(f"90% CI: {p_1_hat - p_2_hat:.4f} ± {margin_of_error:.4f}")
Simulation: Difference of Two Proportions CI Coverage¶
#!/usr/bin/env python3
"""
Difference of two proportions CI simulation: Newcombe, Wald, Clopper-Pearson.
"""
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm, beta
rng_seed = None
n_simulations = 100
n1, n2 = 50, 40
p1_true, p2_true = 0.60, 0.50
alpha = 0.05
method = "newcombe" # 'newcombe' | 'wald' | 'cp'
def main():
if rng_seed is not None:
np.random.seed(rng_seed)
delta_true = p1_true - p2_true
z = norm.ppf(1 - alpha / 2.0)
lowers = np.empty(n_simulations)
uppers = np.empty(n_simulations)
centers = np.empty(n_simulations)
for i in range(n_simulations):
k1 = np.random.binomial(n1, p1_true)
k2 = np.random.binomial(n2, p2_true)
p1hat, p2hat = k1 / n1, k2 / n2
centers[i] = p1hat - p2hat
if method == "wald":
se = np.sqrt(p1hat * (1 - p1hat) / n1 + p2hat * (1 - p2hat) / n2)
lo, hi = centers[i] - z * se, centers[i] + z * se
elif method == "newcombe":
denom1 = 1 + z**2 / n1
center1 = (p1hat + z**2 / (2 * n1)) / denom1
half1 = z * np.sqrt(p1hat * (1 - p1hat) / n1 + z**2 / (4 * n1**2)) / denom1
L1, U1 = center1 - half1, center1 + half1
denom2 = 1 + z**2 / n2
center2 = (p2hat + z**2 / (2 * n2)) / denom2
half2 = z * np.sqrt(p2hat * (1 - p2hat) / n2 + z**2 / (4 * n2**2)) / denom2
L2, U2 = center2 - half2, center2 + half2
lo, hi = L1 - U2, U1 - L2
elif method == "cp":
L1 = 0.0 if k1 == 0 else beta.ppf(alpha / 2.0, k1, n1 - k1 + 1)
U1 = 1.0 if k1 == n1 else beta.ppf(1 - alpha / 2.0, k1 + 1, n1 - k1)
L2 = 0.0 if k2 == 0 else beta.ppf(alpha / 2.0, k2, n2 - k2 + 1)
U2 = 1.0 if k2 == n2 else beta.ppf(1 - alpha / 2.0, k2 + 1, n2 - k2)
lo, hi = L1 - U2, U1 - L2
lowers[i] = max(-1.0, lo)
uppers[i] = min(1.0, hi)
covered = (lowers <= delta_true) & (delta_true <= uppers)
n_fail = int((~covered).sum())
coverage_pct = 100.0 * covered.mean()
fig, ax = plt.subplots(figsize=(12, 12))
for i in range(n_simulations):
color = "k" if covered[i] else "r"
ax.plot([lowers[i], uppers[i]], [i, i], lw=2, color=color)
ax.plot(centers[i], i, marker="o", ms=3, color=color)
ax.axvline(delta_true, linestyle="--", linewidth=1.5, color="r")
ax.set_title(
f"{n_simulations} Δ=p1−p2 CIs ({method.title()}) | n1={n1}, n2={n2}, "
f"CL={int((1 - alpha) * 100)}% | Fail={n_fail} (Coverage ≈ {coverage_pct:.1f}%)")
ax.set_yticks([])
for sp in ["left", "right", "top"]:
ax.spines[sp].set_visible(False)
ax.set_xlabel("Δ = p1 − p2")
plt.tight_layout()
plt.show()
if __name__ == "__main__":
main()
Key Points¶
- The confidence interval for \(p_1 - p_2\) uses the normal approximation to the binomial distribution, assuming large sample sizes.
- The width depends on the sample proportions, sample sizes, and confidence level.
- If the confidence interval includes zero, there is no statistically significant difference between the two proportions at the given confidence level.
- The Newcombe (Wilson-based) method is recommended as the default for better coverage, especially with moderate sample sizes.
Exercise¶
Exercise: 95% CI for Difference Between Two Proportions¶
In two independent samples, 150 out of 200 prefer a brand in sample 1, and 120 out of 180 prefer the same brand in sample 2. Construct a 95% CI.
Solution.