CI for p₁ − p₂¶

Two-Sample Proportion Confidence Interval¶

In many practical situations, we compare the proportions of two populations — for instance, the proportion of people who support two different policies or the defect rates from two production lines.

Formula (Wald)¶

Let \(p_1\) and \(p_2\) be the population proportions for two independent groups. The confidence interval for \(p_1 - p_2\) is

\[ (\hat{p}_1 - \hat{p}_2) \pm z_{\alpha/2} \times \sqrt{\frac{\hat{p}_1(1 - \hat{p}_1)}{n_1} + \frac{\hat{p}_2(1 - \hat{p}_2)}{n_2}} \]

where \(\hat{p}_1 = x_1/n_1\) and \(\hat{p}_2 = x_2/n_2\) are the sample proportions.

Conditions for Validity¶

For the normal approximation to hold:

\(n_1\hat{p}_1 \ge 5\) and \(n_1(1 - \hat{p}_1) \ge 5\),
\(n_2\hat{p}_2 \ge 5\) and \(n_2(1 - \hat{p}_2) \ge 5\).

Alternative Methods¶

Method	Description	When to Use
Wald	\(\Delta \pm z \cdot \text{SE}\)	Large \(n\), not near 0 or 1
Newcombe (Wilson-based)	Wilson CI per group, then combine: \([L_1 - U_2,\; U_1 - L_2]\)	Recommended default
Clopper–Pearson combined	Exact CI per group, then combine	Small \(n\), regulatory settings

Python Code¶

import numpy as np
import scipy.stats as stats

n1, n2 = 200, 250
x1, x2 = 120, 130
confidence_level = 0.95

p1 = x1 / n1
p2 = x2 / n2

standard_error = np.sqrt((p1 * (1 - p1) / n1) + (p2 * (1 - p2) / n2))
z_critical = stats.norm.ppf(1 - (1 - confidence_level) / 2)
margin_of_error = z_critical * standard_error

confidence_interval = ((p1 - p2) - margin_of_error, (p1 - p2) + margin_of_error)
print(f"{confidence_interval = }")

Examples¶

Example 1: 95% CI for Difference in Proportions¶

Sample 1: \(n_1 = 200\), \(x_1 = 120\) successes. Sample 2: \(n_2 = 250\), \(x_2 = 130\) successes.

Solution.

\[ \hat{p}_1 = 0.60, \qquad \hat{p}_2 = 0.52 \]

\[ \text{SE} = \sqrt{\frac{0.60 \times 0.40}{200} + \frac{0.52 \times 0.48}{250}} = \sqrt{0.0012 + 0.001} = \sqrt{0.0022} \approx 0.0469 \]

\[ \text{ME} = 1.96 \times 0.0469 \approx 0.0919 \]

\[ \boxed{(-0.0119,\ 0.1719)} \]

We are 95% confident that the true difference lies between \(-0.0119\) and \(0.1719\). Since the interval includes zero, there is no statistically significant difference at the 95% level.

Example 2: New High School Construction¶

Duncan compares support for a new high school in north and south parts of the city.

Support?	North	South
Yes	54	77
No	66	63
Total	120	140

Construct a 90% CI for \(p_N - p_S\).

Solution.

import numpy as np
from scipy import stats

n_1 = 120  # north
n_2 = 140  # south
p_1_hat = 54 / n_1
p_2_hat = 77 / n_2

confidence_level = 0.90
alpha = 1 - confidence_level
z_star = -stats.norm().ppf(alpha / 2)
margin_of_error = z_star * np.sqrt(
    p_1_hat * (1 - p_1_hat) / n_1 + p_2_hat * (1 - p_2_hat) / n_2
)
print(f"90% CI: {p_1_hat - p_2_hat:.4f} ± {margin_of_error:.4f}")

Simulation: Difference of Two Proportions CI Coverage¶

#!/usr/bin/env python3
"""
Difference of two proportions CI simulation: Newcombe, Wald, Clopper-Pearson.
"""

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm, beta

rng_seed = None
n_simulations = 100
n1, n2 = 50, 40
p1_true, p2_true = 0.60, 0.50
alpha = 0.05
method = "newcombe"  # 'newcombe' | 'wald' | 'cp'


def main():
    if rng_seed is not None:
        np.random.seed(rng_seed)

    delta_true = p1_true - p2_true
    z = norm.ppf(1 - alpha / 2.0)
    lowers = np.empty(n_simulations)
    uppers = np.empty(n_simulations)
    centers = np.empty(n_simulations)

    for i in range(n_simulations):
        k1 = np.random.binomial(n1, p1_true)
        k2 = np.random.binomial(n2, p2_true)
        p1hat, p2hat = k1 / n1, k2 / n2
        centers[i] = p1hat - p2hat

        if method == "wald":
            se = np.sqrt(p1hat * (1 - p1hat) / n1 + p2hat * (1 - p2hat) / n2)
            lo, hi = centers[i] - z * se, centers[i] + z * se
        elif method == "newcombe":
            denom1 = 1 + z**2 / n1
            center1 = (p1hat + z**2 / (2 * n1)) / denom1
            half1 = z * np.sqrt(p1hat * (1 - p1hat) / n1 + z**2 / (4 * n1**2)) / denom1
            L1, U1 = center1 - half1, center1 + half1
            denom2 = 1 + z**2 / n2
            center2 = (p2hat + z**2 / (2 * n2)) / denom2
            half2 = z * np.sqrt(p2hat * (1 - p2hat) / n2 + z**2 / (4 * n2**2)) / denom2
            L2, U2 = center2 - half2, center2 + half2
            lo, hi = L1 - U2, U1 - L2
        elif method == "cp":
            L1 = 0.0 if k1 == 0 else beta.ppf(alpha / 2.0, k1, n1 - k1 + 1)
            U1 = 1.0 if k1 == n1 else beta.ppf(1 - alpha / 2.0, k1 + 1, n1 - k1)
            L2 = 0.0 if k2 == 0 else beta.ppf(alpha / 2.0, k2, n2 - k2 + 1)
            U2 = 1.0 if k2 == n2 else beta.ppf(1 - alpha / 2.0, k2 + 1, n2 - k2)
            lo, hi = L1 - U2, U1 - L2

        lowers[i] = max(-1.0, lo)
        uppers[i] = min(1.0, hi)

    covered = (lowers <= delta_true) & (delta_true <= uppers)
    n_fail = int((~covered).sum())
    coverage_pct = 100.0 * covered.mean()

    fig, ax = plt.subplots(figsize=(12, 12))
    for i in range(n_simulations):
        color = "k" if covered[i] else "r"
        ax.plot([lowers[i], uppers[i]], [i, i], lw=2, color=color)
        ax.plot(centers[i], i, marker="o", ms=3, color=color)
    ax.axvline(delta_true, linestyle="--", linewidth=1.5, color="r")
    ax.set_title(
        f"{n_simulations} Δ=p1−p2 CIs ({method.title()}) | n1={n1}, n2={n2}, "
        f"CL={int((1 - alpha) * 100)}% | Fail={n_fail} (Coverage ≈ {coverage_pct:.1f}%)")
    ax.set_yticks([])
    for sp in ["left", "right", "top"]:
        ax.spines[sp].set_visible(False)
    ax.set_xlabel("Δ = p1 − p2")
    plt.tight_layout()
    plt.show()


if __name__ == "__main__":
    main()

Key Points¶

The confidence interval for \(p_1 - p_2\) uses the normal approximation to the binomial distribution, assuming large sample sizes.
The width depends on the sample proportions, sample sizes, and confidence level.
If the confidence interval includes zero, there is no statistically significant difference between the two proportions at the given confidence level.
The Newcombe (Wilson-based) method is recommended as the default for better coverage, especially with moderate sample sizes.

Exercise¶

Exercise: 95% CI for Difference Between Two Proportions¶

In two independent samples, 150 out of 200 prefer a brand in sample 1, and 120 out of 180 prefer the same brand in sample 2. Construct a 95% CI.

Solution.

\[ \hat{p}_1 = 0.75, \qquad \hat{p}_2 = 0.67 \]

\[ \text{SE} = \sqrt{\frac{0.75 \times 0.25}{200} + \frac{0.67 \times 0.33}{180}} \approx 0.047 \]

\[ \text{ME} = 1.96 \times 0.047 \approx 0.092 \]

\[ \boxed{(-0.012,\ 0.172)} \]