Marginal and Conditional Distributions¶
Overview¶
Given a joint distribution of two random variables, the marginal distribution recovers the distribution of each variable individually, while the conditional distribution describes one variable given a specific value of the other. These concepts are essential for Bayesian reasoning, regression, and understanding dependence.
Marginal Distributions¶
Discrete Case¶
From the joint PMF \(p_{X,Y}(x,y)\), the marginal PMFs are obtained by summing over the other variable:
Continuous Case¶
From the joint PDF \(f_{X,Y}(x,y)\), the marginal PDFs are:
Intuition: Marginalizing "integrates out" the other variable, projecting the joint distribution onto a single axis.
Conditional Distributions¶
Discrete Case¶
The conditional PMF of \(Y\) given \(X = x\) is:
Continuous Case¶
The conditional PDF of \(Y\) given \(X = x\) is:
Conditional Expectation¶
Conditional Variance¶
Fundamental Relationships¶
Multiplication Rule¶
The joint distribution can always be factored as:
Law of Total Expectation¶
Law of Total Variance (Eve's Law)¶
The total variance decomposes into the mean of conditional variances (unexplained variance) plus the variance of conditional means (explained variance).
Bayes' Theorem for Distributions¶
Combining the multiplication rule and marginal distributions yields Bayes' theorem:
This is the foundation of Bayesian inference: update the prior \(f_X(x)\) with the likelihood \(f_{Y|X}(y \mid x)\) to obtain the posterior \(f_{X|Y}(x \mid y)\).
Worked Example: Discrete¶
Problem: Using the joint PMF:
| \(Y=0\) | \(Y=1\) | \(Y=2\) | \(p_X(x)\) | |
|---|---|---|---|---|
| \(X=0\) | 0.10 | 0.15 | 0.05 | 0.30 |
| \(X=1\) | 0.10 | 0.25 | 0.10 | 0.45 |
| \(X=2\) | 0.05 | 0.10 | 0.10 | 0.25 |
| \(p_Y(y)\) | 0.25 | 0.50 | 0.25 | 1.00 |
Find \(P(Y = 1 \mid X = 1)\) and \(E[Y \mid X = 1]\).
Solution:
Worked Example: Continuous¶
Problem: Let \(f_{X,Y}(x,y) = 2\) for \(0 \leq x \leq y \leq 1\). Find \(f_X(x)\), \(f_{Y|X}(y \mid x)\), and \(E[Y \mid X = x]\).
Solution:
Marginal of \(X\):
Conditional PDF of \(Y\) given \(X = x\):
This is \(\text{Uniform}(x, 1)\).
Conditional expectation:
Verification via Law of Total Expectation:
Python: Marginal and Conditional Distributions¶
Discrete Marginals and Conditionals¶
import numpy as np
import pandas as pd
pmf = np.array([
[0.10, 0.15, 0.05],
[0.10, 0.25, 0.10],
[0.05, 0.10, 0.10]
])
# Marginals
p_X = pmf.sum(axis=1)
p_Y = pmf.sum(axis=0)
print("Marginal of X:", p_X)
print("Marginal of Y:", p_Y)
# Conditional P(Y | X=1)
x_val = 1
cond_Y_given_X1 = pmf[x_val, :] / p_X[x_val]
print(f"\nP(Y|X={x_val}):", cond_Y_given_X1)
# Conditional expectation E[Y | X=1]
y_vals = np.array([0, 1, 2])
E_Y_given_X1 = np.sum(y_vals * cond_Y_given_X1)
print(f"E[Y|X={x_val}] = {E_Y_given_X1:.4f}")
Continuous Marginals via Integration¶
import numpy as np
from scipy import integrate
# f(x,y) = 2 for 0 <= x <= y <= 1
def joint_pdf(x, y):
return 2.0 if 0 <= x <= y <= 1 else 0.0
# Marginal f_X(x) = integral of f(x,y) dy from x to 1
def marginal_X(x):
result, _ = integrate.quad(lambda y: joint_pdf(x, y), x, 1)
return result
# E[Y | X=x] via conditional
def E_Y_given_X(x):
fx = marginal_X(x)
if fx == 0:
return 0
result, _ = integrate.quad(lambda y: y * joint_pdf(x, y) / fx, x, 1)
return result
# Verify Law of Total Expectation
E_Y, _ = integrate.quad(lambda x: E_Y_given_X(x) * marginal_X(x), 0, 1)
print(f"E[Y] via Law of Total Expectation: {E_Y:.4f}") # Should be 2/3
Visualizing Conditional Distributions¶
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
np.random.seed(42)
mean = [0, 0]
cov = [[1, 0.8], [0.8, 1]]
samples = np.random.multivariate_normal(mean, cov, 100_000)
fig, ax = plt.subplots(figsize=(12, 3))
# Conditional distribution of Y given X ≈ 1
for x_cond in [-1, 0, 1]:
mask = np.abs(samples[:, 0] - x_cond) < 0.1
ax.hist(samples[mask, 1], bins=50, density=True, alpha=0.4,
label=f'Y | X≈{x_cond}')
ax.spines[['top', 'right']].set_visible(False)
ax.set_xlabel('Y')
ax.legend()
plt.show()
Key Takeaways¶
- Marginal distributions are obtained by summing or integrating the joint distribution over the other variable.
- Conditional distributions describe one variable given a known value of another, computed as the joint divided by the marginal.
- The Law of Total Expectation and Law of Total Variance connect marginal and conditional moments.
- The multiplication rule \(f_{X,Y} = f_{Y|X} \cdot f_X\) provides the foundation for Bayes' theorem and Bayesian inference.
- Conditional expectation \(E[Y \mid X]\) is itself a random variable (a function of \(X\)) and represents the best prediction of \(Y\) given \(X\).