Skip to content

Moment Generating Functions

Overview

The moment generating function (MGF) is a powerful tool that encodes all the moments of a random variable into a single function. It provides an elegant way to compute expectations, prove limit theorems, and characterize distributions. If two random variables have the same MGF, they have the same distribution.


Definition

The moment generating function of a random variable \(X\) is defined as:

\[ M_X(t) = E\left[e^{tX}\right] = \begin{cases} \displaystyle\sum_x e^{tx} \cdot P(X = x), & \text{discrete} \\[10pt] \displaystyle\int_{-\infty}^{\infty} e^{tx} f(x) \, dx, & \text{continuous} \end{cases} \]

The MGF exists if \(M_X(t)\) is finite for all \(t\) in some open interval containing 0.


Why "Moment Generating"?

The Taylor expansion of \(e^{tX}\) reveals the connection to moments:

\[ M_X(t) = E\left[e^{tX}\right] = E\left[\sum_{k=0}^{\infty} \frac{(tX)^k}{k!}\right] = \sum_{k=0}^{\infty} \frac{t^k}{k!} E[X^k] \]

Taking derivatives and evaluating at \(t = 0\) extracts individual moments:

\[ M_X^{(n)}(0) = \frac{d^n}{dt^n} M_X(t) \bigg|_{t=0} = E[X^n] \]

Specifically:

\[ \begin{aligned} M_X'(0) &= E[X] \\ M_X''(0) &= E[X^2] \\ \text{Var}(X) &= M_X''(0) - \left[M_X'(0)\right]^2 \end{aligned} \]

Key Properties

Uniqueness

If two random variables \(X\) and \(Y\) have MGFs that exist and are equal in an open interval around 0:

\[ M_X(t) = M_Y(t) \quad \Longrightarrow \quad X \stackrel{d}{=} Y \]

This makes the MGF a tool for identifying distributions.

Linear Transformation

For \(Y = aX + b\):

\[ M_Y(t) = e^{bt} M_X(at) \]

Sum of Independent Variables

If \(X \perp\!\!\!\perp Y\):

\[ M_{X+Y}(t) = M_X(t) \cdot M_Y(t) \]

This extends to \(n\) independent variables: \(M_{S_n}(t) = \prod_{i=1}^n M_{X_i}(t)\).


Common MGFs

Distribution \(M_X(t)\) Parameters
Bernoulli\((p)\) \(1 - p + pe^t\) \(p \in (0,1)\)
Binomial\((n, p)\) \((1 - p + pe^t)^n\) \(n \in \mathbb{N},\ p \in (0,1)\)
Poisson\((\lambda)\) \(\exp\left(\lambda(e^t - 1)\right)\) \(\lambda > 0\)
Geometric\((p)\) \(\dfrac{pe^t}{1 - (1-p)e^t}\), \(t < -\ln(1-p)\) \(p \in (0,1)\)
Exponential\((\lambda)\) \(\dfrac{\lambda}{\lambda - t}\), \(t < \lambda\) \(\lambda > 0\)
Normal\((\mu, \sigma^2)\) \(\exp\left(\mu t + \frac{\sigma^2 t^2}{2}\right)\) \(\mu \in \mathbb{R},\ \sigma^2 > 0\)

Examples

Example: MGF of the Normal Distribution

For \(X \sim N(\mu, \sigma^2)\), the MGF is:

\[ M_X(t) = \exp\left(\mu t + \frac{\sigma^2 t^2}{2}\right) \]

Extracting moments:

\[ \begin{aligned} M_X'(t) &= \left(\mu + \sigma^2 t\right) M_X(t) \\ M_X'(0) &= \mu = E[X] \\[6pt] M_X''(t) &= \left(\sigma^2 + (\mu + \sigma^2 t)^2\right) M_X(t) \\ M_X''(0) &= \sigma^2 + \mu^2 = E[X^2] \\[6pt] \text{Var}(X) &= (\sigma^2 + \mu^2) - \mu^2 = \sigma^2 \end{aligned} \]

Example: Sum of Independent Normals

If \(X_1 \sim N(\mu_1, \sigma_1^2)\) and \(X_2 \sim N(\mu_2, \sigma_2^2)\) are independent:

\[ M_{X_1 + X_2}(t) = \exp\left((\mu_1 + \mu_2)t + \frac{(\sigma_1^2 + \sigma_2^2)t^2}{2}\right) \]

By uniqueness, \(X_1 + X_2 \sim N(\mu_1 + \mu_2, \sigma_1^2 + \sigma_2^2)\).

Example: Proving the CLT (Sketch)

For i.i.d. \(X_i\) with mean \(\mu\), variance \(\sigma^2\), let \(Z_n = \frac{\bar{X} - \mu}{\sigma/\sqrt{n}}\). The MGF of \(Z_n\) satisfies:

\[ M_{Z_n}(t) = \left[M_{\frac{X_i - \mu}{\sigma}}\left(\frac{t}{\sqrt{n}}\right)\right]^n \to e^{t^2/2} \quad \text{as } n \to \infty \]

The limit \(e^{t^2/2}\) is the MGF of \(N(0,1)\), proving convergence in distribution.


Python Exploration

import numpy as np
from scipy.misc import derivative

def mgf_normal(t, mu, sigma2):
    """MGF of Normal(mu, sigma2)."""
    return np.exp(mu * t + sigma2 * t**2 / 2)

# Extract moments via numerical differentiation
mu, sigma2 = 3.0, 4.0

E_X = derivative(lambda t: mgf_normal(t, mu, sigma2), 0, n=1, dx=1e-6)
E_X2 = derivative(lambda t: mgf_normal(t, mu, sigma2), 0, n=2, dx=1e-6)
Var_X = E_X2 - E_X**2

print(f"E[X] = {E_X:.4f} (theoretical: {mu})")
print(f"E[X²] = {E_X2:.4f} (theoretical: {sigma2 + mu**2})")
print(f"Var(X) = {Var_X:.4f} (theoretical: {sigma2})")
import numpy as np
import matplotlib.pyplot as plt

def plot_mgf_comparison():
    """Plot MGFs of several distributions."""
    t = np.linspace(-1.5, 1.5, 300)

    fig, ax = plt.subplots(figsize=(12, 4))

    # Normal(0, 1)
    ax.plot(t, np.exp(t**2 / 2), label='N(0, 1)', lw=2)

    # Exponential(1)
    t_exp = t[t < 1]
    ax.plot(t_exp, 1 / (1 - t_exp), label='Exp(1)', lw=2)

    # Poisson(3)
    lam = 3
    ax.plot(t, np.exp(lam * (np.exp(t) - 1)), label='Poisson(3)', lw=2)

    # Bernoulli(0.5)
    p = 0.5
    ax.plot(t, 1 - p + p * np.exp(t), label='Bernoulli(0.5)', lw=2)

    ax.set_xlabel('t')
    ax.set_ylabel('M_X(t)')
    ax.set_title('Moment Generating Functions')
    ax.set_ylim(0, 15)
    ax.legend()
    ax.spines[['top', 'right']].set_visible(False)
    plt.tight_layout()
    plt.show()

plot_mgf_comparison()
import numpy as np

def verify_sum_of_normals(n_simulations=100_000):
    """Verify that sum of independent normals is normal via simulation."""
    np.random.seed(42)
    mu1, sigma1 = 2, 3
    mu2, sigma2 = 5, 4

    X1 = np.random.normal(mu1, sigma1, n_simulations)
    X2 = np.random.normal(mu2, sigma2, n_simulations)
    S = X1 + X2

    print(f"E[X1+X2] = {S.mean():.4f} (theoretical: {mu1 + mu2})")
    print(f"Var(X1+X2) = {S.var():.4f} (theoretical: {sigma1**2 + sigma2**2})")

verify_sum_of_normals()

Key Takeaways

  • The MGF \(M_X(t) = E[e^{tX}]\) encodes all moments: the \(n\)-th derivative at 0 gives \(E[X^n]\).
  • If two distributions have the same MGF (in a neighborhood of 0), they are identical.
  • For independent variables, the MGF of the sum is the product of individual MGFs.
  • MGFs provide elegant proofs for results like the distribution of sums of normals and the Central Limit Theorem.