Skip to content

Statistical Distributions Visualization

This document provides practical examples for visualizing probability distributions, including PDFs, CDFs, and comparisons across distribution families.

Mental Model

Visualizing distributions means plotting their PDF (bell curve shape), CDF (cumulative probability from 0 to 1), or both. Use scipy.stats to compute the theoretical curves and Matplotlib to draw them. Overlaying multiple distributions on one Axes with ax.plot() and a legend makes parameter effects (e.g., changing mean or variance) immediately visible.

When to Use Which Distribution

The shape of a distribution encodes its generating mechanism:

Distribution Shape Use when
Normal Symmetric bell Sums of many small effects (CLT)
Exponential Decaying right tail Waiting times between events
Uniform Flat No prior information, equal likelihood
Poisson Discrete, right-skewed Counting rare events in fixed intervals
Binomial Discrete, symmetric for large \(n\) Fixed number of yes/no trials
Gamma Flexible right-skewed Sum of exponential waiting times

Many distributions are related: Gamma generalizes Exponential (\(k=1\)), Chi-squared is Gamma with specific parameters, and Poisson approximates Binomial for large \(n\) and small \(p\). Overlaying related distributions on one plot makes these connections visible.

Unifying Idea: Everything Is Density

All continuous distributions define a density function \(f(x)\) — the same concept that appears throughout this book:

  • Histograms → empirical density (counting)
  • KDE → smoothed density (estimation)
  • 2D density plots → density in two dimensions
  • Distributions on this page → theoretical density (closed-form)

Visualizing distributions is visualizing density from the theoretical side, just as histograms and KDE visualize it from the data side.

Setup

python import matplotlib.pyplot as plt import numpy as np from scipy import stats


Normal Distribution

1. Standard Normal

```python x = np.linspace(-4, 4, 200)

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

PDF

axes[0].plot(x, stats.norm.pdf(x), 'b-', linewidth=2) axes[0].fill_between(x, stats.norm.pdf(x), alpha=0.3) axes[0].set_title('Standard Normal PDF') axes[0].set_xlabel('x') axes[0].set_ylabel('f(x)') axes[0].grid(alpha=0.3)

CDF

axes[1].plot(x, stats.norm.cdf(x), 'r-', linewidth=2) axes[1].set_title('Standard Normal CDF') axes[1].set_xlabel('x') axes[1].set_ylabel('F(x)') axes[1].grid(alpha=0.3)

plt.tight_layout() plt.show() ```

2. Varying Parameters

```python x = np.linspace(-10, 10, 200)

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

Varying mean

for mu in [-2, 0, 2]: axes[0].plot(x, stats.norm(mu, 1).pdf(x), label=f'μ={mu}') axes[0].set_title('Effect of Mean (σ=1)') axes[0].legend() axes[0].grid(alpha=0.3)

Varying std

for sigma in [0.5, 1, 2]: axes[1].plot(x, stats.norm(0, sigma).pdf(x), label=f'σ={sigma}') axes[1].set_title('Effect of Std Dev (μ=0)') axes[1].legend() axes[1].grid(alpha=0.3)

plt.tight_layout() plt.show() ```

3. Normal Distribution Dashboard

```python mu, sigma = 2, 1.5 x = np.linspace(-4, 8, 200) rv = stats.norm(mu, sigma)

fig, axes = plt.subplots(2, 2, figsize=(12, 10))

PDF

axes[0, 0].plot(x, rv.pdf(x), 'b-', linewidth=2) axes[0, 0].fill_between(x, rv.pdf(x), alpha=0.3) axes[0, 0].axvline(mu, color='red', linestyle='--', label=f'μ={mu}') axes[0, 0].set_title('Probability Density Function') axes[0, 0].legend() axes[0, 0].grid(alpha=0.3)

CDF

axes[0, 1].plot(x, rv.cdf(x), 'g-', linewidth=2) axes[0, 1].axhline(0.5, color='gray', linestyle=':', alpha=0.7) axes[0, 1].axvline(mu, color='red', linestyle='--') axes[0, 1].set_title('Cumulative Distribution Function') axes[0, 1].grid(alpha=0.3)

Histogram + PDF

np.random.seed(42) samples = rv.rvs(1000) axes[1, 0].hist(samples, bins=30, density=True, alpha=0.7, label='Samples') axes[1, 0].plot(x, rv.pdf(x), 'r-', linewidth=2, label='PDF') axes[1, 0].set_title('Sample Histogram vs PDF') axes[1, 0].legend() axes[1, 0].grid(alpha=0.3)

Q-Q Plot

stats.probplot(samples, dist="norm", plot=axes[1, 1]) axes[1, 1].set_title('Q-Q Plot') axes[1, 1].grid(alpha=0.3)

plt.suptitle(f'Normal Distribution (μ={mu}, σ={sigma})', fontsize=14, fontweight='bold') plt.tight_layout() plt.show() ```


Exponential Distribution

1. Basic Visualization

```python x = np.linspace(0, 8, 200)

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

for lam in [0.5, 1, 2]: rv = stats.expon(scale=1/lam) axes[0].plot(x, rv.pdf(x), label=f'λ={lam}') axes[1].plot(x, rv.cdf(x), label=f'λ={lam}')

axes[0].set_title('Exponential PDF') axes[0].legend() axes[0].grid(alpha=0.3)

axes[1].set_title('Exponential CDF') axes[1].legend() axes[1].grid(alpha=0.3)

plt.tight_layout() plt.show() ```


Gamma Distribution

```python x = np.linspace(0, 20, 200)

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

Varying shape (k)

for k in [1, 2, 5, 9]: rv = stats.gamma(a=k, scale=1) axes[0].plot(x, rv.pdf(x), label=f'k={k}, θ=1')

axes[0].set_title('Gamma PDF: Varying Shape') axes[0].legend() axes[0].grid(alpha=0.3)

Varying scale (θ)

for theta in [0.5, 1, 2]: rv = stats.gamma(a=3, scale=theta) axes[1].plot(x, rv.pdf(x), label=f'k=3, θ={theta}')

axes[1].set_title('Gamma PDF: Varying Scale') axes[1].legend() axes[1].grid(alpha=0.3)

plt.tight_layout() plt.show() ```


Beta Distribution

```python x = np.linspace(0, 1, 200)

fig, ax = plt.subplots(figsize=(10, 6))

params = [(0.5, 0.5), (2, 2), (2, 5), (5, 2), (1, 3)] colors = plt.cm.viridis(np.linspace(0, 1, len(params)))

for (a, b), color in zip(params, colors): rv = stats.beta(a, b) ax.plot(x, rv.pdf(x), color=color, linewidth=2, label=f'α={a}, β={b}')

ax.set_title('Beta Distribution') ax.set_xlabel('x') ax.set_ylabel('Density') ax.legend() ax.grid(alpha=0.3) plt.show() ```


Student's t-Distribution

```python x = np.linspace(-5, 5, 200)

fig, ax = plt.subplots(figsize=(10, 6))

Normal for reference

ax.plot(x, stats.norm.pdf(x), 'k--', linewidth=2, label='Normal')

t-distributions with various df

for df in [1, 2, 5, 30]: ax.plot(x, stats.t(df).pdf(x), linewidth=2, label=f't (df={df})')

ax.set_title("Student's t-Distribution") ax.set_xlabel('x') ax.set_ylabel('Density') ax.legend() ax.grid(alpha=0.3) plt.show() ```


Chi-Square Distribution

```python x = np.linspace(0, 30, 200)

fig, ax = plt.subplots(figsize=(10, 6))

for df in [1, 2, 3, 5, 10]: ax.plot(x, stats.chi2(df).pdf(x), linewidth=2, label=f'df={df}')

ax.set_title('Chi-Square Distribution') ax.set_xlabel('x') ax.set_ylabel('Density') ax.legend() ax.grid(alpha=0.3) ax.set_ylim(0, 0.5) plt.show() ```


Discrete Distributions

1. Binomial Distribution

```python n = 20 x = np.arange(0, n + 1)

fig, ax = plt.subplots(figsize=(10, 6))

for p in [0.2, 0.5, 0.7]: pmf = stats.binom(n, p).pmf(x) ax.bar(x + (p - 0.5) * 0.25, pmf, width=0.25, alpha=0.7, label=f'p={p}')

ax.set_title(f'Binomial Distribution (n={n})') ax.set_xlabel('k') ax.set_ylabel('P(X = k)') ax.legend() ax.grid(alpha=0.3, axis='y') plt.show() ```

2. Poisson Distribution

```python x = np.arange(0, 20)

fig, ax = plt.subplots(figsize=(10, 6))

for lam in [1, 4, 10]: pmf = stats.poisson(lam).pmf(x) ax.plot(x, pmf, 'o-', linewidth=2, markersize=6, label=f'λ={lam}')

ax.set_title('Poisson Distribution') ax.set_xlabel('k') ax.set_ylabel('P(X = k)') ax.legend() ax.grid(alpha=0.3) plt.show() ```

3. Geometric Distribution

```python x = np.arange(1, 15)

fig, ax = plt.subplots(figsize=(10, 6))

for p in [0.2, 0.5, 0.8]: pmf = stats.geom(p).pmf(x) ax.bar(x + (p - 0.5) * 0.25, pmf, width=0.25, alpha=0.7, label=f'p={p}')

ax.set_title('Geometric Distribution') ax.set_xlabel('k (number of trials)') ax.set_ylabel('P(X = k)') ax.legend() ax.grid(alpha=0.3, axis='y') plt.show() ```


Distribution Comparisons

1. Normal vs t-Distribution

```python x = np.linspace(-5, 5, 200)

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

PDF comparison

axes[0].plot(x, stats.norm.pdf(x), 'b-', linewidth=2, label='Normal') axes[0].plot(x, stats.t(5).pdf(x), 'r-', linewidth=2, label='t (df=5)') axes[0].fill_between(x, stats.norm.pdf(x), stats.t(5).pdf(x), alpha=0.3) axes[0].set_title('PDF Comparison') axes[0].legend() axes[0].grid(alpha=0.3)

Tail comparison

x_tail = np.linspace(2, 5, 100) axes[1].plot(x_tail, stats.norm.pdf(x_tail), 'b-', linewidth=2, label='Normal') axes[1].plot(x_tail, stats.t(5).pdf(x_tail), 'r-', linewidth=2, label='t (df=5)') axes[1].set_title('Right Tail Comparison') axes[1].legend() axes[1].grid(alpha=0.3)

plt.suptitle('Normal vs t-Distribution', fontsize=14, fontweight='bold') plt.tight_layout() plt.show() ```

2. Exponential Family

```python x = np.linspace(0, 10, 200)

fig, ax = plt.subplots(figsize=(10, 6))

Exponential

ax.plot(x, stats.expon(scale=2).pdf(x), label='Exponential (λ=0.5)')

Gamma

ax.plot(x, stats.gamma(a=2, scale=1).pdf(x), label='Gamma (k=2, θ=1)')

Chi-square

ax.plot(x, stats.chi2(4).pdf(x), label='Chi-square (df=4)')

ax.set_title('Exponential Family Distributions') ax.set_xlabel('x') ax.set_ylabel('Density') ax.legend() ax.grid(alpha=0.3) plt.show() ```


Bivariate Distributions

1. Bivariate Normal

```python from scipy import stats

x = np.linspace(-3, 3, 100) y = np.linspace(-3, 3, 100) X, Y = np.meshgrid(x, y) pos = np.dstack((X, Y))

fig, axes = plt.subplots(1, 3, figsize=(15, 4))

rhos = [-0.7, 0, 0.7]

for ax, rho in zip(axes, rhos): rv = stats.multivariate_normal([0, 0], [[1, rho], [rho, 1]]) Z = rv.pdf(pos) cf = ax.contourf(X, Y, Z, levels=15, cmap='Blues') ax.contour(X, Y, Z, levels=8, colors='navy', linewidths=0.5) ax.set_title(f'ρ = {rho}') ax.set_aspect('equal') plt.colorbar(cf, ax=ax)

plt.suptitle('Bivariate Normal Distribution', fontsize=14, fontweight='bold') plt.tight_layout() plt.show() ```

2. Bivariate Normal with Marginals

```python from mpl_toolkits.axes_grid1 import make_axes_locatable

np.random.seed(42) rho = 0.6 mean = [0, 0] cov = [[1, rho], [rho, 1]] data = np.random.multivariate_normal(mean, cov, 500)

fig, ax_main = plt.subplots(figsize=(8, 8)) divider = make_axes_locatable(ax_main) ax_top = divider.append_axes("top", 1.2, pad=0.1, sharex=ax_main) ax_right = divider.append_axes("right", 1.2, pad=0.1, sharey=ax_main)

Main scatter

ax_main.scatter(data[:, 0], data[:, 1], alpha=0.5, s=20) ax_main.set_xlabel('X') ax_main.set_ylabel('Y')

Marginals

ax_top.hist(data[:, 0], bins=30, density=True, alpha=0.7) x_range = np.linspace(-4, 4, 100) ax_top.plot(x_range, stats.norm.pdf(x_range), 'r-', linewidth=2) plt.setp(ax_top.get_xticklabels(), visible=False)

ax_right.hist(data[:, 1], bins=30, density=True, alpha=0.7, orientation='horizontal') ax_right.plot(stats.norm.pdf(x_range), x_range, 'r-', linewidth=2) plt.setp(ax_right.get_yticklabels(), visible=False)

plt.suptitle(f'Bivariate Normal (ρ={rho}) with Marginals', fontsize=13, y=1.02) plt.show() ```


Complete Overview

```python fig, axes = plt.subplots(3, 3, figsize=(15, 12))

Normal

x = np.linspace(-4, 4, 200) axes[0, 0].plot(x, stats.norm.pdf(x), 'b-', linewidth=2) axes[0, 0].fill_between(x, stats.norm.pdf(x), alpha=0.3) axes[0, 0].set_title('Normal')

Exponential

x = np.linspace(0, 6, 200) axes[0, 1].plot(x, stats.expon.pdf(x), 'g-', linewidth=2) axes[0, 1].fill_between(x, stats.expon.pdf(x), alpha=0.3) axes[0, 1].set_title('Exponential')

Uniform

x = np.linspace(-0.5, 1.5, 200) axes[0, 2].plot(x, stats.uniform.pdf(x), 'r-', linewidth=2) axes[0, 2].fill_between(x, stats.uniform.pdf(x), alpha=0.3) axes[0, 2].set_title('Uniform')

Gamma

x = np.linspace(0, 15, 200) axes[1, 0].plot(x, stats.gamma(a=3).pdf(x), 'purple', linewidth=2) axes[1, 0].fill_between(x, stats.gamma(a=3).pdf(x), alpha=0.3, color='purple') axes[1, 0].set_title('Gamma (k=3)')

Beta

x = np.linspace(0, 1, 200) axes[1, 1].plot(x, stats.beta(2, 5).pdf(x), 'orange', linewidth=2) axes[1, 1].fill_between(x, stats.beta(2, 5).pdf(x), alpha=0.3, color='orange') axes[1, 1].set_title('Beta (α=2, β=5)')

Chi-square

x = np.linspace(0, 20, 200) axes[1, 2].plot(x, stats.chi2(5).pdf(x), 'brown', linewidth=2) axes[1, 2].fill_between(x, stats.chi2(5).pdf(x), alpha=0.3, color='brown') axes[1, 2].set_title('Chi-square (df=5)')

Binomial

x = np.arange(0, 21) axes[2, 0].bar(x, stats.binom(20, 0.5).pmf(x), color='steelblue', alpha=0.7) axes[2, 0].set_title('Binomial (n=20, p=0.5)')

Poisson

x = np.arange(0, 15) axes[2, 1].bar(x, stats.poisson(5).pmf(x), color='seagreen', alpha=0.7) axes[2, 1].set_title('Poisson (λ=5)')

Geometric

x = np.arange(1, 12) axes[2, 2].bar(x, stats.geom(0.3).pmf(x), color='coral', alpha=0.7) axes[2, 2].set_title('Geometric (p=0.3)')

for ax in axes.flat: ax.grid(alpha=0.3)

plt.suptitle('Common Probability Distributions', fontsize=14, fontweight='bold') plt.tight_layout() plt.show() ```


Publication-Quality Figure

```python fig, axes = plt.subplots(2, 2, figsize=(12, 10))

Normal with shaded regions

x = np.linspace(-4, 4, 200) rv = stats.norm() axes[0, 0].plot(x, rv.pdf(x), 'steelblue', linewidth=2) axes[0, 0].fill_between(x, rv.pdf(x), where=(x >= -1) & (x <= 1), alpha=0.4, color='steelblue') axes[0, 0].fill_between(x, rv.pdf(x), where=(x >= -2) & (x <= 2), alpha=0.2, color='steelblue') axes[0, 0].axvline(-1, color='gray', linestyle=':', alpha=0.7) axes[0, 0].axvline(1, color='gray', linestyle=':', alpha=0.7) axes[0, 0].set_title('Standard Normal with σ Regions', fontsize=12) axes[0, 0].set_xlabel('\(x\)') axes[0, 0].set_ylabel('\(f(x)\)')

t-distribution comparison

axes[0, 1].plot(x, rv.pdf(x), 'b-', linewidth=2, label='Normal') for df, color in [(3, 'orange'), (10, 'green')]: axes[0, 1].plot(x, stats.t(df).pdf(x), color=color, linewidth=2, label=f't (df={df})') axes[0, 1].set_title('Normal vs t-Distribution', fontsize=12) axes[0, 1].legend() axes[0, 1].set_xlabel('\(x\)') axes[0, 1].set_ylabel('\(f(x)\)')

Gamma family

x = np.linspace(0, 15, 200) for k, color in [(1, 'red'), (2, 'green'), (5, 'blue')]: axes[1, 0].plot(x, stats.gamma(a=k).pdf(x), color=color, linewidth=2, label=f'k={k}') axes[1, 0].set_title('Gamma Distribution Family', fontsize=12) axes[1, 0].legend() axes[1, 0].set_xlabel('\(x\)') axes[1, 0].set_ylabel('\(f(x)\)')

Beta distribution

x = np.linspace(0, 1, 200) params = [(2, 2), (2, 5), (5, 2)] colors = ['blue', 'green', 'red'] for (a, b), color in zip(params, colors): axes[1, 1].plot(x, stats.beta(a, b).pdf(x), color=color, linewidth=2, label=f'α={a}, β={b}') axes[1, 1].set_title('Beta Distribution', fontsize=12) axes[1, 1].legend() axes[1, 1].set_xlabel('\(x\)') axes[1, 1].set_ylabel('\(f(x)\)')

for ax in axes.flat: ax.grid(alpha=0.3) ax.tick_params(labelsize=10)

plt.suptitle('Probability Distribution Examples', fontsize=14, fontweight='bold') plt.tight_layout() plt.show() ```


Summary Table

Distribution scipy.stats Parameters Support
Normal norm(loc, scale) μ, σ (-∞, ∞)
Exponential expon(scale=1/λ) λ [0, ∞)
Gamma gamma(a, scale) k, θ [0, ∞)
Beta beta(a, b) α, β [0, 1]
t t(df) df (-∞, ∞)
Chi-square chi2(df) df [0, ∞)
Binomial binom(n, p) n, p {0,...,n}
Poisson poisson(mu) λ {0,1,2,...}
Geometric geom(p) p {1,2,3,...}

Exercises

Exercise 1. Write code that plots the probability density function (PDF) of a standard normal distribution \(N(0, 1)\) and shades the area for \(|x| > 1.96\) (the 95% confidence region tails).

Solution to Exercise 1

```python import matplotlib.pyplot as plt import numpy as np from scipy import stats

x = np.linspace(-4, 4, 500) y = stats.norm.pdf(x)

fig, ax = plt.subplots(figsize=(10, 5)) ax.plot(x, y, 'b-', lw=2)

x_left = x[x < -1.96] x_right = x[x > 1.96] ax.fill_between(x_left, stats.norm.pdf(x_left), alpha=0.4, color='red') ax.fill_between(x_right, stats.norm.pdf(x_right), alpha=0.4, color='red')

ax.set_xlabel('\(x\)') ax.set_ylabel('Density') ax.set_title('Standard Normal PDF with 95% Confidence Tails') plt.show() ```


Exercise 2. Create a figure with 2x2 subplots showing the PDFs of four distributions: Normal(0, 1), Exponential(1), Uniform(0, 1), and Chi-squared(3). Label each subplot with the distribution name.

Solution to Exercise 2

```python import matplotlib.pyplot as plt import numpy as np from scipy import stats

fig, axes = plt.subplots(2, 2, figsize=(10, 8))

x1 = np.linspace(-4, 4, 200) axes[0, 0].plot(x1, stats.norm.pdf(x1), 'b-', lw=2) axes[0, 0].set_title('Normal(0, 1)') axes[0, 0].grid(True, alpha=0.3)

x2 = np.linspace(0, 6, 200) axes[0, 1].plot(x2, stats.expon.pdf(x2), 'r-', lw=2) axes[0, 1].set_title('Exponential(1)') axes[0, 1].grid(True, alpha=0.3)

x3 = np.linspace(-0.5, 1.5, 200) axes[1, 0].plot(x3, stats.uniform.pdf(x3), 'g-', lw=2) axes[1, 0].set_title('Uniform(0, 1)') axes[1, 0].grid(True, alpha=0.3)

x4 = np.linspace(0, 12, 200) axes[1, 1].plot(x4, stats.chi2.pdf(x4, df=3), 'm-', lw=2) axes[1, 1].set_title('Chi-squared(df=3)') axes[1, 1].grid(True, alpha=0.3)

plt.tight_layout() plt.show() ```


Exercise 3. Write code that generates 10000 samples from a normal distribution, plots a histogram with density=True, and overlays the theoretical PDF curve. Include a legend distinguishing the histogram from the theoretical curve.

Solution to Exercise 3

```python import matplotlib.pyplot as plt import numpy as np from scipy import stats

np.random.seed(42) samples = np.random.randn(10000)

fig, ax = plt.subplots(figsize=(10, 5)) ax.hist(samples, bins=50, density=True, alpha=0.7, label='Histogram')

x = np.linspace(-4, 4, 200) ax.plot(x, stats.norm.pdf(x), 'r-', lw=2, label='Theoretical PDF')

ax.set_xlabel('\(x\)') ax.set_ylabel('Density') ax.set_title('Histogram vs Theoretical Normal PDF') ax.legend() plt.show() ```


Exercise 4. Create a plot comparing three normal distributions with different parameters: \(N(0, 1)\), \(N(0, 2)\), and \(N(2, 1)\). Use different colors and line styles for each, and add a legend.

Solution to Exercise 4

```python import matplotlib.pyplot as plt import numpy as np from scipy import stats

x = np.linspace(-6, 8, 500)

fig, ax = plt.subplots(figsize=(10, 5)) ax.plot(x, stats.norm(0, 1).pdf(x), 'b-', lw=2, label='\(N(0, 1)\)') ax.plot(x, stats.norm(0, 2).pdf(x), 'r--', lw=2, label='\(N(0, 2)\)') ax.plot(x, stats.norm(2, 1).pdf(x), 'g-.', lw=2, label='\(N(2, 1)\)')

ax.set_xlabel('\(x\)') ax.set_ylabel('Density') ax.set_title('Comparison of Normal Distributions') ax.legend() ax.grid(True, alpha=0.3) plt.show() ```