Statistical Distributions Visualization¶

This document provides practical examples for visualizing probability distributions, including PDFs, CDFs, and comparisons across distribution families.

Mental Model

Visualizing distributions means plotting their PDF (bell curve shape), CDF (cumulative probability from 0 to 1), or both. Use scipy.stats to compute the theoretical curves and Matplotlib to draw them. Overlaying multiple distributions on one Axes with ax.plot() and a legend makes parameter effects (e.g., changing mean or variance) immediately visible.

When to Use Which Distribution

The shape of a distribution encodes its generating mechanism:

Distribution	Shape	Use when
Normal	Symmetric bell	Sums of many small effects (CLT)
Exponential	Decaying right tail	Waiting times between events
Uniform	Flat	No prior information, equal likelihood
Poisson	Discrete, right-skewed	Counting rare events in fixed intervals
Binomial	Discrete, symmetric for large \(n\)	Fixed number of yes/no trials
Gamma	Flexible right-skewed	Sum of exponential waiting times

Many distributions are related: Gamma generalizes Exponential (\(k=1\)), Chi-squared is Gamma with specific parameters, and Poisson approximates Binomial for large \(n\) and small \(p\). Overlaying related distributions on one plot makes these connections visible.

Unifying Idea: Everything Is Density

All continuous distributions define a density function \(f(x)\) — the same concept that appears throughout this book:

Histograms → empirical density (counting)
KDE → smoothed density (estimation)
2D density plots → density in two dimensions
Distributions on this page → theoretical density (closed-form)

Visualizing distributions is visualizing density from the theoretical side, just as histograms and KDE visualize it from the data side.

Setup¶

python import matplotlib.pyplot as plt import numpy as np from scipy import stats

Normal Distribution¶

1. Standard Normal¶

```python x = np.linspace(-4, 4, 200)

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

PDF¶

axes[0].plot(x, stats.norm.pdf(x), 'b-', linewidth=2) axes[0].fill_between(x, stats.norm.pdf(x), alpha=0.3) axes[0].set_title('Standard Normal PDF') axes[0].set_xlabel('x') axes[0].set_ylabel('f(x)') axes[0].grid(alpha=0.3)

CDF¶

axes[1].plot(x, stats.norm.cdf(x), 'r-', linewidth=2) axes[1].set_title('Standard Normal CDF') axes[1].set_xlabel('x') axes[1].set_ylabel('F(x)') axes[1].grid(alpha=0.3)

plt.tight_layout() plt.show() ```

2. Varying Parameters¶

```python x = np.linspace(-10, 10, 200)

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

Varying mean¶

for mu in [-2, 0, 2]: axes[0].plot(x, stats.norm(mu, 1).pdf(x), label=f'μ={mu}') axes[0].set_title('Effect of Mean (σ=1)') axes[0].legend() axes[0].grid(alpha=0.3)

Varying std¶

for sigma in [0.5, 1, 2]: axes[1].plot(x, stats.norm(0, sigma).pdf(x), label=f'σ={sigma}') axes[1].set_title('Effect of Std Dev (μ=0)') axes[1].legend() axes[1].grid(alpha=0.3)

plt.tight_layout() plt.show() ```

3. Normal Distribution Dashboard¶

```python mu, sigma = 2, 1.5 x = np.linspace(-4, 8, 200) rv = stats.norm(mu, sigma)

fig, axes = plt.subplots(2, 2, figsize=(12, 10))

PDF¶

axes[0, 0].plot(x, rv.pdf(x), 'b-', linewidth=2) axes[0, 0].fill_between(x, rv.pdf(x), alpha=0.3) axes[0, 0].axvline(mu, color='red', linestyle='--', label=f'μ={mu}') axes[0, 0].set_title('Probability Density Function') axes[0, 0].legend() axes[0, 0].grid(alpha=0.3)

CDF¶

axes[0, 1].plot(x, rv.cdf(x), 'g-', linewidth=2) axes[0, 1].axhline(0.5, color='gray', linestyle=':', alpha=0.7) axes[0, 1].axvline(mu, color='red', linestyle='--') axes[0, 1].set_title('Cumulative Distribution Function') axes[0, 1].grid(alpha=0.3)

Histogram + PDF¶

np.random.seed(42) samples = rv.rvs(1000) axes[1, 0].hist(samples, bins=30, density=True, alpha=0.7, label='Samples') axes[1, 0].plot(x, rv.pdf(x), 'r-', linewidth=2, label='PDF') axes[1, 0].set_title('Sample Histogram vs PDF') axes[1, 0].legend() axes[1, 0].grid(alpha=0.3)

Q-Q Plot¶

stats.probplot(samples, dist="norm", plot=axes[1, 1]) axes[1, 1].set_title('Q-Q Plot') axes[1, 1].grid(alpha=0.3)

plt.suptitle(f'Normal Distribution (μ={mu}, σ={sigma})', fontsize=14, fontweight='bold') plt.tight_layout() plt.show() ```

Exponential Distribution¶

1. Basic Visualization¶

```python x = np.linspace(0, 8, 200)

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

for lam in [0.5, 1, 2]: rv = stats.expon(scale=1/lam) axes[0].plot(x, rv.pdf(x), label=f'λ={lam}') axes[1].plot(x, rv.cdf(x), label=f'λ={lam}')

axes[0].set_title('Exponential PDF') axes[0].legend() axes[0].grid(alpha=0.3)

axes[1].set_title('Exponential CDF') axes[1].legend() axes[1].grid(alpha=0.3)

plt.tight_layout() plt.show() ```

Gamma Distribution¶

```python x = np.linspace(0, 20, 200)

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

Varying shape (k)¶

for k in [1, 2, 5, 9]: rv = stats.gamma(a=k, scale=1) axes[0].plot(x, rv.pdf(x), label=f'k={k}, θ=1')

axes[0].set_title('Gamma PDF: Varying Shape') axes[0].legend() axes[0].grid(alpha=0.3)

Varying scale (θ)¶

for theta in [0.5, 1, 2]: rv = stats.gamma(a=3, scale=theta) axes[1].plot(x, rv.pdf(x), label=f'k=3, θ={theta}')

axes[1].set_title('Gamma PDF: Varying Scale') axes[1].legend() axes[1].grid(alpha=0.3)

plt.tight_layout() plt.show() ```

Beta Distribution¶

```python x = np.linspace(0, 1, 200)

fig, ax = plt.subplots(figsize=(10, 6))

params = [(0.5, 0.5), (2, 2), (2, 5), (5, 2), (1, 3)] colors = plt.cm.viridis(np.linspace(0, 1, len(params)))

for (a, b), color in zip(params, colors): rv = stats.beta(a, b) ax.plot(x, rv.pdf(x), color=color, linewidth=2, label=f'α={a}, β={b}')

ax.set_title('Beta Distribution') ax.set_xlabel('x') ax.set_ylabel('Density') ax.legend() ax.grid(alpha=0.3) plt.show() ```

Student's t-Distribution¶

```python x = np.linspace(-5, 5, 200)

fig, ax = plt.subplots(figsize=(10, 6))

Normal for reference¶

ax.plot(x, stats.norm.pdf(x), 'k--', linewidth=2, label='Normal')

t-distributions with various df¶

for df in [1, 2, 5, 30]: ax.plot(x, stats.t(df).pdf(x), linewidth=2, label=f't (df={df})')

ax.set_title("Student's t-Distribution") ax.set_xlabel('x') ax.set_ylabel('Density') ax.legend() ax.grid(alpha=0.3) plt.show() ```

Chi-Square Distribution¶

```python x = np.linspace(0, 30, 200)

fig, ax = plt.subplots(figsize=(10, 6))

for df in [1, 2, 3, 5, 10]: ax.plot(x, stats.chi2(df).pdf(x), linewidth=2, label=f'df={df}')

ax.set_title('Chi-Square Distribution') ax.set_xlabel('x') ax.set_ylabel('Density') ax.legend() ax.grid(alpha=0.3) ax.set_ylim(0, 0.5) plt.show() ```

Discrete Distributions¶

1. Binomial Distribution¶

```python n = 20 x = np.arange(0, n + 1)

fig, ax = plt.subplots(figsize=(10, 6))

for p in [0.2, 0.5, 0.7]: pmf = stats.binom(n, p).pmf(x) ax.bar(x + (p - 0.5) * 0.25, pmf, width=0.25, alpha=0.7, label=f'p={p}')

ax.set_title(f'Binomial Distribution (n={n})') ax.set_xlabel('k') ax.set_ylabel('P(X = k)') ax.legend() ax.grid(alpha=0.3, axis='y') plt.show() ```

2. Poisson Distribution¶

```python x = np.arange(0, 20)

fig, ax = plt.subplots(figsize=(10, 6))

for lam in [1, 4, 10]: pmf = stats.poisson(lam).pmf(x) ax.plot(x, pmf, 'o-', linewidth=2, markersize=6, label=f'λ={lam}')

ax.set_title('Poisson Distribution') ax.set_xlabel('k') ax.set_ylabel('P(X = k)') ax.legend() ax.grid(alpha=0.3) plt.show() ```

3. Geometric Distribution¶

```python x = np.arange(1, 15)

fig, ax = plt.subplots(figsize=(10, 6))

for p in [0.2, 0.5, 0.8]: pmf = stats.geom(p).pmf(x) ax.bar(x + (p - 0.5) * 0.25, pmf, width=0.25, alpha=0.7, label=f'p={p}')

ax.set_title('Geometric Distribution') ax.set_xlabel('k (number of trials)') ax.set_ylabel('P(X = k)') ax.legend() ax.grid(alpha=0.3, axis='y') plt.show() ```

Distribution Comparisons¶

1. Normal vs t-Distribution¶

```python x = np.linspace(-5, 5, 200)

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

PDF comparison¶

axes[0].plot(x, stats.norm.pdf(x), 'b-', linewidth=2, label='Normal') axes[0].plot(x, stats.t(5).pdf(x), 'r-', linewidth=2, label='t (df=5)') axes[0].fill_between(x, stats.norm.pdf(x), stats.t(5).pdf(x), alpha=0.3) axes[0].set_title('PDF Comparison') axes[0].legend() axes[0].grid(alpha=0.3)

Tail comparison¶

x_tail = np.linspace(2, 5, 100) axes[1].plot(x_tail, stats.norm.pdf(x_tail), 'b-', linewidth=2, label='Normal') axes[1].plot(x_tail, stats.t(5).pdf(x_tail), 'r-', linewidth=2, label='t (df=5)') axes[1].set_title('Right Tail Comparison') axes[1].legend() axes[1].grid(alpha=0.3)

plt.suptitle('Normal vs t-Distribution', fontsize=14, fontweight='bold') plt.tight_layout() plt.show() ```

2. Exponential Family¶

```python x = np.linspace(0, 10, 200)

fig, ax = plt.subplots(figsize=(10, 6))

Exponential¶

ax.plot(x, stats.expon(scale=2).pdf(x), label='Exponential (λ=0.5)')

Gamma¶

ax.plot(x, stats.gamma(a=2, scale=1).pdf(x), label='Gamma (k=2, θ=1)')

Chi-square¶

ax.plot(x, stats.chi2(4).pdf(x), label='Chi-square (df=4)')

ax.set_title('Exponential Family Distributions') ax.set_xlabel('x') ax.set_ylabel('Density') ax.legend() ax.grid(alpha=0.3) plt.show() ```

Bivariate Distributions¶

1. Bivariate Normal¶

```python from scipy import stats

x = np.linspace(-3, 3, 100) y = np.linspace(-3, 3, 100) X, Y = np.meshgrid(x, y) pos = np.dstack((X, Y))

fig, axes = plt.subplots(1, 3, figsize=(15, 4))

rhos = [-0.7, 0, 0.7]

for ax, rho in zip(axes, rhos): rv = stats.multivariate_normal([0, 0], [[1, rho], [rho, 1]]) Z = rv.pdf(pos) cf = ax.contourf(X, Y, Z, levels=15, cmap='Blues') ax.contour(X, Y, Z, levels=8, colors='navy', linewidths=0.5) ax.set_title(f'ρ = {rho}') ax.set_aspect('equal') plt.colorbar(cf, ax=ax)

plt.suptitle('Bivariate Normal Distribution', fontsize=14, fontweight='bold') plt.tight_layout() plt.show() ```

2. Bivariate Normal with Marginals¶

```python from mpl_toolkits.axes_grid1 import make_axes_locatable

np.random.seed(42) rho = 0.6 mean = [0, 0] cov = [[1, rho], [rho, 1]] data = np.random.multivariate_normal(mean, cov, 500)

fig, ax_main = plt.subplots(figsize=(8, 8)) divider = make_axes_locatable(ax_main) ax_top = divider.append_axes("top", 1.2, pad=0.1, sharex=ax_main) ax_right = divider.append_axes("right", 1.2, pad=0.1, sharey=ax_main)

Main scatter¶

ax_main.scatter(data[:, 0], data[:, 1], alpha=0.5, s=20) ax_main.set_xlabel('X') ax_main.set_ylabel('Y')

Marginals¶

ax_top.hist(data[:, 0], bins=30, density=True, alpha=0.7) x_range = np.linspace(-4, 4, 100) ax_top.plot(x_range, stats.norm.pdf(x_range), 'r-', linewidth=2) plt.setp(ax_top.get_xticklabels(), visible=False)

ax_right.hist(data[:, 1], bins=30, density=True, alpha=0.7, orientation='horizontal') ax_right.plot(stats.norm.pdf(x_range), x_range, 'r-', linewidth=2) plt.setp(ax_right.get_yticklabels(), visible=False)

plt.suptitle(f'Bivariate Normal (ρ={rho}) with Marginals', fontsize=13, y=1.02) plt.show() ```

Distribution Gallery¶

Complete Overview¶

```python fig, axes = plt.subplots(3, 3, figsize=(15, 12))

Normal¶

x = np.linspace(-4, 4, 200) axes[0, 0].plot(x, stats.norm.pdf(x), 'b-', linewidth=2) axes[0, 0].fill_between(x, stats.norm.pdf(x), alpha=0.3) axes[0, 0].set_title('Normal')

Exponential¶

x = np.linspace(0, 6, 200) axes[0, 1].plot(x, stats.expon.pdf(x), 'g-', linewidth=2) axes[0, 1].fill_between(x, stats.expon.pdf(x), alpha=0.3) axes[0, 1].set_title('Exponential')

Uniform¶

x = np.linspace(-0.5, 1.5, 200) axes[0, 2].plot(x, stats.uniform.pdf(x), 'r-', linewidth=2) axes[0, 2].fill_between(x, stats.uniform.pdf(x), alpha=0.3) axes[0, 2].set_title('Uniform')

Gamma¶

x = np.linspace(0, 15, 200) axes[1, 0].plot(x, stats.gamma(a=3).pdf(x), 'purple', linewidth=2) axes[1, 0].fill_between(x, stats.gamma(a=3).pdf(x), alpha=0.3, color='purple') axes[1, 0].set_title('Gamma (k=3)')

Beta¶

x = np.linspace(0, 1, 200) axes[1, 1].plot(x, stats.beta(2, 5).pdf(x), 'orange', linewidth=2) axes[1, 1].fill_between(x, stats.beta(2, 5).pdf(x), alpha=0.3, color='orange') axes[1, 1].set_title('Beta (α=2, β=5)')

Chi-square¶

x = np.linspace(0, 20, 200) axes[1, 2].plot(x, stats.chi2(5).pdf(x), 'brown', linewidth=2) axes[1, 2].fill_between(x, stats.chi2(5).pdf(x), alpha=0.3, color='brown') axes[1, 2].set_title('Chi-square (df=5)')

Binomial¶

x = np.arange(0, 21) axes[2, 0].bar(x, stats.binom(20, 0.5).pmf(x), color='steelblue', alpha=0.7) axes[2, 0].set_title('Binomial (n=20, p=0.5)')

Poisson¶

x = np.arange(0, 15) axes[2, 1].bar(x, stats.poisson(5).pmf(x), color='seagreen', alpha=0.7) axes[2, 1].set_title('Poisson (λ=5)')

Geometric¶

x = np.arange(1, 12) axes[2, 2].bar(x, stats.geom(0.3).pmf(x), color='coral', alpha=0.7) axes[2, 2].set_title('Geometric (p=0.3)')

for ax in axes.flat: ax.grid(alpha=0.3)

plt.suptitle('Common Probability Distributions', fontsize=14, fontweight='bold') plt.tight_layout() plt.show() ```

Publication-Quality Figure¶

```python fig, axes = plt.subplots(2, 2, figsize=(12, 10))

Normal with shaded regions¶

x = np.linspace(-4, 4, 200) rv = stats.norm() axes[0, 0].plot(x, rv.pdf(x), 'steelblue', linewidth=2) axes[0, 0].fill_between(x, rv.pdf(x), where=(x >= -1) & (x <= 1), alpha=0.4, color='steelblue') axes[0, 0].fill_between(x, rv.pdf(x), where=(x >= -2) & (x <= 2), alpha=0.2, color='steelblue') axes[0, 0].axvline(-1, color='gray', linestyle=':', alpha=0.7) axes[0, 0].axvline(1, color='gray', linestyle=':', alpha=0.7) axes[0, 0].set_title('Standard Normal with σ Regions', fontsize=12) axes[0, 0].set_xlabel('\(x\)') axes[0, 0].set_ylabel('\(f(x)\)')

t-distribution comparison¶

axes[0, 1].plot(x, rv.pdf(x), 'b-', linewidth=2, label='Normal') for df, color in [(3, 'orange'), (10, 'green')]: axes[0, 1].plot(x, stats.t(df).pdf(x), color=color, linewidth=2, label=f't (df={df})') axes[0, 1].set_title('Normal vs t-Distribution', fontsize=12) axes[0, 1].legend() axes[0, 1].set_xlabel('\(x\)') axes[0, 1].set_ylabel('\(f(x)\)')

Gamma family¶

x = np.linspace(0, 15, 200) for k, color in [(1, 'red'), (2, 'green'), (5, 'blue')]: axes[1, 0].plot(x, stats.gamma(a=k).pdf(x), color=color, linewidth=2, label=f'k={k}') axes[1, 0].set_title('Gamma Distribution Family', fontsize=12) axes[1, 0].legend() axes[1, 0].set_xlabel('\(x\)') axes[1, 0].set_ylabel('\(f(x)\)')

Beta distribution¶

x = np.linspace(0, 1, 200) params = [(2, 2), (2, 5), (5, 2)] colors = ['blue', 'green', 'red'] for (a, b), color in zip(params, colors): axes[1, 1].plot(x, stats.beta(a, b).pdf(x), color=color, linewidth=2, label=f'α={a}, β={b}') axes[1, 1].set_title('Beta Distribution', fontsize=12) axes[1, 1].legend() axes[1, 1].set_xlabel('\(x\)') axes[1, 1].set_ylabel('\(f(x)\)')

for ax in axes.flat: ax.grid(alpha=0.3) ax.tick_params(labelsize=10)

plt.suptitle('Probability Distribution Examples', fontsize=14, fontweight='bold') plt.tight_layout() plt.show() ```

Summary Table¶

Distribution	scipy.stats	Parameters	Support
Normal	`norm(loc, scale)`	μ, σ	(-∞, ∞)
Exponential	`expon(scale=1/λ)`	λ	[0, ∞)
Gamma	`gamma(a, scale)`	k, θ	[0, ∞)
Beta	`beta(a, b)`	α, β	[0, 1]
t	`t(df)`	df	(-∞, ∞)
Chi-square	`chi2(df)`	df	[0, ∞)
Binomial	`binom(n, p)`	n, p	{0,...,n}
Poisson	`poisson(mu)`	λ	{0,1,2,...}
Geometric	`geom(p)`	p	{1,2,3,...}

Exercises¶

Exercise 1. Write code that plots the probability density function (PDF) of a standard normal distribution \(N(0, 1)\) and shades the area for \(|x| > 1.96\) (the 95% confidence region tails).

Solution to Exercise 1

```python import matplotlib.pyplot as plt import numpy as np from scipy import stats

x = np.linspace(-4, 4, 500) y = stats.norm.pdf(x)

fig, ax = plt.subplots(figsize=(10, 5)) ax.plot(x, y, 'b-', lw=2)

x_left = x[x < -1.96] x_right = x[x > 1.96] ax.fill_between(x_left, stats.norm.pdf(x_left), alpha=0.4, color='red') ax.fill_between(x_right, stats.norm.pdf(x_right), alpha=0.4, color='red')

ax.set_xlabel('\(x\)') ax.set_ylabel('Density') ax.set_title('Standard Normal PDF with 95% Confidence Tails') plt.show() ```

Exercise 2. Create a figure with 2x2 subplots showing the PDFs of four distributions: Normal(0, 1), Exponential(1), Uniform(0, 1), and Chi-squared(3). Label each subplot with the distribution name.

Solution to Exercise 2

```python import matplotlib.pyplot as plt import numpy as np from scipy import stats

fig, axes = plt.subplots(2, 2, figsize=(10, 8))

x1 = np.linspace(-4, 4, 200) axes[0, 0].plot(x1, stats.norm.pdf(x1), 'b-', lw=2) axes[0, 0].set_title('Normal(0, 1)') axes[0, 0].grid(True, alpha=0.3)

x2 = np.linspace(0, 6, 200) axes[0, 1].plot(x2, stats.expon.pdf(x2), 'r-', lw=2) axes[0, 1].set_title('Exponential(1)') axes[0, 1].grid(True, alpha=0.3)

x3 = np.linspace(-0.5, 1.5, 200) axes[1, 0].plot(x3, stats.uniform.pdf(x3), 'g-', lw=2) axes[1, 0].set_title('Uniform(0, 1)') axes[1, 0].grid(True, alpha=0.3)

x4 = np.linspace(0, 12, 200) axes[1, 1].plot(x4, stats.chi2.pdf(x4, df=3), 'm-', lw=2) axes[1, 1].set_title('Chi-squared(df=3)') axes[1, 1].grid(True, alpha=0.3)

plt.tight_layout() plt.show() ```

Exercise 3. Write code that generates 10000 samples from a normal distribution, plots a histogram with density=True, and overlays the theoretical PDF curve. Include a legend distinguishing the histogram from the theoretical curve.

Solution to Exercise 3

```python import matplotlib.pyplot as plt import numpy as np from scipy import stats

np.random.seed(42) samples = np.random.randn(10000)

fig, ax = plt.subplots(figsize=(10, 5)) ax.hist(samples, bins=50, density=True, alpha=0.7, label='Histogram')

x = np.linspace(-4, 4, 200) ax.plot(x, stats.norm.pdf(x), 'r-', lw=2, label='Theoretical PDF')

ax.set_xlabel('\(x\)') ax.set_ylabel('Density') ax.set_title('Histogram vs Theoretical Normal PDF') ax.legend() plt.show() ```

Exercise 4. Create a plot comparing three normal distributions with different parameters: \(N(0, 1)\), \(N(0, 2)\), and \(N(2, 1)\). Use different colors and line styles for each, and add a legend.

Solution to Exercise 4

```python import matplotlib.pyplot as plt import numpy as np from scipy import stats

x = np.linspace(-6, 8, 500)

fig, ax = plt.subplots(figsize=(10, 5)) ax.plot(x, stats.norm(0, 1).pdf(x), 'b-', lw=2, label='\(N(0, 1)\)') ax.plot(x, stats.norm(0, 2).pdf(x), 'r--', lw=2, label='\(N(0, 2)\)') ax.plot(x, stats.norm(2, 1).pdf(x), 'g-.', lw=2, label='\(N(2, 1)\)')

ax.set_xlabel('\(x\)') ax.set_ylabel('Density') ax.set_title('Comparison of Normal Distributions') ax.legend() ax.grid(True, alpha=0.3) plt.show() ```