Skip to content

Normal Distributions

NumPy provides multiple functions for generating samples from normal (Gaussian) distributions.

Mental Model

np.random.randn draws from the standard normal \(\mathcal{N}(0,1)\); scale and shift with sigma * randn(...) + mu to get any normal distribution. Alternatively, np.random.normal(mu, sigma, size) does it in one call. The normal distribution appears everywhere because the Central Limit Theorem guarantees that sums of many independent variables converge to it.

np.random.randn

Generates samples from the standard normal distribution \(\mathcal{N}(0, 1)\).

1. Basic Usage

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): np.random.seed(0)

n_samples = 10_000
data = np.random.randn(n_samples)

fig, ax = plt.subplots(figsize=(12, 3))

_, bins, _ = ax.hist(data, bins=100, density=True, alpha=0.3, label='Histogram')

pdf = stats.norm().pdf(bins)
ax.plot(bins, pdf, '--r', linewidth=2, label='Standard Normal PDF')

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.legend()
plt.show()

if name == "main": main() ```

2. Shape Argument

Pass dimensions as separate arguments.

```python import numpy as np

def main(): np.random.seed(42)

# 1D array
a = np.random.randn(5)
print(f"1D: {a.shape}")

# 2D array
b = np.random.randn(3, 4)
print(f"2D: {b.shape}")

# 3D array
c = np.random.randn(2, 3, 4)
print(f"3D: {c.shape}")

if name == "main": main() ```

3. Quick Sampling

Use randn for quick standard normal samples with positional shape.

np.random.standard_normal

Alternative syntax for standard normal samples using size keyword.

1. Size Keyword

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): np.random.seed(0)

n_samples = 10_000
data = np.random.standard_normal(size=(n_samples,))

fig, ax = plt.subplots(figsize=(12, 3))

_, bins, _ = ax.hist(data, bins=100, density=True, alpha=0.3, label='Histogram')

pdf = stats.norm().pdf(bins)
ax.plot(bins, pdf, '--r', linewidth=2, label='Standard Normal PDF')

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.legend()
plt.show()

if name == "main": main() ```

2. Difference from randn

Uses size keyword tuple instead of positional dimension arguments.

```python import numpy as np

def main(): np.random.seed(42)

# randn: positional arguments
a = np.random.randn(3, 4)

# standard_normal: size keyword
b = np.random.standard_normal(size=(3, 4))

print(f"randn shape: {a.shape}")
print(f"standard_normal shape: {b.shape}")

if name == "main": main() ```

3. Equivalent Results

Both produce standard normal samples; choice is stylistic.

np.random.normal

Generates samples from a general normal distribution \(\mathcal{N}(\mu, \sigma^2)\).

1. Parameters

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): np.random.seed(0)

loc = 5      # mean (μ)
scale = 2    # standard deviation (σ)
n_samples = 10_000

data = np.random.normal(loc=loc, scale=scale, size=(n_samples,))

fig, ax = plt.subplots(figsize=(12, 3))

_, bins, _ = ax.hist(data, bins=100, density=True, alpha=0.3, label='Histogram')

pdf = stats.norm(loc=loc, scale=scale).pdf(bins)
ax.plot(bins, pdf, '--r', linewidth=2, label=f'N({loc}, {scale}²) PDF')

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.legend()
plt.show()

if name == "main": main() ```

2. Scaling Relation

\(X \sim \mathcal{N}(\mu, \sigma^2)\) is equivalent to \(X = \mu + \sigma Z\) where \(Z \sim \mathcal{N}(0, 1)\).

```python import numpy as np

def main(): np.random.seed(42)

mu, sigma = 5, 2
n = 10_000

# Method 1: np.random.normal
x1 = np.random.normal(loc=mu, scale=sigma, size=n)

# Method 2: transform standard normal
np.random.seed(42)
z = np.random.randn(n)
x2 = mu + sigma * z

print(f"Method 1 mean: {x1.mean():.4f}")
print(f"Method 2 mean: {x2.mean():.4f}")

if name == "main": main() ```

3. Use for Custom Mean/Std

Use normal when you need to specify mean and standard deviation.

scipy.stats.norm.rvs

The scipy.stats alternative for normal sampling.

1. Basic Usage

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): np.random.seed(0)

n_samples = 10_000
data = stats.norm(loc=0, scale=1).rvs(n_samples)

fig, ax = plt.subplots(figsize=(12, 3))

_, bins, _ = ax.hist(data, bins=100, density=True, alpha=0.3, label='Histogram')

pdf = stats.norm().pdf(bins)
ax.plot(bins, pdf, '--r', linewidth=2, label='Standard Normal PDF')

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.set_xlabel('Value')
ax.set_ylabel('Density')
ax.legend()
plt.show()

if name == "main": main() ```

2. Distribution Object

Create a frozen distribution for repeated use.

```python import numpy as np from scipy import stats

def main(): np.random.seed(42)

# Create distribution object
dist = stats.norm(loc=10, scale=3)

# Sample
samples = dist.rvs(size=5)
print(f"Samples: {samples}")

# Also get PDF, CDF, etc.
print(f"PDF at 10: {dist.pdf(10):.4f}")
print(f"CDF at 10: {dist.cdf(10):.4f}")

if name == "main": main() ```

3. When to Use

Use stats.norm when you also need PDF, CDF, quantiles, or other distribution methods.

Method Comparison

1. All Four Methods

```python import numpy as np from scipy import stats

def main(): np.random.seed(0) n = 5

print("Standard Normal N(0,1) - 4 equivalent methods:")
print()

np.random.seed(42)
print(f"np.random.randn({n}):")
print(f"  {np.random.randn(n)}")

np.random.seed(42)
print(f"np.random.standard_normal(size=({n},)):")
print(f"  {np.random.standard_normal(size=(n,))}")

np.random.seed(42)
print(f"np.random.normal(0, 1, size={n}):")
print(f"  {np.random.normal(0, 1, size=n)}")

np.random.seed(42)
print(f"stats.norm(0, 1).rvs({n}):")
print(f"  {stats.norm(0, 1).rvs(n)}")

if name == "main": main() ```

2. Summary Table

Function Standard Normal General Normal Shape Syntax
randn Positional args
standard_normal size= keyword
normal size= keyword
stats.norm.rvs Positional or size=

3. Recommendations

  • Quick standard normal: randn
  • Custom mean/std: normal
  • Need PDF/CDF too: stats.norm

Multivariate Normal

Generates samples from a multivariate normal distribution.

1. Covariance Matrix

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): np.random.seed(42)

mean = [0, 0]
cov = [[1, 0.8], [0.8, 1]]

x = np.random.multivariate_normal(mean, cov, size=1000)
print(f"Shape: {x.shape}")

fig, ax = plt.subplots(figsize=(6, 6))
ax.scatter(x[:, 0], x[:, 1], alpha=0.3)
ax.set_xlabel('X1')
ax.set_ylabel('X2')
ax.set_title('Bivariate Normal (ρ=0.8)')
ax.set_aspect('equal')
plt.show()

if name == "main": main() ```

2. Correlation Structure

The covariance matrix determines the shape and orientation.

```python import numpy as np import matplotlib.pyplot as plt

def main(): np.random.seed(42)

fig, axes = plt.subplots(1, 3, figsize=(15, 5))

correlations = [-0.8, 0, 0.8]

for ax, rho in zip(axes, correlations):
    cov = [[1, rho], [rho, 1]]
    x = np.random.multivariate_normal([0, 0], cov, size=500)
    ax.scatter(x[:, 0], x[:, 1], alpha=0.3)
    ax.set_title(f'ρ = {rho}')
    ax.set_xlim(-4, 4)
    ax.set_ylim(-4, 4)
    ax.set_aspect('equal')

plt.tight_layout()
plt.show()

if name == "main": main() ```

3. Higher Dimensions

```python import numpy as np

def main(): np.random.seed(42)

# 4D multivariate normal
mean = [0, 0, 0, 0]
cov = np.eye(4)  # independent components

samples = np.random.multivariate_normal(mean, cov, size=1000)
print(f"Shape: {samples.shape}")
print(f"Sample mean: {samples.mean(axis=0)}")

if name == "main": main() ```

Chi-Square Distribution

A distribution derived from squared normal random variables.

1. Degrees of Freedom

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): np.random.seed(0)

df = 5
data = np.random.chisquare(df=df, size=10_000)

fig, ax = plt.subplots(figsize=(10, 4))

_, bins, _ = ax.hist(data, bins=100, density=True, alpha=0.3)

pdf = stats.chi2(df).pdf(bins)
ax.plot(bins, pdf, 'r-', linewidth=2, label=f'χ²({df}) PDF')

ax.set_xlabel('Value')
ax.set_ylabel('Density')
ax.legend()
plt.show()

if name == "main": main() ```

2. Relation to Normal

\(\(\chi^2_k = \sum_{i=1}^{k} Z_i^2\)\) where \(Z_i \sim \mathcal{N}(0, 1)\).

```python import numpy as np

def main(): np.random.seed(42)

k = 5
n_samples = 10_000

# Method 1: np.random.chisquare
chi2_direct = np.random.chisquare(df=k, size=n_samples)

# Method 2: sum of squared normals
z = np.random.randn(n_samples, k)
chi2_manual = (z ** 2).sum(axis=1)

print(f"Direct mean: {chi2_direct.mean():.2f} (expected: {k})")
print(f"Manual mean: {chi2_manual.mean():.2f} (expected: {k})")

if name == "main": main() ```

3. Varying df

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): x = np.linspace(0, 30, 200)

fig, ax = plt.subplots(figsize=(10, 4))

for df in [2, 5, 10, 15]:
    pdf = stats.chi2(df).pdf(x)
    ax.plot(x, pdf, linewidth=2, label=f'df={df}')

ax.set_xlabel('x')
ax.set_ylabel('f(x)')
ax.set_title('Chi-Square Distributions')
ax.legend()
plt.show()

if name == "main": main() ```


Exercises

Exercise 1. Generate 10,000 samples from a normal distribution with mean 100 and standard deviation 15. Verify the sample mean and std are close to the true parameters.

Solution to Exercise 1

python import numpy as np rng = np.random.default_rng(42) samples = rng.normal(100, 15, 10000) print(f"Mean: {samples.mean():.1f}") # ~100 print(f"Std: {samples.std():.1f}") # ~15


Exercise 2. Generate standard normal samples and verify that approximately 68% fall within one standard deviation of the mean.

Solution to Exercise 2

python import numpy as np rng = np.random.default_rng(42) samples = rng.standard_normal(10000) within_1std = np.sum(np.abs(samples) <= 1) / len(samples) print(f"Within 1 std: {within_1std:.2%}") # ~68%


Exercise 3. Generate a 2D array of shape (1000, 3) from a standard normal. Compute the mean and std of each column.

Solution to Exercise 3

python import numpy as np rng = np.random.default_rng(42) data = rng.standard_normal((1000, 3)) print("Means:", data.mean(axis=0)) print("Stds:", data.std(axis=0))


Exercise 4. Use the Box-Muller transform to generate normal samples from uniform samples: \(Z = \sqrt{-2\ln U_1}\cos(2\pi U_2)\). Compare with rng.standard_normal.

Solution to Exercise 4

python import numpy as np rng = np.random.default_rng(42) n = 10000 u1 = rng.uniform(0, 1, n) u2 = rng.uniform(0, 1, n) z = np.sqrt(-2 * np.log(u1)) * np.cos(2 * np.pi * u2) print(f"Box-Muller mean: {z.mean():.3f}, std: {z.std():.3f}")