Normal Distributions¶

NumPy provides multiple functions for generating samples from normal (Gaussian) distributions.

Mental Model

np.random.randn draws from the standard normal \(\mathcal{N}(0,1)\); scale and shift with sigma * randn(...) + mu to get any normal distribution. Alternatively, np.random.normal(mu, sigma, size) does it in one call. The normal distribution appears everywhere because the Central Limit Theorem guarantees that sums of many independent variables converge to it.

np.random.randn¶

Generates samples from the standard normal distribution \(\mathcal{N}(0, 1)\).

1. Basic Usage¶

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): np.random.seed(0)

n_samples = 10_000
data = np.random.randn(n_samples)

fig, ax = plt.subplots(figsize=(12, 3))

_, bins, _ = ax.hist(data, bins=100, density=True, alpha=0.3, label='Histogram')

pdf = stats.norm().pdf(bins)
ax.plot(bins, pdf, '--r', linewidth=2, label='Standard Normal PDF')

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.legend()
plt.show()

if name == "main": main() ```

2. Shape Argument¶

Pass dimensions as separate arguments.

```python import numpy as np

def main(): np.random.seed(42)

# 1D array
a = np.random.randn(5)
print(f"1D: {a.shape}")

# 2D array
b = np.random.randn(3, 4)
print(f"2D: {b.shape}")

# 3D array
c = np.random.randn(2, 3, 4)
print(f"3D: {c.shape}")

if name == "main": main() ```

3. Quick Sampling¶

Use randn for quick standard normal samples with positional shape.

np.random.standard_normal¶

Alternative syntax for standard normal samples using size keyword.

1. Size Keyword¶

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): np.random.seed(0)

n_samples = 10_000
data = np.random.standard_normal(size=(n_samples,))

fig, ax = plt.subplots(figsize=(12, 3))

_, bins, _ = ax.hist(data, bins=100, density=True, alpha=0.3, label='Histogram')

pdf = stats.norm().pdf(bins)
ax.plot(bins, pdf, '--r', linewidth=2, label='Standard Normal PDF')

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.legend()
plt.show()

if name == "main": main() ```

2. Difference from randn¶

Uses size keyword tuple instead of positional dimension arguments.

```python import numpy as np

def main(): np.random.seed(42)

# randn: positional arguments
a = np.random.randn(3, 4)

# standard_normal: size keyword
b = np.random.standard_normal(size=(3, 4))

print(f"randn shape: {a.shape}")
print(f"standard_normal shape: {b.shape}")

if name == "main": main() ```

3. Equivalent Results¶

Both produce standard normal samples; choice is stylistic.

np.random.normal¶

Generates samples from a general normal distribution \(\mathcal{N}(\mu, \sigma^2)\).

1. Parameters¶

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): np.random.seed(0)

loc = 5      # mean (μ)
scale = 2    # standard deviation (σ)
n_samples = 10_000

data = np.random.normal(loc=loc, scale=scale, size=(n_samples,))

fig, ax = plt.subplots(figsize=(12, 3))

_, bins, _ = ax.hist(data, bins=100, density=True, alpha=0.3, label='Histogram')

pdf = stats.norm(loc=loc, scale=scale).pdf(bins)
ax.plot(bins, pdf, '--r', linewidth=2, label=f'N({loc}, {scale}²) PDF')

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.legend()
plt.show()

if name == "main": main() ```

2. Scaling Relation¶

\(X \sim \mathcal{N}(\mu, \sigma^2)\) is equivalent to \(X = \mu + \sigma Z\) where \(Z \sim \mathcal{N}(0, 1)\).

```python import numpy as np

def main(): np.random.seed(42)

mu, sigma = 5, 2
n = 10_000

# Method 1: np.random.normal
x1 = np.random.normal(loc=mu, scale=sigma, size=n)

# Method 2: transform standard normal
np.random.seed(42)
z = np.random.randn(n)
x2 = mu + sigma * z

print(f"Method 1 mean: {x1.mean():.4f}")
print(f"Method 2 mean: {x2.mean():.4f}")

if name == "main": main() ```

3. Use for Custom Mean/Std¶

Use normal when you need to specify mean and standard deviation.

scipy.stats.norm.rvs¶

The scipy.stats alternative for normal sampling.

1. Basic Usage¶

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): np.random.seed(0)

n_samples = 10_000
data = stats.norm(loc=0, scale=1).rvs(n_samples)

fig, ax = plt.subplots(figsize=(12, 3))

_, bins, _ = ax.hist(data, bins=100, density=True, alpha=0.3, label='Histogram')

pdf = stats.norm().pdf(bins)
ax.plot(bins, pdf, '--r', linewidth=2, label='Standard Normal PDF')

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.set_xlabel('Value')
ax.set_ylabel('Density')
ax.legend()
plt.show()

if name == "main": main() ```

2. Distribution Object¶

Create a frozen distribution for repeated use.

```python import numpy as np from scipy import stats

def main(): np.random.seed(42)

# Create distribution object
dist = stats.norm(loc=10, scale=3)

# Sample
samples = dist.rvs(size=5)
print(f"Samples: {samples}")

# Also get PDF, CDF, etc.
print(f"PDF at 10: {dist.pdf(10):.4f}")
print(f"CDF at 10: {dist.cdf(10):.4f}")

if name == "main": main() ```

3. When to Use¶

Use stats.norm when you also need PDF, CDF, quantiles, or other distribution methods.

Method Comparison¶

1. All Four Methods¶

```python import numpy as np from scipy import stats

def main(): np.random.seed(0) n = 5

print("Standard Normal N(0,1) - 4 equivalent methods:")
print()

np.random.seed(42)
print(f"np.random.randn({n}):")
print(f"  {np.random.randn(n)}")

np.random.seed(42)
print(f"np.random.standard_normal(size=({n},)):")
print(f"  {np.random.standard_normal(size=(n,))}")

np.random.seed(42)
print(f"np.random.normal(0, 1, size={n}):")
print(f"  {np.random.normal(0, 1, size=n)}")

np.random.seed(42)
print(f"stats.norm(0, 1).rvs({n}):")
print(f"  {stats.norm(0, 1).rvs(n)}")

if name == "main": main() ```

2. Summary Table¶

Function	Standard Normal	General Normal	Shape Syntax
`randn`	✓	✗	Positional args
`standard_normal`	✓	✗	`size=` keyword
`normal`	✓	✓	`size=` keyword
`stats.norm.rvs`	✓	✓	Positional or `size=`

3. Recommendations¶

Quick standard normal: randn
Custom mean/std: normal
Need PDF/CDF too: stats.norm

Multivariate Normal¶

Generates samples from a multivariate normal distribution.

1. Covariance Matrix¶

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): np.random.seed(42)

mean = [0, 0]
cov = [[1, 0.8], [0.8, 1]]

x = np.random.multivariate_normal(mean, cov, size=1000)
print(f"Shape: {x.shape}")

fig, ax = plt.subplots(figsize=(6, 6))
ax.scatter(x[:, 0], x[:, 1], alpha=0.3)
ax.set_xlabel('X1')
ax.set_ylabel('X2')
ax.set_title('Bivariate Normal (ρ=0.8)')
ax.set_aspect('equal')
plt.show()

if name == "main": main() ```

2. Correlation Structure¶

The covariance matrix determines the shape and orientation.

```python import numpy as np import matplotlib.pyplot as plt

def main(): np.random.seed(42)

fig, axes = plt.subplots(1, 3, figsize=(15, 5))

correlations = [-0.8, 0, 0.8]

for ax, rho in zip(axes, correlations):
    cov = [[1, rho], [rho, 1]]
    x = np.random.multivariate_normal([0, 0], cov, size=500)
    ax.scatter(x[:, 0], x[:, 1], alpha=0.3)
    ax.set_title(f'ρ = {rho}')
    ax.set_xlim(-4, 4)
    ax.set_ylim(-4, 4)
    ax.set_aspect('equal')

plt.tight_layout()
plt.show()

if name == "main": main() ```

3. Higher Dimensions¶

```python import numpy as np

def main(): np.random.seed(42)

# 4D multivariate normal
mean = [0, 0, 0, 0]
cov = np.eye(4)  # independent components

samples = np.random.multivariate_normal(mean, cov, size=1000)
print(f"Shape: {samples.shape}")
print(f"Sample mean: {samples.mean(axis=0)}")

if name == "main": main() ```

Chi-Square Distribution¶

A distribution derived from squared normal random variables.

1. Degrees of Freedom¶

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): np.random.seed(0)

df = 5
data = np.random.chisquare(df=df, size=10_000)

fig, ax = plt.subplots(figsize=(10, 4))

_, bins, _ = ax.hist(data, bins=100, density=True, alpha=0.3)

pdf = stats.chi2(df).pdf(bins)
ax.plot(bins, pdf, 'r-', linewidth=2, label=f'χ²({df}) PDF')

ax.set_xlabel('Value')
ax.set_ylabel('Density')
ax.legend()
plt.show()

if name == "main": main() ```

2. Relation to Normal¶

\(\(\chi^2_k = \sum_{i=1}^{k} Z_i^2\)\) where \(Z_i \sim \mathcal{N}(0, 1)\).

```python import numpy as np

def main(): np.random.seed(42)

k = 5
n_samples = 10_000

# Method 1: np.random.chisquare
chi2_direct = np.random.chisquare(df=k, size=n_samples)

# Method 2: sum of squared normals
z = np.random.randn(n_samples, k)
chi2_manual = (z ** 2).sum(axis=1)

print(f"Direct mean: {chi2_direct.mean():.2f} (expected: {k})")
print(f"Manual mean: {chi2_manual.mean():.2f} (expected: {k})")

if name == "main": main() ```

3. Varying df¶

```python import numpy as np import matplotlib.pyplot as plt from scipy import stats

def main(): x = np.linspace(0, 30, 200)

fig, ax = plt.subplots(figsize=(10, 4))

for df in [2, 5, 10, 15]:
    pdf = stats.chi2(df).pdf(x)
    ax.plot(x, pdf, linewidth=2, label=f'df={df}')

ax.set_xlabel('x')
ax.set_ylabel('f(x)')
ax.set_title('Chi-Square Distributions')
ax.legend()
plt.show()

if name == "main": main() ```

Exercises¶

Exercise 1. Generate 10,000 samples from a normal distribution with mean 100 and standard deviation 15. Verify the sample mean and std are close to the true parameters.

Solution to Exercise 1

python import numpy as np rng = np.random.default_rng(42) samples = rng.normal(100, 15, 10000) print(f"Mean: {samples.mean():.1f}") # ~100 print(f"Std: {samples.std():.1f}") # ~15

Exercise 2. Generate standard normal samples and verify that approximately 68% fall within one standard deviation of the mean.

Solution to Exercise 2

python import numpy as np rng = np.random.default_rng(42) samples = rng.standard_normal(10000) within_1std = np.sum(np.abs(samples) <= 1) / len(samples) print(f"Within 1 std: {within_1std:.2%}") # ~68%

Exercise 3. Generate a 2D array of shape (1000, 3) from a standard normal. Compute the mean and std of each column.

Solution to Exercise 3

python import numpy as np rng = np.random.default_rng(42) data = rng.standard_normal((1000, 3)) print("Means:", data.mean(axis=0)) print("Stds:", data.std(axis=0))

Exercise 4. Use the Box-Muller transform to generate normal samples from uniform samples: \(Z = \sqrt{-2\ln U_1}\cos(2\pi U_2)\). Compare with rng.standard_normal.

Solution to Exercise 4

python import numpy as np rng = np.random.default_rng(42) n = 10000 u1 = rng.uniform(0, 1, n) u2 = rng.uniform(0, 1, n) z = np.sqrt(-2 * np.log(u1)) * np.cos(2 * np.pi * u2) print(f"Box-Muller mean: {z.mean():.3f}, std: {z.std():.3f}")