Skip to content

Moments and Skewness

Moments describe the shape of a distribution, with mean (1st moment), variance (2nd), skewness (3rd), and kurtosis (4th) quantifying center, spread, asymmetry, and tail heaviness respectively.

Mental Model

Each successive moment peels back another layer of a distribution's shape. The mean tells you where it sits, the variance how wide it spreads, the skewness which direction it leans, and the kurtosis how heavy its tails are. Together, these four numbers sketch a surprisingly complete portrait of any distribution.


Moments Definition

1. Raw Moments

E[X^n]:

```python from scipy import stats import numpy as np

data = np.random.normal(0, 1, 1000)

First raw moment = mean

m1 = np.mean(data) print(f"1st moment (mean): {m1:.3f}")

Second raw moment

m2 = np.mean(data**2) print(f"2nd moment: {m2:.3f}")

Third raw moment

m3 = np.mean(data**3) print(f"3rd moment: {m3:.3f}") ```

2. Central Moments

E[(X - μ)^n]:

```python mean = np.mean(data)

Second central moment = variance

c2 = np.mean((data - mean)**2) print(f"Variance: {c2:.3f}")

Third central moment

c3 = np.mean((data - mean)**3) print(f"3rd central moment: {c3:.3f}")

Using scipy

print(stats.moment(data, moment=2)) # Variance print(stats.moment(data, moment=3)) # 3rd central ```

3. Standardized Moments

E[(X - μ)^n / σ^n]:

```python std = np.std(data, ddof=1)

Skewness (3rd standardized moment)

skew = np.mean(((data - mean) / std)**3) print(f"Skewness: {skew:.3f}")

Using scipy

print(stats.skew(data, bias=False)) ```

4. Relationship

```python

Raw → Central

μ₂' = E[X²] = Var(X) + (E[X])²

μ₃' = E[X³] = μ₃ + 3μ(Var) + μ³

Central → Standardized

γ₃ = μ₃ / σ³ (skewness)

γ₄ = μ₄ / σ⁴ - 3 (excess kurtosis)

```

5. Moment Generating Function

```python

MGF: M(t) = E[e^(tX)]

Derivatives at t=0 give moments

For normal distribution

t = 0.5 norm_mgf = np.exp(mean * t + 0.5 * std2 * t2) print(f"Normal MGF at t={t}: {norm_mgf:.3f}") ```

6. Cumulants

```python

Cumulant generating function

K(t) = log(M(t))

First cumulant = mean

Second cumulant = variance

Third cumulant = third central moment

```

7. Sample vs Population

```python

Sample moments (unbiased)

sample_var = np.var(data, ddof=1) # Divide by n-1

Population moments (biased)

pop_var = np.var(data, ddof=0) # Divide by n

print(f"Sample variance: {sample_var:.3f}") print(f"Population variance: {pop_var:.3f}") ```


Skewness

1. Definition

Measure of asymmetry:

```python

Skewness = E[(X - μ)³] / σ³

skewness = stats.skew(data, bias=False) print(f"Skewness: {skewness:.3f}")

Interpretation:

0: Symmetric (normal)

>0: Right-skewed (positive, long right tail)

<0: Left-skewed (negative, long left tail)

```

2. Positive Skew

```python

Generate right-skewed data

right_skew = np.random.exponential(scale=1, size=1000) print(f"Skewness: {stats.skew(right_skew):.3f}") # ~2.0

Visualization

plt.hist(right_skew, bins=50, density=True, alpha=0.7) plt.axvline(np.mean(right_skew), color='r', label='Mean') plt.axvline(np.median(right_skew), color='g', label='Median') plt.legend() plt.title('Positive Skew: Mean > Median') plt.show()

Mean pulled toward tail

print(f"Mean: {np.mean(right_skew):.3f}") print(f"Median: {np.median(right_skew):.3f}") # Mean > Median ```

3. Negative Skew

```python

Generate left-skewed data

left_skew = -np.random.exponential(scale=1, size=1000) print(f"Skewness: {stats.skew(left_skew):.3f}") # ~-2.0

Mean < Median for left skew

```

4. Skewness Test

```python

Test if skewness significantly differs from 0

skewtest_stat, p_value = stats.skewtest(data) print(f"Skew test: statistic={skewtest_stat:.3f}, p={p_value:.4f}")

p < 0.05: Significant skewness (reject symmetry)

```

5. Pearson Skewness Coefficients

```python

Mode skewness

mean = np.mean(data) mode = stats.mode(data, keepdims=True).mode[0] std = np.std(data, ddof=1) pearson_mode = (mean - mode) / std

Median skewness (more robust)

median = np.median(data) pearson_median = 3 * (mean - median) / std

print(f"Pearson mode skewness: {pearson_mode:.3f}") print(f"Pearson median skewness: {pearson_median:.3f}") ```

6. Quantile Skewness

```python

Based on quartiles (very robust)

q1, q2, q3 = np.percentile(data, [25, 50, 75]) bowley_skew = (q3 + q1 - 2*q2) / (q3 - q1) print(f"Bowley skewness: {bowley_skew:.3f}")

Ranges from -1 to +1

```

7. Transformations

```python

Box-Cox transformation to reduce skewness

from scipy import stats

data_positive = np.abs(data) + 1 # Must be positive transformed, lambda_param = stats.boxcox(data_positive)

print(f"Original skewness: {stats.skew(data_positive):.3f}") print(f"Transformed skewness: {stats.skew(transformed):.3f}") print(f"Lambda parameter: {lambda_param:.3f}") ```


Kurtosis

1. Definition

Measure of tail heaviness:

```python

Excess kurtosis = E[(X - μ)⁴] / σ⁴ - 3

kurtosis = stats.kurtosis(data, bias=False) print(f"Excess kurtosis: {kurtosis:.3f}")

Interpretation:

0: Normal (mesokurtic)

>0: Heavy tails (leptokurtic)

<0: Light tails (platykurtic)

```

2. Leptokurtic (Heavy Tails)

```python

t-distribution has heavy tails

heavy_tail = stats.t.rvs(df=5, size=1000) print(f"Kurtosis: {stats.kurtosis(heavy_tail):.3f}") # Positive

More extreme values than normal

```

3. Platykurtic (Light Tails)

```python

Uniform distribution has light tails

light_tail = np.random.uniform(-1, 1, 1000) print(f"Kurtosis: {stats.kurtosis(light_tail):.3f}") # ~-1.2

Fewer extreme values than normal

```

4. Kurtosis Test

```python

Test if kurtosis significantly differs from 0 (normal)

kurttest_stat, p_value = stats.kurtosistest(data) print(f"Kurt test: statistic={kurttest_stat:.3f}, p={p_value:.4f}")

p < 0.05: Significant deviation from normal kurtosis

```

5. Sample Kurtosis Bias

```python

Unbiased vs biased estimator

kurt_unbiased = stats.kurtosis(data, bias=False) # Recommended kurt_biased = stats.kurtosis(data, bias=True)

print(f"Unbiased: {kurt_unbiased:.3f}") print(f"Biased: {kurt_biased:.3f}") ```

6. Outlier Sensitivity

```python

Kurtosis very sensitive to outliers

data_clean = np.random.normal(0, 1, 100) data_outlier = np.concatenate([data_clean, [10, -10]])

print(f"Clean kurtosis: {stats.kurtosis(data_clean):.3f}") print(f"With outliers: {stats.kurtosis(data_outlier):.3f}") # Much higher ```

7. Robust Alternatives

```python

L-kurtosis (based on L-moments, more robust)

Not in scipy, but conceptually important

Or use quantile-based measure

q = np.percentile(data, [10, 25, 75, 90]) robust_kurt = (q[3] - q[0]) / (q[2] - q[1]) print(f"Robust kurtosis proxy: {robust_kurt:.3f}") ```


Normal Test

1. Jarque-Bera Test

```python

Tests both skewness and kurtosis

jb_stat, p_value = stats.jarque_bera(data) print(f"Jarque-Bera: statistic={jb_stat:.3f}, p={p_value:.4f}")

H0: Data is normally distributed

p < 0.05: Reject normality

```

2. Anderson-Darling Test

```python

More powerful than Jarque-Bera

result = stats.anderson(data, dist='norm') print(f"A-D statistic: {result.statistic:.3f}")

Compare to critical values

for i, sl in enumerate(result.significance_level): print(f"{sl}%: {result.critical_values[i]:.3f}") ```

3. Shapiro-Wilk Test

```python

Most powerful test for normality (small samples)

sw_stat, p_value = stats.shapiro(data) print(f"Shapiro-Wilk: statistic={sw_stat:.4f}, p={p_value:.4f}")

W close to 1 indicates normality

```

4. Kolmogorov-Smirnov Test

```python

General distribution test

ks_stat, p_value = stats.kstest(data, 'norm', args=(np.mean(data), np.std(data))) print(f"K-S test: statistic={ks_stat:.4f}, p={p_value:.4f}") ```

5. D'Agostino-Pearson Test

```python

Combines skew and kurtosis tests

k2, p_value = stats.normaltest(data) print(f"D'Agostino: statistic={k2:.3f}, p={p_value:.4f}")

Equivalent to combining skewtest and kurtosistest

```

6. Q-Q Plot

```python

Visual normality check

stats.probplot(data, dist="norm", plot=plt) plt.title("Q-Q Plot") plt.show()

Points on line → Normal

S-curve → Heavy tails

Reverse S → Light tails

```

7. Omnibus Test

```python

Not directly in scipy, but can be constructed

skew = stats.skew(data) kurt = stats.kurtosis(data) n = len(data)

Jarque-Bera statistic

jb = n/6 * (skew2 + kurt2/4) p_value = 1 - stats.chi2.cdf(jb, df=2) print(f"Omnibus: JB={jb:.3f}, p={p_value:.4f}") ```


Practical Applications

1. Financial Returns

```python

Stock returns often have fat tails

returns = np.random.standard_t(df=5, size=1000) * 0.02 # 2% daily vol

print(f"Skewness: {stats.skew(returns):.3f}") print(f"Kurtosis: {stats.kurtosis(returns):.3f}") # Positive → Fat tails

Implications: Normal models underestimate risk

```

2. Income Distribution

```python

Income typically right-skewed

income = np.random.lognormal(mean=10.5, sigma=0.5, size=1000)

print(f"Skewness: {stats.skew(income):.3f}") # Strongly positive print(f"Mean: ${np.mean(income):.0f}") print(f"Median: ${np.median(income):.0f}") # Median < Mean ```

3. Outlier Detection

```python

High kurtosis suggests outliers

if stats.kurtosis(data) > 3: print("Warning: Heavy tails detected") # Consider robust statistics or outlier removal ```

4. Transformation Selection

```python

Choose transformation based on skewness

if stats.skew(data) > 1: # Strong right skew → log transform transformed = np.log(data + 1) elif stats.skew(data) < -1: # Strong left skew → exponential transform transformed = np.exp(data) else: # Mild skew → square root or Box-Cox transformed, _ = stats.boxcox(data + abs(data.min()) + 1) ```

5. Model Selection

```python

Normal vs t-distribution based on kurtosis

if abs(stats.kurtosis(data)) < 1: model = stats.norm print("Using normal distribution") else: # Fit t-distribution for heavy tails df, loc, scale = stats.t.fit(data) model = stats.t(df=df, loc=loc, scale=scale) print(f"Using t-distribution (df={df:.1f})") ```

6. Risk Measures

```python

VaR underestimated if kurtosis ignored

mean, std = np.mean(returns), np.std(returns)

Normal assumption

var_normal = stats.norm.ppf(0.05, loc=mean, scale=std)

Empirical (accounts for actual distribution)

var_empirical = np.percentile(returns, 5)

print(f"VaR (normal): {var_normal:.4f}") print(f"VaR (empirical): {var_empirical:.4f}") ```

7. Power Analysis

```python

Sample size for detecting skewness/kurtosis

from statsmodels.stats.power import normal_power_het

Example: Detect skewness of 0.5 with power 0.8

Requires large sample size

```


Summary

Moments:

  • 1st: Mean (location)
  • 2nd: Variance (spread)
  • 3rd: Skewness (asymmetry)
  • 4th: Kurtosis (tail heaviness)

Skewness:

  • 0: Symmetric
  • 0: Right-skewed (mean > median)

  • <0: Left-skewed (mean < median)

Kurtosis (excess):

  • 0: Normal tails
  • 0: Heavy tails (more extremes)

  • <0: Light tails (fewer extremes)

Key insight: Higher moments reveal distribution shape beyond mean and variance, with skewness indicating asymmetry (important for risk assessment) and kurtosis measuring tail thickness (critical for extreme event modeling).


Exercises

Exercise 1. Compute the skewness and excess kurtosis of 1000 samples from (a) a standard normal, (b) an exponential with \(\lambda = 1\), and (c) a uniform on \([0, 1]\). Use scipy.stats.skew() and scipy.stats.kurtosis().

Solution to Exercise 1
import numpy as np
from scipy import stats

np.random.seed(42)
for name, data in [("Normal", np.random.normal(size=1000)),
                   ("Exponential", np.random.exponential(size=1000)),
                   ("Uniform", np.random.uniform(size=1000))]:
    print(f"{name:12s}: skew={stats.skew(data):.4f}, kurtosis={stats.kurtosis(data):.4f}")

Exercise 2. For 500 samples from a \(\chi^2(5)\) distribution, compute the first four raw moments using scipy.stats.moment() (for central moments) and manual computation. Compare with theoretical moments.

Solution to Exercise 2
import numpy as np
from scipy import stats as st

np.random.seed(42)
data = st.chi2.rvs(df=5, size=500)
for k in range(1, 5):
    cm = st.moment(data, moment=k)
    print(f"Central moment {k}: {cm:.4f}")

Exercise 3. Apply the Jarque-Bera test (scipy.stats.jarque_bera()) to samples from a normal and a lognormal distribution (500 samples each). Print the test statistics and p-values to demonstrate which sample departs from normality.

Solution to Exercise 3
import numpy as np
from scipy import stats

np.random.seed(42)
normal_data = np.random.normal(size=500)
lognormal_data = np.random.lognormal(size=500)

for name, data in [("Normal", normal_data), ("Lognormal", lognormal_data)]:
    jb_stat, p_val = stats.jarque_bera(data)
    print(f"{name:10s}: JB={jb_stat:.4f}, p={p_val:.4f}")