t-Distribution¶
The t-distribution (Student's t-distribution) arises when estimating the mean of a normally distributed population using a small sample with unknown variance. It has heavier tails than the standard normal distribution, which accounts for the additional uncertainty introduced by estimating the variance from the data. The t-distribution is the basis for t-tests and confidence intervals in classical statistics.
Mathematical Definition¶
If \(Z \sim N(0, 1)\) and \(V \sim \chi^2(\nu)\) are independent, then the random variable
follows a t-distribution with \(\nu\) degrees of freedom, written \(T \sim t(\nu)\).
The probability density function is:
where \(\Gamma(\cdot)\) is the gamma function and \(\nu > 0\) is the degrees of freedom parameter.
Usage in scipy.stats¶
The scipy.stats.t distribution object takes the degrees of freedom df (\(\nu\)):
import scipy.stats as stats
import numpy as np
import matplotlib.pyplot as plt
df = 5
a = stats.t(df)
print(f"Mean: {a.mean():.4f}") # 0 (for ν > 1)
print(f"Variance: {a.var():.4f}") # ν/(ν-2) = 5/3
x = np.linspace(-5, 5, 200)
y_pdf = a.pdf(x)
y_cdf = a.cdf(x)
plt.plot(x, y_pdf, label='PDF')
plt.plot(x, y_cdf, label='CDF')
plt.legend()
plt.title(f't-Distribution (ν={df})')
plt.xlabel('x')
plt.ylabel('Density / Probability')
plt.show()
Comparison with the Normal Distribution¶
The t-distribution is symmetric about zero like the standard normal, but it has heavier tails. As \(\nu\) increases, the t-distribution converges to the standard normal:
import scipy.stats as stats
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-5, 5, 200)
plt.plot(x, stats.norm.pdf(x), 'k--', linewidth=2, label='N(0,1)')
for df in [1, 3, 5, 10, 30]:
plt.plot(x, stats.t(df).pdf(x), label=f'ν={df}')
plt.legend()
plt.title('t-Distribution PDFs vs Standard Normal')
plt.xlabel('x')
plt.ylabel('f(x)')
plt.show()
The case \(\nu = 1\) is the Cauchy distribution, which has such heavy tails that neither the mean nor the variance exists.
Key Properties¶
- Mean: \(E[T] = 0\) for \(\nu > 1\) (undefined for \(\nu \le 1\))
- Variance: \(\text{Var}(T) = \dfrac{\nu}{\nu - 2}\) for \(\nu > 2\) (infinite for \(1 < \nu \le 2\))
- Symmetry: The distribution is symmetric about 0
- Heavy tails: For any finite \(\nu\), the tails decay as a power law \(|x|^{-(\nu+1)}\), slower than the exponential decay of the normal
- Convergence: As \(\nu \to \infty\), \(t(\nu) \to N(0, 1)\)
Special Cases¶
- \(\nu = 1\): Cauchy distribution (no finite moments)
- \(\nu = \infty\): standard normal distribution
Parameters in scipy.stats¶
| Parameter | Symbol | scipy.stats keyword |
Default |
|---|---|---|---|
| Degrees of freedom | \(\nu\) | df |
(required) |
| Location | — | loc |
0 |
| Scale | — | scale |
1 |
Applications in Hypothesis Testing¶
The t-distribution is used whenever a test statistic involves an estimated standard deviation:
- One-sample t-test: Testing \(H_0\colon \mu = \mu_0\) using \(T = (\bar{X} - \mu_0)/(S/\sqrt{n})\), which follows \(t(n-1)\) under \(H_0\)
- Two-sample t-test: Comparing means of two independent groups
- Confidence intervals: A 95% confidence interval for the mean is \(\bar{X} \pm t_{0.025,\,n-1} \cdot S/\sqrt{n}\)
Critical values are obtained using the PPF:
alpha = 0.05
df = 20
t_critical = stats.t(df).ppf(1 - alpha / 2)
print(f"Two-sided critical value: ±{t_critical:.4f}")
Financial Applications¶
In finance, the t-distribution is used in modeling asset returns when normal tails are too thin to capture extreme events. The Student-t copula models joint tail dependence between assets. Risk measures such as Value at Risk use the t-distribution for more conservative tail estimates, and GARCH models with t-distributed innovations provide better fits to financial return series.
Summary¶
The t-distribution arises from the ratio of a standard normal to a scaled chi-square variable and has heavier tails than the normal. In scipy.stats, use stats.t(df) to create a frozen distribution for computing PDFs, CDFs, critical values, and generating random samples.