t-Distribution¶

The t-distribution (Student's t-distribution) arises when estimating the mean of a normally distributed population using a small sample with unknown variance. It has heavier tails than the standard normal distribution, which accounts for the additional uncertainty introduced by estimating the variance from the data. The t-distribution is the basis for t-tests and confidence intervals in classical statistics.

Mathematical Definition¶

If \(Z \sim N(0, 1)\) and \(V \sim \chi^2(\nu)\) are independent, then the random variable

\[ T = \frac{Z}{\sqrt{V/\nu}} \]

follows a t-distribution with \(\nu\) degrees of freedom, written \(T \sim t(\nu)\).

The probability density function is:

\[ f(x) = \frac{\Gamma\!\left(\frac{\nu + 1}{2}\right)}{\sqrt{\nu\pi}\;\Gamma\!\left(\frac{\nu}{2}\right)} \left(1 + \frac{x^2}{\nu}\right)^{-(\nu + 1)/2} \]

where \(\Gamma(\cdot)\) is the gamma function and \(\nu > 0\) is the degrees of freedom parameter.

Usage in scipy.stats¶

The scipy.stats.t distribution object takes the degrees of freedom df (\(\nu\)):

import scipy.stats as stats
import numpy as np
import matplotlib.pyplot as plt

df = 5
a = stats.t(df)

print(f"Mean: {a.mean():.4f}")        # 0 (for ν > 1)
print(f"Variance: {a.var():.4f}")      # ν/(ν-2) = 5/3

x = np.linspace(-5, 5, 200)
y_pdf = a.pdf(x)
y_cdf = a.cdf(x)

plt.plot(x, y_pdf, label='PDF')
plt.plot(x, y_cdf, label='CDF')
plt.legend()
plt.title(f't-Distribution (ν={df})')
plt.xlabel('x')
plt.ylabel('Density / Probability')
plt.show()

Comparison with the Normal Distribution¶

The t-distribution is symmetric about zero like the standard normal, but it has heavier tails. As \(\nu\) increases, the t-distribution converges to the standard normal:

import scipy.stats as stats
import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(-5, 5, 200)
plt.plot(x, stats.norm.pdf(x), 'k--', linewidth=2, label='N(0,1)')
for df in [1, 3, 5, 10, 30]:
    plt.plot(x, stats.t(df).pdf(x), label=f'ν={df}')

plt.legend()
plt.title('t-Distribution PDFs vs Standard Normal')
plt.xlabel('x')
plt.ylabel('f(x)')
plt.show()

The case \(\nu = 1\) is the Cauchy distribution, which has such heavy tails that neither the mean nor the variance exists.

Key Properties¶

Mean: \(E[T] = 0\) for \(\nu > 1\) (undefined for \(\nu \le 1\))
Variance: \(\text{Var}(T) = \dfrac{\nu}{\nu - 2}\) for \(\nu > 2\) (infinite for \(1 < \nu \le 2\))
Symmetry: The distribution is symmetric about 0
Heavy tails: For any finite \(\nu\), the tails decay as a power law \(|x|^{-(\nu+1)}\), slower than the exponential decay of the normal
Convergence: As \(\nu \to \infty\), \(t(\nu) \to N(0, 1)\)

Special Cases¶

\(\nu = 1\): Cauchy distribution (no finite moments)
\(\nu = \infty\): standard normal distribution

Parameters in scipy.stats¶

Parameter	Symbol	`scipy.stats` keyword	Default
Degrees of freedom	\(\nu\)	`df`	(required)
Location	—	`loc`	0
Scale	—	`scale`	1

Applications in Hypothesis Testing¶

The t-distribution is used whenever a test statistic involves an estimated standard deviation:

One-sample t-test: Testing \(H_0\colon \mu = \mu_0\) using \(T = (\bar{X} - \mu_0)/(S/\sqrt{n})\), which follows \(t(n-1)\) under \(H_0\)
Two-sample t-test: Comparing means of two independent groups
Confidence intervals: A 95% confidence interval for the mean is \(\bar{X} \pm t_{0.025,\,n-1} \cdot S/\sqrt{n}\)

Critical values are obtained using the PPF:

alpha = 0.05
df = 20
t_critical = stats.t(df).ppf(1 - alpha / 2)
print(f"Two-sided critical value: ±{t_critical:.4f}")

Financial Applications¶

In finance, the t-distribution is used in modeling asset returns when normal tails are too thin to capture extreme events. The Student-t copula models joint tail dependence between assets. Risk measures such as Value at Risk use the t-distribution for more conservative tail estimates, and GARCH models with t-distributed innovations provide better fits to financial return series.

Summary¶

The t-distribution arises from the ratio of a standard normal to a scaled chi-square variable and has heavier tails than the normal. In scipy.stats, use stats.t(df) to create a frozen distribution for computing PDFs, CDFs, critical values, and generating random samples.