Chapter 18: Term Structure and Short-Rate Models¶

This chapter develops the theory and practice of fixed-income modeling from first principles. Starting from the interest rate market and its instruments, we construct yield curves via bootstrapping and interpolation, then build continuous-time short-rate models---Vasicek, Cox--Ingersoll--Ross, Hull--White, and Black--Karasinski---that price bonds and interest rate derivatives within a unified no-arbitrage framework. The chapter connects market observables to model dynamics, derives closed-form bond pricing formulas through both the market price of risk and risk-neutral measure approaches, and compares tractability, calibration fit, and practical suitability across the model landscape.

Key Concepts¶

Interest Rate Market and Instruments¶

The fixed-income market is organized around discount factors \(P(t,T)\), the price at time \(t\) of one unit of currency at \(T\). Zero-coupon bond prices determine all other quantities: the continuously compounded zero rate \(R(t,T) = -\ln P(t,T)/(T-t)\), the simply compounded spot rate \(L(t,T) = (1/P(t,T) - 1)/(T-t)\), and the instantaneous forward rate \(f(t,T) = -\partial_T \ln P(t,T)\). Bond prices are quoted in 1/32 increments in US Treasury markets, with the dirty price (clean plus accrued interest) determining the actual cash flow. A coupon bond with coupon rate \(c\) and face value \(N\) prices as \(CB(t) = N\sum_{i} c_i\, P(t,T_i) + N\,P(t,T_n)\), following from value additivity and no-arbitrage discounting of each cashflow. A floating-rate note resetting at Libor satisfies \(FRN(t) = N\,P(t,T_m)\) immediately after a reset, a consequence of the telescoping property \(\sum (P(t,T_{i-1}) - P(t,T_i)) = P(t,T_m) - P(t,T_n)\). A forward rate agreement (FRA) exchanging fixed rate \(K\) for floating rate \(L(T_{k-1},T_k)\) has value \(V_{\text{FRA}}(t) = N\tau_k(l_k(t) - K)P(t,T_k)\) where \(l_k(t) = (P(t,T_{k-1})/P(t,T_k) - 1)/\tau_k\) is the forward Libor rate, which is a martingale under the \(T_k\)-forward measure obtained via change of numeraire from the money-market account to the \(T_k\)-bond. An interest rate swap (IRS) decomposes as a portfolio of FRAs: \(\text{IRS}^{\text{Payer}}(t) = N\sum_{k=m+1}^n (l_k(t) - K)\tau_k P(t,T_k) = NA_{m,n}(t)(S_{m,n}(t) - K) = N(P(t,T_m) - P(t,T_n)) - NK\sum_{k=m+1}^n\tau_k P(t,T_k)\), with the par swap rate \(S_{m,n}(t) = (P(t,T_m) - P(t,T_n))/A_{m,n}(t)\) determined by the annuity factor \(A_{m,n}(t) = \sum_{k=m+1}^{n} \tau_k P(t,T_k)\). The swap rate can also be expressed in terms of discrete forward rates: \(S_{m,n}(t) = (1 - \prod_{j=m+1}^n (1+\tau_j L(t,T_{j-1},T_j))^{-1})/(\sum_{k=m+1}^n \tau_k \prod_{j=m+1}^k (1+\tau_j L(t,T_{j-1},T_j))^{-1})\), linking the swap market directly to the forward rate curve.

Yield Curve Construction¶

The yield curve is built from market instruments via bootstrapping, a sequential procedure in three stages. Stage 1 (deposits): short-term deposit rates directly imply discount factors \(P(0,T) = 1/(1 + R_d\cdot\delta)\) where \(\delta\) is the day count fraction (typically ACT/360). Stage 2 (futures): interest rate futures with price \(F\) imply forward rates \(R_{\text{fut}} = (100-F)/100\) after a convexity adjustment \(F(0;T_1,T_2) \approx R_{\text{fut}} - \frac{1}{2}\sigma^2 T_1(T_2-T_1)\) accounting for daily margining, then \(P(0,T_2) = P(0,T_1)/(1 + F(0;T_1,T_2)(T_2-T_1))\) extends the curve sequentially. Stage 3 (swaps): par swap rates determine longer discount factors from \(P(0,T_n) = (1 - S_n\sum_{i=1}^{n-1}\delta_i P(0,T_i))/(1 + S_n\delta_n)\), using all previously determined discount factors. Verification by repricing all input instruments within bid--ask tolerance \(\max_i|V_i^{\text{model}} - V_i^{\text{market}}| < \epsilon\) ensures curve consistency. The modern multi-curve framework separates the discounting curve (OIS) from projection curves (tenor-specific Libor/SOFR), reflecting post-2008 basis spreads: the OIS curve is bootstrapped first via \(P^{\text{OIS}}(0,T_n) = (1 - S_n^{\text{OIS}}\sum\delta_i P^{\text{OIS}}(0,T_i))/(1 + S_n^{\text{OIS}}\delta_n)\), then each tenor curve is bootstrapped using OIS for discounting.

Interpolation Methods¶

Interpolation between bootstrap nodes profoundly affects forward rates and hedging. The choice of what to interpolate---discount factors, log-discount factors, zero rates, or forward rates---matters as much as the interpolation method itself. Interpolating log-discounts \(\log P(0,T)\) ensures positivity \(P(0,T) > 0\) by construction. Linear interpolation on zero rates \(z(0,T) = z_i + (T-T_i)(z_{i+1}-z_i)/(T_{i+1}-T_i)\) produces continuous forwards but piecewise linear behavior; linear interpolation on log-discounts produces piecewise constant (stepped) forward rates with jumps at nodes. Cubic spline interpolation on zero rates \(z(0,T) = a_i + b_i(T-T_i) + c_i(T-T_i)^2 + d_i(T-T_i)^3\) with continuity of \(z\), \(z'\), \(z''\) at interior nodes gives smooth forwards but may introduce spurious oscillations, negative forward rates, and non-local sensitivity (dense Jacobian complicating hedges). Monotone convex interpolation (Hagan--West) on the forward curve preserves positivity and locality: it interpolates \(g(T) = -\log P(0,T) = \int_0^T f(0,u)\,du\) ensuring \(g'(T) = f(0,T) > 0\) with monotone forwards in each segment and changes localized to adjacent intervals. Advanced methods include tension splines with a tension parameter \(\tau\) controlling smoothness-fidelity trade-off, rational interpolation, and machine learning approaches. Impact on hedging: local methods (linear, monotone convex) produce sparse Jacobians \(\partial P(0,T_j)/\partial z_i = 0\) unless \(T_j\) is near \(T_i\), yielding cleaner bucket sensitivities; global methods (cubic spline) produce dense Jacobians and more complex hedge ratios.

No-Arbitrage Relations¶

No-arbitrage requires discount factors to be positive \(P(0,T) > 0\), decreasing \(P(0,T_1) \ge P(0,T_2)\) for \(T_1 < T_2\) (equivalently non-negative forward rates), and convex in maturity. The replication argument establishes forward rates: two strategies for obtaining 1 unit at time \(T_2\)---direct purchase of a \(T_2\)-bond for \(P(0,T_2)\) versus buying a \(T_1\)-bond and lending at the locked-in forward rate---must cost the same, giving \(P(0,T_1)/P(0,T_2) = 1 + F(0;T_1,T_2)(T_2-T_1)\). The continuous-time analog yields the multiplicative decomposition \(P(0,T) = \prod_{i=1}^n P(0,T_i)/P(0,T_{i-1})\) underlying bootstrapping. The par yield \(y_n = (1 - P(0,T_n))/\sum_i\delta_i P(0,T_i)\) is the coupon rate making a bond trade at par. Calendar spread arbitrage occurs when \(P(0,T_1)/P(0,T_2) < 1\) (negative forward rate) or equivalently when total variance \(w = T\sigma_{\text{impl}}^2\) is non-increasing in maturity. Butterfly arbitrage occurs when the discount factor violates convexity: \(P(0,T_2) > \lambda P(0,T_1) + (1-\lambda)P(0,T_3)\) for \(\lambda = (T_3-T_2)/(T_3-T_1)\). For stochastic rates, Jensen's inequality produces the convexity effect \(P(0,T) = \mathbb{E}^{\mathbb{Q}}[e^{-\int_0^T r_s\,ds}] \ge e^{-\mathbb{E}^{\mathbb{Q}}[\int_0^T r_s\,ds]}\).

Discount Factors and Zero Rates¶

Multiple compounding conventions coexist: continuous compounding \(P(0,T) = e^{-R(0,T)\cdot T}\) with \(R(0,T) = -\ln P(0,T)/T\); simple compounding \(P(0,T) = 1/(1 + L(0,T)\cdot T)\); and annual compounding \(P(0,T) = 1/(1+y)^T\). Conversions between conventions are exact but regime-dependent. Forward rates decompose future borrowing costs: the simply compounded forward rate \(F(t;T_1,T_2) = (P(t,T_1)/P(t,T_2) - 1)/(T_2-T_1)\); the continuously compounded forward rate \(f^c(t;T_1,T_2) = (\ln P(t,T_1) - \ln P(t,T_2))/(T_2-T_1)\); and the instantaneous forward rate \(f(t,T) = -\partial_T\ln P(t,T)\) obtained in the limit \(T_2 \to T_1\). The short rate is \(r_t = f(t,t) = \lim_{T \to t^+}f(t,T)\). Term structure shapes---normal (upward-sloping), inverted, flat, and humped---reflect market expectations, risk premia, and monetary policy. Day count conventions (ACT/360, ACT/365, 30/360) affect rate calculations in practice.

General Short-Rate Framework¶

A short-rate model specifies the instantaneous rate \(r_t\) as a diffusion \(dr_t = \mu(t,r_t)\,dt + \sigma(t,r_t)\,dW_t\) under the risk-neutral measure \(\mathbb{Q}\). The money-market account \(B_t = \exp(\int_0^t r_s\,ds)\) satisfies \(dB_t = r_t B_t\,dt\) and serves as the natural numeraire for risk-neutral pricing. Bond prices satisfy the fundamental pricing equation \(P(t,T) = \mathbb{E}^{\mathbb{Q}}_t[\exp(-\int_t^T r_s\,ds)]\) with properties: \(P(T,T) = 1\), \(P(t,T) > 0\), and \(P(t,T)/B_t\) is a \(\mathbb{Q}\)-martingale. Substituting \(P = F(t,r)\) into the Ito formula and imposing the martingale condition yields the bond pricing PDE

\[\frac{\partial P}{\partial t} + \mu^{\mathbb{Q}}(t,r)\frac{\partial P}{\partial r} + \frac{1}{2}\sigma^2(t,r)\frac{\partial^2 P}{\partial r^2} = rP\]

with terminal condition \(P(T,T) = 1\). Boundary conditions depend on the model: \(P \to 0\) as \(r \to +\infty\) (heavy discounting); at \(r = 0\) the CIR model requires reflection; Gaussian models accommodate \(r < 0\). The connection between physical and risk-neutral dynamics involves the market price of risk \(\lambda(t,r) = (\mu^{\mathbb{P}} - \mu^{\mathbb{Q}})/\sigma\), related via the Girsanov transformation \(dW_t^{\mathbb{Q}} = dW_t^{\mathbb{P}} + \lambda(t,r_t)\,dt\). The volatility \(\sigma(t,r)\) is the same under both measures. Models are classified by volatility structure (Gaussian: \(\sigma(t)\), square-root: \(\sigma\sqrt{r}\), log-normal: \(\sigma r\), CEV: \(\sigma r^\gamma\)), analytical tractability (affine with closed-form \(P\) versus non-affine requiring numerical methods), and number of factors (one-factor for limited dynamics, multi-factor for richer curve dynamics including steepening/flattening). Short-rate models correspond to Markovian HJM models with specific volatility structures, where \(r_t = f(t,t)\) is the limit of the forward rate curve.

The Vasicek Model¶

The Vasicek (1977) model specifies \(dr_t = \kappa(\theta - r_t)\,dt + \sigma\,dW_t^{\mathbb{Q}}\), an Ornstein--Uhlenbeck process with mean-reversion speed \(\kappa > 0\), long-run mean \(\theta\), and constant volatility \(\sigma > 0\). The mean-reversion dynamics create an automatic restoring force: \(\mathbb{E}[dr_t \mid r_t] = \kappa(\theta - r_t)\,dt\) is positive when \(r_t < \theta\) and negative when \(r_t > \theta\), with half-life \(\ln 2/\kappa\). The OU deviation \(y_t = r_t - \theta\) satisfies \(dy_t = -\kappa y_t\,dt + \sigma\,dW_t\). The explicit solution via the integrating factor \(e^{\kappa t}\) is

\[r_t = \theta + (r_0 - \theta)e^{-\kappa t} + \sigma e^{-\kappa t}\int_0^t e^{\kappa s}\,dW_s\]

so \(r_t \mid r_0 \sim \mathcal{N}\!\left(\theta + (r_0 - \theta)e^{-\kappa t},\; \frac{\sigma^2}{2\kappa}(1 - e^{-2\kappa t})\right)\). The stationary distribution is \(\mathcal{N}(\theta, \sigma^2/(2\kappa))\). Bond prices are derived through two equivalent approaches: the market price of risk method (constructing a risk-free portfolio of two bonds with different maturities, imposing no-arbitrage to identify the maturity-independent Sharpe ratio \(\lambda_t = (\mu(t,r,T) - r)/\nu(t,r,T)\)) and the risk-neutral measure method (applying Girsanov's theorem and requiring the discounted bond price to be a martingale). Both yield the same PDE \(f_t + \frac{1}{2}\sigma^2 f_{rr} + [\kappa(\theta - r) - \sigma\lambda]f_r - rf = 0\), with the exponential-affine solution \(P(t,T) = A(\tau)\,e^{-B(\tau)\,r_t}\) where \(\tau = T - t\),

\[B(\tau) = \frac{1 - e^{-\kappa\tau}}{\kappa}, \qquad \ln A(\tau) = \left(\theta - \frac{\sigma^2}{2\kappa^2}\right)(B(\tau) - \tau) - \frac{\sigma^2}{4\kappa}B(\tau)^2\]

The Gaussian distribution permits negative rates---a theoretical deficiency that nevertheless allows closed-form bond option pricing via the Jamshidian decomposition, which reduces coupon bond options to portfolios of zero-coupon bond options by exploiting the monotonicity of \(P(t,T) = A(\tau)e^{-B(\tau)r}\) in \(r\). Caplet and swaption formulas follow from the log-normal distribution of forward bond prices under the appropriate forward measure. Exact simulation uses \(r_{t+\Delta t} = e^{-\kappa\Delta t}r_t + (1 - e^{-\kappa\Delta t})\theta + \sigma\sqrt{(1 - e^{-2\kappa\Delta t})/(2\kappa)}\;Z\) with \(Z \sim \mathcal{N}(0,1)\). Yield curve shapes produced by Vasicek include normal, inverted, and humped, depending on the relationship between \(r_0\), \(\theta\), and \(\sigma^2/(2\kappa^2)\). Calibration targets the initial term structure (via least-squares fit to observed bond prices) and interest rate option prices.

The Cox--Ingersoll--Ross Model¶

The CIR (1985) model replaces constant volatility with a square-root diffusion: \(dr_t = \kappa(\theta - r_t)\,dt + \sigma\sqrt{r_t}\,dW_t^{\mathbb{Q}}\). The Feller condition \(2\kappa\theta \geq \sigma^2\) guarantees that \(r_t > 0\) for all \(t\), preventing negative rates; when violated, \(r_t\) can touch zero but is immediately reflected. The transition density is a scaled non-central chi-squared distribution with degrees of freedom \(\nu = 4\kappa\theta/\sigma^2\) and non-centrality parameter proportional to \(r_t e^{-\kappa\tau}\). Bond prices retain the affine form \(P(t,T) = A(\tau)\,e^{-B(\tau)\,r_t}\) but with

\[B(\tau) = \frac{2(e^{\gamma\tau} - 1)}{(\gamma + \kappa)(e^{\gamma\tau} - 1) + 2\gamma}, \qquad \gamma = \sqrt{\kappa^2 + 2\sigma^2}\]

where \(A(\tau)\) and \(B(\tau)\) solve Riccati ODEs arising from the affine structure. The auxiliary parameter \(\gamma > \kappa\) reflects the risk-adjusted mean-reversion speed. Bond options are priced via the non-central chi-squared distribution. Caplet and swaption formulas are available in semi-closed form. Compared with Vasicek, CIR achieves rate positivity at the cost of more complex analytics (non-central \(\chi^2\) rather than Gaussian). The level-dependent volatility \(\sigma\sqrt{r}\) produces higher rate volatility when rates are high, matching empirical observations. Euler discretization requires care to avoid negative rates---truncation \(v^+ = \max(v,0)\), reflection, or exact simulation via the non-central chi-squared distribution or the Broadie--Kaya algorithm is preferred. The change of measure from \(\mathbb{P}\) to \(\mathbb{Q}\) preserves the CIR form but modifies mean-reversion parameters. Yield curve dynamics under CIR are richer than Vasicek due to the state-dependent volatility, and calibration identifies parameters from both the initial curve shape and rate option prices.

The Hull--White Model¶

The Hull--White (1990) extension \(dr_t = (\theta(t) - \kappa\, r_t)\,dt + \sigma\,dW_t\) introduces a time-dependent drift \(\theta(t)\) chosen to exactly match the initial term structure: \(\theta(t) = \partial_t f^M(0,t) + \kappa f^M(0,t) + \frac{\sigma^2}{2\kappa}(1 - e^{-2\kappa t})\) where \(f^M(0,t)\) is the market instantaneous forward rate. This ensures \(P^{\text{model}}(0,T) = P^{\text{market}}(0,T)\) for all \(T\), eliminating the static calibration error that plagues time-homogeneous models. Closed-form pricing of caps, floors, and swaptions is preserved, making Hull--White the standard production model for vanilla rate derivatives. The two-factor G2++ extension adds a second OU factor \(r_t = x_t + y_t + \phi(t)\) with \(dx_t = -a\,x_t\,dt + \sigma_1\,dW_t^1\) and \(dy_t = -b\,y_t\,dt + \sigma_2\,dW_t^2\), providing richer correlation structure and humped volatility for more accurate swaption calibration.

The Black--Karasinski Model¶

The Black--Karasinski (1991) model specifies \(d\ln r_t = \kappa(t)(\theta(t) - \ln r_t)\,dt + \sigma(t)\,dW_t\), making the short rate log-normally distributed and strictly positive by construction. The logarithmic transformation destroys the affine structure: bond prices \(P(t,T) = \mathbb{E}_t^{\mathbb{Q}}[\exp(-\int_t^T r_s\,ds)]\) have no closed-form solution because \(e^{r_t}\) is not affine in \(r_t\). Pricing therefore requires numerical methods: trinomial trees (with mean-reversion-adjusted branching probabilities that account for the drift toward \(\theta(t)\)) provide the standard implementation, while Monte Carlo simulation with log-Euler discretization \(\ln r_{t+\Delta t} = \ln r_t + \kappa(t)(\theta(t) - \ln r_t)\Delta t + \sigma(t)\sqrt{\Delta t}\,Z\) offers flexibility for path-dependent products. Time-dependent parameters \(\kappa(t)\), \(\theta(t)\), \(\sigma(t)\) enable calibration to both the initial term structure and cap volatility smiles. Despite the computational cost relative to affine models, Black--Karasinski is widely used in practice for its realistic rate distribution, particularly when the log-normal assumption better matches market dynamics than the Gaussian (Hull--White) alternative. Comparison with Hull--White reveals the fundamental trade-off: Hull--White offers analytical tractability and exact curve fit but permits negative rates; Black--Karasinski ensures positivity and captures log-normal cap smiles but requires numerical pricing for all derivatives.

Affine Term Structure Models¶

The Vasicek and CIR models are special cases of the affine class: if the drift and squared diffusion of the state vector \(X_t \in \mathbb{R}^n\) are affine in \(X_t\)---that is, \(\mu(X) = K_0 + K_1 X\) and \((\sigma\sigma^\top)(X) = H_0 + H_1 X\)---then bond prices take the form \(P(t,T) = \exp(A(T-t) - B(T-t)^\top X_t)\) where \(A\) and \(B\) satisfy a system of Riccati ODEs \(B'(\tau) = K_1^\top B(\tau) - \frac{1}{2}B(\tau)^\top H_1 B(\tau) + \rho_1\) and \(A'(\tau) = K_0^\top B(\tau) - \frac{1}{2}B(\tau)^\top H_0 B(\tau) + \rho_0\). This structure yields semi-analytic pricing for a wide range of derivatives via Fourier transform methods. The Dai--Singleton \(A_m(n)\) classification organizes \(n\)-factor affine models by the number \(m\) of state variables driving square-root components: \(A_0(n)\) models (purely Gaussian, e.g., multi-factor Vasicek) through \(A_n(n)\) models (all square-root, e.g., multi-factor CIR). Multi-factor extensions capture richer yield curve dynamics including level, slope, and curvature movements that one-factor models cannot reproduce.

Model Comparison and Selection¶

The choice among short-rate models involves trade-offs across multiple dimensions. Analytical tractability: Vasicek and CIR offer full closed-form bond and option prices; Hull--White preserves these while fitting the curve; Black--Karasinski sacrifices analytics for distributional realism. Rate distribution: Vasicek/Hull--White permit negative rates (Gaussian); CIR ensures positivity via the Feller condition (\(2\kappa\theta \ge \sigma^2\)); Black--Karasinski ensures positivity (log-normal). Volatility structure: Vasicek has constant (level-independent) volatility; CIR has square-root (level-dependent) volatility matching the empirical observation that rate volatility increases with rate level; Black--Karasinski has proportional (log-normal) volatility. Calibration: time-homogeneous models (Vasicek, CIR) cannot match the initial curve exactly; Hull--White and Black--Karasinski fit perfectly via \(\theta(t)\). Numerical methods: Vasicek and CIR admit exact simulation; Hull--White supports efficient tree construction; Black--Karasinski requires trinomial trees or Monte Carlo. Practical guidance: Vasicek for pedagogical clarity and rapid prototyping; CIR when rate positivity and level-dependent volatility matter; Hull--White as the standard production model for vanilla rate derivatives; Black--Karasinski when log-normal dynamics or cap smile calibration are priorities. Monte Carlo simulation under each model requires appropriate discretization schemes: exact Gaussian sampling for Vasicek/Hull--White, non-central chi-squared sampling for CIR, and log-Euler for Black--Karasinski.

Role in the Book

The term structure models developed here build on the stochastic calculus (Chapter 3), Feynman--Kac theory (Chapter 5), and PDE methods (Chapter 6) from earlier chapters. The bond pricing PDE parallels the Black--Scholes equation with the short rate replacing the stock price, and the market price of risk derivation mirrors the hedging argument for equity options. The CIR variance process reappears as the Heston model's variance dynamics (Chapter 16), with the same Feller condition, non-central chi-squared distribution, and Riccati ODE structure. Calibration techniques from Chapter 17 apply directly to fitting these models to market data, including regularization for parameter stability and filtering for time-varying parameters. The HJM framework (Chapter 19) generalizes the short-rate approach by modeling the entire forward curve, with short-rate models corresponding to Markovian HJM specifications.