No Closed-Form Bond Prices¶
In the Vasicek, Hull-White, and CIR models, zero-coupon bond prices can be written as explicit functions of the model parameters and the current short rate. The Black-Karasinski model offers no such luxury. As established in the non-affine structure section, the \(r\ln r\) drift and \(r^2\) diffusion terms in the BK dynamics prevent the bond pricing PDE from admitting an exponential-affine solution. This section examines the numerical methods required to compute BK bond prices, compares their accuracy and computational cost, and presents asymptotic approximations that provide useful intuition even without exact formulas.
The bond pricing problem¶
Under the risk-neutral measure, the BK zero-coupon bond price is
where \(r_s = e^{x_s}\) and \(x_s\) solves \(dx_s = [\theta(s) - ax_s]\,dt + \sigma\,dW_s\). The expectation cannot be evaluated in closed form because the integral \(\int_t^T e^{x_s}\,ds\) couples the exponential nonlinearity of \(r = e^x\) with the stochastic path of \(x_s\).
In the log-rate variable \(x = \ln r\), the bond pricing PDE is
The term \(e^x g\) --- the discounting term --- is the fundamental obstruction. There is no transformation that linearizes \(e^x\) and simultaneously maintains the Gaussian structure of the \(x\)-dynamics.
Numerical method 1: trinomial tree¶
The most widely used method for BK bond pricing is the trinomial tree (Hull-White tree adapted for log-normal rates). The tree discretizes both time and the log-rate space.
Construction¶
- Build a tree for \(x = \ln r\) with time steps \(\Delta t\) and log-rate steps \(\Delta x = \sigma\sqrt{3\Delta t}\)
- At each node, compute branching probabilities (up, middle, down) that match the conditional mean and variance of \(x_{t+\Delta t}\)
- Calibrate \(\theta(t_k)\) at each time step to match the observed discount factor \(P^{\text{mkt}}(0, t_k)\)
Bond pricing on the tree¶
To price a bond maturing at \(T\):
- Set all terminal nodes at time \(T\) to value \(1\)
- Roll backward through the tree: at each node \((t_k, x_j)\), the bond price is the discounted expected value
where \(p_u, p_m, p_d\) are the branching probabilities and \(e^{x_j}\) is the short rate at node \(j\).
Convergence: \(O(\Delta t)\) for standard trees, \(O(\Delta t^2)\) with Richardson extrapolation.
Numerical method 2: finite differences¶
The PDE in the log-rate variable can be solved by standard finite difference methods.
Crank-Nicolson scheme¶
Discretize the spatial domain \(x \in [x_{\min}, x_{\max}]\) with \(N_x\) points and time with \(N_t\) steps. The Crank-Nicolson (implicit-explicit average) scheme for the BK PDE is
where \(\mathcal{L}^k\) is the spatial operator at time step \(k\):
Boundary conditions: At \(x_{\min}\) (very low rates), \(P \approx 1\); at \(x_{\max}\) (very high rates), \(P \approx 0\).
Convergence: \(O(\Delta t^2 + \Delta x^2)\) for Crank-Nicolson, unconditionally stable.
Finite differences vs trees
Finite differences offer higher-order convergence and systematic grid refinement. Trees are simpler to implement and naturally handle the calibration of \(\theta(t)\) through forward induction. For production systems, finite differences with adaptive grids are often preferred; for teaching and prototyping, trees are more transparent.
Numerical method 3: Monte Carlo¶
Monte Carlo simulation generates paths of \(x_t = \ln r_t\) (which is Gaussian, making simulation straightforward) and averages the discounted payoffs.
Path simulation¶
Since \(x_t\) follows an OU process, exact simulation is available:
where \(Z \sim \mathcal{N}(0,1)\). The short rate is then \(r_t = e^{x_t}\).
Bond price estimator¶
Convergence: \(O(1/\sqrt{M})\) statistical error plus \(O(\Delta t^2)\) discretization error (with trapezoidal rule).
Asymptotic approximations¶
Although no exact formula exists, useful approximations can be derived.
Short-maturity expansion¶
For small \(\tau = T - t\), expand \(P(t,T)\) in powers of \(\tau\):
The first two terms match the general short-rate expansion \(P \approx e^{-r_t\tau}\). The third term introduces model-specific corrections from the drift and volatility.
Log-normal approximation¶
For moderate maturities, one can approximate the bond price by assuming the integral \(\int_t^T r_s\,ds\) is approximately log-normal. If \(Y = \int_t^T r_s\,ds\) has mean \(\mu_Y\) and variance \(\sigma_Y^2\), then
The moments \(\mu_Y\) and \(\sigma_Y^2\) can be computed from the conditional moments of the CIR or BK process. This approximation captures the convexity correction but is only accurate for moderate maturities and volatilities.
Computational cost comparison¶
| Method | Cost per bond price | Accuracy | Calibration |
|---|---|---|---|
| Affine formula (Vasicek/CIR) | \(O(1)\) | Exact | Analytical |
| Trinomial tree | \(O(N_t \cdot N_x)\) | \(O(\Delta t)\) | Forward induction |
| Finite differences | \(O(N_t \cdot N_x)\) | \(O(\Delta t^2 + \Delta x^2)\) | Iterative |
| Monte Carlo | \(O(M \cdot N_t)\) | \(O(1/\sqrt{M})\) | Iterative |
The computational cost of BK bond pricing is orders of magnitude higher than for affine models. A single bond price that takes nanoseconds with the CIR formula requires milliseconds on a tree. This cost multiplies for derivative pricing (which requires many bond prices) and calibration (which requires many derivative prices).
The speed penalty is real
For a cap with 40 caplets, each requiring a bond price, and calibration requiring 100 iterations of 40 cap prices: the total is \(\sim 4000\) bond prices per calibration. With trees this takes seconds; with CIR formulas, microseconds. For real-time risk management, this speed difference matters.
Summary¶
The Black-Karasinski model requires numerical methods for bond pricing because the non-affine structure prevents closed-form solutions. The three standard approaches --- trinomial trees, finite differences, and Monte Carlo --- offer different tradeoffs between accuracy, speed, and ease of calibration. Trees are the most common in practice due to their natural integration with the forward-induction calibration of \(\theta(t)\). Finite differences provide higher-order convergence, while Monte Carlo handles path-dependent products. Asymptotic approximations offer quick estimates for short maturities but lack the accuracy needed for precision pricing. The computational cost of BK bond pricing, while manageable with modern hardware, is orders of magnitude higher than for affine models and represents the practical price of non-negativity and exact term structure fitting.
Exercises¶
Exercise 1. Starting from the bond pricing PDE in the log-rate variable,
attempt the affine ansatz \(g(t,x) = \exp(A(t) + B(t)\,x)\) and show that it fails. Specifically, substitute the ansatz and demonstrate that the resulting equation cannot be separated into terms depending only on \(t\) and terms depending only on \(x\).
Solution to Exercise 1
We try the affine ansatz \(g(t,x) = \exp(A(t) + B(t)\,x)\) in the PDE
Computing the partial derivatives of the ansatz:
Substituting into the PDE and dividing by \(g\) (which is positive):
Collecting terms:
For this equation to hold for all \(x\), each functionally independent term must vanish separately. The functions \(1\), \(x\), and \(e^x\) are linearly independent over any interval in \(x\). Therefore we would need:
- Coefficient of \(e^x\): \(-1 = 0\) --- a contradiction
This is immediate and definitive. The discounting term \(e^x g\) produces an \(e^x\) that is not in the span of \(\{1, x\}\) and cannot be absorbed by any choice of \(A(t)\) and \(B(t)\). The affine ansatz fails because the exponential discounting rate \(e^x\) is incommensurate with the linear state variable \(x\) used in the ansatz. In an affine model, the discounting term would be \(x \cdot g\) (linear in the state), which integrates naturally with the ansatz.
Exercise 2. On a BK trinomial tree with \(\Delta t = 0.5\) year, the log-rate at a node is \(x_j = \ln(0.06)\), and the three successor node bond values are \(g_{j+1} = 0.945\), \(g_j = 0.950\), \(g_{j-1} = 0.955\), with branching probabilities \(p_u = 0.1667\), \(p_m = 0.6667\), \(p_d = 0.1667\). Compute the bond price \(g(t_k, x_j)\) using the backward induction formula.
Solution to Exercise 2
The backward induction formula is
Given values: \(x_j = \ln(0.06)\), \(\Delta t = 0.5\), \(g_{j+1} = 0.945\), \(g_j = 0.950\), \(g_{j-1} = 0.955\), \(p_u = p_d = 0.1667\), \(p_m = 0.6667\).
Step 1: Compute the expected continuation value:
Step 2: Compute the one-period discount factor. The short rate at this node is \(r = e^{x_j} = 0.06\):
Step 3: Compute the bond price:
The bond price at this node is approximately \(0.9220\).
Exercise 3. Write out the Crank-Nicolson finite difference equation for the interior node \(j\) at time step \(k \to k+1\), using the spatial operator \(\mathcal{L}^k\) defined in the text. If \(\Delta x = 0.05\), \(\Delta t = 0.01\), \(a = 0.10\), \(\sigma = 0.20\), \(\theta(t_k) = -0.25\), and \(x_j = \ln(0.05)\), compute the coefficients of \(g_{j-1}^{k+1}\), \(g_j^{k+1}\), and \(g_{j+1}^{k+1}\) on the implicit side of the scheme.
Solution to Exercise 3
The Crank-Nicolson scheme is
The spatial operator at time step \(k\) is
With \(\Delta x = 0.05\), \(\Delta t = 0.01\), \(a = 0.10\), \(\sigma = 0.20\), \(\theta_k = -0.25\), \(x_j = \ln(0.05) \approx -2.9957\):
Compute the drift coefficient:
Compute the auxiliary coefficients:
The operator acting on node \(j\) gives coefficients for \(g_{j-1}\), \(g_j\), \(g_{j+1}\):
Rearranging Crank-Nicolson to isolate the implicit (time \(k+1\)) side:
The implicit-side coefficients for \(g_{j-1}^{k+1}\), \(g_j^{k+1}\), \(g_{j+1}^{k+1}\) are:
The implicit-side equation at node \(j\) is
Exercise 4. Using the short-maturity expansion
compute \(P(t,T)\) for \(r_t = 5\%\), \(\tau = 0.25\) year, \(\theta(t) = -0.25\), \(a = 0.10\), and \(\sigma = 0.20\). Compare this with the naive approximation \(e^{-r_t \tau}\) and quantify the difference.
Solution to Exercise 4
With \(r_t = 0.05\), \(\tau = 0.25\), \(\theta(t) = -0.25\), \(a = 0.10\), \(\sigma = 0.20\):
Short-maturity expansion:
Computing the bracketed terms:
Now assembling the expansion:
Naive approximation:
Difference:
The model-specific correction adds approximately 1.1 basis points to the bond price. This correction arises from the drift and volatility terms in the BK dynamics and represents the convexity adjustment for the stochastic path of rates over the short period \([t, T]\). For a 3-month maturity, the difference is small, validating the use of the naive approximation \(e^{-r_t\tau}\) as a quick estimate.
Exercise 5. A cap with 40 quarterly caplets must be calibrated. Each calibration iteration requires pricing all 40 caplets, and convergence takes 100 iterations. If one trinomial tree bond price takes \(1\) ms and one affine-model bond price takes \(1\) ns, compute the total calibration time for both the BK model and an affine model (e.g., CIR). Express the ratio.
Solution to Exercise 5
A cap with 40 quarterly caplets requires pricing 40 caplet payoffs. Each caplet pricing involves a bond price computation (for the LIBOR rate at the reset date) and the backward induction of the caplet payoff. Each calibration iteration prices all 40 caplets, and convergence requires 100 iterations.
Total number of bond price evaluations: \(40 \times 100 = 4{,}000\).
BK model (trinomial tree): Each bond price takes 1 ms.
Affine model (e.g., CIR): Each bond price takes 1 ns.
Ratio:
The BK calibration is approximately one million times slower than the affine model calibration. In absolute terms, the BK calibration takes 4 seconds, which is acceptable for a daily calibration but would be prohibitive if repeated thousands of times (e.g., in a nested simulation for CVA). The affine model calibration at 4 microseconds is essentially instantaneous and can be run in real time.
Exercise 6. In the log-normal approximation, \(P(t,T) \approx \exp(-\mu_Y + \frac{1}{2}\sigma_Y^2)\) where \(Y = \int_t^T r_s\,ds\). Show that this approximation is exact when \(Y\) is Gaussian (as would be the case if \(r_s\) were a Gaussian process). For the BK model, explain why \(Y\) is not Gaussian and discuss the sign of the error introduced by this approximation for typical parameter values.
Solution to Exercise 6
The approximation \(P(t,T) \approx \exp(-\mu_Y + \frac{1}{2}\sigma_Y^2)\) is the moment-generating function of \(Y\) evaluated at \(\lambda = 1\), under the assumption that \(Y\) is Gaussian.
Exactness for Gaussian \(Y\): If \(Y \sim \mathcal{N}(\mu_Y, \sigma_Y^2)\), then by the standard Gaussian MGF:
This is exact, not an approximation. In the Vasicek/Hull-White model, \(r_s\) is Gaussian, so \(Y = \int_t^T r_s\,ds\) (an integral of a Gaussian process) is Gaussian, and the formula gives the exact bond price.
Why \(Y\) is not Gaussian in BK: In the BK model, \(r_s = e^{x_s}\) where \(x_s\) is Gaussian. The integral \(Y = \int_t^T e^{x_s}\,ds\) is a sum of log-normal random variables. The sum (integral) of log-normal random variables is not log-normal, and certainly not Gaussian. The distribution of \(Y\) has:
- A heavier right tail than Gaussian (inherited from the log-normal \(r_s\))
- Right skewness (since \(e^{x_s}\) is right-skewed)
- A support bounded below (since \(Y > 0\))
Sign of the error: The log-normal approximation uses only the first two moments (\(\mu_Y\), \(\sigma_Y^2\)) and assumes Gaussian tails. For the bond price \(\mathbb{E}[e^{-Y}]\), the function \(e^{-Y}\) is convex and decreasing. The true distribution of \(Y\) has a heavier right tail than the Gaussian with the same mean and variance. Paths with very large \(Y\) (high integrated rates) contribute negligibly to \(\mathbb{E}[e^{-Y}]\) (since \(e^{-Y} \approx 0\) for large \(Y\)), while the heavier right tail of the true distribution moves probability mass away from moderate values of \(Y\) where \(e^{-Y}\) is significant. The net effect is that the Gaussian approximation typically overestimates the bond price (underestimates the integrated rate's impact), because it underweights the heavy-tailed scenarios. The error is small for short maturities and moderate volatilities but grows with \(\sigma\) and \(T\).
Exercise 7. Compare the boundary conditions for the finite difference PDE solver: \(g \approx 1\) at \(x_{\min}\) (very low rates) and \(g \approx 0\) at \(x_{\max}\) (very high rates). Justify these conditions economically. What happens to the bond price as \(r_t \to 0\)? What happens as \(r_t \to \infty\)? How should \(x_{\min}\) and \(x_{\max}\) be chosen relative to the model parameters \(a\), \(\sigma\), and \(\theta(t)\) to ensure negligible truncation error?
Solution to Exercise 7
Boundary condition at \(x_{\min}\) (\(g \approx 1\)): When \(x = x_{\min}\) is very negative, the short rate is \(r = e^{x_{\min}} \approx 0\). With near-zero interest rates, there is essentially no discounting between \(t\) and \(T\):
Economically, a zero interest rate means money has no time value, so a bond paying $1 at maturity is worth $1 today. Furthermore, with strong mean reversion, even if the rate starts very low, it will revert toward the mean, but the discounting effect during the period when \(r\) is near zero contributes negligibly to \(\int r_s\,ds\).
Boundary condition at \(x_{\max}\) (\(g \approx 0\)): When \(x = x_{\max}\) is very large, the short rate \(r = e^{x_{\max}}\) is enormous. The discount factor \(\exp(-\int_t^T r_s\,ds)\) is dominated by the initial high rate:
For very high \(r_t\), this integral is huge, and \(e^{-\int r_s\,ds} \approx 0\). Economically, extremely high interest rates mean money has enormous time value, so a future payment of $1 is worth essentially nothing today.
Choosing \(x_{\min}\) and \(x_{\max}\): The boundaries should be placed where the probability of the log rate reaching them is negligible. The long-run distribution of \(x_t\) is approximately \(\mathcal{N}(\theta/a, \sigma^2/(2a))\), with standard deviation \(\sigma/\sqrt{2a}\). A standard choice is
where \(n_{\text{sd}}\) is the number of standard deviations (typically 4--6). With \(n_{\text{sd}} = 5\), the probability of reaching the boundary is less than \(3 \times 10^{-7}\) per time step, ensuring negligible truncation error. If \(\theta(t)\) varies, use the extremes of \(\theta(t)/a\) over the time horizon and add the appropriate volatility buffer. The grid should also be wide enough that the boundary condition errors (from approximating \(g\) as exactly 0 or 1 at the boundary) do not propagate significantly into the interior solution region.