Tree Construction for Hull-White¶

Trinomial trees provide a lattice-based numerical method for pricing interest rate derivatives, particularly those with early exercise features where Monte Carlo methods are less natural. The Hull-White trinomial tree, introduced by Hull and White (1994), discretizes the short rate process on a recombining grid. The construction proceeds in two stages: first, build a symmetric tree for a zero-mean Ornstein-Uhlenbeck process \(x_t\); second, shift each time slice to fit the initial yield curve. This section covers the first stage --- the tree geometry, node spacing, and branching probabilities.

Zero-Mean Process¶

The Hull-White short rate \(r_t\) satisfies \(dr_t = [\theta(t) - \lambda r_t]\,dt + \sigma\,dW_t\). To separate the time-dependent drift from the stochastic dynamics, define the zero-mean process

\[ x_t = r_t - \alpha(t) \]

where \(\alpha(t)\) absorbs \(\theta(t)\) and will be determined later to fit the initial curve. The process \(x_t\) satisfies

\[ dx_t = -\lambda\,x_t\,dt + \sigma\,dW_t, \quad x_0 = 0 \]

The tree is first constructed for \(x_t\), then shifted by \(\alpha(t_i)\) at each time step \(t_i\) so that \(r_{ij} = \alpha_i + j\,\Delta x\) at node \((i, j)\).

Node Spacing¶

The spatial step \(\Delta x\) is chosen so that the trinomial tree matches the variance of \(x_t\) over one time step. The conditional variance of \(x_{t+\Delta t}\) given \(x_t\) is

\[ \text{Var}(x_{t+\Delta t} \mid x_t) = \frac{\sigma^2}{2\lambda}\left(1 - e^{-2\lambda \Delta t}\right) \approx \sigma^2 \Delta t \]

for small \(\Delta t\). The standard choice for the spatial step is

\[ \Delta x = \sigma\sqrt{3\,\Delta t} \]

This ensures that the branching probabilities remain positive and well-behaved. The factor \(\sqrt{3}\) arises from matching the second moment of a trinomial distribution with three equally-spaced outcomes.

Branching Geometry¶

At each node \((i, j)\), the process can move to one of three nodes at time \(t_{i+1}\). The branching pattern depends on the position of node \(j\) relative to the center of the grid.

Normal branching (used at most nodes): from node \(j\), the process moves to nodes \(j+1\) (up), \(j\) (middle), or \(j-1\) (down).

Up branching (used near the upper boundary): from node \(j\), the process moves to nodes \(j\) (up), \(j-1\) (middle), or \(j-2\) (down). This pattern prevents the tree from growing unboundedly.

Down branching (used near the lower boundary): from node \(j\), the process moves to nodes \(j+2\) (up), \(j+1\) (middle), or \(j\) (down).

The boundary between normal and non-normal branching occurs at

\[ j_{\max} = \left\lfloor \frac{0.1844}{\lambda\,\Delta t} \right\rfloor \]

where the constant \(0.1844\) is chosen to ensure positive probabilities. Equivalently, non-normal branching is used when \(|j| > j_{\max}\).

Branching Probabilities¶

The branching probabilities are determined by matching the first two moments and the normalization condition of the continuous-time process over one time step \(\Delta t\).

For a node at position \(j\) (representing \(x = j\,\Delta x\)), the conditional mean and variance of \(x_{t+\Delta t}\) are

\[ \mathbb{E}[x_{t+\Delta t} \mid x_t = j\,\Delta x] = j\,\Delta x\,e^{-\lambda \Delta t} \]

\[ \text{Var}(x_{t+\Delta t} \mid x_t) = \frac{\sigma^2}{2\lambda}\left(1 - e^{-2\lambda \Delta t}\right) \]

Define \(M = j\,\Delta x\,e^{-\lambda \Delta t}\) and \(V = \frac{\sigma^2}{2\lambda}(1 - e^{-2\lambda \Delta t})\).

Normal branching (\(j \to j+1, j, j-1\)): the probabilities \(p_u, p_m, p_d\) satisfy

\[ \begin{cases} p_u + p_m + p_d = 1 \\ p_u(j+1)\Delta x + p_m\,j\,\Delta x + p_d(j-1)\Delta x = M + j\,\Delta x \\ p_u(j+1)^2\Delta x^2 + p_m\,j^2\Delta x^2 + p_d(j-1)^2\Delta x^2 = V + M^2 + 2Mj\,\Delta x + j^2\Delta x^2 \end{cases} \]

Solving, with \(\eta = j\,\Delta x\,e^{-\lambda \Delta t} / \Delta x = j\,e^{-\lambda \Delta t}\) (the mean position in grid units relative to \(j\)):

\[ p_u = \frac{1}{6} + \frac{1}{2}\left(\eta^2\frac{\Delta x^2}{3\Delta x^2} + \eta\frac{\Delta x}{3\Delta x^2/ \Delta x}\right) \]

More precisely, defining \(\hat{\jmath} = j\,e^{-\lambda \Delta t}\) and using \(\Delta x^2 = 3\sigma^2 \Delta t\):

\[ p_u = \frac{1}{6} + \frac{\hat{\jmath}^2\,\Delta x^2 + \hat{\jmath}\,\Delta x\,\Delta x}{2 \cdot 3\sigma^2\Delta t} \]

The explicit formulas for normal branching simplify to

\[ p_u = \frac{1}{6} + \frac{(j^2 e^{-2\lambda\Delta t} - j\,e^{-\lambda\Delta t})\Delta x^2 + V}{2\Delta x^2} \]

In practice, using the small-\(\Delta t\) approximations \(e^{-\lambda\Delta t} \approx 1 - \lambda\Delta t\) and \(V \approx \sigma^2\Delta t\), the normal branching probabilities become

\[ p_u = \frac{1}{6} + \frac{j^2\lambda^2\Delta t^2 - j\lambda\Delta t}{2} \]

\[ p_m = \frac{2}{3} - j^2\lambda^2\Delta t^2 \]

\[ p_d = \frac{1}{6} + \frac{j^2\lambda^2\Delta t^2 + j\lambda\Delta t}{2} \]

Probability Positivity

The probabilities must satisfy \(0 \leq p_u, p_m, p_d \leq 1\). For normal branching, this holds when \(j^2\lambda^2\Delta t^2 < 2/3\), which is equivalent to \(|j| < j_{\max}\). Beyond this boundary, non-normal branching must be used.

Up branching (\(j \to j, j-1, j-2\)): the moment-matching equations give

\[ p_u = \frac{7}{6} + \frac{j^2\lambda^2\Delta t^2 - 3j\lambda\Delta t}{2} \]

\[ p_m = -\frac{1}{3} - j^2\lambda^2\Delta t^2 + 2j\lambda\Delta t \]

\[ p_d = \frac{1}{6} + \frac{j^2\lambda^2\Delta t^2 - j\lambda\Delta t}{2} \]

Down branching (\(j \to j+2, j+1, j\)): by symmetry with up branching,

\[ p_u = \frac{1}{6} + \frac{j^2\lambda^2\Delta t^2 + j\lambda\Delta t}{2} \]

\[ p_m = -\frac{1}{3} - j^2\lambda^2\Delta t^2 - 2j\lambda\Delta t \]

\[ p_d = \frac{7}{6} + \frac{j^2\lambda^2\Delta t^2 + 3j\lambda\Delta t}{2} \]

Recombining Property¶

The tree is recombining because the spatial grid is uniform: a sequence of up-middle-down moves returns to the same node regardless of the order. At time step \(i\), the node index \(j\) ranges from \(-j_{\max,i}\) to \(j_{\max,i}\), where \(j_{\max,i}\) grows initially (as the process diffuses) but is bounded by the non-normal branching boundaries.

The total number of nodes at time step \(i\) is at most \(2j_{\max} + 1\), independent of \(i\) after the initial growth phase. This keeps the tree size manageable: \(O(N \cdot j_{\max})\) total nodes for \(N\) time steps.

Visualization of the Trinomial Tree¶

The tree structure at time step \(i\) is:

j_max ●───●───● │ ╲ │ ╱ │ j_max-1 ●───●───● │ ╲ │ ╱ │ ... ... ... ... │ ╲ │ ╱ │ -j_max+1 ●───●───● │ ╲ │ ╱ │ -j_max ●───●───● t_i t_{i+1}

Each node connects to three successor nodes, with the branching pattern (normal, up, or down) determined by the node's position \(j\).

Small Tree Construction

Consider \(\lambda = 0.1\), \(\sigma = 0.01\), \(\Delta t = 1\). Then:

\(\Delta x = 0.01\sqrt{3} \approx 0.01732\)
\(j_{\max} = \lfloor 0.1844 / 0.1 \rfloor = 1\)
At each time step, nodes are at \(j \in \{-1, 0, 1\}\)
For \(j = 0\) (normal branching): \(p_u = p_d = 1/6\), \(p_m = 2/3\)
For \(j = 1\): \(\lambda\Delta t \cdot j = 0.1\), giving \(p_u \approx 0.1717\), \(p_m \approx 0.6567\), \(p_d \approx 0.1717\)

The short rate at node \((i, j)\) is \(r_{ij} = \alpha_i + j \cdot 0.01732\), where \(\alpha_i\) is determined by calibration to the initial yield curve (covered in the next section).

Summary¶

The Hull-White trinomial tree construction discretizes the zero-mean OU process \(x_t\) on a recombining grid with spacing \(\Delta x = \sigma\sqrt{3\Delta t}\). Three branching patterns (normal, up, down) ensure positive probabilities everywhere by switching to non-normal branching when \(|j| > j_{\max}\). The branching probabilities are determined by matching the conditional mean and variance of the continuous-time process. The resulting tree has \(O(N \cdot j_{\max})\) nodes total, where \(j_{\max}\) is bounded by the mean-reversion parameter. The next section covers calibrating the shift \(\alpha_i\) at each time slice to fit the initial yield curve.

Exercises¶

Exercise 1. For \(\lambda = 0.05\), \(\sigma = 0.01\), and \(\Delta t = 0.5\), compute \(\Delta x\), \(j_{\max}\), and the total number of nodes at a single time step. How does \(j_{\max}\) change if \(\lambda\) is doubled to \(0.10\)?

Solution to Exercise 1

With \(\lambda = 0.05\), \(\sigma = 0.01\), and \(\Delta t = 0.5\):

Spatial step:

\[ \Delta x = \sigma\sqrt{3\,\Delta t} = 0.01\sqrt{3 \times 0.5} = 0.01\sqrt{1.5} \approx 0.01225 \]

Maximum node index:

\[ j_{\max} = \left\lfloor \frac{0.1844}{\lambda\,\Delta t} \right\rfloor = \left\lfloor \frac{0.1844}{0.05 \times 0.5} \right\rfloor = \left\lfloor \frac{0.1844}{0.025} \right\rfloor = \lfloor 7.376 \rfloor = 7 \]

Total number of nodes at a single time step:

\[ 2j_{\max} + 1 = 2(7) + 1 = 15 \]

If \(\lambda\) is doubled to \(0.10\), then

\[ j_{\max} = \left\lfloor \frac{0.1844}{0.10 \times 0.5} \right\rfloor = \left\lfloor \frac{0.1844}{0.05} \right\rfloor = \lfloor 3.688 \rfloor = 3 \]

The total number of nodes decreases to \(2(3) + 1 = 7\). Doubling the mean-reversion speed roughly halves \(j_{\max}\) because stronger mean reversion pulls the process back toward zero more aggressively, requiring fewer grid points to capture the distribution.

Exercise 2. Verify that the normal branching probabilities \(p_u = \frac{1}{6} + \frac{j^2\lambda^2\Delta t^2 - j\lambda\Delta t}{2}\), \(p_m = \frac{2}{3} - j^2\lambda^2\Delta t^2\), \(p_d = \frac{1}{6} + \frac{j^2\lambda^2\Delta t^2 + j\lambda\Delta t}{2}\) sum to \(1\) for all \(j\). Then show that \(p_m\) becomes negative when \(j^2\lambda^2\Delta t^2 > 2/3\), confirming the need for non-normal branching.

Solution to Exercise 2

Summing the three probabilities:

\[ p_u + p_m + p_d = \left(\frac{1}{6} + \frac{j^2\lambda^2\Delta t^2 - j\lambda\Delta t}{2}\right) + \left(\frac{2}{3} - j^2\lambda^2\Delta t^2\right) + \left(\frac{1}{6} + \frac{j^2\lambda^2\Delta t^2 + j\lambda\Delta t}{2}\right) \]

Collecting constant terms: \(\frac{1}{6} + \frac{2}{3} + \frac{1}{6} = \frac{1 + 4 + 1}{6} = 1\).

Collecting \(j^2\lambda^2\Delta t^2\) terms: \(\frac{1}{2} - 1 + \frac{1}{2} = 0\).

Collecting \(j\lambda\Delta t\) terms: \(-\frac{1}{2} + \frac{1}{2} = 0\).

Therefore \(p_u + p_m + p_d = 1\) for all \(j\).

For the middle probability, \(p_m = \frac{2}{3} - j^2\lambda^2\Delta t^2\). This becomes negative when

\[ j^2\lambda^2\Delta t^2 > \frac{2}{3} \]

When \(p_m < 0\), the probability triple is no longer valid, confirming that non-normal branching must be used for nodes beyond \(|j| > j_{\max}\).

Exercise 3. The spatial step \(\Delta x = \sigma\sqrt{3\Delta t}\) is chosen to match the variance of the continuous-time process. Derive this choice by requiring that the second moment of the trinomial distribution \(p_u(\Delta x)^2 + p_m(0)^2 + p_d(-\Delta x)^2\) equals \(\sigma^2\Delta t\) when \(j = 0\).

Solution to Exercise 3

At node \(j = 0\) with normal branching, the conditional mean is zero (since \(M = 0 \cdot e^{-\lambda\Delta t} = 0\)). By symmetry at \(j = 0\), the probabilities simplify to \(p_u = p_d = 1/6\) and \(p_m = 2/3\).

The second moment of the trinomial distribution about the mean (which is zero) is

\[ p_u(\Delta x)^2 + p_m(0)^2 + p_d(-\Delta x)^2 = \frac{1}{6}(\Delta x)^2 + 0 + \frac{1}{6}(\Delta x)^2 = \frac{(\Delta x)^2}{3} \]

Setting this equal to the continuous-time variance \(\sigma^2\Delta t\):

\[ \frac{(\Delta x)^2}{3} = \sigma^2\Delta t \]

Solving for \(\Delta x\):

\[ \Delta x = \sigma\sqrt{3\,\Delta t} \]

This confirms that the factor \(\sqrt{3}\) arises from matching the variance of a trinomial distribution with three equally-spaced outcomes at the central node.

Exercise 4. For the up branching pattern (\(j \to j, j-1, j-2\)), derive the probabilities by solving the three moment-matching equations: normalization, mean-matching, and variance-matching. Verify that the formulas given in the text are correct.

Solution to Exercise 4

For up branching, node \(j\) connects to \(j\), \(j-1\), \(j-2\) at the next time step. Let \(p_u\), \(p_m\), \(p_d\) be the probabilities of moving to \(j\), \(j-1\), \(j-2\) respectively. Define \(M = j\,e^{-\lambda\Delta t}\Delta x\) and \(V = \sigma^2\Delta t\) (using small-\(\Delta t\) approximations).

Normalization:

\[ p_u + p_m + p_d = 1 \]

Mean matching (the expected position must equal \(M + j\Delta x = j\,e^{-\lambda\Delta t}\Delta x + j\Delta x\), but using the relative displacement from the current node):

\[ p_u \cdot j\Delta x + p_m(j-1)\Delta x + p_d(j-2)\Delta x = j\,e^{-\lambda\Delta t}\Delta x \]

Dividing by \(\Delta x\) and setting \(\hat\jmath = j\,e^{-\lambda\Delta t} \approx j(1 - \lambda\Delta t)\):

\[ p_u\,j + p_m(j-1) + p_d(j-2) = \hat\jmath \]

Variance matching:

\[ p_u\,j^2 + p_m(j-1)^2 + p_d(j-2)^2 = \hat\jmath^2 + V/\Delta x^2 \]

Substituting \(\Delta x^2 = 3\sigma^2\Delta t\) gives \(V/\Delta x^2 = 1/3\).

Let \(a = j\lambda\Delta t\) (so \(\hat\jmath = j - a\) approximately). Solving the \(3 \times 3\) linear system by substitution:

From the mean equation: \(p_u\,j + p_m(j-1) + p_d(j-2) = j - a\), so using normalization: \(j - p_m - 2p_d = j - a\), giving \(p_m + 2p_d = a\).

Combined with \(p_u = 1 - p_m - p_d\) and the variance equation, solving yields:

\[ p_u = \frac{7}{6} + \frac{j^2\lambda^2\Delta t^2 - 3j\lambda\Delta t}{2} \]

\[ p_m = -\frac{1}{3} - j^2\lambda^2\Delta t^2 + 2j\lambda\Delta t \]

\[ p_d = \frac{1}{6} + \frac{j^2\lambda^2\Delta t^2 - j\lambda\Delta t}{2} \]

These match the formulas given in the text.

Exercise 5. Explain why the tree is recombining. Show that after an up-middle-down sequence and a down-middle-up sequence, the process returns to the same node. What advantage does the recombining property provide for computational complexity?

Solution to Exercise 5

The tree is recombining because the spatial grid is uniform with spacing \(\Delta x\), so a move up increases the node index by 1 and a move down decreases it by 1 (under normal branching). The final node index depends only on the total number of up and down moves, not on their order.

Up-middle-down sequence: Starting at node \(j\), an up move goes to \(j+1\), a middle move stays at \(j+1\), a down move goes to \(j+1-1 = j\). Final position: \(j\).

Down-middle-up sequence: Starting at node \(j\), a down move goes to \(j-1\), a middle move stays at \(j-1\), an up move goes to \(j-1+1 = j\). Final position: \(j\).

Both sequences end at the same node \(j\), confirming that the tree recombines.

Computational advantage: A recombining tree has \(O(j_{\max})\) nodes per time step, giving \(O(N \cdot j_{\max})\) total nodes for \(N\) time steps. A non-recombining tree has \(3^i\) nodes at time step \(i\) (each node spawning 3 successors), giving \(O(3^N)\) total nodes --- exponential in the number of steps. For \(N = 100\), this is the difference between roughly \(10^3\) nodes (recombining) and \(10^{47}\) nodes (non-recombining).

Exercise 6. The boundary constant \(0.1844\) in \(j_{\max} = \lfloor 0.1844/(\lambda\Delta t) \rfloor\) ensures positive probabilities. Derive this constant by finding the value of \(j\lambda\Delta t\) at which one of the normal branching probabilities first reaches zero.

Solution to Exercise 6

For normal branching, the critical probability is \(p_m = \frac{2}{3} - j^2\lambda^2\Delta t^2\), which reaches zero first (before \(p_u\) or \(p_d\) do). Setting \(p_m = 0\):

\[ j^2\lambda^2\Delta t^2 = \frac{2}{3} \]

\[ |j|\lambda\Delta t = \sqrt{\frac{2}{3}} \approx 0.8165 \]

However, we also need to check \(p_u\) and \(p_d\). For \(p_d = \frac{1}{6} + \frac{j^2\lambda^2\Delta t^2 + j\lambda\Delta t}{2}\), this is always positive for \(j > 0\) (both terms are positive). For \(p_u = \frac{1}{6} + \frac{j^2\lambda^2\Delta t^2 - j\lambda\Delta t}{2}\), this reaches zero when

\[ \frac{1}{6} + \frac{j^2\lambda^2\Delta t^2 - j\lambda\Delta t}{2} = 0 \]

Setting \(a = j\lambda\Delta t\): \(\frac{1}{6} + \frac{a^2 - a}{2} = 0\), so \(a^2 - a + \frac{1}{3} = 0\), giving \(a = \frac{1 \pm \sqrt{1 - 4/3}}{2}\), which has no real solution. So \(p_u > 0\) for all real \(a\).

The binding constraint comes from requiring all three probabilities to be well-behaved. The constant \(0.1844\) corresponds to a more conservative threshold that ensures probabilities remain bounded away from zero and well-behaved in practice:

\[ j_{\max}\lambda\Delta t = 0.1844 \]

This is a practical choice (approximately \(0.1844 \approx 1 - \sqrt{2/3} \cdot e^{-1}\) in some derivations) that ensures the probabilities remain strictly positive with a safety margin.

Exercise 7. Compare the trinomial tree with a binomial tree for the Hull-White model. A binomial tree has only two successors per node. Explain why a binomial tree cannot simultaneously match the mean and variance of the OU process while maintaining positive probabilities and a recombining structure.

Solution to Exercise 7

A binomial tree has two successors per node: up and down. At each node, we have two probabilities \(p_u\) and \(p_d = 1 - p_u\), giving two free parameters (one probability and the spatial step \(\Delta x\)). We need to match three conditions:

Normalization: \(p_u + p_d = 1\) (automatically satisfied with one probability)
Mean matching: \(p_u(j+1)\Delta x + p_d(j-1)\Delta x = j\,e^{-\lambda\Delta t}\Delta x\)
Variance matching: requires a specific relationship between \(\Delta x\) and \(\Delta t\)

The mean-matching equation gives \(p_u - p_d = j(e^{-\lambda\Delta t} - 1)\), so \(p_u = \frac{1}{2} + \frac{j(e^{-\lambda\Delta t} - 1)}{2}\). For the variance to match, we need \(\Delta x = \sigma\sqrt{\Delta t}\) (not \(\sigma\sqrt{3\Delta t}\)).

The problem arises because with only two parameters (\(p_u\) and \(\Delta x\)), we can match the mean and variance, but the probability \(p_u\) depends on \(j\). For large \(|j|\), the mean-reversion drift \(j(e^{-\lambda\Delta t} - 1)\) becomes large, and \(p_u\) falls outside \([0, 1]\).

In a trinomial tree, the third branch provides an extra degree of freedom, and the three branching patterns (normal, up, down) provide additional flexibility to keep all probabilities positive. The ability to switch branching patterns is the key mechanism that the binomial tree lacks. While a non-recombining binomial tree could handle this by adjusting step sizes, it would lose the \(O(N)\)-nodes-per-step property that makes lattice methods efficient.