PDE Formulation¶
Stochastic volatility models also admit a partial differential equation (PDE) formulation for option pricing. The PDE perspective is useful for theoretical analysis, boundary conditions, American options, and products not well-suited to Fourier methods.
Risk-Neutral Pricing PDE¶
Derivation from No-Arbitrage¶
Let \(V(t, S, v)\) be the price at time \(t\) of a European option when the spot price is \(S\) and instantaneous variance is \(v\).
By delta-vega hedging arguments (in the presence of a volatility-traded asset) or by the Feynman–Kac theorem, \(V\) satisfies:
with terminal condition:
where \(\Phi(S)\) is the payoff (e.g., \((S-K)^+\) for a call).
The Infinitesimal Generator¶
For a general two-factor stochastic volatility model:
the generator is:
Heston PDE¶
Specific Form¶
For the Heston model with \(a(v) = \kappa(\theta - v)\) and \(b(v) = \xi\sqrt{v}\):
Log-Price Transformation¶
Setting \(x = \log S\):
This form has constant coefficients in the second-order \(x\) terms, simplifying numerical treatment.
Boundary Conditions¶
In Asset Price (S)¶
As \(S \to 0\): - Call: \(V \to 0\) - Put: \(V \to Ke^{-r(T-t)}\)
As \(S \to \infty\): - Call: \(V \sim Se^{-q(T-t)} - Ke^{-r(T-t)}\) - Put: \(V \to 0\)
In Variance (v)¶
As \(v \to 0\):
The PDE degenerates (diffusion coefficient vanishes). Treatment depends on the Feller condition:
- Feller satisfied (\(2\kappa\theta \geq \xi^2\)): Boundary is unattainable; no explicit condition needed
- Feller violated: Boundary is attainable; specify behavior (typically: smooth continuation)
A common approach: use the limiting PDE as \(v \to 0\):
As \(v \to \infty\):
Option value becomes approximately linear in \(\sqrt{v}\). Common choices: - Neumann: \(\frac{\partial V}{\partial v} = 0\) - Linear extrapolation - Far boundary placed sufficiently far
Terminal Condition¶
For a European call: \(\Phi(S) = (S - K)^+\)
Numerical Methods¶
Finite Difference Discretization¶
Discretize the \((x, v)\) domain on a grid:
Denote \(V_{i,j}^n \approx V(t_n, x_i, v_j)\).
Standard Finite Differences¶
First derivatives:
Second derivatives:
Mixed derivative:
Time Stepping¶
Explicit scheme: Unstable for fine grids; CFL condition required
Implicit scheme: Unconditionally stable but requires solving linear systems
Crank–Nicolson: Second-order in time, but can produce oscillations
ADI (Alternating Direction Implicit): Preferred for 2D problems
ADI Schemes¶
Douglas–Rachford ADI¶
Split the operator: \(\mathcal{L} = \mathcal{L}_x + \mathcal{L}_v + \mathcal{L}_{xv}\)
Step 1 (implicit in \(x\)):
Step 2 (implicit in \(v\)):
Each step involves only tridiagonal systems (fast to solve).
Hundsdorfer–Verwer ADI¶
More sophisticated splitting with better stability:
with \(\theta = 1/2\) for second-order accuracy.
Mixed Derivative Treatment¶
The cross-derivative \(\frac{\partial^2 V}{\partial x \partial v}\) complicates ADI:
Option 1: Include in explicit part (may limit stability)
Option 2: Craig–Sneyd scheme (modified ADI for mixed terms)
Option 3: Coordinate transformation to remove mixed term
Coordinate Transformations¶
Removing the Mixed Derivative¶
The transformation:
can simplify the PDE by reducing the mixed derivative coefficient.
Non-Uniform Grids¶
Concentrate grid points where \(V\) varies most: - Near the strike (\(S \approx K\)) - Near \(v = 0\) (boundary layer) - Near \(v = v_0\) (current variance)
Common transformations:
American Options¶
Free Boundary Problem¶
American options require:
The early exercise boundary \(S^*(t, v)\) separates continuation and exercise regions.
Complementarity Formulation¶
Numerical Approaches¶
Projected SOR: After each time step, enforce \(V \geq \Phi\)
Penalty method: Replace constraint with penalty term
Policy iteration: Iterate between solving PDE and updating exercise boundary
When PDE Methods Are Preferred¶
Advantages¶
- American options: Natural handling of early exercise
- Barriers: Boundary conditions straightforward
- No moment restrictions: Unlike Fourier methods
- Greeks: Finite differences give Greeks directly
- Path-dependent (some): Certain structures fit PDE framework
Disadvantages¶
- Computational cost: \(O(N_x \cdot N_v \cdot N_t)\) per price
- Curse of dimensionality: Multi-asset/multi-factor is expensive
- Boundary treatment: Requires care at \(v = 0\) and \(v = \infty\)
- Implementation complexity: ADI schemes are intricate
Comparison with Fourier Methods¶
| Aspect | PDE | Fourier |
|---|---|---|
| European vanilla | Slower | Faster |
| American | Natural | Difficult |
| Barriers | Natural | Difficult |
| Greeks | Direct (FD) | Differentiation of CF |
| Calibration | Slow | Fast |
| Implementation | Complex | Moderate |
Key Takeaways¶
- Stochastic volatility pricing can be expressed as a two-dimensional PDE
- The Heston PDE involves mixed derivatives requiring careful numerical treatment
- Boundary conditions at \(v = 0\) depend on the Feller condition
- ADI schemes (Douglas–Rachford, Hundsdorfer–Verwer) are standard for efficiency
- PDE methods excel for American options and barriers
- For European calibration, Fourier methods are usually preferred
Further Reading¶
- Wilmott, P. (2006). Paul Wilmott on Quantitative Finance, 2nd ed. Wiley.
- in 't Hout, K. & Foulon, S. (2010). ADI finite difference schemes for option pricing in the Heston model with correlation. International Journal of Numerical Analysis and Modeling.
- Ikonen, S. & Toivanen, J. (2008). Efficient numerical methods for pricing American options under stochastic volatility. Numerical Methods for Partial Differential Equations.
- Haentjens, T. & in 't Hout, K. (2012). Alternating direction implicit finite difference schemes for the Heston–Hull–White PDE. Journal of Computational Finance.
- Glasserman, P. (2003). Monte Carlo Methods in Financial Engineering. Springer.
Exercises¶
Exercise 1. Starting from the Heston SDE system under \(\mathbb{Q}\), apply the Feynman-Kac theorem to derive the pricing PDE for a European option \(V(t, S, v)\). Identify each term in the PDE and explain its financial interpretation (e.g., which terms correspond to the asset drift, variance mean reversion, the hedging cost, etc.).
Solution to Exercise 1
Under \(\mathbb{Q}\), the Heston SDE system is
with \(d\langle W^S, W^v\rangle_t = \rho\,dt\). By the Feynman-Kac theorem, the price \(V(t,S,v) = e^{-r(T-t)}\mathbb{E}^{\mathbb{Q}}[\Phi(S_T) \mid S_t = S, v_t = v]\) satisfies
where \(\mathcal{L}\) is the infinitesimal generator of \((S_t, v_t)\). Applying Ito's lemma to \(V(t, S_t, v_t)\) and collecting terms:
Financial interpretation of each term:
- \((r-q)S\frac{\partial V}{\partial S}\): Risk-neutral drift of the asset (growth at \(r-q\) after dividend yield)
- \(\kappa(\theta - v)\frac{\partial V}{\partial v}\): Mean reversion of variance toward \(\theta\) at speed \(\kappa\)
- \(\frac{1}{2}vS^2\frac{\partial^2 V}{\partial S^2}\): Gamma effect from asset diffusion (convexity in \(S\))
- \(\rho\xi vS\frac{\partial^2 V}{\partial S\partial v}\): Cross-gamma from correlation between asset and variance shocks
- \(\frac{1}{2}\xi^2 v\frac{\partial^2 V}{\partial v^2}\): Vol-of-vol effect (convexity in \(v\))
- \(-rV\): Discounting at the risk-free rate
Exercise 2. Perform the log-price transformation \(x = \log S\) on the Heston PDE. Show explicitly how the terms \(S\frac{\partial V}{\partial S}\) and \(S^2\frac{\partial^2 V}{\partial S^2}\) transform. Verify that the resulting PDE has coefficients that are at most linear in \(v\) (confirming the affine structure).
Solution to Exercise 2
Set \(x = \log S\), so \(S = e^x\). Let \(U(t, x, v) = V(t, e^x, v)\). By the chain rule:
For the second derivative:
so
Substituting into the Heston PDE:
Combining the first-order \(x\) terms:
All coefficients are either constant or linear in \(v\): the \(\frac{\partial U}{\partial x}\) coefficient is \((r-q-v/2)\), the \(\frac{\partial^2 U}{\partial x^2}\) coefficient is \(v/2\), the cross-derivative coefficient is \(\rho\xi v\), and the \(\frac{\partial^2 U}{\partial v^2}\) coefficient is \(\xi^2 v/2\). This confirms the affine structure in \(v\).
Exercise 3. At the boundary \(v = 0\), the Heston PDE degenerates because the diffusion coefficients involving \(v\) vanish. Write the limiting PDE as \(v \to 0\) and show it reduces to
Explain why this is a first-order PDE in \(v\) (no second-order \(v\) derivatives) and what this means for the numerical treatment of the \(v = 0\) boundary.
Solution to Exercise 3
In the Heston PDE (in original \(S\) coordinates), the terms involving \(v\) as a multiplicative factor are:
As \(v \to 0\), all three terms vanish. The drift term \(\kappa(\theta - v)\frac{\partial V}{\partial v} \to \kappa\theta\frac{\partial V}{\partial v}\), which remains nonzero (since \(\theta > 0\)). The surviving terms give the limiting PDE:
This is a first-order PDE in \(v\) because there are no second-order derivatives in \(v\) (or mixed derivatives). The \(\kappa\theta\frac{\partial V}{\partial v}\) term acts as a transport term pushing variance away from zero (since \(\kappa\theta > 0\), it advects in the positive \(v\) direction).
Numerical implications: Because the PDE degenerates at \(v = 0\), standard second-order finite difference stencils cannot be applied at that boundary. Instead, we use the first-order limiting PDE as a boundary condition. If the Feller condition \(2\kappa\theta \geq \xi^2\) holds, the variance process never reaches zero, so the boundary is unattainable and the limiting PDE is only needed formally. If the Feller condition is violated, variance can reach zero, and the first-order boundary PDE must be imposed explicitly, requiring a one-sided (upwind) discretization in the \(v\)-direction at \(v = 0\).
Exercise 4. For a finite difference grid with \(N_x = 200\) points in the log-price direction and \(N_v = 100\) points in the variance direction, compute the total number of unknowns per time step. If \(N_t = 500\) time steps are used, estimate the total number of floating-point operations for: (a) an explicit scheme (\(O(N_x \cdot N_v)\) per step); (b) an implicit scheme requiring solution of a banded linear system. Compare with the cost of a Fourier method using \(N_u = 256\) integration points.
Solution to Exercise 4
The grid has \(N_x = 200\) points in log-price and \(N_v = 100\) points in variance, giving \(N_x \cdot N_v = 20{,}000\) unknowns per time step.
(a) Explicit scheme: Each time step requires \(O(N_x \cdot N_v)\) operations (one stencil evaluation per grid point). Over \(N_t = 500\) steps:
(b) Implicit scheme: A fully implicit scheme requires solving a linear system of size \(20{,}000 \times 20{,}000\) per time step. Using ADI splitting, each sub-step involves tridiagonal solves: \(N_v\) tridiagonal systems of size \(N_x\) (cost \(O(N_x)\) each) plus \(N_x\) tridiagonal systems of size \(N_v\). Per time step: \(O(N_v \cdot N_x + N_x \cdot N_v) = O(2 N_x N_v)\). Over \(N_t\) steps:
Without ADI (direct sparse solve), the cost is higher: \(O(N_t \cdot (N_x N_v)^{3/2})\) for a banded system, which is \(O(500 \cdot 20000^{1.5}) \approx O(1.4 \times 10^9)\).
(c) Fourier method: With \(N_u = 256\) integration points, each requiring one CF evaluation, the cost is \(O(256)\) for a single strike. For \(N\) strikes via FFT: \(O(N_u \log N_u) = O(256 \times 8) \approx O(2000)\).
The Fourier method is roughly 4--5 orders of magnitude faster than PDE methods for European option pricing.
Exercise 5. The mixed derivative \(\frac{\partial^2 V}{\partial x \partial v}\) is approximated by the four-point stencil
Show that this stencil has truncation error \(O((\Delta x)^2 + (\Delta v)^2)\). When \(\rho\) is close to \(\pm 1\), explain why the coefficient of the mixed derivative becomes large and can cause stability issues. How does the coordinate transformation \(y = x - \rho v\) help?
Solution to Exercise 5
Truncation error: Taylor-expand \(V_{i\pm 1, j\pm 1}\) around \((x_i, v_j)\):
Computing the stencil combination:
Dividing by \(4\Delta x\Delta v\) gives
confirming second-order accuracy.
Stability when \(|\rho| \approx 1\): The coefficient of the mixed derivative in the PDE is \(\rho\xi v\). When \(|\rho|\) is close to 1, this coefficient becomes large relative to the pure second derivatives. The four-point stencil connects diagonal grid neighbors, and the resulting finite difference matrix can lose diagonal dominance, producing negative coefficients in the stencil. This can cause non-physical oscillations or instability, particularly in explicit schemes.
Coordinate transformation: The substitution \(y = x - \rho v\), \(w = v\) transforms the cross-derivative term. Since \(\frac{\partial}{\partial x} = \frac{\partial}{\partial y}\) and \(\frac{\partial}{\partial v} = -\rho\frac{\partial}{\partial y} + \frac{\partial}{\partial w}\), the mixed derivative \(\rho\xi v\frac{\partial^2 V}{\partial x\partial v}\) is absorbed into pure second-order terms in \(y\) and \(w\), eliminating or reducing the mixed derivative coefficient. This restores diagonal dominance and standard ADI applicability.
Exercise 6. For an American put option under Heston dynamics, the free boundary problem requires
Explain why the early exercise boundary \(S^*(t, v)\) is now a surface (function of both \(t\) and \(v\)) rather than a curve as in Black-Scholes. How does increasing \(v\) affect \(S^*(t, v)\)? (Hint: consider the option holder's incentive to exercise when volatility is high vs. low.)
Solution to Exercise 6
In Black-Scholes, the early exercise boundary \(S^*(t)\) is a function of time only because the volatility is deterministic. Under Heston dynamics, the instantaneous variance \(v\) is stochastic and affects the option value, so the exercise decision depends on \(v\) as well. The boundary becomes a surface \(S^*(t, v)\) in the \((t, v)\) plane.
Effect of increasing \(v\) on \(S^*(t,v)\): Consider an American put holder at time \(t\) with stock price \(S\) and current variance \(v\). The value of holding (continuing) the option increases with \(v\), because higher variance means a greater chance of the stock falling further, increasing the expected payoff. The exercise value \((K - S)^+\) is independent of \(v\). Therefore, for higher \(v\), the holder demands a lower stock price before exercising (continuation is more valuable). This means \(S^*(t, v)\) is decreasing in \(v\): as variance increases, the critical stock price below which exercise occurs drops.
Intuitively, when volatility is high, there is more "optionality" to benefit from further price moves, making early exercise less attractive. When volatility is very low, the option behaves nearly like an intrinsic-value instrument, and early exercise becomes optimal at stock prices closer to the strike.
Exercise 7. Compare PDE and Fourier methods for the following pricing problems, and recommend the better approach for each:
(a) Pricing ATM European calls at 50 different strikes for calibration.
(b) Pricing an American put with \(T = 1\) year under Heston dynamics.
(c) Computing delta and gamma of a European call.
(d) Pricing a down-and-out barrier call with barrier at \(B = 0.8 S_0\).
Justify each recommendation based on computational cost, accuracy, and ease of implementation.
Solution to Exercise 7
(a) Pricing ATM European calls at 50 different strikes for calibration: Fourier methods (COS or Carr-Madan FFT) are strongly preferred. The FFT computes prices at all strikes simultaneously in \(O(N\log N)\), and COS achieves \(O(N \cdot N_K)\) where \(N_K = 50\) is the number of strikes. A PDE solve gives one price surface but costs \(O(N_x \cdot N_v \cdot N_t) \sim O(10^7)\), and calibration requires many such solves (iterating over parameters). Fourier methods are orders of magnitude faster for this task.
(b) Pricing an American put with \(T = 1\) year under Heston dynamics: PDE methods are preferred. The early exercise constraint \(V \geq (K-S)^+\) is naturally handled via projected SOR or penalty methods within the PDE framework. Fourier methods are designed for European payoffs and do not directly accommodate the free boundary. While extensions exist (e.g., COS for Bermudan options), PDE methods remain the standard, robust approach.
(c) Computing delta and gamma of a European call: PDE methods have a slight edge, since finite differences on the \((S, v)\) grid provide Greeks directly as by-products (e.g., \(\Delta \approx \frac{V_{i+1,j} - V_{i-1,j}}{2\Delta S}\), \(\Gamma \approx \frac{V_{i+1,j} - 2V_{i,j} + V_{i-1,j}}{(\Delta S)^2}\)). However, Fourier methods can also deliver Greeks by differentiating the pricing integral or using finite difference bumps on the input parameters. For a single option, either method works; for calibration with Greeks, Fourier is faster.
(d) Pricing a down-and-out barrier call with barrier at \(B = 0.8\,S_0\): PDE methods are preferred. The barrier condition \(V(t, B, v) = 0\) is implemented as a Dirichlet boundary condition on the PDE grid, which is straightforward. Fourier methods require either image-charge techniques or conditional CFs that are model-specific and less general. PDE methods handle barriers naturally and accurately (especially with grid points placed at the barrier).