Gyongy's Theorem and Markovian Projection¶
Every stochastic volatility model — Heston, SABR, Bergomi, or any other — produces a set of marginal distributions for the asset price \(S_t\) at each time \(t\). A natural question arises: does there exist a simpler, one-factor local volatility model that reproduces exactly the same marginal distributions? Gyongy's theorem (1986) answers this question affirmatively and provides an explicit formula for the local volatility function in terms of conditional expectations. This result establishes a deep bridge between local and stochastic volatility models, explains why local volatility can perfectly calibrate to vanilla option prices, and reveals the precise sense in which local volatility is a "projection" of more complex dynamics onto Markovian dynamics.
Learning Objectives
After completing this section, you should be able to:
- State Gyongy's theorem and explain the conditional expectation formula for local volatility
- Define Markovian projection and explain why it preserves marginal distributions
- Derive the local volatility function from a given stochastic volatility model
- Explain the difference between matching marginals and matching path-dependent dynamics
- Describe the implications for exotic option pricing and model selection
Motivation¶
The Calibration Puzzle¶
Consider a stochastic volatility model where the asset price and its volatility evolve jointly:
with \(\text{Corr}(dW_t^{(1)}, dW_t^{(2)}) = \rho \, dt\). This model has stochastic volatility \(\sigma_t\) driven by its own Brownian motion.
Now consider a local volatility model:
Both models can be calibrated to the same set of vanilla European option prices. But the stochastic volatility model has two factors (spot and vol), while the local volatility model has only one factor (spot). How can a one-factor model reproduce the prices generated by a two-factor model?
The answer is that vanilla option prices depend only on the marginal distribution of \(S_T\) at each maturity, not on the path or the joint dynamics of \((S_t, \sigma_t)\). Gyongy's theorem shows that for every multi-factor model, there exists a one-factor local volatility model with the same marginal distributions.
Intuition¶
Imagine all the possible paths of \(S_t\) in the stochastic volatility model that pass through a specific point \(S_t = K\) at time \(t\). Along each such path, the instantaneous volatility \(\sigma_t\) takes a different value (because \(\sigma_t\) is random). The local volatility \(\sigma_{\text{loc}}(K, t)\) is the average of these instantaneous volatilities, weighted by the probability of being at level \(K\):
This conditional expectation "projects" the stochastic volatility onto a deterministic function of spot and time, collapsing the multi-factor dynamics into a single factor while preserving the marginal distributions.
Statement of the Theorem¶
General Form¶
Theorem 13.4.1 (Gyongy, 1986) Let \((X_t)_{t \geq 0}\) be an Ito process satisfying:
where \(Y_t\) is an additional state variable (possibly multidimensional) and \(W_t\) is a (possibly multidimensional) Brownian motion. Assume that the coefficients satisfy standard regularity conditions (Lipschitz, linear growth) ensuring existence and uniqueness of strong solutions.
Then there exists a Markov process \((\bar{X}_t)_{t \geq 0}\) satisfying:
such that:
where the coefficients of the Markov process are the conditional expectations:
Application to Finance¶
For the stochastic volatility model with \((r - q)S_t\) drift, the drift of \(S_t\) is already Markovian (it depends only on \(S_t\)). The conditional expectation of the drift is:
So the drift of the projected process is unchanged. The key projection is on the diffusion coefficient:
which gives the fundamental formula:
This says the local volatility at spot level \(K\) and time \(t\) equals the risk-neutral conditional expectation of instantaneous variance, given that the spot price is at \(K\) at time \(t\).
Proof Sketch¶
Key Idea¶
The proof proceeds by showing that the one-dimensional marginals of \(X_t\) and \(\bar{X}_t\) satisfy the same Fokker-Planck equation, and hence must be equal.
Step 1: Fokker-Planck for \(X_t\). Let \(p(x, t)\) be the density of \(X_t\). The marginal density satisfies:
where \(\bar{\mu}(x, t) = \mathbb{E}[\mu \mid X_t = x]\) and \(\bar{\sigma}^2(x, t) = \mathbb{E}[\sigma^2 \mid X_t = x]\) arise from integrating the joint Fokker-Planck equation over the \(Y\) variable.
To see this, the joint density \(p(x, y, t)\) of \((X_t, Y_t)\) satisfies a two-dimensional Fokker-Planck equation. Integrating over \(y\):
Recognizing that \(\int p(x, y, t) \, dy = p(x, t)\) and:
and similarly for \(\sigma^2\), we obtain the desired equation. \(\square\)
Step 2: Fokker-Planck for \(\bar{X}_t\). The density \(\bar{p}(x, t)\) of the Markov process \(\bar{X}_t\) satisfies the same Fokker-Planck equation by construction (since \(\bar{X}_t\) has drift \(\bar{\mu}\) and diffusion \(\bar{\sigma}\)).
Step 3: Uniqueness. Both \(p\) and \(\bar{p}\) satisfy the same parabolic PDE with the same initial condition \(p(x, 0) = \bar{p}(x, 0) = \delta(x - x_0)\). Under regularity conditions on \(\bar{\mu}\) and \(\bar{\sigma}\), the solution is unique, so \(p(x, t) = \bar{p}(x, t)\) for all \(x\) and \(t\). \(\square\)
What the Proof Does NOT Show
Gyongy's theorem proves equality of one-dimensional marginals at each time \(t\). It does NOT prove equality of:
- Joint distributions \((X_{t_1}, X_{t_2})\) at multiple times
- Path-space distributions (the law of the entire trajectory)
- Expectations of path-dependent functionals
The original process and its Markovian projection can have very different path-dependent behavior while agreeing on all marginals.
The Mimicking Theorem¶
Stronger Version¶
A refinement of Gyongy's theorem, sometimes called the mimicking theorem (Brunick and Shreve, 2013), establishes existence under weaker conditions and provides additional structure.
Theorem 13.4.2 (Mimicking Theorem) Under the conditions of Theorem 13.4.1, the Markov process \(\bar{X}_t\) with:
not only matches the marginal laws of \(X_t\) but is also the unique Markov diffusion (in the class of continuous Markov processes) with this property.
Uniqueness follows from the one-to-one relationship between marginal distributions and the local volatility function established by Dupire's formula: given all marginal distributions (equivalently, all European call prices), there is exactly one local volatility function.
Implications for Financial Modeling¶
Why Local Volatility Calibrates Perfectly¶
Gyongy's theorem explains the remarkable calibration power of local volatility:
- Any arbitrage-free set of European option prices is generated by some process for \(S_t\)
- By Gyongy, there exists a local volatility model matching all marginal distributions
- European option prices depend only on marginal distributions (\(C(K,T) = e^{-rT}\mathbb{E}[(S_T - K)^+]\))
- Therefore, the local volatility model matches all European option prices
This is why Dupire's formula always produces a valid local volatility function given an arbitrage-free option surface — it is computing the Gyongy projection.
Local Volatility as Conditional Average¶
For the Heston model with instantaneous variance \(v_t\):
the local volatility is:
This conditional expectation depends on the entire joint dynamics of \((S_t, v_t)\) and generally does not admit a closed-form expression. It can be computed by:
- Monte Carlo: Simulate the Heston model, bin paths by \(S_t = K\), and average \(v_t\)
- PDE methods: Solve the joint Fokker-Planck equation for the conditional density \(p(v | S_t = K)\)
- Dupire applied to Heston prices: Compute Heston call prices analytically, then apply Dupire's formula
The Leverage Function¶
In stochastic local volatility (SLV) models, the dynamics combine both local and stochastic volatility:
where \(L(S_t, t)\) is the leverage function. By Gyongy's theorem:
Solving for \(L\):
The leverage function is the ratio of the target local volatility (from market data) to the conditional variance from the stochastic volatility component. Calibrating the SLV model reduces to computing this ratio.
Marginals vs Paths¶
What Gyongy Preserves¶
For any function \(f\) and any single time \(t\):
This immediately implies identical pricing for:
- European calls and puts at any \((K, T)\)
- Digital options (binary payoffs)
- Power options (\(f(S) = S^n\))
- Any payoff depending only on \(S_T\) at a single maturity
What Gyongy Does NOT Preserve¶
For path-dependent functionals, the two models generally disagree:
Examples of mismatched prices:
| Product | Depends On | LV vs SV Agreement |
|---|---|---|
| European call | \(S_T\) marginal | Exact match |
| Barrier option | Path maximum/minimum | Different |
| Asian option | \(\frac{1}{n}\sum S_{t_i}\) | Different |
| Cliquet | Forward returns \(S_{t_{i+1}}/S_{t_i}\) | Different |
| Variance swap | \(\sum (\ln S_{t_i}/S_{t_{i-1}})^2\) | Different |
| Forward-starting option | Future smile | Different |
The discrepancies arise because the local volatility model has deterministic vol-of-vol (zero, by construction), while stochastic volatility models have genuine randomness in the volatility process. This randomness affects:
- The correlation between successive returns (persistence of high/low vol regimes)
- The distribution of realized variance (which has heavier tails under SV)
- The forward smile (which is flatter under LV than under SV)
The Forward Smile Problem
Gyongy's theorem guarantees that local volatility matches today's smile (spot smile). It says nothing about future smiles. The forward smile — the implied volatility surface conditional on the spot being at some level at a future time — is systematically different under local and stochastic volatility. This is the most important practical limitation of local volatility for exotic pricing.
Computing the Gyongy Projection¶
Monte Carlo Method¶
Given a stochastic volatility model, compute the local volatility surface by simulation:
- Simulate \(N\) paths of \((S_t, \sigma_t)\) on a time grid \(\{t_0, t_1, \ldots, t_M\}\)
- At each time \(t_k\), sort paths into bins by \(S_{t_k}\) values
- For each bin centered at \(S = K_j\), compute the average instantaneous variance:
This particle method is straightforward but noisy, especially in the tails where few paths visit.
Via Dupire's Formula¶
An alternative avoids conditional expectations entirely:
- Compute European call prices \(C(K, T)\) from the stochastic volatility model (e.g., via characteristic function for Heston)
- Apply Dupire's formula to obtain \(\sigma_{\text{loc}}(K, T)\)
This is mathematically equivalent to the conditional expectation approach (by Gyongy's theorem) but is often more practical because:
- Semi-analytic call prices are available for many SV models
- Dupire's formula avoids the binning and noise of Monte Carlo
- The resulting surface is deterministic and smooth (given smooth input prices)
Connection to Dupire's Formula¶
Dupire as Gyongy¶
Dupire's formula can be understood as the computational realization of Gyongy's projection:
The left side (Dupire) computes the local volatility from observable prices. The right side (Gyongy) gives its theoretical interpretation as a conditional expectation. The equality holds for any underlying model — the Dupire formula automatically computes the Gyongy projection of whatever dynamics generated the market prices.
Consistency Check¶
This interpretation provides a useful consistency check: if one calibrates a stochastic volatility model to market data and then computes the Gyongy projection, the result should match the Dupire local volatility surface computed directly from market prices.
Equality holds if and only if the stochastic volatility model is perfectly calibrated to the market. Discrepancies indicate calibration error.
Summary¶
Gyongy's theorem provides the foundational link between local and stochastic volatility:
- The conditional expectation formula \(\sigma_{\text{loc}}^2(K, t) = \mathbb{E}[\sigma_t^2 \mid S_t = K]\) defines local volatility as the Markovian projection of instantaneous variance
- Marginal matching: The local volatility model reproduces all one-dimensional marginals (and hence all vanilla option prices) of any underlying multi-factor model
- Uniqueness: The Markovian projection is unique — there is exactly one local volatility function consistent with a given set of marginal distributions
- Path dependence lost: While marginals match, the path-dependent behavior differs systematically, causing mispricing of exotic options
- Dupire = Gyongy: Dupire's formula is the computational implementation of Gyongy's conditional expectation, connecting observable prices to the theoretical projection
- Bridge to SLV: The leverage function in stochastic local volatility models is calibrated using the Gyongy ratio \(L^2 = \sigma_{\text{loc}}^2 / \mathbb{E}[v_t \mid S_t = K]\)
Exercises¶
Exercise 1. State Gyongy's theorem in your own words. Why does the theorem guarantee that a one-factor local volatility model can reproduce all European option prices generated by a multi-factor stochastic volatility model?
Solution to Exercise 1
Gyongy's theorem states that for any Ito process \(X_t\) driven by possibly multi-factor dynamics (including hidden state variables \(Y_t\)), there exists a one-dimensional Markov diffusion \(\bar{X}_t\) whose drift and diffusion coefficients are conditional expectations of the original coefficients:
and \(\bar{X}_t\) has the same marginal distribution as \(X_t\) at every time \(t\).
Why this guarantees matching European prices: European option prices depend only on the marginal distribution of \(S_T\) at the single maturity \(T\):
Since Gyongy's theorem guarantees \(\text{Law}(\bar{S}_T) = \text{Law}(S_T)\) for all \(T\), the expectation is the same under both models:
Therefore the one-factor local volatility model produces the same European call (and put) prices at every strike and maturity as the multi-factor stochastic volatility model.
Exercise 2. In the Heston model with \(dS_t = (r - q)S_t \, dt + \sqrt{v_t} S_t \, dW_t^{(1)}\) and \(dv_t = \kappa(\theta - v_t) \, dt + \xi\sqrt{v_t} \, dW_t^{(2)}\), the Gyongy projection gives \(\sigma_{\text{loc}}^2(K, t) = \mathbb{E}[v_t \mid S_t = K]\). Suppose at time \(t = 1\) the joint density of \((S_t, v_t)\) concentrates near two scenarios: \(S_t = 90\) with \(v_t \approx 0.06\) and \(S_t = 110\) with \(v_t \approx 0.03\). Estimate \(\sigma_{\text{loc}}(90, 1)\) and \(\sigma_{\text{loc}}(110, 1)\). Does this produce a downward-sloping local volatility smile?
Solution to Exercise 2
Given that the joint density concentrates near two scenarios at \(t = 1\):
- Scenario A: \(S_t = 90\) with \(v_t \approx 0.06\)
- Scenario B: \(S_t = 110\) with \(v_t \approx 0.03\)
The Gyongy projection gives:
At \(K = 90\): The paths reaching \(S_t = 90\) predominantly come from Scenario A, so:
At \(K = 110\): The paths reaching \(S_t = 110\) predominantly come from Scenario B, so:
Yes, this produces a downward-sloping local volatility smile. The local volatility decreases from \(0.245\) at \(K = 90\) to \(0.173\) at \(K = 110\), consistent with the typical equity skew. The negative correlation between spot and variance in the Heston model (low spot \(\leftrightarrow\) high variance) is captured by the conditional expectation, producing higher local volatility at lower strikes.
Exercise 3. Consider a stochastic local volatility (SLV) model \(dS_t = (r - q)S_t \, dt + L(S_t, t)\sqrt{v_t}S_t \, dW_t^{(1)}\). The leverage function satisfies \(L^2(K, t) = \sigma_{\text{loc}}^2(K, t) / \mathbb{E}[v_t \mid S_t = K]\). If \(\sigma_{\text{loc}}(K, t) = 0.25\) and \(\mathbb{E}[v_t \mid S_t = K] = 0.04\), compute \(L(K, t)\). What does it mean if \(L(K, t) > 1\)?
Solution to Exercise 3
The leverage function satisfies:
Given \(\sigma_{\text{loc}}(K, t) = 0.25\) and \(\mathbb{E}[v_t \mid S_t = K] = 0.04\):
Interpretation of \(L(K, t) > 1\): The leverage function greater than 1 means the stochastic volatility component alone (with \(L = 1\)) is insufficient to generate the level of local volatility observed in the market at this strike-time point. The SLV model must amplify the stochastic volatility by a factor of \(L = 1.25\) to match the market-implied local volatility. This typically occurs when the pure stochastic volatility model underestimates the skew or the level of volatility at certain strikes, and the leverage function compensates by scaling up the effective diffusion coefficient.
Exercise 4. Explain why Gyongy's theorem preserves single-maturity option prices but fails for path-dependent products. Give a concrete example of two models (one local volatility, one stochastic volatility) that agree on all European call prices but disagree on the price of a variance swap.
Solution to Exercise 4
Why marginals match but path-dependent products differ:
Gyongy's theorem guarantees \(\text{Law}(S_t) = \text{Law}(\bar{S}_t)\) at each individual time \(t\), which is sufficient for any payoff that depends on \(S_T\) at a single maturity. However, the theorem does not guarantee equality of joint distributions at multiple times, i.e., \(\text{Law}(S_{t_1}, S_{t_2}) \neq \text{Law}(\bar{S}_{t_1}, \bar{S}_{t_2})\) in general. Path-dependent products depend on the joint distribution across multiple times or on the entire trajectory.
Concrete example with variance swaps. Consider the Heston model and its Gyongy-projected local volatility model, both calibrated to the same European option surface.
The fair variance swap strike is \(K_{\text{var}} = \mathbb{E}\left[\frac{1}{T}\int_0^T \sigma_t^2 \, dt\right]\).
-
Under the Heston model: The realized variance \(\int_0^T v_t \, dt\) has a distribution with significant right-tail weight due to the stochastic nature of \(v_t\). The variance of realized variance is positive, driven by the vol-of-vol parameter \(\xi > 0\).
-
Under local volatility: The instantaneous variance is \(\sigma_{\text{loc}}^2(S_t, t)\), a deterministic function of spot. While \(\int_0^T \sigma_{\text{loc}}^2(S_t, t) \, dt\) is random (through \(S_t\)), the randomness comes only from one source (the spot), producing a thinner-tailed distribution of realized variance.
The variance swap strikes will generally differ because \(\mathbb{E}[\int_0^T v_t \, dt] \neq \mathbb{E}[\int_0^T \sigma_{\text{loc}}^2(S_t, t) \, dt]\) -- the time-averaged variance depends on the autocorrelation structure of volatility along paths, which is fundamentally different in the two models.
Exercise 5. The proof sketch of Gyongy's theorem shows that the marginal density \(p(x, t)\) of the original process satisfies the same Fokker-Planck equation as the Markovian projection \(\bar{p}(x, t)\). The key step involves integrating the joint Fokker-Planck equation over the hidden variable \(Y\). Show that:
and explain why uniqueness of the Fokker-Planck solution guarantees \(p = \bar{p}\).
Solution to Exercise 5
The joint density of \((X_t, Y_t)\) is \(p(x, y, t)\). The marginal density of \(X_t\) is:
By the definition of conditional expectation:
Therefore:
This identity follows directly from the definition of conditional expectation: the numerator is the joint expectation weighted by the joint density, and dividing by the marginal density gives the conditional expectation.
Why uniqueness guarantees \(p = \bar{p}\): After integrating the joint Fokker-Planck equation over \(y\), the marginal density \(p(x, t)\) satisfies:
The density \(\bar{p}(x, t)\) of the Markov process \(\bar{X}_t\) satisfies exactly the same PDE (by construction of \(\bar{X}_t\) with drift \(\bar{\mu}\) and diffusion \(\bar{\sigma}\)), with the same initial condition \(\bar{p}(x, 0) = \delta(x - x_0) = p(x, 0)\). Under the regularity conditions (uniform ellipticity and Holder continuity of \(\bar{\sigma}\)), the parabolic PDE theory guarantees a unique solution. Since both \(p\) and \(\bar{p}\) solve the same PDE with the same initial data, \(p(x, t) = \bar{p}(x, t)\) for all \(x\) and \(t\).
Exercise 6. Two methods for computing the Gyongy projection are: (a) Monte Carlo with conditional averaging, and (b) Dupire's formula applied to semi-analytic call prices. Compare these two approaches in terms of accuracy, computational cost, and smoothness of the resulting local volatility surface. In what situation would you prefer method (a) over method (b)?
Solution to Exercise 6
Method (a): Monte Carlo with conditional averaging.
- Accuracy: Subject to statistical noise from finite sample size. The conditional expectation \(\mathbb{E}[\sigma_t^2 \mid S_t = K]\) requires binning paths by spot level, and bins in the tails contain few paths, leading to high variance estimates.
- Computational cost: Requires simulating the full multi-factor model (\(N\) paths \(\times\) \(M\) time steps), then sorting and binning. Cost scales as \(O(NM)\) plus overhead for binning. For smooth results, \(N\) must be large (e.g., \(10^5\) to \(10^6\)).
- Smoothness: The raw output is noisy and requires additional smoothing (kernel regression, spline fitting, etc.), introducing bias-variance tradeoffs.
Method (b): Dupire's formula applied to semi-analytic prices.
- Accuracy: High, since semi-analytic call prices (e.g., via Fourier inversion for Heston) are computed to machine precision. The main source of error is the numerical differentiation in the Dupire formula.
- Computational cost: Requires evaluating the characteristic function at a grid of \((K, T)\) points. For models with closed-form characteristic functions (Heston, Bates, etc.), this is fast: \(O(N_K \times N_T)\) FFT evaluations.
- Smoothness: The resulting surface is smooth provided the input prices are smooth and the numerical differentiation is done carefully (e.g., with appropriate finite difference stencils or analytic derivatives).
When to prefer method (a): Method (a) is preferred when the stochastic volatility model does not have a semi-analytic characteristic function or Fourier-invertible call prices. Examples include path-dependent volatility models, models with jumps in both spot and volatility without tractable transforms, or general non-Markovian models where no PDE or Fourier approach is available. In such cases, Monte Carlo simulation of the full model is the only feasible approach, and conditional averaging is the natural way to extract the Gyongy projection.