Skip to content

VIX Options Under Heston

Introduction

The CBOE Volatility Index (VIX) measures the market's expectation of 30-day forward realized variance, extracted from S&P 500 option prices. VIX options --- European options on the VIX --- are among the most actively traded volatility derivatives. Pricing VIX options requires a model that captures the dynamics of the VIX itself, not just the underlying equity index.

The Heston model provides an elegant framework for VIX option pricing because VIX\(^2\) is an affine function of the instantaneous variance \(v_t\). This affine relationship means that the characteristic function of VIX\(^2\) (and hence of VIX) can be computed in closed form, enabling semi-analytical pricing of VIX options via Fourier inversion. This section derives the affine representation, obtains the characteristic function, and prices VIX calls and puts.

Prerequisites

Learning Objectives

By the end of this section, you will be able to:

  1. Express VIX\(^2\) as an affine function of the instantaneous variance \(v_t\)
  2. Derive the characteristic function of VIX\(^2\) under the Heston model
  3. Price VIX calls and puts via Fourier inversion
  4. Explain the VIX smile generated by the Heston model
  5. Identify the limitations of single-factor Heston for VIX option pricing

VIX Definition

The Continuous-Time VIX

The VIX is defined (in its continuous-time idealization) as:

\[ \text{VIX}_t^2 = \frac{1}{\tau} \mathbb{E}^{\mathbb{Q}}\!\left[\int_t^{t+\tau} v_s \, ds \,\Big|\, \mathcal{F}_t\right] \]

where \(\tau = 30/365\) is the 30-day horizon (annualized). The VIX itself is:

\[ \text{VIX}_t = 100 \times \sqrt{\text{VIX}_t^2} \]

The factor of 100 converts from decimal to percentage, matching the market convention (VIX = 20 means 20% annualized volatility).

VIX Convention

The CBOE defines VIX using a discrete strip of OTM options (the model-free formula). The continuous-time definition above is the idealization used in stochastic volatility models. The two agree exactly when the model generates option prices consistent with the strip, but differences arise from truncation and discretization in the market formula.

VIX Squared as Affine in \(v_t\)

Under Heston, \(v_t\) is a CIR process with conditional expectation \(\mathbb{E}^{\mathbb{Q}}[v_s \mid \mathcal{F}_t] = \theta + (v_t - \theta)e^{-\kappa(s-t)}\) for \(s \geq t\). Integrating:

\[ \text{VIX}_t^2 = \frac{1}{\tau}\int_t^{t+\tau}\left[\theta + (v_t - \theta)e^{-\kappa(s-t)}\right] ds \]
\[ = \theta + (v_t - \theta)\frac{1 - e^{-\kappa\tau}}{\kappa\tau} \]

Theorem (Affine Representation of VIX Squared)

Under the Heston model, VIX\(^2\) is an affine function of the instantaneous variance:

\[ \text{VIX}_t^2 = a + b \, v_t \]

where

\[ a = \theta\left(1 - \frac{1 - e^{-\kappa\tau}}{\kappa\tau}\right), \qquad b = \frac{1 - e^{-\kappa\tau}}{\kappa\tau} \]
Proof

From the integrated expectation:

\[ \text{VIX}_t^2 = \theta + (v_t - \theta)\frac{1 - e^{-\kappa\tau}}{\kappa\tau} = \theta - \theta\frac{1 - e^{-\kappa\tau}}{\kappa\tau} + v_t \frac{1 - e^{-\kappa\tau}}{\kappa\tau} \]

Setting \(b = \frac{1 - e^{-\kappa\tau}}{\kappa\tau}\) and \(a = \theta(1 - b)\) completes the proof. \(\square\)

Properties of the Affine Coefficients

The coefficient \(b\) satisfies \(0 < b \leq 1\) for all \(\kappa > 0\), with \(b \to 1\) as \(\kappa \to 0\) (no mean reversion: VIX\(^2 \approx v_t\)) and \(b \to 0\) as \(\kappa \to \infty\) (fast mean reversion: VIX\(^2 \approx \theta\)). The coefficient \(a\) satisfies \(0 \leq a < \theta\), with \(a \to 0\) as \(\kappa \to 0\) and \(a \to \theta\) as \(\kappa \to \infty\).

Intuition

When mean reversion is strong (\(\kappa\) large), VIX\(^2\) is close to the long-run variance \(\theta\) regardless of the current variance \(v_t\) --- the 30-day horizon is long enough for variance to revert. When mean reversion is weak (\(\kappa\) small), VIX\(^2\) is close to the current variance \(v_t\), as mean reversion has little effect over 30 days.


Characteristic Function of VIX

From \(v_t\) to VIX\(^2\)

Since VIX\(_t^2 = a + b \, v_t\), the characteristic function of VIX\(^2\) at a future time \(T\) (the VIX option expiry) is:

\[ \phi_{\text{VIX}^2}(u, T) = \mathbb{E}^{\mathbb{Q}}\!\left[e^{iu \, \text{VIX}_T^2} \,\Big|\, \mathcal{F}_0\right] = e^{iua} \, \mathbb{E}^{\mathbb{Q}}\!\left[e^{iub \, v_T} \,\Big|\, v_0\right] \]

The expectation on the right is the characteristic function of \(v_T\) evaluated at \(ub\). For the CIR process, this is known in closed form.

Theorem (Characteristic Function of VIX Squared)

The characteristic function of VIX\(_T^2\) under the Heston model is:

\[ \phi_{\text{VIX}^2}(u, T) = \exp\!\left(iua + C(T, ub) + D(T, ub) \, v_0\right) \]

where \(C\) and \(D\) satisfy the CIR Riccati equations:

\[ D(T, w) = \frac{w \, e^{-\kappa T}}{1 - \frac{\xi^2 w}{2\kappa}(1 - e^{-\kappa T})} \]
\[ C(T, w) = \frac{\kappa\theta}{\xi^2}\left[2\ln\!\left(1 - \frac{\xi^2 w}{2\kappa}(1 - e^{-\kappa T})\right) + \kappa T + \xi^2 w \frac{1 - e^{-\kappa T}}{2\kappa}\right] \]

with \(w = iub\).

Proof sketch

The moment generating function of \(v_T\) for the CIR process is:

\[ \mathbb{E}[e^{w v_T} \mid v_0] = \exp(C(T, w) + D(T, w) v_0) \]

where \(D\) and \(C\) solve the Riccati system \(D' = -\kappa D + \frac{1}{2}\xi^2 D^2\), \(D(0) = w\) and \(C' = \kappa\theta D\), \(C(0) = 0\). The ODE for \(D\) is a Bernoulli equation with solution:

\[ D(T, w) = \frac{w e^{-\kappa T}}{1 - \frac{\xi^2 w}{2\kappa}(1 - e^{-\kappa T})} \]

Integrating \(C' = \kappa\theta D\) gives the formula for \(C\). Setting \(w = iub\) and prepending the factor \(e^{iua}\) gives the VIX\(^2\) characteristic function. \(\square\)

Characteristic Function of VIX

For pricing options on VIX (not VIX\(^2\)), we need the characteristic function of \(\text{VIX}_T = \sqrt{\text{VIX}_T^2}\). There is no simple closed-form transformation from \(\phi_{\text{VIX}^2}\) to \(\phi_{\text{VIX}}\). Two practical approaches exist:

  1. Price in VIX\(^2\) space: Express VIX call and put payoffs in terms of VIX\(^2\) and use \(\phi_{\text{VIX}^2}\) directly.
  2. Numerical transform: Compute the density of VIX from \(\phi_{\text{VIX}^2}\) via Fourier inversion, then numerically integrate against the payoff.

Pricing VIX Options

VIX Call via Fourier Inversion

A VIX call with strike \(K_{\text{VIX}}\) and expiry \(T\) has payoff:

\[ (\text{VIX}_T - K_{\text{VIX}})^+ = \left(\sqrt{\text{VIX}_T^2} - K_{\text{VIX}}\right)^+ \]

Defining \(\tilde{K} = K_{\text{VIX}}^2\) and working in VIX\(^2\) space:

\[ (\text{VIX}_T - K_{\text{VIX}})^+ = \left(\sqrt{\text{VIX}_T^2} - \sqrt{\tilde{K}}\right)^+ = \frac{\text{VIX}_T^2 - \tilde{K}}{\text{VIX}_T + K_{\text{VIX}}} \mathbf{1}_{\{\text{VIX}_T^2 > \tilde{K}\}} \]

A more direct approach uses the Gil-Pelaez inversion applied to the characteristic function \(\phi_{\text{VIX}^2}\):

\[ \mathbb{P}(\text{VIX}_T^2 > \tilde{K}) = \frac{1}{2} + \frac{1}{\pi}\int_0^{\infty} \operatorname{Re}\!\left[\frac{e^{-iu\ln\tilde{K}} \phi_{\text{VIX}^2}(u)}{iu}\right] du \]

The VIX call price is then obtained by numerical integration combining the density with the payoff function.

COS Method for VIX Options

The COS method applies directly to VIX\(^2\). Let \(Y = \text{VIX}_T^2\) with characteristic function \(\phi_Y(u) = \phi_{\text{VIX}^2}(u, T)\). The VIX call price is:

\[ C_{\text{VIX}} = e^{-rT} \int_0^{\infty} (\sqrt{y} - K_{\text{VIX}})^+ f_Y(y) \, dy \]

The COS expansion of \(f_Y\) on \([a_{\text{COS}}, b_{\text{COS}}]\) gives:

\[ C_{\text{VIX}} \approx e^{-rT} \frac{b_{\text{COS}} - a_{\text{COS}}}{2} \sum_{k=0}^{N-1} {}' A_k H_k \]

where \(A_k\) are the density coefficients from \(\phi_Y\) and \(H_k = \frac{2}{b_{\text{COS}} - a_{\text{COS}}} \int_{K_{\text{VIX}}^2}^{b_{\text{COS}}} (\sqrt{y} - K_{\text{VIX}}) \cos\!\left(\frac{k\pi(y - a_{\text{COS}})}{b_{\text{COS}} - a_{\text{COS}}}\right) dy\) are the (non-standard) payoff coefficients. These integrals require numerical evaluation since the \(\sqrt{y}\) payoff does not admit closed-form cosine projections.


VIX Smile Under Heston

Implied Volatility of VIX Options

The VIX option implied volatility (the Black implied vol when treating VIX as a lognormal underlying) exhibits its own smile pattern. Under the Heston model:

  1. VIX has a right-skewed distribution: Since \(v_t\) follows a CIR process with a chi-squared-type distribution, VIX\(^2 = a + b v_T\) is right-skewed (heavy right tail). This generates a positive skew in VIX implied volatility.

  2. VIX smile is controlled by \(\xi\) and \(\kappa\): The vol-of-vol \(\xi\) controls the width of the VIX distribution, while \(\kappa\) determines how much of the current \(v_0\) propagates to the VIX option expiry.

  3. Limited flexibility: The single-factor Heston model generates VIX options prices from the same five parameters that determine the equity implied volatility surface. In practice, jointly fitting both the SPX and VIX surfaces is challenging.

Joint SPX-VIX Calibration Challenge

The Heston model typically cannot fit both the SPX implied volatility surface and the VIX implied volatility surface simultaneously. The VIX smile requires higher vol-of-vol \(\xi\) than what fits the SPX smile. This is a well-known limitation that motivates extensions such as the double Heston model, Bates model with jumps, or models with additional variance factors.

VIX Futures

As a byproduct, the VIX futures price (the forward VIX) is:

\[ F_{\text{VIX}}(T) = \mathbb{E}^{\mathbb{Q}}[\text{VIX}_T] = \mathbb{E}^{\mathbb{Q}}\!\left[\sqrt{a + b \, v_T}\right] \]

This expectation does not have a closed form because the square root of a CIR variable has no simple distribution. It can be computed numerically from the density of \(v_T\) (which is non-central chi-squared) or approximated using:

\[ F_{\text{VIX}}(T) \approx \sqrt{\mathbb{E}^{\mathbb{Q}}[\text{VIX}_T^2]} - \frac{\operatorname{Var}^{\mathbb{Q}}[\text{VIX}_T^2]}{8(\mathbb{E}^{\mathbb{Q}}[\text{VIX}_T^2])^{3/2}} \]

This second-order approximation uses the delta method (Taylor expansion of \(\sqrt{\cdot}\)).


Worked Example

Parameters

Parameter Value
\(v_0\) 0.04 (VIX \(\approx\) 20)
\(\kappa\) 3.0
\(\theta\) 0.04
\(\xi\) 0.6
\(\rho\) \(-0.7\)
\(r\) 3%
VIX option expiry \(T\) 3 months
VIX horizon \(\tau\) 30/365

Affine Coefficients

\[ b = \frac{1 - e^{-3.0 \times 30/365}}{3.0 \times 30/365} = \frac{1 - e^{-0.2466}}{0.2466} = \frac{0.2182}{0.2466} = 0.8849 \]
\[ a = 0.04 \times (1 - 0.8849) = 0.00460 \]

So VIX\(_T^2 = 0.00460 + 0.8849 \, v_T\), confirming VIX\(^2\) is approximately \(0.885 v_T\) (heavily loaded on current variance).

VIX Option Prices

Strike (\(K_{\text{VIX}}\)) VIX Call VIX Put Implied Vol
15 $5.62 $0.03 78%
18 $3.24 $0.33 72%
20 $1.78 $0.88 68%
22 $0.84 $1.89 69%
25 $0.24 $4.19 73%
30 $0.02 $8.88 82%

Observations

  1. The VIX option implied volatility exhibits a smile with a minimum near ATM and higher implied vols for both deep ITM and OTM strikes.
  2. The VIX implied vol levels (68--82%) are much higher than the equity implied vol (\(\approx 20\%\)) because VIX itself is highly volatile.
  3. The right-skewed distribution of VIX (from the chi-squared tail of \(v_T\)) produces higher implied vols for high VIX strikes.

Summary

Concept Formula
VIX\(^2\) definition \(\text{VIX}_t^2 = \frac{1}{\tau}\mathbb{E}^{\mathbb{Q}}[\int_t^{t+\tau} v_s \, ds \mid \mathcal{F}_t]\)
Affine representation \(\text{VIX}_t^2 = a + b \, v_t\), \(b = (1-e^{-\kappa\tau})/(\kappa\tau)\)
CF of VIX\(^2\) \(\phi_{\text{VIX}^2}(u,T) = e^{iua + C(T,iub) + D(T,iub)v_0}\)
VIX futures approx \(F_{\text{VIX}} \approx \sqrt{\mathbb{E}[\text{VIX}^2]} - \operatorname{Var}[\text{VIX}^2]/(8(\mathbb{E}[\text{VIX}^2])^{3/2})\)

Key Takeaways

  1. VIX\(^2\) is affine in \(v_t\): Under Heston, VIX\(^2 = a + b \, v_t\) with explicit coefficients depending on \(\kappa\) and \(\tau\). This is the key result enabling semi-analytical pricing.

  2. Characteristic function in closed form: The CF of VIX\(^2\) inherits the CIR moment generating function structure, requiring only evaluation of the Riccati solution.

  3. Fourier pricing applies: VIX options are priced via Gil-Pelaez inversion or the COS method applied to the VIX\(^2\) characteristic function.

  4. VIX smile from chi-squared tail: The right-skewed distribution of \(v_T\) (non-central chi-squared) produces a characteristic positive skew in VIX implied volatility.

  5. Joint calibration is challenging: The single-factor Heston model struggles to fit both SPX and VIX surfaces simultaneously, motivating multi-factor extensions.


What's Next

Section Topic
Forward-Start Options Conditional characteristic function pricing
Double Heston Model Two-factor variance for improved VIX fitting
Variance Swaps (Closed-Form) The variance swap as a VIX building block

Exercises

Exercise 1. Under Heston, VIX\(^2\) is an affine function of the instantaneous variance \(v_t\):

\[ \text{VIX}_t^2 = A + B \cdot v_t \]

where \(A = \theta(1 - (1 - e^{-\kappa\tau})/(\kappa\tau))\) and \(B = (1 - e^{-\kappa\tau})/(\kappa\tau)\) with \(\tau = 30/365\). For \(\kappa = 2.0\) and \(\theta = 0.04\), compute \(A\) and \(B\). If \(v_t = 0.04\), compute VIX in percentage.

Solution to Exercise 1

Computing the affine coefficients. With \(\kappa = 2.0\), \(\theta = 0.04\), \(\tau = 30/365\):

\[ \kappa\tau = 2.0 \times \frac{30}{365} = \frac{60}{365} = 0.16438 \]
\[ e^{-\kappa\tau} = e^{-0.16438} = 0.8484 \]
\[ B = \frac{1 - e^{-\kappa\tau}}{\kappa\tau} = \frac{1 - 0.8484}{0.16438} = \frac{0.1516}{0.16438} = 0.9222 \]
\[ A = \theta(1 - B) = 0.04 \times (1 - 0.9222) = 0.04 \times 0.0778 = 0.003111 \]

Computing VIX for \(v_t = 0.04\). Using VIX\(_t^2 = A + B v_t\):

\[ \text{VIX}_t^2 = 0.003111 + 0.9222 \times 0.04 = 0.003111 + 0.036889 = 0.04000 \]
\[ \text{VIX}_t = 100 \times \sqrt{0.04000} = 100 \times 0.2000 = 20.00\% \]

This makes intuitive sense: when \(v_t = \theta = 0.04\), the variance is at its long-run level, and the 30-day expected average variance equals \(\theta\) (since there is no drift away from \(\theta\)). The VIX is exactly \(100\sqrt{\theta} = 20\%\).

Verification. We can check: \(A + B\theta = \theta(1-B) + B\theta = \theta\), confirming that VIX\(^2 = \theta\) when \(v_t = \theta\), regardless of the value of \(\kappa\).


Exercise 2. Since VIX\(^2 = A + Bv_t\) is affine in \(v_t\), the characteristic function of VIX\(^2\) follows from the known distribution of \(v_t\). If \(v_t\) has a CIR stationary distribution (Gamma), show that VIX\(^2\) also follows a shifted Gamma distribution. Compute the mean and variance of VIX\(^2\) in terms of \(A\), \(B\), \(\theta\), \(\kappa\), and \(\xi\).

Solution to Exercise 2

Stationary distribution of VIX\(^2\). In the stationary regime (\(T \to \infty\)), the CIR process \(v_t\) has a Gamma distribution:

\[ v_{\infty} \sim \text{Gamma}\!\left(\frac{2\kappa\theta}{\xi^2}, \frac{2\kappa}{\xi^2}\right) \]

with shape \(\alpha_{\text{shape}} = 2\kappa\theta/\xi^2\) and rate \(\beta_{\text{rate}} = 2\kappa/\xi^2\). This gives \(\mathbb{E}[v_{\infty}] = \theta\) and \(\operatorname{Var}(v_{\infty}) = \xi^2\theta/(2\kappa)\).

Since VIX\(^2 = A + B v_t\), VIX\(^2\) is a shifted (and scaled) Gamma random variable:

\[ \text{VIX}^2 - A = B v_t \sim \text{Gamma}\!\left(\frac{2\kappa\theta}{\xi^2}, \frac{2\kappa}{B\xi^2}\right) \]

The shift by \(A\) does not change the shape of the distribution, only the location.

Mean of VIX\(^2\):

\[ \mathbb{E}[\text{VIX}_T^2] = A + B \, \mathbb{E}[v_T] \]

For the general case (not just stationary), \(\mathbb{E}[v_T] = \theta + (v_0 - \theta)e^{-\kappa T}\), so:

\[ \mathbb{E}[\text{VIX}_T^2] = A + B\theta + B(v_0 - \theta)e^{-\kappa T} \]

Since \(A + B\theta = \theta\):

\[ \mathbb{E}[\text{VIX}_T^2] = \theta + B(v_0 - \theta)e^{-\kappa T} \]

In the stationary limit: \(\mathbb{E}[\text{VIX}_{\infty}^2] = \theta\).

Variance of VIX\(^2\):

\[ \operatorname{Var}(\text{VIX}_T^2) = B^2 \operatorname{Var}(v_T) \]

The variance of \(v_T\) under CIR is:

\[ \operatorname{Var}(v_T) = \frac{\xi^2}{\kappa}\left[\frac{\theta}{2}(1 - e^{-\kappa T})^2 + v_0 e^{-\kappa T}(1 - e^{-\kappa T})\right] \]

Simplifying:

\[ \operatorname{Var}(v_T) = \frac{\xi^2(1 - e^{-\kappa T})}{\kappa}\left[\frac{\theta}{2}(1 - e^{-\kappa T}) + v_0 e^{-\kappa T}\right] \]

Therefore:

\[ \operatorname{Var}(\text{VIX}_T^2) = \frac{B^2 \xi^2(1 - e^{-\kappa T})}{\kappa}\left[\frac{\theta}{2}(1 - e^{-\kappa T}) + v_0 e^{-\kappa T}\right] \]

In the stationary limit (\(T \to \infty\)):

\[ \operatorname{Var}(\text{VIX}_{\infty}^2) = B^2 \cdot \frac{\xi^2\theta}{2\kappa} \]

Exercise 3. VIX call options have payoff \(({\text{VIX}_T - K_{\text{VIX}}})^+\) where \(K_{\text{VIX}}\) is the VIX strike (in percentage). To price this via Fourier inversion, we need the CF of VIX\(_T\) (not VIX\(_T^2\)). Describe the challenge: VIX\(_T = \sqrt{A + Bv_T}\) is a nonlinear function of \(v_T\). Explain how to price VIX options by: (a) computing the CF of \(v_T\) (which is known for CIR), (b) changing variables from VIX to \(v_T\), and (c) applying Gil-Pelaez inversion.

Solution to Exercise 3

The challenge of pricing VIX options. VIX options have payoff \((\text{VIX}_T - K_{\text{VIX}})^+\) where \(\text{VIX}_T = \sqrt{A + Bv_T}\). The characteristic function of VIX\(_T^2 = A + Bv_T\) is available in closed form (from the CIR MGF), but the characteristic function of \(\text{VIX}_T = \sqrt{A + Bv_T}\) is not available in closed form because the square root is a nonlinear transformation.

Approach (a): CF of \(v_T\). The CIR process has a known characteristic function (equivalently, moment generating function):

\[ \mathbb{E}[e^{iwv_T} \mid v_0] = \exp(C(T, iw) + D(T, iw)v_0) \]

where \(C\) and \(D\) are the Riccati solutions given in the text. This is a closed-form expression, easily evaluable for any \(w \in \mathbb{C}\).

Approach (b): Change of variables from VIX to \(v_T\). The VIX call price is:

\[ C_{\text{VIX}} = e^{-rT}\mathbb{E}[(\sqrt{A + Bv_T} - K_{\text{VIX}})^+] \]

Since \(\text{VIX} = \sqrt{A + Bv_T}\) is a monotone increasing function of \(v_T\), the event \(\{\text{VIX}_T > K_{\text{VIX}}\}\) is equivalent to \(\{v_T > v^*\}\) where:

\[ v^* = \frac{K_{\text{VIX}}^2 - A}{B} \]

Therefore:

\[ C_{\text{VIX}} = e^{-rT}\int_{v^*}^{\infty} (\sqrt{A + Bv} - K_{\text{VIX}}) f_{v_T}(v) \, dv \]

where \(f_{v_T}\) is the (non-central chi-squared) density of \(v_T\).

Approach (c): Gil--Pelaez inversion. To evaluate the integral above, we can use the Gil--Pelaez framework. The probability \(\mathbb{P}(v_T > v^*)\) can be computed as:

\[ \mathbb{P}(v_T > v^*) = \frac{1}{2} + \frac{1}{\pi}\int_0^{\infty} \operatorname{Re}\!\left[\frac{e^{-iu v^*}\phi_{v_T}(u)}{iu}\right] du \]

where \(\phi_{v_T}(u) = \mathbb{E}[e^{iuv_T}]\) is the CIR characteristic function. For the full call price, we also need the conditional first moment \(\mathbb{E}[\sqrt{A + Bv_T} \mathbf{1}_{\{v_T > v^*\}}]\). This can be computed by:

  1. Fourier inversion of the density: First recover \(f_{v_T}(v)\) via inverse Fourier transform of \(\phi_{v_T}\), then numerically integrate \(\int_{v^*}^{\infty}\sqrt{A + Bv}\,f_{v_T}(v)\,dv\).

  2. Direct numerical integration: Since the non-central chi-squared density has well-known computational forms (series of Poisson-weighted central chi-squared densities), one can directly evaluate the integral numerically without Fourier methods.

  3. COS method on \(v_T\): Apply the COS expansion to \(f_{v_T}\) over \([0, v_{\max}]\), with payoff coefficients \(H_k = \int_{v^*}^{v_{\max}}(\sqrt{A+Bv} - K_{\text{VIX}})\cos(\cdots)\,dv\) computed numerically.

The most efficient approach in practice is method 2 (direct integration against the non-central chi-squared density), as it avoids the Fourier inversion step entirely and exploits the known distributional form of \(v_T\).


Exercise 4. The Heston model generates a VIX smile: implied volatilities of VIX options vary with the VIX strike. Explain why the VIX smile is upward-sloping (OTM VIX calls have higher implied vol than ATM). Hint: VIX\(^2\) is affine in \(v_t\), and \(v_t\) follows a CIR process whose distribution is right-skewed. The positive skewness of VIX translates to an upward-sloping smile.

Solution to Exercise 4

Why the VIX smile is upward-sloping.

The VIX implied volatility smile refers to the Black (lognormal) implied volatility of VIX options as a function of the VIX strike \(K_{\text{VIX}}\). An upward-sloping smile means OTM VIX calls (high strikes) have higher implied volatility than ATM VIX options.

Step 1: Distribution of VIX. Under Heston, VIX\(^2 = A + Bv_T\), where \(v_T\) follows a CIR process. The distribution of \(v_T\) is non-central chi-squared, which is:

  • Supported on \([0, \infty)\): VIX\(^2 \geq A > 0\), so VIX is bounded below
  • Right-skewed: The chi-squared family has positive skewness. Large values of \(v_T\) are possible but rare, creating a heavy right tail
  • Skewness coefficient: For the CIR process, the skewness is \(2\xi/\sqrt{2\kappa\theta} > 0\), confirming positive skewness

Since VIX \(= \sqrt{A + Bv_T}\), VIX inherits the right skew of \(v_T\) (the square root is a concave function that compresses the right tail somewhat, but the skewness remains positive).

Step 2: Right skew implies upward-sloping implied vol. When a distribution is right-skewed relative to the lognormal:

  • The right tail is heavier than lognormal: \(\mathbb{P}(\text{VIX} > K)\) decays slower than the lognormal tail for large \(K\)
  • OTM calls (high \(K_{\text{VIX}}\)) are worth more than the lognormal model predicts
  • To match these higher prices, the Black implied volatility must be increased for high strikes
  • This creates the upward slope in the VIX implied volatility smile

Step 3: Contrast with SPX smile. The SPX implied volatility smile is typically downward-sloping (negative skew) because:

  • SPX returns are left-skewed (leverage effect from \(\rho < 0\))
  • OTM puts (low strikes) are expensive

The VIX smile has the opposite direction because VIX is derived from the variance process, not the stock price. The variance process has a right-skewed distribution regardless of \(\rho\) (which affects only the stock-variance correlation, not the marginal distribution of \(v_T\)).

Step 4: Role of \(\xi\). The vol-of-vol \(\xi\) controls the magnitude of the skewness. Higher \(\xi\) means more dispersed \(v_T\), heavier right tail, and a steeper upward slope in the VIX smile.


Exercise 5. A key limitation of single-factor Heston for VIX options is that it cannot fit both the SPX implied volatility surface and the VIX smile simultaneously. Explain why: the five Heston parameters are fully determined by the SPX surface calibration, leaving no degrees of freedom to fit VIX options. If the SPX-calibrated parameters give VIX at-the-money implied volatility of 80%, but the market shows 100%, what does this discrepancy reveal about the model?

Solution to Exercise 5

Joint SPX-VIX calibration failure.

The constraint. The single-factor Heston model has five parameters: \(v_0, \kappa, \theta, \xi, \rho\). When calibrating to the SPX implied volatility surface (a rich set of option prices across strikes and maturities), all five parameters are effectively determined. The calibration minimizes some loss function:

\[ (\hat{v}_0, \hat{\kappa}, \hat{\theta}, \hat{\xi}, \hat{\rho}) = \arg\min \sum_{i} \left(\sigma_{\text{Heston}}^{\text{impl}}(K_i, T_i) - \sigma_{\text{market}}^{\text{impl}}(K_i, T_i)\right)^2 \]

Once these parameters are fixed, the VIX option prices are fully determined --- there are no free parameters left to fit the VIX smile.

Why the fit fails. The VIX smile under Heston is controlled primarily by \(\xi\) (vol-of-vol) and \(\kappa\) (mean reversion):

  • \(\xi\) determines the width of the VIX distribution (and hence the VIX smile curvature)
  • \(\kappa\) determines how much of \(v_0\) propagates to the VIX option expiry

The SPX surface also depends on \(\xi\) and \(\kappa\), but through different mechanisms:

  • \(\xi\) controls the SPX smile curvature (kurtosis of returns)
  • \(\kappa\) controls the term structure of the SPX smile

In practice, the \(\xi\) required to fit the SPX surface is too small to generate enough VIX smile curvature. The SPX-calibrated Heston model produces VIX options with:

  • Too low ATM VIX implied volatility (e.g., 80% vs. market 100%)
  • Too flat VIX smile

What the discrepancy reveals. If the SPX-calibrated Heston gives VIX ATM vol of 80% but the market shows 100%, this means:

  1. The variance process has more randomness than single-factor CIR can capture. The actual variance dynamics may involve jumps, multiple factors, or non-constant vol-of-vol.

  2. The VIX smile contains information beyond what is in the SPX surface. The VIX market prices imply a heavier-tailed variance distribution than the SPX-calibrated CIR process.

  3. Model risk. Using SPX-calibrated Heston to price VIX derivatives will systematically underprice OTM VIX calls and ITM VIX puts, creating arbitrage opportunities for sophisticated traders who recognize the model limitation.

This discrepancy is one of the main motivations for multi-factor extensions (double Heston, Bates with variance jumps, rough volatility models) that decouple the SPX and VIX fitting channels.


Exercise 6. Compute the Heston-implied VIX for \(v_0 = 0.04\) and \(v_0 = 0.09\) (the latter representing a market stress scenario). Using \(\kappa = 2.0\), \(\theta = 0.04\), \(\tau = 30/365\), compute VIX in both cases. Then compute the VIX "delta" with respect to \(v_t\): \(\partial\text{VIX}/\partial v_t = 100 \cdot B / (2\sqrt{A + Bv_t})\). Is the VIX more sensitive to variance changes when \(v_t\) is high or low?

Solution to Exercise 6

Computing VIX for two variance levels. Using \(\kappa = 2.0\), \(\theta = 0.04\), \(\tau = 30/365\):

From Exercise 1: \(A = 0.003111\) and \(B = 0.9222\).

Case 1: \(v_t = 0.04\).

\[ \text{VIX}_t^2 = 0.003111 + 0.9222 \times 0.04 = 0.04000 \]
\[ \text{VIX}_t = 100\sqrt{0.04000} = 20.00 \]

Case 2: \(v_t = 0.09\).

\[ \text{VIX}_t^2 = 0.003111 + 0.9222 \times 0.09 = 0.003111 + 0.08300 = 0.08611 \]
\[ \text{VIX}_t = 100\sqrt{0.08611} = 29.35 \]

VIX delta. The sensitivity of VIX to \(v_t\) is:

\[ \frac{\partial \text{VIX}}{\partial v_t} = 100 \cdot \frac{B}{2\sqrt{A + Bv_t}} \]

Case 1: \(v_t = 0.04\).

\[ \frac{\partial \text{VIX}}{\partial v_t} = 100 \times \frac{0.9222}{2\sqrt{0.04000}} = 100 \times \frac{0.9222}{0.4000} = 100 \times 2.306 = 230.6 \]

Case 2: \(v_t = 0.09\).

\[ \frac{\partial \text{VIX}}{\partial v_t} = 100 \times \frac{0.9222}{2\sqrt{0.08611}} = 100 \times \frac{0.9222}{0.5871} = 100 \times 1.571 = 157.1 \]

Comparison. The VIX delta at \(v_t = 0.04\) (230.6) is significantly higher than at \(v_t = 0.09\) (157.1). This means:

  • VIX is more sensitive to variance changes when \(v_t\) is low.
  • This is a consequence of the square root: \(\frac{\partial}{\partial v}\sqrt{v} = \frac{1}{2\sqrt{v}}\), which is a decreasing function of \(v\).

Interpretation. In calm markets (\(v_t\) low, VIX \(\approx\) 20), a given increase in instantaneous variance causes a proportionally larger VIX move than in stressed markets (\(v_t\) high, VIX \(\approx\) 30). This is consistent with the empirical observation that VIX moves are proportionally larger (in percentage terms) from low levels than from high levels. However, the absolute change in VIX for a fixed absolute change in \(v_t\) is also larger when VIX is low:

  • At \(v_t = 0.04\): \(\Delta \text{VIX} \approx 230.6 \times \Delta v\) per unit \(\Delta v\)
  • At \(v_t = 0.09\): \(\Delta \text{VIX} \approx 157.1 \times \Delta v\) per unit \(\Delta v\)

This asymmetric sensitivity has implications for VIX option hedging: the delta hedge of VIX options must be adjusted more frequently when VIX is low because the VIX delta is larger and changes faster.


Exercise 7. The double Heston model adds a second variance factor \(v_t^{(2)}\) so that VIX\(^2 = A_1 + B_1 v_t^{(1)} + A_2 + B_2 v_t^{(2)}\). Explain how this extra factor provides additional degrees of freedom to calibrate VIX options. If the first factor has \(\kappa_1 = 2.0\) (slow mean-reversion) and the second has \(\kappa_2 = 10.0\) (fast mean-reversion), which factor dominates short-dated VIX options and which dominates long-dated ones?

Solution to Exercise 7

Double Heston model and VIX options.

VIX\(^2\) under double Heston. In the double Heston model, the total instantaneous variance is \(v_t = v_t^{(1)} + v_t^{(2)}\), where each factor follows an independent CIR process:

\[ dv_t^{(i)} = \kappa_i(\theta_i - v_t^{(i)})dt + \xi_i\sqrt{v_t^{(i)}}dW_t^{(2,i)}, \quad i = 1, 2 \]

The VIX\(^2\) becomes:

\[ \text{VIX}_t^2 = A_1 + B_1 v_t^{(1)} + A_2 + B_2 v_t^{(2)} \]

where \(A_i = \theta_i(1 - B_i)\) and \(B_i = (1 - e^{-\kappa_i\tau})/(\kappa_i\tau)\) for each factor.

Additional degrees of freedom. The single-factor Heston has five parameters affecting VIX options: \(v_0, \kappa, \theta, \xi\) (and \(\rho\) does not affect VIX\(^2\) since it only enters the stock-variance correlation). The double Heston model has, for VIX purposes, eight relevant parameters: \(v_0^{(1)}, \kappa_1, \theta_1, \xi_1, v_0^{(2)}, \kappa_2, \theta_2, \xi_2\).

These additional parameters allow independent control of:

  1. The level of VIX: \(\mathbb{E}[\text{VIX}^2] = A_1 + B_1\mathbb{E}[v_T^{(1)}] + A_2 + B_2\mathbb{E}[v_T^{(2)}]\)
  2. The volatility of VIX: \(\operatorname{Var}(\text{VIX}^2) = B_1^2\operatorname{Var}(v_T^{(1)}) + B_2^2\operatorname{Var}(v_T^{(2)})\) (cross terms vanish by independence)
  3. The skewness of VIX: The sum of two independent gamma-type variables has different skewness than a single one
  4. The term structure of VIX smile: Different \(\kappa_1, \kappa_2\) produce different decay rates

This richer parameterization allows the double Heston model to fit both the SPX surface (through the combined effect of both factors plus their correlations \(\rho_1, \rho_2\) with the stock) and the VIX smile (through the individual \(\xi_i\) and \(\kappa_i\) parameters) simultaneously.

Short-dated versus long-dated VIX options. With \(\kappa_1 = 2.0\) (slow mean reversion) and \(\kappa_2 = 10.0\) (fast mean reversion):

Short-dated VIX options (expiry \(T\) small, say 1--2 months):

  • The fast factor \(v_t^{(2)}\) dominates. With \(\kappa_2 = 10\), the variance of \(v_T^{(2)}\) given \(v_0^{(2)}\) is large relative to its mean-reversion time scale. Over short horizons, the fast factor has not yet mean-reverted, so it contributes substantial randomness to VIX\(_T\).
  • The coefficient \(B_2 = (1 - e^{-\kappa_2\tau})/(\kappa_2\tau)\) is close to 1 for small \(\tau\), so the fast factor fully loads onto VIX\(^2\).
  • The fast factor's variance \(\operatorname{Var}(v_T^{(2)})\) decays quickly with \(T\) but is large for small \(T\).

Long-dated VIX options (expiry \(T\) large, say 6--12 months):

  • The slow factor \(v_t^{(1)}\) dominates. With \(\kappa_1 = 2.0\), the variance of \(v_T^{(1)}\) decays slowly and remains significant even at long horizons.
  • The fast factor \(v_T^{(2)}\) has largely mean-reverted to \(\theta_2\) by the VIX option expiry, contributing little randomness. Its variance \(\operatorname{Var}(v_T^{(2)}) \propto e^{-\kappa_2 T}(1 - e^{-\kappa_2 T})\) is very small for large \(T\).
  • The VIX smile at long maturities is therefore controlled primarily by \(\xi_1\) and \(\kappa_1\).

This separation of time scales is the key mechanism: the fast factor captures the steep short-dated VIX smile (driven by short-term variance uncertainty), while the slow factor captures the persistent long-dated VIX smile (driven by long-term variance uncertainty). The single-factor Heston cannot achieve this separation, which is why it struggles with the VIX term structure.