Correlation Structure¶
The correlations between components of a multidimensional affine process are encoded in the off-diagonal elements of the instantaneous covariance matrix \(a(x) = a_0 + \sum_i \alpha_i x^{(i)}\). Because this covariance is itself an affine function of the state, the instantaneous correlation between components is generally state-dependent --- a feature that captures the leverage effect in equity markets and the time-varying comovement of yields across maturities. However, the affine structure also imposes significant constraints: not every desired correlation pattern is achievable while maintaining admissibility. This section develops the correlation structure of affine processes, derives the key limitations, and illustrates the trade-offs with the Heston model and multi-factor interest rate models.
Learning Objectives
By the end of this section, you will be able to:
- Compute instantaneous correlations from the affine covariance matrix \(a(x)\)
- Identify which correlation specifications are compatible with admissibility constraints
- Derive the leverage effect parameter \(\rho\) in the Heston model from the affine covariance structure
- Explain why affine models have limited ability to generate time-varying correlations
- Assess the implications of correlation constraints for model calibration
Motivation¶
Why Correlation Matters¶
In equity derivatives, the correlation between asset returns and volatility --- the leverage effect --- drives the implied volatility skew. In fixed income, the correlations between short-rate factors, long-rate factors, and volatility factors determine the shape and dynamics of the yield curve. In credit, the correlation between default intensities of different obligors determines portfolio loss distributions.
Modeling these correlations correctly is essential for accurate pricing and hedging. The affine framework provides a tractable way to incorporate correlations, but the admissibility conditions constrain which correlation structures are achievable. Understanding these constraints is crucial for choosing the right model specification.
Instantaneous Correlation¶
Derivation from the Covariance Matrix¶
For a \(d\)-dimensional affine diffusion with instantaneous covariance \(a(x) = a_0 + \sum_{i=1}^d \alpha_i x^{(i)}\), the instantaneous covariance between the \(k\)-th and \(l\)-th components is
The instantaneous correlation is
Proposition: State-Dependent Correlation in Affine Models
If both \(a_{kk}(x)\) and \(a_{ll}(x)\) depend on the state through different components, the instantaneous correlation \(\rho_{kl}(x)\) is a nonlinear function of \(x\), even though the covariance matrix is affine in \(x\).
This nonlinearity arises because correlation is a ratio involving square roots of diagonal entries. An affine covariance does not imply an affine correlation.
Constant vs. State-Dependent Correlation¶
Case 1: Both components Gaussian (\(k, l \in J\)). Then \(a_{kk}(x) = (a_0)_{kk}\) and \(a_{ll}(x) = (a_0)_{ll}\) (constant), and \(a_{kl}(x) = (a_0)_{kl}\) (constant). The correlation \(\rho_{kl}\) is constant.
Case 2: One CIR, one Gaussian (\(k \in I\), \(l \in J\)). Then \(a_{kk}(x) = (\alpha_k)_{kk} x^{(k)}\), \(a_{ll}(x) = (a_0)_{ll}\), and \(a_{kl}(x) = (\alpha_k)_{kl} x^{(k)}\). The correlation is
Wait --- simplifying properly: the \(x^{(k)}\) cancels in numerator and denominator, giving a constant correlation:
This is the situation in the Heston model, where the correlation \(\rho\) between returns and variance is a fixed parameter.
Case 3: Two CIR components driven by different state variables (\(k, l \in I\), \(k \neq l\)). The covariance \(a_{kl}(x)\) can depend on \(x^{(k)}\) and \(x^{(l)}\) through \((\alpha_k)_{kl} x^{(k)} + (\alpha_l)_{kl} x^{(l)}\). If this dependence is nontrivial, the correlation becomes genuinely state-dependent.
The Heston Leverage Effect¶
Covariance Structure¶
In the Heston model with state \(X_t = (V_t, \log S_t)^\top\), the covariance matrix is
The instantaneous covariance between \(d\log S_t\) and \(dV_t\) is \(\rho\xi\,V_t\,dt\), and the correlation is
The correlation parameter \(\rho\) in the Heston model is therefore the instantaneous correlation between the return \(d\log S_t\) and the variance change \(dV_t\).
Financial Interpretation of Leverage
Empirically, \(\rho < 0\) for equity markets (typically \(\rho \in [-0.9, -0.5]\)): when stock prices fall, volatility tends to rise. This negative correlation generates the implied volatility skew --- out-of-the-money puts are more expensive than out-of-the-money calls.
Setting \(\rho = 0\) recovers a model where returns and volatility are independent, producing a symmetric implied volatility smile rather than a skew.
Admissibility Constraint on Correlation¶
The matrix \(\alpha_1\) must be positive semi-definite, requiring
This gives \(|\rho| \leq 1\), which is the natural constraint on a correlation parameter. The admissibility condition automatically enforces this.
Correlation Constraints in Multi-Factor Models¶
The \(A_m(d)\) Constraints¶
For a general \(A_m(d)\) model, the correlation structure is constrained by the positive semi-definiteness of \(a(x)\) for all \(x \in D\). This imposes restrictions beyond the simple \(|\rho| \leq 1\) condition.
Proposition: Correlation Constraints in \(A_2(3)\)
For a three-factor model with two CIR components (\(V_1, V_2 \in \mathbb{R}_+\)) and one Gaussian component (\(r \in \mathbb{R}\)), the covariance matrix is
The conditions for \(a(x) \succeq 0\) for all \(V_1, V_2 \geq 0\) require:
- \(a_0 \succeq 0\)
- \(\alpha_1 \succeq 0\) and \(\alpha_2 \succeq 0\)
- No cancellation: \(a_0 + \alpha_1 V_1 + \alpha_2 V_2 \succeq 0\) must hold even at the boundary where \(V_1 = 0\) or \(V_2 = 0\)
These constraints are more restrictive than requiring \(a(x) \succeq 0\) at a single point.
Off-Diagonal Constraints¶
Between two CIR components \(X^{(k)}\) and \(X^{(l)}\) with \(k, l \in I\), the admissibility condition (A4) from the admissibility section requires \((\alpha_i)_{kk} = 0\) for \(i \neq k\). This means the variance of the \(k\)-th component is driven only by \(x^{(k)}\) itself. However, the covariance between \(k\) and \(l\) can depend on both \(x^{(k)}\) and \(x^{(l)}\):
The positive semi-definiteness requirement constrains \((\alpha_k)_{kl}\) and \((\alpha_l)_{kl}\) jointly with the diagonal terms.
Limitations of the Affine Correlation Structure¶
What Affine Models Cannot Do¶
Fixed Correlation Topology
In the standard affine framework, the instantaneous correlation between Gaussian components is constant (determined by \(a_0\)), and the correlation between a CIR component and any other component is at most a simple function of the CIR state variables. This means:
- Stochastic correlation between returns and volatility (beyond the fixed \(\rho\)) is not available in the basic Heston model
- Regime-dependent correlations (e.g., correlations increasing during crises) require additional state variables
- Arbitrary time-varying correlations cannot be achieved without breaking the affine structure
Workarounds¶
Several extensions address these limitations while preserving partial tractability:
- Multi-factor stochastic volatility: Using multiple CIR factors with different correlations to returns creates an effective time-varying correlation through the mix of active factors
- Wishart affine processes: Extending the state space to include matrix-valued processes (Bru, 1991; Gourieroux and Sufana, 2003) allows the entire covariance matrix to be stochastic and affine
- Regime-switching affine models: Different parameter sets (including correlations) in different regimes, with affine structure within each regime
Example: Two-Factor Interest Rate Model¶
Consider an \(A_1(2)\) model with state \((V_t, r_t)\) where \(V_t \geq 0\) is a volatility factor and \(r_t \in \mathbb{R}\) is the short rate:
The covariance matrix is
Wait --- let us be more careful. Actually with constant \(\sigma_r\):
Here \(a_0 = \operatorname{diag}(0, \sigma_r^2)\) and \(\alpha_1 = \begin{pmatrix} \xi^2 & \eta\xi \\ \eta\xi & \eta^2 \end{pmatrix}\).
The instantaneous correlation between \(dV_t\) and \(dr_t\) is
This correlation is state-dependent: it increases with \(V_t\) (approaching \(\eta/|\eta| = \operatorname{sign}(\eta)\) as \(V_t \to \infty\)) and vanishes as \(V_t \to 0\) (because the volatility factor contributes no noise at the boundary). This is a genuine stochastic correlation effect within the affine framework, arising because the Gaussian component has both a constant diffusion (\(\sigma_r\)) and a state-dependent diffusion (\(\eta\sqrt{V_t}\)).
Summary¶
The correlation structure of multidimensional affine processes is encoded in the off-diagonal elements of the affine covariance matrix \(a(x) = a_0 + \sum_i \alpha_i x^{(i)}\). For two Gaussian components, the correlation is constant; for a CIR-Gaussian pair with diffusion driven only by the CIR component, the correlation simplifies to a constant parameter (as in Heston's \(\rho\)). State-dependent correlations arise when both constant and state-dependent diffusion terms contribute to the same component. The admissibility conditions --- particularly positive semi-definiteness of \(a(x)\) for all \(x \in D\) --- constrain the achievable correlation patterns and impose restrictions on model parameters. The affine framework cannot generate arbitrary time-varying correlations, motivating extensions such as Wishart processes and multi-factor specifications that increase the effective dimensionality of the correlation structure.
Further Reading¶
- Heston, S. (1993). "A Closed-Form Solution for Options with Stochastic Volatility." Review of Financial Studies, 6(2), 327--343.
- Dai, Q. and Singleton, K. (2000). "Specification Analysis of Affine Term Structure Models." Journal of Finance, 55(5), 1943--1978.
- Gourieroux, C. and Sufana, R. (2003). "Wishart Quadratic Term Structure Models." Working Paper, CREST.
- Filipovic, D. (2009). Term-Structure Models: A Graduate Course. Springer.
Exercises¶
Exercise 1. For the Heston model with covariance matrix \(a(V) = V \begin{pmatrix} 1 & \rho\xi \\ \rho\xi & \xi^2 \end{pmatrix}\), compute the instantaneous correlation \(\operatorname{Corr}(d\log S_t, dV_t) = \rho\xi\sqrt{V}/(\sqrt{V}\cdot\xi\sqrt{V}) = \rho\). Verify that the correlation is state-independent and equals \(\rho\).
Solution to Exercise 1
The Heston covariance matrix is \(a(V) = \alpha_1 V\) where
The individual variances and covariance are:
- \(a_{11}(V) = V\) (variance of \(d\log S_t\) is \(V\,dt\))
- \(a_{22}(V) = \xi^2 V\) (variance of \(dV_t\) is \(\xi^2 V\,dt\))
- \(a_{12}(V) = \rho\xi V\) (covariance is \(\rho\xi V\,dt\))
The instantaneous correlation is
The factor \(V\) cancels exactly in numerator and denominator, confirming that the correlation is state-independent and equals the parameter \(\rho\).
Exercise 2. For a two-factor CIR model on \(\mathbb{R}_+^2\) with independent components (no off-diagonal terms in \(\alpha_1\) or \(\alpha_2\)), show that the instantaneous correlation between \(X_t^{(1)}\) and \(X_t^{(2)}\) is zero. Can a two-factor CIR model ever produce nonzero instantaneous correlation while maintaining admissibility?
Solution to Exercise 2
With independent components, \((\alpha_1)_{12} = (\alpha_1)_{21} = 0\) and \((\alpha_2)_{12} = (\alpha_2)_{21} = 0\). The off-diagonal covariance is
By admissibility condition (A5), \((a_0)_{11} = (a_0)_{22} = 0\) for CIR components. Positive semi-definiteness of \(a_0\) then requires \((a_0)_{12}^2 \leq (a_0)_{11}(a_0)_{22} = 0\), so \((a_0)_{12} = 0\). Hence \(a_{12}(x) = 0\) identically, and the instantaneous correlation is zero.
Can a two-factor CIR model produce nonzero correlation? Yes, but only through state-dependent cross-terms. If \((\alpha_1)_{12} \neq 0\), then \(a_{12}(x) = (\alpha_1)_{12}x^{(1)}\), which is nonzero when \(x^{(1)} > 0\). However, admissibility requires \(\alpha_1 \succeq 0\). Since \((\alpha_1)_{11} = \xi_1^2\) and \((\alpha_1)_{22} = 0\) (by condition A4, the diffusion of the second CIR component cannot depend on \(x^{(1)}\)), positive semi-definiteness forces \((\alpha_1)_{12} = 0\). By symmetry, \((\alpha_2)_{12} = 0\). Therefore, a two-factor CIR model on \(\mathbb{R}_+^2\) with standard admissibility conditions cannot produce nonzero instantaneous correlation between the two components.
Exercise 3. Consider a three-factor model on \(\mathbb{R}_+ \times \mathbb{R}^2\) where the CIR component \(V_t\) drives the diffusion of both Gaussian factors. Write the diffusion matrix \(a(x) = a_0 + \alpha_1 V\) and compute the instantaneous correlation between the two Gaussian factors as a function of \(V_t\). Is this correlation time-varying?
Solution to Exercise 3
Let the state be \((V_t, Y_t^{(1)}, Y_t^{(2)})^\top \in \mathbb{R}_+ \times \mathbb{R}^2\). The CIR component \(V_t\) drives the diffusion of both Gaussian factors. The diffusion matrix is:
where
The covariance between the two Gaussian factors is
Their variances are \(a_{22}(V) = \sigma_1^2 + \eta_1^2 V\) and \(a_{33}(V) = \sigma_2^2 + \eta_2^2 V\). The instantaneous correlation is
This correlation is time-varying (stochastic) because it depends on \(V_t\), which is itself a stochastic process. As \(V_t \to 0\), the correlation approaches \(\sigma_{12}/(\sigma_1\sigma_2)\) (the constant Gaussian correlation). As \(V_t \to \infty\), it approaches \(\eta_1\eta_2/(|\eta_1||\eta_2|) = \operatorname{sign}(\eta_1\eta_2)\). The correlation interpolates between these limits as the volatility factor evolves.
Exercise 4. Explain the leverage effect in the context of the Heston model: why does \(\rho < 0\) imply that stock price declines are associated with volatility increases? Derive the covariance \(\operatorname{Cov}(d\log S_t, dV_t) = \rho\xi V_t\,dt\) and interpret the state-dependence on \(V_t\).
Solution to Exercise 4
In the Heston model, the covariance between \(d\log S_t\) and \(dV_t\) is the \((1,2)\) entry of \(a(V_t)\,dt\):
The leverage effect: When \(\rho < 0\), the covariance \(\rho\xi V_t\,dt < 0\), meaning that negative innovations to \(\log S_t\) (price declines) are associated with positive innovations to \(V_t\) (volatility increases). Economically, when the stock price drops, the firm's leverage ratio (debt/equity) increases, making the equity riskier and increasing its volatility. This is the leverage effect.
State-dependence on \(V_t\): The covariance is proportional to \(V_t\), so the leverage coupling is stronger when volatility is already high. In high-volatility environments, a given price decline produces a larger volatility increase than the same decline would in a low-volatility environment. This amplification effect is consistent with the empirical observation that volatility clustering tends to intensify during market stress.
Note that while the covariance is state-dependent (proportional to \(V_t\)), the correlation itself is \(\rho\), a constant. The varying covariance reflects the fact that both variables become more volatile when \(V_t\) is large, but their linear dependence structure (as measured by correlation) remains fixed.
Exercise 5. For the Dai-Singleton \(A_1(3)\) model with one CIR factor and two Gaussian factors, what is the maximum number of free correlation parameters? Compare this to the three independent correlations in a general \(3 \times 3\) correlation matrix, and explain why affine models have restricted correlation flexibility.
Solution to Exercise 5
The \(A_1(3)\) model has state \((V_t, Y_t^{(1)}, Y_t^{(2)})^\top\) with \(V_t \in \mathbb{R}_+\) (one CIR factor) and \(Y_t^{(1)}, Y_t^{(2)} \in \mathbb{R}\) (two Gaussian factors).
The correlation structure involves three pairs: \((V, Y^{(1)})\), \((V, Y^{(2)})\), and \((Y^{(1)}, Y^{(2)})\). In a general \(3 \times 3\) correlation matrix, there are 3 independent correlation parameters.
In the affine framework, the covariance is \(a(x) = a_0 + \alpha_1 V\). The free correlation parameters are:
- \((a_0)_{23}\): the constant part of the covariance between the two Gaussian components (1 parameter)
- \((\alpha_1)_{12}\): the \(V\)-dependent covariance between \(V\) and \(Y^{(1)}\) (1 parameter)
- \((\alpha_1)_{13}\): the \(V\)-dependent covariance between \(V\) and \(Y^{(2)}\) (1 parameter)
- \((\alpha_1)_{23}\): the \(V\)-dependent covariance between \(Y^{(1)}\) and \(Y^{(2)}\) (1 parameter)
However, the matrix \(\alpha_1\) must be positive semi-definite, and its rank is constrained. Since \((\alpha_1)_{11} = \xi^2\) and \(\alpha_1 \succeq 0\), the off-diagonal entries are not all independently free --- they are constrained by \(\det(\alpha_1) \geq 0\) and the \(2 \times 2\) minor conditions.
The maximum number of free correlation-related parameters is 3 (from \((\alpha_1)_{12}\), \((\alpha_1)_{13}\), and \((a_0)_{23}\)), but they are jointly constrained by positive semi-definiteness. In contrast, a general \(3 \times 3\) correlation matrix has 3 free parameters with only the constraint that the matrix is positive definite. The affine model restricts these correlations because \((V, Y^{(1)})\) and \((V, Y^{(2)})\) correlations must arise through the single CIR factor, and the Gaussian-Gaussian correlation is the sum of a constant part and a \(V\)-dependent part, both constrained by the rank structure of \(\alpha_1\).
Exercise 6. In calibrating the Heston model to equity options, the correlation parameter \(\rho\) primarily controls the skew of the implied volatility smile. Describe qualitatively how the implied volatility smile changes as \(\rho\) varies from \(-0.9\) to \(0\). Why can't the affine structure generate a purely symmetric smile with \(\rho = 0\) and still match observed market skews?
Solution to Exercise 6
Effect of \(\rho\) on the implied volatility smile:
-
\(\rho = -0.9\) (strongly negative): The implied volatility surface exhibits a pronounced negative skew. Out-of-the-money puts (low strikes) have substantially higher implied volatility than out-of-the-money calls (high strikes). This is because price declines are strongly correlated with volatility increases, making large downward moves more likely under the risk-neutral measure.
-
\(\rho = -0.5\) (moderately negative): The skew is present but less extreme. The implied volatility curve still slopes downward from left to right, but the difference between low-strike and high-strike implied volatilities is smaller.
-
\(\rho = 0\) (zero correlation): The implied volatility smile is approximately symmetric around the at-the-money level. Volatility is stochastic but uncorrelated with returns, so large moves in either direction are equally likely to occur during high-volatility periods. The smile has curvature (due to the vol-of-vol parameter \(\xi\)) but no directional tilt.
Why \(\rho = 0\) cannot match observed skews: Empirical equity implied volatility surfaces display a strong negative skew (the "volatility smirk"), with put options trading at significantly higher implied volatilities than calls. Setting \(\rho = 0\) produces only curvature (a smile) but no skew (no asymmetry). The affine Heston model has no other mechanism to generate skew: the parameter \(\rho\) is the sole driver of asymmetry. Other model features (mean-reversion speed \(\kappa\), vol-of-vol \(\xi\), long-run variance \(\theta\)) control the level and curvature of the smile but not its tilt. Therefore, \(\rho = 0\) is incompatible with observed market data, and calibration always yields \(\rho < 0\) for equity indices.