Monetary Risk Measures¶

Monetary risk measures form the axiomatic foundation for the theory of risk quantification. By requiring only two minimal axioms -- monotonicity and translation invariance -- they provide the broadest framework within which coherent, convex, and more general risk measures are systematically organized. This section develops the hierarchy from monetary through convex to coherent risk measures, with dual representations and key examples at each level.

The Axiomatic Framework¶

Setting¶

Let \((\Omega, \mathcal{F}, \mathbb{P})\) be a probability space. A risk measure is a mapping:

\[ \rho: \mathcal{X} \to \mathbb{R} \cup \{+\infty\} \]

where \(\mathcal{X}\) is a space of financial positions (random variables representing losses). We work with \(\mathcal{X} = L^\infty(\Omega, \mathcal{F}, \mathbb{P})\) (bounded random variables) or \(\mathcal{X} = L^p\) for \(p \ge 1\).

Convention: Positive values of \(X \in \mathcal{X}\) represent losses, so higher \(\rho(X)\) means greater risk.

Monetary Risk Measures¶

Definition

A risk measure \(\rho: \mathcal{X} \to \mathbb{R}\) is monetary if it satisfies:

Monotonicity: If \(X \le Y\) a.s., then \(\rho(X) \le \rho(Y)\)
Translation invariance (cash invariance): \(\rho(X + c) = \rho(X) + c\) for all \(c \in \mathbb{R}\)

Interpretation¶

Monotonicity says that higher losses require more capital. If position \(X\) is dominated by \(Y\) (i.e., \(Y\) produces at least as large a loss in every state), then \(Y\) is at least as risky.

Translation invariance says that adding a certain loss \(c\) increases the risk by exactly \(c\). Equivalently, setting aside cash \(m = \rho(X)\) reduces the risk to zero:

\[ \rho(X - \rho(X)) = \rho(X) - \rho(X) = 0 \]

This gives \(\rho(X)\) the interpretation of the minimum capital requirement: the smallest cash amount that, when added to the position, makes it acceptable.

Acceptance Set¶

The acceptance set of a monetary risk measure is:

\[ \mathcal{A}_\rho = \{X \in \mathcal{X} : \rho(X) \le 0\} \]

Proposition. For a monetary risk measure, \(\mathcal{A}_\rho\) satisfies:

\(\mathcal{A}_\rho\) is non-empty (contains \(X = c\) for sufficiently negative \(c\))
If \(X \in \mathcal{A}_\rho\) and \(Y \le X\) a.s., then \(Y \in \mathcal{A}_\rho\) (monotonicity)
\(\inf\{c : X - c \in \mathcal{A}_\rho\} = \rho(X)\) (reconstruction)

Conversely, any set \(\mathcal{A}\) satisfying these properties defines a monetary risk measure via the reconstruction formula. \(\square\)

Convex Risk Measures¶

Definition¶

Definition

A monetary risk measure \(\rho\) is convex if:

\[ \rho(\lambda X + (1-\lambda)Y) \le \lambda \rho(X) + (1-\lambda) \rho(Y) \quad \forall \lambda \in [0,1] \]

Interpretation¶

Convexity encodes a diversification principle: a mixture of positions is no riskier than the weighted average of individual risks. This is weaker than subadditivity; it allows for position-size-dependent risk but still rewards diversification.

Acceptance Set Characterization¶

\(\rho\) is convex if and only if \(\mathcal{A}_\rho\) is a convex set.

Proof. Suppose \(\rho\) is convex. If \(X, Y \in \mathcal{A}_\rho\) (i.e., \(\rho(X) \le 0\) and \(\rho(Y) \le 0\)), then for \(\lambda \in [0,1]\):

\[ \rho(\lambda X + (1-\lambda)Y) \le \lambda \rho(X) + (1-\lambda)\rho(Y) \le 0 \]

so \(\lambda X + (1-\lambda)Y \in \mathcal{A}_\rho\). The converse follows from the reconstruction formula. \(\square\)

Dual Representation of Convex Risk Measures¶

Theorem (Föllmer-Schied)

A convex risk measure \(\rho\) on \(L^\infty\) that is lower semicontinuous admits the representation:

\[ \rho(X) = \sup_{\mathbb{Q} \in \mathcal{M}_1} \left\{\mathbb{E}^{\mathbb{Q}}[X] - \alpha(\mathbb{Q})\right\} \]

where \(\mathcal{M}_1\) is the set of probability measures absolutely continuous with respect to \(\mathbb{P}\), and \(\alpha: \mathcal{M}_1 \to \mathbb{R} \cup \{+\infty\}\) is the penalty function:

\[ \alpha(\mathbb{Q}) = \sup_{X \in \mathcal{A}_\rho} \mathbb{E}^{\mathbb{Q}}[X] \]

Interpretation: The risk of \(X\) is the worst-case expected loss over all probability models \(\mathbb{Q}\), penalized by \(\alpha(\mathbb{Q})\) which measures the implausibility of \(\mathbb{Q}\). A low penalty means \(\mathbb{Q}\) is considered plausible; a high penalty means \(\mathbb{Q}\) is extreme and contributes less to the supremum.

The Entropic Risk Measure¶

The canonical convex risk measure is the entropic risk measure:

\[ \rho_\gamma(X) = \frac{1}{\gamma} \log \mathbb{E}[e^{\gamma X}] \]

where \(\gamma > 0\) is the risk aversion parameter.

Properties:

Monotone and translation invariant: \(\rho_\gamma(X + c) = \rho_\gamma(X) + c\)
Convex but not positively homogeneous: \(\rho_\gamma(2X) \ne 2\rho_\gamma(X)\) in general
Penalizes tail risk exponentially
Connected to exponential utility: an agent with utility \(U(w) = -e^{-\gamma w}\) prices risk via \(\rho_\gamma\)

Dual representation penalty:

\[ \alpha(\mathbb{Q}) = \frac{1}{\gamma} H(\mathbb{Q} \| \mathbb{P}) \]

where \(H(\mathbb{Q} \| \mathbb{P}) = \mathbb{E}^{\mathbb{Q}}\left[\log \frac{d\mathbb{Q}}{d\mathbb{P}}\right]\) is the relative entropy (Kullback-Leibler divergence). Thus:

\[ \rho_\gamma(X) = \sup_{\mathbb{Q}} \left\{\mathbb{E}^{\mathbb{Q}}[X] - \frac{1}{\gamma} H(\mathbb{Q} \| \mathbb{P})\right\} \]

Coherent Risk Measures¶

Definition¶

Definition

A monetary risk measure \(\rho\) is coherent (Artzner, Delbaen, Eber, Heath, 1999) if it satisfies:

Monotonicity: \(X \le Y\) a.s. \(\Rightarrow\) \(\rho(X) \le \rho(Y)\)
Translation invariance: \(\rho(X + c) = \rho(X) + c\)
Positive homogeneity: \(\rho(\lambda X) = \lambda \rho(X)\) for \(\lambda > 0\)
Subadditivity: \(\rho(X + Y) \le \rho(X) + \rho(Y)\)

Relationship to Convex Risk Measures¶

Proposition. Positive homogeneity + subadditivity \(\Rightarrow\) convexity.

Proof. For \(\lambda \in [0,1]\):

\[ \rho(\lambda X + (1-\lambda)Y) \le \rho(\lambda X) + \rho((1-\lambda)Y) = \lambda \rho(X) + (1-\lambda)\rho(Y) \]

where the inequality uses subadditivity and the equality uses positive homogeneity. \(\square\)

Thus: coherent \(\Rightarrow\) convex \(\Rightarrow\) monetary, and each inclusion is strict.

Acceptance Set Characterization¶

\(\rho\) is coherent if and only if \(\mathcal{A}_\rho\) is a convex cone (i.e., convex and closed under positive scaling).

Dual Representation of Coherent Risk Measures¶

Theorem (Artzner-Delbaen)

A coherent risk measure \(\rho\) on \(L^\infty\) that is lower semicontinuous admits the representation:

\[ \rho(X) = \sup_{\mathbb{Q} \in \mathcal{Q}} \mathbb{E}^{\mathbb{Q}}[X] \]

where \(\mathcal{Q} \subseteq \mathcal{M}_1\) is a convex, closed set of probability measures.

Interpretation: Coherent risk is the worst-case expected loss over a set of plausible scenarios. There is no penalty function -- equivalently, the penalty is zero on \(\mathcal{Q}\) and \(+\infty\) outside.

This is a special case of the convex dual with \(\alpha(\mathbb{Q}) = 0\) for \(\mathbb{Q} \in \mathcal{Q}\) and \(\alpha(\mathbb{Q}) = +\infty\) otherwise.

The Hierarchy of Risk Measures¶

Property	Monetary	Convex	Coherent
Monotonicity	Yes	Yes	Yes
Translation invariance	Yes	Yes	Yes
Convexity	Not required	Yes	Yes (implied)
Positive homogeneity	Not required	Not required	Yes
Subadditivity	Not required	Not required	Yes
Acceptance set	Monotone, nonempty	Convex	Convex cone
Dual representation	Not guaranteed	\(\sup_\mathbb{Q}\{\mathbb{E}^\mathbb{Q}[X] - \alpha(\mathbb{Q})\}\)	\(\sup_{\mathbb{Q} \in \mathcal{Q}} \mathbb{E}^\mathbb{Q}[X]\)

Key Examples¶

Value-at-Risk¶

\[ \text{VaR}_\alpha(X) = \inf\{x : \mathbb{P}(X \le x) \ge \alpha\} \]

Monetary: Yes (monotone and translation invariant)
Convex: No in general (fails convexity for non-elliptical distributions)
Coherent: No (fails subadditivity)
VaR is subadditive for elliptically distributed losses (e.g., multivariate normal)

Expected Shortfall¶

\[ \text{ES}_\alpha(X) = \frac{1}{1-\alpha} \int_\alpha^1 \text{VaR}_u(X) \, du \]

Monetary: Yes
Convex: Yes
Coherent: Yes
Dual set: \(\mathcal{Q} = \left\{\mathbb{Q} : \frac{d\mathbb{Q}}{d\mathbb{P}} \le \frac{1}{1-\alpha}\right\}\)

Entropic Risk Measure¶

\[ \rho_\gamma(X) = \frac{1}{\gamma} \log \mathbb{E}[e^{\gamma X}] \]

Monetary: Yes
Convex: Yes
Coherent: No (fails positive homogeneity)
Penalty: \(\alpha(\mathbb{Q}) = \frac{1}{\gamma} H(\mathbb{Q} \| \mathbb{P})\)

Worst-Case Risk Measure¶

\[ \rho_{\text{max}}(X) = \text{ess\,sup}(X) \]

Monetary: Yes
Convex: Yes
Coherent: Yes (with \(\mathcal{Q} = \mathcal{M}_1\), all probability measures)
Maximally conservative: the worst possible outcome

From Static to Dynamic¶

Monetary risk measures provide the foundation for dynamic (conditional) risk measures. The conditional versions replace the axioms with \(\mathcal{F}_t\)-measurable versions:

Conditional monotonicity: \(X \le Y\) a.s. \(\Rightarrow\) \(\rho_t(X) \le \rho_t(Y)\) a.s.
Conditional translation invariance: \(\rho_t(X + m) = \rho_t(X) + m\) for \(\mathcal{F}_t\)-measurable \(m\)

The conditional dual representation becomes:

\[ \rho_t(X) = \operatorname{ess\,sup}_{\mathbb{Q} \in \mathcal{Q}_t} \mathbb{E}^{\mathbb{Q}}[X \mid \mathcal{F}_t] \]

The time-consistency property \(\rho_s(X) = \rho_s(-\rho_t(X))\) for \(s < t\) connects conditional risk measures to BSDEs and the dynamic programming principle, bridging to the BSDE-based risk measures developed in this chapter.

Robust Representation and Model Uncertainty¶

The dual representation \(\rho(X) = \sup_{\mathbb{Q} \in \mathcal{Q}} \mathbb{E}^{\mathbb{Q}}[X]\) provides a natural framework for model uncertainty (ambiguity):

The set \(\mathcal{Q}\) represents plausible probability models
\(\rho(X)\) is the worst-case expected loss over all plausible models
An agent who is ambiguity-averse uses a coherent risk measure to hedge against model misspecification

This connects risk measurement to distributionally robust optimization:

\[ \min_{\mathbf{w}} \rho(L_{\mathbf{w}}) = \min_{\mathbf{w}} \sup_{\mathbb{Q} \in \mathcal{Q}} \mathbb{E}^{\mathbb{Q}}[L_{\mathbf{w}}] \]

which yields portfolios robust to model uncertainty.

Key Takeaways¶

Monetary risk measures require only monotonicity and translation invariance, forming the broadest axiomatic class
Convex risk measures add the diversification principle through convexity, with a dual representation via penalty functions
Coherent risk measures (Artzner axioms) add positive homogeneity and subadditivity, with a simpler dual representation as worst-case expectations
The hierarchy is strict: coherent \(\subsetneq\) convex \(\subsetneq\) monetary
VaR is monetary but not convex; ES is coherent; the entropic risk measure is convex but not coherent
The dual representation connects risk measurement to model uncertainty and robust optimization
Conditional versions of these axioms lead to dynamic risk measures and the BSDE framework

Exercises¶

Exercise 1. A monetary risk measure satisfies monotonicity and translation invariance. State each axiom formally. Verify that the worst-case risk measure \(\rho(X) = -\text{ess inf}(X)\) is monetary. Is it also coherent?

Solution to Exercise 1

Formal axioms. A risk measure \(\rho:\mathcal{X}\to\mathbb{R}\) on a space \(\mathcal{X}\) of financial positions (with the loss convention) is monetary if:

Monotonicity: For all \(X, Y \in \mathcal{X}\), if \(X \le Y\) a.s., then \(\rho(X) \le \rho(Y)\).
Translation invariance (cash invariance): For all \(X \in \mathcal{X}\) and \(c \in \mathbb{R}\), \(\rho(X + c) = \rho(X) + c\).

Verification for the worst-case risk measure. Recall that under the loss convention used in this chapter, the worst-case risk measure is

\[ \rho(X) = \operatorname{ess\,sup}(X) \]

(Note: the exercise writes \(\rho(X) = -\operatorname{ess\,inf}(X)\). Under the loss convention where \(X\) represents losses, we have \(-\operatorname{ess\,inf}(X) = \operatorname{ess\,sup}(-X)\). Both formulations define the worst-case measure depending on sign conventions. We verify the axioms for \(\rho(X) = -\operatorname{ess\,inf}(X)\).)

Monotonicity: If \(X \le Y\) a.s., then \(\inf X \ge \inf Y\) does not hold in general, but \(-\operatorname{ess\,inf}(X) \le -\operatorname{ess\,inf}(Y)\) needs care with the sign convention. Under the convention that \(X\) represents gains (positive = good), \(\rho(X) = -\operatorname{ess\,inf}(X)\) is the worst-case measure for gains, and monotonicity in that convention reads: \(X \ge Y\) a.s. \(\Rightarrow\) \(\rho(X) \le \rho(Y)\). Equivalently, using the loss convention of this chapter with \(\rho(X) = \operatorname{ess\,sup}(X)\):

If \(X \le Y\) a.s., then \(\operatorname{ess\,sup}(X) \le \operatorname{ess\,sup}(Y)\), since the supremum operator preserves order. Hence monotonicity holds.

Translation invariance: For any \(c \in \mathbb{R}\),

\[ \rho(X + c) = \operatorname{ess\,sup}(X + c) = \operatorname{ess\,sup}(X) + c = \rho(X) + c \]

so translation invariance holds. Thus \(\rho\) is monetary.

Is it coherent? We verify positive homogeneity and subadditivity.

Positive homogeneity: For \(\lambda > 0\),

\[ \rho(\lambda X) = \operatorname{ess\,sup}(\lambda X) = \lambda \operatorname{ess\,sup}(X) = \lambda \rho(X) \]

Subadditivity: For any \(X, Y\),

\[ \rho(X + Y) = \operatorname{ess\,sup}(X + Y) \le \operatorname{ess\,sup}(X) + \operatorname{ess\,sup}(Y) = \rho(X) + \rho(Y) \]

since the essential supremum is subadditive. Therefore \(\rho\) is coherent. Its dual representation corresponds to \(\mathcal{Q} = \mathcal{M}_1\) (the set of all probability measures absolutely continuous w.r.t. \(\mathbb{P}\)), making it the most conservative coherent risk measure.

Exercise 2. Show that VaR at level \(\alpha\) satisfies monotonicity and translation invariance (hence is monetary) but fails subadditivity. Construct an explicit example with two positions \(X\) and \(Y\) where \(\text{VaR}_\alpha(X + Y) > \text{VaR}_\alpha(X) + \text{VaR}_\alpha(Y)\).

Solution to Exercise 2

Monotonicity of VaR. Let \(\text{VaR}_\alpha(X) = \inf\{x : \mathbb{P}(X \le x) \ge \alpha\}\). If \(X \le Y\) a.s., then for every \(x\),

\[ \mathbb{P}(Y \le x) \le \mathbb{P}(X \le x) \]

because \(\{Y \le x\} \subseteq \{X \le x\}\) a.s. Hence if \(\mathbb{P}(X \le x) \ge \alpha\), it follows that \(\mathbb{P}(Y \le x) \ge \alpha\) is harder to achieve. More precisely, the quantile function \(F_X^{-1}(\alpha) \le F_Y^{-1}(\alpha)\) when \(X \le Y\) a.s., so \(\text{VaR}_\alpha(X) \le \text{VaR}_\alpha(Y)\).

Translation invariance. For any \(c \in \mathbb{R}\),

\[ \text{VaR}_\alpha(X + c) = \inf\{x : \mathbb{P}(X + c \le x) \ge \alpha\} = \inf\{x : \mathbb{P}(X \le x - c) \ge \alpha\} \]

Substituting \(x' = x - c\), this equals \(\inf\{x' + c : \mathbb{P}(X \le x') \ge \alpha\} = \text{VaR}_\alpha(X) + c\).

Hence VaR is monetary.

Failure of subadditivity -- explicit counterexample. Let \(\Omega = \{\omega_1, \omega_2, \ldots, \omega_{100}\}\) with uniform probability \(\mathbb{P}(\omega_i) = 1/100\). Define two positions \(X\) and \(Y\) as follows:

\(X(\omega_i) = 0\) for \(i = 1, \ldots, 99\), and \(X(\omega_{100}) = 100\)
\(Y(\omega_i) = 0\) for \(i = 1, \ldots, 99\), and \(Y(\omega_{99}) = 100\), with the events \(\{X = 100\}\) and \(\{Y = 100\}\) disjoint

More precisely, let \(X\) and \(Y\) be designed so that they each have a single large loss occurring in different states:

\(X(\omega_i) = \begin{cases} 100 & i = 100 \\ 0 & \text{otherwise}\end{cases}\)
\(Y(\omega_i) = \begin{cases} 100 & i = 99 \\ 0 & \text{otherwise}\end{cases}\)

At level \(\alpha = 0.95\):

\[ \text{VaR}_{0.95}(X) = 0, \quad \text{VaR}_{0.95}(Y) = 0 \]

since \(\mathbb{P}(X \le 0) = 99/100 = 0.99 \ge 0.95\) and similarly for \(Y\).

Now consider \(X + Y\):

\[ (X+Y)(\omega_i) = \begin{cases} 100 & i = 99 \text{ or } i = 100 \\ 0 & \text{otherwise}\end{cases} \]

We have \(\mathbb{P}(X + Y \le 0) = 98/100 = 0.98 \ge 0.95\), so \(\text{VaR}_{0.95}(X+Y) = 0\).

This particular example does not violate subadditivity. Let us refine it with more states. Consider \(\alpha = 0.95\) and a sample space with \(n = 20\) equally likely states. Define:

\(X(\omega_i) = \begin{cases} 100 & i = 20 \\ 0 & \text{otherwise}\end{cases}\)
\(Y(\omega_i) = \begin{cases} 100 & i = 19 \\ 0 & \text{otherwise}\end{cases}\)

Then \(\mathbb{P}(X \le 0) = 19/20 = 0.95 \ge 0.95\), so \(\text{VaR}_{0.95}(X) = 0\). Similarly \(\text{VaR}_{0.95}(Y) = 0\).

For \(X + Y\): \(\mathbb{P}(X+Y \le 0) = 18/20 = 0.90 < 0.95\). Hence \(\text{VaR}_{0.95}(X+Y) = 100\).

Therefore:

\[ \text{VaR}_{0.95}(X + Y) = 100 > 0 = \text{VaR}_{0.95}(X) + \text{VaR}_{0.95}(Y) \]

This demonstrates the failure of subadditivity. Diversification (combining \(X\) and \(Y\)) actually increases VaR because the two rare losses, while individually below the quantile threshold, jointly push the combined loss above it.

Exercise 3. A convex risk measure satisfies monotonicity, translation invariance, and convexity: \(\rho(\lambda X + (1-\lambda)Y) \le \lambda\rho(X) + (1-\lambda)\rho(Y)\). Show that every coherent risk measure is convex, but not vice versa. Provide an example of a convex but not coherent risk measure (hint: the entropic risk measure).

Solution to Exercise 3

Every coherent risk measure is convex. A coherent risk measure satisfies positive homogeneity and subadditivity. For \(\lambda \in [0,1]\):

\[ \rho(\lambda X + (1-\lambda)Y) \le \rho(\lambda X) + \rho((1-\lambda)Y) = \lambda\rho(X) + (1-\lambda)\rho(Y) \]

where the inequality uses subadditivity, and the equality uses positive homogeneity (applied with scalars \(\lambda\) and \(1-\lambda\)). This is exactly the convexity condition.

Not every convex risk measure is coherent. The entropic risk measure provides a counterexample. Define:

\[ \rho_\gamma(X) = \frac{1}{\gamma}\log\mathbb{E}[e^{\gamma X}] \]

for \(\gamma > 0\).

Convexity: By Hölder's inequality (or the log-convexity of the exponential moment generating function), for \(\lambda \in [0,1]\):

\[ \mathbb{E}[e^{\gamma(\lambda X + (1-\lambda)Y)}] = \mathbb{E}[e^{\gamma\lambda X} \cdot e^{\gamma(1-\lambda)Y}] \]

By Hölder's inequality with conjugate exponents \(p = 1/\lambda\) and \(q = 1/(1-\lambda)\):

\[ \mathbb{E}[e^{\gamma\lambda X} \cdot e^{\gamma(1-\lambda)Y}] \le \left(\mathbb{E}[e^{\gamma X}]\right)^\lambda \left(\mathbb{E}[e^{\gamma Y}]\right)^{1-\lambda} \]

Taking \(\frac{1}{\gamma}\log\) of both sides:

\[ \rho_\gamma(\lambda X + (1-\lambda)Y) \le \lambda \rho_\gamma(X) + (1-\lambda)\rho_\gamma(Y) \]

So \(\rho_\gamma\) is convex.

Failure of positive homogeneity: Consider a constant \(\lambda > 0\) with \(\lambda \ne 1\). For any non-degenerate random variable \(X\):

\[ \rho_\gamma(\lambda X) = \frac{1}{\gamma}\log\mathbb{E}[e^{\gamma\lambda X}] = \frac{1}{\gamma}\log\mathbb{E}[e^{(\gamma\lambda) X}] \]

If positive homogeneity held, we would need \(\rho_\gamma(\lambda X) = \lambda \rho_\gamma(X)\), i.e.,

\[ \frac{1}{\gamma}\log\mathbb{E}[e^{\gamma\lambda X}] = \frac{\lambda}{\gamma}\log\mathbb{E}[e^{\gamma X}] \]

This fails in general. For example, let \(X \sim N(0,1)\) and compute:

\[ \rho_\gamma(\lambda X) = \frac{1}{\gamma}\log e^{\gamma^2\lambda^2/2} = \frac{\gamma\lambda^2}{2} \]

\[ \lambda\rho_\gamma(X) = \lambda \cdot \frac{\gamma}{2} = \frac{\gamma\lambda}{2} \]

These are equal only when \(\lambda = 1\). Since positive homogeneity fails, \(\rho_\gamma\) is not coherent.

Exercise 4. The dual representation of a coherent risk measure is \(\rho(X) = \sup_{\mathbb{Q} \in \mathcal{Q}} \mathbb{E}^{\mathbb{Q}}[-X]\) for some set of probability measures \(\mathcal{Q}\). For Expected Shortfall at level \(\alpha\), describe the set \(\mathcal{Q}\) explicitly. Why does this representation have the interpretation of "worst-case expected loss"?

Solution to Exercise 4

Dual representation of Expected Shortfall. The coherent dual representation is:

\[ \text{ES}_\alpha(X) = \sup_{\mathbb{Q} \in \mathcal{Q}} \mathbb{E}^{\mathbb{Q}}[-X] \]

For Expected Shortfall at level \(\alpha\), the set \(\mathcal{Q}\) is:

\[ \mathcal{Q} = \left\{\mathbb{Q} \ll \mathbb{P} \;\bigg|\; \frac{d\mathbb{Q}}{d\mathbb{P}} \le \frac{1}{1-\alpha}\right\} \]

That is, \(\mathcal{Q}\) consists of all probability measures \(\mathbb{Q}\) absolutely continuous with respect to \(\mathbb{P}\) whose Radon-Nikodym derivative is bounded above by \(\frac{1}{1-\alpha}\).

Why this set? The constraint \(\frac{d\mathbb{Q}}{d\mathbb{P}} \le \frac{1}{1-\alpha}\) means that \(\mathbb{Q}\) can place at most \(\frac{1}{1-\alpha}\) times as much weight on any event as \(\mathbb{P}\) does. The extremal measure in \(\mathcal{Q}\) that achieves the supremum concentrates all its extra weight on the worst \((1-\alpha)\) fraction of outcomes. Specifically, the maximizing \(\mathbb{Q}^*\) has:

\[ \frac{d\mathbb{Q}^*}{d\mathbb{P}} = \frac{1}{1-\alpha} \mathbf{1}_{\{X \ge \text{VaR}_\alpha(X)\}} \]

(with an appropriate adjustment at the boundary if the distribution has an atom at \(\text{VaR}_\alpha\)). Under this \(\mathbb{Q}^*\):

\[ \mathbb{E}^{\mathbb{Q}^*}[X] = \frac{1}{1-\alpha}\mathbb{E}[X \cdot \mathbf{1}_{\{X \ge \text{VaR}_\alpha(X)\}}] = \text{ES}_\alpha(X) \]

Worst-case expected loss interpretation. The representation \(\text{ES}_\alpha(X) = \sup_{\mathbb{Q} \in \mathcal{Q}} \mathbb{E}^{\mathbb{Q}}[X]\) says that ES is the worst-case expected loss over all probability models in \(\mathcal{Q}\). Each \(\mathbb{Q} \in \mathcal{Q}\) can be viewed as a "stress scenario" that reweights the probability of different outcomes. The constraint \(\frac{d\mathbb{Q}}{d\mathbb{P}} \le \frac{1}{1-\alpha}\) limits how extreme these stress scenarios can be -- the adversary cannot place arbitrarily large weight on bad outcomes. The worst-case adversary concentrates attention on the tail, yielding the average loss in the worst \((1-\alpha)\) fraction of scenarios, which is precisely the definition of Expected Shortfall.

Exercise 5. Verify that the entropic risk measure \(\rho_\theta(X) = \frac{1}{\theta}\ln\mathbb{E}[e^{-\theta X}]\) satisfies monotonicity and translation invariance. Show that it is convex but not positively homogeneous (hence not coherent). What role does the parameter \(\theta\) play as a risk aversion coefficient?

Solution to Exercise 5

We verify each property for \(\rho_\theta(X) = \frac{1}{\theta}\ln\mathbb{E}[e^{-\theta X}]\).

Note on convention. The exercise uses \(\rho_\theta(X) = \frac{1}{\theta}\ln\mathbb{E}[e^{-\theta X}]\) with a negative sign in the exponent. This corresponds to a gain convention where \(X\) represents gains. Under the loss convention used earlier in the chapter (\(\rho_\gamma(X) = \frac{1}{\gamma}\log\mathbb{E}[e^{\gamma X}]\)), the formulas are equivalent with appropriate sign adjustments. We verify the properties as stated.

Monotonicity. If \(X \le Y\) a.s., then \(-\theta X \ge -\theta Y\) a.s. (for \(\theta > 0\)), so \(e^{-\theta X} \ge e^{-\theta Y}\) a.s. Hence \(\mathbb{E}[e^{-\theta X}] \ge \mathbb{E}[e^{-\theta Y}]\), and since \(\ln\) is increasing and \(1/\theta > 0\):

\[ \rho_\theta(X) = \frac{1}{\theta}\ln\mathbb{E}[e^{-\theta X}] \ge \frac{1}{\theta}\ln\mathbb{E}[e^{-\theta Y}] = \rho_\theta(Y) \]

Under the gain convention, this says higher gains lead to lower risk, which is the correct direction: \(X \le Y \Rightarrow \rho_\theta(X) \ge \rho_\theta(Y)\).

Translation invariance. For \(c \in \mathbb{R}\):

\[ \rho_\theta(X + c) = \frac{1}{\theta}\ln\mathbb{E}[e^{-\theta(X+c)}] = \frac{1}{\theta}\ln\left(e^{-\theta c}\mathbb{E}[e^{-\theta X}]\right) = \frac{1}{\theta}\left(-\theta c + \ln\mathbb{E}[e^{-\theta X}]\right) = \rho_\theta(X) - c \]

This is translation invariance under the gain convention: adding a sure gain \(c\) reduces risk by \(c\).

Convexity. For \(\lambda \in [0,1]\), by Hölder's inequality with exponents \(1/\lambda\) and \(1/(1-\lambda)\):

\[ \mathbb{E}[e^{-\theta(\lambda X + (1-\lambda)Y)}] = \mathbb{E}\left[(e^{-\theta X})^\lambda (e^{-\theta Y})^{1-\lambda}\right] \le \left(\mathbb{E}[e^{-\theta X}]\right)^\lambda \left(\mathbb{E}[e^{-\theta Y}]\right)^{1-\lambda} \]

Taking \(\frac{1}{\theta}\ln\):

\[ \rho_\theta(\lambda X + (1-\lambda)Y) \le \lambda\,\rho_\theta(X) + (1-\lambda)\,\rho_\theta(Y) \]

Failure of positive homogeneity. Let \(X \sim N(0,1)\) and \(\lambda = 2\). Then:

\[ \rho_\theta(2X) = \frac{1}{\theta}\ln\mathbb{E}[e^{-2\theta X}] = \frac{1}{\theta}\ln e^{2\theta^2} = 2\theta \]

\[ 2\,\rho_\theta(X) = \frac{2}{\theta}\ln\mathbb{E}[e^{-\theta X}] = \frac{2}{\theta}\ln e^{\theta^2/2} = \theta \]

Since \(2\theta \ne \theta\) for \(\theta > 0\), positive homogeneity fails. Therefore \(\rho_\theta\) is convex but not coherent.

Role of \(\theta\) as risk aversion. The parameter \(\theta > 0\) controls the degree of risk aversion:

As \(\theta \to 0^+\), a Taylor expansion gives \(\rho_\theta(X) \to -\mathbb{E}[X]\) (risk-neutral assessment).
As \(\theta \to \infty\), \(\rho_\theta(X) \to -\operatorname{ess\,inf}(X)\) (worst-case assessment).
For intermediate \(\theta\), the exponential weighting \(e^{-\theta X}\) penalizes large losses (small \(X\)) more heavily, with the penalty increasing in \(\theta\).

The entropic risk measure is directly linked to exponential utility: an agent with utility \(U(w) = -e^{-\theta w}\) computes their certainty equivalent via \(\rho_\theta\), so \(\theta\) is literally the Arrow-Pratt coefficient of absolute risk aversion.

Exercise 6. Explain the hierarchy: monetary \(\supset\) convex \(\supset\) coherent risk measures. For each level, provide a representative example and state which additional axioms are required. Why did the Artzner et al. (1999) coherence axioms become the standard framework for regulatory risk measurement?

Solution to Exercise 6

The hierarchy. The three classes form a strict chain of inclusions:

\[ \text{Coherent} \subsetneq \text{Convex} \subsetneq \text{Monetary} \]

Level 1: Monetary risk measures. Axioms required: (i) monotonicity, (ii) translation invariance. Representative example: Value-at-Risk \(\text{VaR}_\alpha(X) = \inf\{x : \mathbb{P}(X \le x) \ge \alpha\}\). VaR satisfies monotonicity and translation invariance but fails convexity (and hence subadditivity) as shown in Exercise 2.

Level 2: Convex risk measures. Additional axiom beyond monetary: (iii) convexity: \(\rho(\lambda X + (1-\lambda)Y) \le \lambda\rho(X) + (1-\lambda)\rho(Y)\). Representative example: Entropic risk measure \(\rho_\gamma(X) = \frac{1}{\gamma}\log\mathbb{E}[e^{\gamma X}]\). It satisfies monotonicity, translation invariance, and convexity, but fails positive homogeneity (hence is not coherent), as shown in Exercise 5.

Level 3: Coherent risk measures. Additional axioms beyond monetary: (iii) positive homogeneity: \(\rho(\lambda X) = \lambda\rho(X)\) for \(\lambda > 0\), and (iv) subadditivity: \(\rho(X+Y) \le \rho(X) + \rho(Y)\). (These together imply convexity.) Representative example: Expected Shortfall \(\text{ES}_\alpha(X) = \frac{1}{1-\alpha}\int_\alpha^1 \text{VaR}_u(X)\,du\).

Why each inclusion is strict:

VaR is monetary but not convex \(\Rightarrow\) monetary \(\supsetneq\) convex.
The entropic risk measure is convex but not coherent \(\Rightarrow\) convex \(\supsetneq\) coherent.

Why coherence became the regulatory standard. The Artzner et al. (1999) axioms became the standard for several reasons:

Subadditivity captures diversification. Regulators want risk measures that reward diversification: the risk of a combined portfolio should not exceed the sum of individual risks. VaR's failure of subadditivity means it can penalize diversification, which is economically perverse.
Positive homogeneity ensures proportionality. Doubling a position should double its risk. This makes capital requirements scale linearly with position size, which is intuitive and operationally simple.
Dual representation provides robustness. The representation \(\rho(X) = \sup_{\mathbb{Q} \in \mathcal{Q}} \mathbb{E}^{\mathbb{Q}}[X]\) means coherent risk measures can be interpreted as worst-case expected losses under model uncertainty. This aligns with the regulatory goal of conservatism and robustness.
Basel III/IV adopted ES. Partly motivated by the theoretical advantages of coherence, the Basel Committee replaced VaR with Expected Shortfall as the primary market risk measure in the Fundamental Review of the Trading Book (FRTB), explicitly citing the subadditivity failure of VaR.
Mathematical tractability. Coherent risk measures have cleaner dual representations (no penalty function), which simplifies optimization, capital allocation (via the Euler allocation), and hedging computations.

Monetary Risk Measures¶

The Axiomatic Framework¶

Setting¶

Monetary Risk Measures¶

Interpretation¶

Acceptance Set¶

Convex Risk Measures¶

Definition¶

Interpretation¶

Acceptance Set Characterization¶

Dual Representation of Convex Risk Measures¶

The Entropic Risk Measure¶

Coherent Risk Measures¶

Definition¶

Relationship to Convex Risk Measures¶

Acceptance Set Characterization¶

Dual Representation of Coherent Risk Measures¶

The Hierarchy of Risk Measures¶

Key Examples¶

Value-at-Risk¶

Expected Shortfall¶

Entropic Risk Measure¶

Worst-Case Risk Measure¶

From Static to Dynamic¶

Robust Representation and Model Uncertainty¶

Key Takeaways¶

Further Reading¶

Exercises¶