Long-Time Trajectorial Large Deviations and Importance Sampling for Affine Stochastic Volatility Models

Zorana Grbac; David Krief; Peter Tankov

doi:10.1017/apr.2020.58

Long-Time Trajectorial Large Deviations and Importance Sampling for Affine Stochastic Volatility Models

Part of: Limit theorems Mathematical finance

Published online by Cambridge University Press: 17 March 2021

Zorana Grbac ,

David Krief and

Peter Tankov

Show author details

Zorana Grbac*: Affiliation:
Université de Paris
David Krief*: Affiliation:
Université de Paris
Peter Tankov*: Affiliation:
ENSAE, Institut Polytechnique de Paris
*: *Postal address: 5 rue Thomas Mann, 75013Paris, France.
*Postal address: 5 rue Thomas Mann, 75013Paris, France.
****Postal address: 5 avenue Henry Le Chatelier, 91120Palaiseau, France. Email address: peter.tankov@ensae.fr

Article contents

Abstract
Introduction
Model description
Large deviations theory
Trajectorial large deviations for ASV models
Variance reduction
Numerical examples
References

Rights & Permissions

Abstract

We establish a pathwise large deviation principle for affine stochastic volatility models introduced by Keller-Ressel (2011), and present an application to variance reduction for Monte Carlo computation of prices of path-dependent options in these models, extending the method developed by Genin and Tankov (2020) for exponential Lévy models. To this end, we apply an exponentially affine change of measure and use Varadhan’s lemma, in the fashion of Guasoni and Robertson (2008) and Robertson (2010), to approximate the problem of finding the measure that minimizes the variance of the Monte Carlo estimator. We test the method on the Heston model with and without jumps to demonstrate its numerical efficiency.

Keywords

Large deviations Monte Carlo methods importance sampling affine stochastic volatility

MSC classification

Primary: 60F10: Large deviations

Secondary: 91G60: Numerical methods (including Monte Carlo methods)

Type: Original Article
Information: Advances in Applied Probability , Volume 53 , Issue 1 , March 2021 , pp. 220 - 250

DOI: https://doi.org/10.1017/apr.2020.58 [Opens in a new window]
Copyright: © The Author(s), 2021. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

The aim of this paper is to develop efficient importance sampling estimators for prices of path-dependent options in affine stochastic volatility (ASV) models of asset prices. To this end, we establish pathwise large deviation results for these models, which are of independent interest.

An ASV model, studied by Keller-Ressel in [Reference Keller-Ressel14], is a two-dimensional affine process (X,V) on $\mathbb R\times\mathbb R_+$ with special properties, where X models the logarithm of the stock price and V its instantaneous variance. This class includes many well studied and widely used models, such as Heston’s stochastic volatility model [Reference Heston12], the model of Bates [Reference Bates3], the stochastic volatility model of Barndorff-Nielsen and Shephard [Reference Barndorff-Nielsen and Shephard2], and time-changed Lévy models with independent affine time change. European options in ASV models may be priced by Fourier transform, but for path-dependent options explicit formulas are in general not available and Monte Carlo is often the method of choice. At the same time, Monte Carlo simulation of such processes is difficult and time-consuming: the convergence rates of discretization schemes are often low because of the irregular nature of the coefficients of the corresponding stochastic differential equations. To accelerate Monte Carlo simulation, it is thus important to develop efficient variance-reduction algorithms for these models.

In this paper, we therefore develop an importance sampling algorithm for ASV models. Let $(\Omega, \mathcal{F}, \mathbb P)$ be a given probability space, and denote by $\mathbb E [{\cdot}]$ the expectation with respect to $\mathbb P$. The importance sampling method is based on the following identity, valid for any probability measure $\mathbb Q$, with respect to which $\mathbb P$ is absolutely continuous. Let P be a deterministic function of a random trajectory S; then

\begin{equation*}\mathbb E[P(S)] = \mathbb E^{\mathbb Q}\left[\frac{d\mathbb P}{d\mathbb Q}P(S)\right].\end{equation*}

This allows one to define the importance sampling estimator

\begin{equation*}\widehat P^{\mathbb Q}_N \,:\!=\, \frac{1}{N} \sum_{j=1}^N \left[\frac{d\mathbb P}{d\mathbb Q}\right]^{(j)} P\big(S^{(j)}_{\mathbb Q}\big),\end{equation*}

where the $S^{(j)}_{\mathbb Q}$ are independent and identically distributed sample trajectories of S under the measure $\mathbb Q$. For efficient variance reduction, one then needs to find a probability measure $\mathbb Q$ such that S is easy to simulate under $\mathbb Q$ and the variance

\begin{equation*}\mathbb{V}\text{ar}_{\mathbb Q} \left[ P(S)\frac{d\mathbb P}{d\mathbb Q} \right]\end{equation*}

is considerably smaller than the original variance $\mathbb{V}\text{ar}_{\mathbb P} \left[P(S) \right]$.

In this paper, following the work of Genin and Tankov [Reference Genin and Tankov9] in the context of Lévy processes, we define the probability $\mathbb Q$ using the exponentially affine measure change

\begin{equation*}\frac{d\mathbb P_\Theta}{d\mathbb P} = \frac{e^{\int_{[0, T]} X_t \cdot \Theta(dt)}}{\mathbb E\big[e^{\int_{[0, T]} X_t \cdot \Theta(dt)}\big]},\end{equation*}

where X is the first component of the ASV model (the logarithm of stock price) and $\Theta$ is a (deterministic) bounded signed measure on [0, T]. Such a choice is justified by several considerations. First, under the new measure the characteristic function of X remains analytically tractable; moreover, if $\Theta$ is supported by a finite number of points, X is piecewise affine under $\mathbb P_\Theta$ and is relatively easy to simulate. Second, for a class of payoffs possessing the concavity property, such a choice leads to asymptotically optimal variance reduction (see Theorem 9). For other payoffs it may be necessary to use more strongly path-dependent measure changes (e.g., with stochastic $\Theta$), which can be obtained by approximating the solution of the Hamilton–Jacobi–Bellman equation resulting from the minimization of the variance; see for example [Reference Dupuis and Wang6, Reference Dupuis and Wang7]. However, such schemes may lead to higher computational complexity.

The optimal choice of $\Theta$ should minimize the variance of the estimator under $\mathbb P_\Theta$,

\begin{equation*}\mathbb{V}\text{ar}_{\mathbb P_\Theta} \left( P(S) \frac{d\mathbb P}{d\mathbb P_\Theta}\right) = \mathbb E \left[P^2(S)\frac{d\mathbb P}{d\mathbb P_\Theta}\right] - \mathbb E\left[P(S)\right]^2.\end{equation*}

The computation of this variance is in general as difficult as the computation of the option price itself. Following [Reference Glasserman, Heidelberger and Shahabuddin10, Reference Guasoni and Robertson11, Reference Robertson16] and more recently [Reference Genin and Tankov9], we propose to compute the variance reduction measure $\Theta^*$ by minimizing the proxy for the variance computed using the theory of large deviations.

To this end, we establish a pathwise large deviation principle (LDP) for ASV models. A one-dimensional LDP for $X_t/t$ as $t\to \infty$ where X is the first component of an ASV model has been proven by Jacquier et al. [Reference Jacquier, Keller-Ressel and Mijatović13]. In this paper, we first establish an equivalent result for multiple dates and then we use the Dawson–Gärtner theorem to extend it to the trajectorial setting, in the spirit of the pathwise LDP principles of Léonard [Reference Léonard15], but in the weaker topology of pointwise convergence.

The rest of the paper is structured as follows. In Section 2, we describe the model and recall certain useful properties of ASV processes. In Section 3, we recall some general results of large deviations theory. In Section 4, we prove an LDP for the trajectories of ASV processes. In Section 5, we develop the variance reduction method, using an asymptotically optimal change of measure obtained via the LDP shown in Section 4. In Section 6, we test the method numerically on several examples of options, some of which are path-dependent, in the Heston model with and without jumps.

2. Model description

In this paper, we model the price of the underlying asset $(S_t)_{t \ge 0}$ of an option as $S_t = S_0 \, e^{X_t}$, where we model $(X_t)_{t \ge 0}$ as an ASV process. We recall the definition and some properties of ASV models from [Reference Keller-Ressel14] and [Reference Duffie, Filipovic and Schachermayer5].

Definition 1. An ASV model $(X_t,V_t)_{t \ge 0}$, is a stochastically continuous, time-homogeneous Markov process such that $\left(e^{X_t}\right)_{t \ge 0}$ is a martingale and

(1)

\begin{equation}\mathbb E\left( e^{u X_{t} + w V_{t}}\middle|X_0=x,V_0=v\right) = e^{\phi(t,u,w) + \psi(t,u,w) \, v + u \, x},\end{equation}

for all $(t,u,w) \in \mathbb R_+ \!\!\times \mathbb C^2$.

Proposition 1. The functions $\phi$ and $\psi$ satisfy generalized Riccati equations

(2a)

\begin{align}\partial_{t} \phi(t,u,w) & = F(u,\psi(t,u,w)) , && \phi(0,u,w) = 0, \end{align}

(2b)

\begin{align}\partial_{t} \psi(t,u,w) & = R(u,\psi(t,u,w)) , && \psi(0,u,w) = w ,\end{align}

where F and R have the Lévy–Khintchine forms

\begin{align*}F(u,w) & =\left(\begin{array}{c@{\quad}c}u & w\end{array}\right)\cdot \frac{a}{2} \cdot\left(\begin{array}{c}u \\ w\end{array}\right)+ b \cdot\left(\begin{array}{c}u \\ w\end{array}\right) \\ & \quad + \int_{D\backslash\{0\}}\left(e^{xu+yw}-1-w_F(x,y) \cdot \left(\begin{array}{c}u \\ w\end{array}\right)\right) m(dx,dy) , \\ R(u,w) & =\left(\begin{array}{c@{\quad}c}u & w\end{array}\right)\cdot \frac{\alpha}{2} \cdot\left(\begin{array}{c}u \\ w\end{array}\right)+ \beta \cdot\left(\begin{array}{c}u \\ w\end{array}\right) \\ & \quad + \int_{D\backslash\{0\}}\left(e^{xu+yw}-1-w_R(x,y) \cdot \left(\begin{array}{c}u \\w\end{array}\right)\right) \mu(dx,dy) ,\end{align*}

where $D = \mathbb R \times \mathbb R_+$,

\begin{equation*}w_F(x,y) =\left(\begin{array}{c}\frac{x}{1+x^2} \\[7pt]0\end{array}\right)\qquad {and} \qquad w_R(x,y) =\left(\begin{array}{c}\frac{x}{1+x^2} \\[7pt] \frac{y}{1+y^2}\end{array}\right),\end{equation*}

and $(a,\alpha, b, \beta, m, \mu)$ satisfy the following conditions:

• $a,\alpha$ are positive semidefinite 2 $\times$2-matrices where $a_{12}=a_{21}=a_{22}=0$;
• $b \in D$ and $\beta \in \mathbb R^2$;
• m and $\mu$ are Lévy measures on D, and $\int_{D\backslash \{0\}} ((x^2+y) \wedge 1) \, m(dx,dy) < \infty$.

Define the function

\begin{equation*}\chi(u) = \left. \partial_{w} R(u,w) \right|_{w=0} = \alpha_{12} u + \beta_2 + \int_{D\backslash\{0\}} y \left(e^{xu}- \frac{1}{1+y^2}\right) \mu(dx,dy) .\end{equation*}

In the rest of the paper, we shall impose the following standing assumption to ensure nondegeneracy of the model (dependence of the law of $(X_t)_{t \ge 0}$ on $V_0$), the martingale property of $(S_t)_{t\geq 0}$, where $S_t = S_0 \, e^{X_t}$, and the existence of the long-time limit of the functions $\phi$ and $\psi$; see [Reference Keller-Ressel14, Corollary 2.7 and Theorem 3.4].

Assumption 1. (Nondegeneracy and martingale property.) The functions F, R, and $\chi$ are such that there exists $u \in \mathbb R$ such that $R(u,0) \neq 0$, $F(1,0)=R(1,0)=0$, $\chi(0) <0$, and $\chi(1) < 0$.

In the following theorem, we compile several results of [Reference Keller-Ressel14] which describe the behaviour of the solutions to the equations given in (2) as $t \rightarrow\infty$. Throughout the paper we will use the functions w(u) and $\tilde w(u)$, denoting respectively the unique stable equilibrium and the unstable equilibrium (if there is one) of (2b), which are defined below in Parts (1) and (2) of Theorem 2, as well as the function h(u) defined in Part (6) (Equation (5)).

Theorem 1. The following statements hold:

(1) There exist a maximal interval $I \supseteq [0, 1]$ and a unique function $w\in C(I)\cap C^1({I}^{^{\!\kern-1pt\circ}}\kern-1pt)$ such that
\begin{equation*}R(u,w(u)) = 0 \quad {for\ all}\quad u\in I,\end{equation*}
$w(0)=w(1)=0$, $w(u)<0$ for all $u\in (0,1)$, $w(u)>0$ for all $u\in I\setminus [0, 1]$, and
\begin{equation*}\partial_w R(u,w(u))<0\quad {for\ all\ u\in} {I}^{^{\!\kern-1pt\circ}}.\end{equation*}

In other words, the function w(u) is the unique stable equilibrium of (2b).
(2) For $u \in I$, (2b) admits at most one other equilibrium $\tilde{w}(u)$, which is unstable.
(3) For $u \in \mathbb R \backslash I$, (2b) does not have any equilibrium.

We denote by $\mathcal{B}(u)$ the basin of attraction of the stable solution w(u) of (2b), and by $J=\{u\in I\,:\,F(u,w(u)) < \infty\}$ the domain of $u \mapsto F(u,w(u))$. We then have the following:
(4) J is an interval such that $[0, 1] \subseteq J \subseteq I$.
(5) For $u \in I$, $w \in \mathcal B(u)$, and $\Delta t > 0$, we have
(3) \begin{equation}\psi\left(\frac{\Delta t}{\epsilon},\, u,\, w \right) \underset{\epsilon \rightarrow 0}{\longrightarrow} w(u) . \end{equation}
(6) For $u \in J$, $w \in \mathcal B(u)$, and $\Delta t > 0$,
(4) \begin{equation}\epsilon\phi\left(\frac{\Delta t}{\epsilon},\, u,\, w \right) \underset{\epsilon \rightarrow 0}{\longrightarrow} \Delta t \, h\left(u\right)\! ,\end{equation}
where
(5) \begin{equation}h(u) \,:\!=\, F(u,w(u)) = \lim_{\epsilon \rightarrow 0} \epsilon \log \mathbb E\left[e^{u X_{1/\epsilon}}\right].\end{equation}
(7) For every $u \in I$, $0 \in \mathcal B(u)$.

Definition 2. A convex function $f\,:\, \mathbb R^n \rightarrow \mathbb R \cup \{\infty\}$ with effective domain $D_f$ is essentially smooth if

(i) $D^\circ_f$ is non-empty;
(ii) f is differentiable in $D^\circ_f$;
(iii) f is steep; that is, for any sequence $(u_n)_{n\in \mathbb N} \subset D_f^\circ$ that converges to a point in the boundary of $D_f$,
\begin{equation*}\lim_{n \rightarrow \infty} ||\nabla f(u_n)|| = \infty .\end{equation*}

In the rest of the paper, to establish large deviations results, we shall make the following assumptions on the model.

Assumption 2. The function h defined in (5) satisfies the following properties:

1. There exists $u < 0$ such that $h(u) < \infty$.
2. The map $u \mapsto h(u)$ is essentially smooth.

Jacquier et al. [Reference Jacquier, Keller-Ressel and Mijatović13] provide the following set of sufficient conditions for Assumption 2 to be satisfied.

Proposition 2. ([Reference Jacquier, Keller-Ressel and Mijatović13, Corollary 8].) Let (X, V) be an ASV model satisfying Assumption 1. Suppose that either of the following conditions holds:

(i) The Lévy measure $\mu$ of R has exponential moments of all orders, F is steep, and $(0,0),(1,0) \in D_F^\circ$.
(ii) (X,V) is a diffusion.

Then the function h is well defined for every $u \in \mathbb R$, with effective domain J. Moreover, h is essentially smooth, and $\{0, 1\} \subset J^\circ$.

We now discuss the form of the basin of attraction of the unique stable solution of (2b).

Lemma 1. ([Reference Keller-Ressel14, Lemma 2.2].)

(a) F and R are proper closed convex functions on $\mathbb R^2$.
(b) F and R are analytic in the interior of their effective domain.
(c) Let U be a one-dimensional affine subspace of $\mathbb R^2$. Then $F|U$ is either a strictly convex or an affine function. The same holds for $R|U$.
(d) If $(u, w) \in D_F$, then also $(u,\eta) \in D_F$ for all $\eta \le w$. The same holds for R.

Lemma 2. Let $f \,:\, \mathbb R \rightarrow \mathbb R \cup \{+\infty\}$ be a convex function with either two zeros $w < \tilde{w}$, or a single zero w. In the latter case, we let $\tilde{w} = \infty$. Assume that there exists $y \in (w,\tilde{w})$ such that $f(y) < 0$. Then for every $x \in D_f$,

\begin{equation*}\begin{cases}f(x) > 0 \qquad & {if} \ x < w \quad \text{or} \quad \tilde{w} < x ; \\[4pt] f(x) < 0 \qquad & {if} \ x \in (w,\tilde{w}) .\end{cases}\end{equation*}

Proof. By convexity, for every $x \in D_f$ such that $x < w$,

\begin{equation*}\frac{y-w}{y-x} \, f(x) + \frac{w-x}{y-x} \, f(y) \ge f(w) = 0,\end{equation*}

and therefore

\begin{equation*}f(x) \ge -\frac{w-x}{y-w} \, f(y) > 0.\end{equation*}

Furthermore, for every $x \in (w,y]$,

\begin{equation*}f(x) \le \frac{y-x}{y-w} \, f(w) + \frac{x-w}{y-w} \, f(y) < 0 .\end{equation*}

Let $s = \sup\{x \in D_f \,:\, f(x) < 0 \}$. If f is continuous in s, then $\tilde{w} = s$ and for every $x > \tilde{w}$ in $D_f$,

\begin{equation*}f(x) \ge -\frac{\tilde{w}-x}{y-\tilde{w}} \, f(y) > 0.\end{equation*}

If f is discontinuous in s, however, then by convexity, $f(x) = +\infty$ for $x > s$.

Proposition 3. Let $u \in I$ and let w(u) denote the stable equilibrium of (2b), defined in Theorem 1(1). Then the basin of attraction of w(u) is $\mathcal B(u) = ({-}\infty, \tilde{w}(u)) \cap D_{R(u,\cdot)}$, where $\tilde{w}(u)$ is the unstable equilibrium of (2b), defined in Theorem 1(2), and $\tilde{w}(u) = \infty$ when (2b) admits only one equilibrium.

Proof. By Lemma 1, $w \mapsto R(u,w)$ is convex. Since w(u) is a stable equilibrium, the hypotheses of Lemma 2 are satisfied. Therefore, $R(u, w) > 0$ for every $w < w(u)$, whereas $R(u, w) < 0$ for every $w \in D_{R(u,\cdot)}$ such that $ w(u) < w < \tilde{w}(u)$. This implies that the solution of

(6)

\begin{equation}\partial_{t} \psi(t,u,w) = R(u,\psi(t,u,w)) , \qquad \psi(0,u,w) = w\end{equation}

converges to w(u) for every $w \in ({-}\infty, \tilde{w}(u)) \cap D_{R(u,\cdot)}$, whereas, if $w > \tilde{w}$, the solution of (6) diverges to $\infty$.

3. Large deviations theory

In this section, we recall some useful classical results from the theory of large deviations. We refer the reader to Dembo and Zeitouni [Reference Dembo and Zeitouni4] for the proofs and for a broader overview of the theory.

Theorem 2. (Gärtner–Ellis.) Let $\left(X^\epsilon\right)_{\epsilon \in ]0,1]}$ be a family of random vectors in $\mathbb R^n$ with associated measure $\mu_\epsilon$. Assume that for each $\lambda \in \mathbb R^n$, the logarithmic moment generating function, defined as the limit

\begin{equation*}\Lambda(\lambda) \,:\!=\, \lim_{\epsilon \rightarrow 0} \epsilon \log \mathbb E\left[ e^{\frac{\left\langle {\lambda,X^\epsilon} \right\rangle}{\epsilon}}\right],\end{equation*}

exists as an extended real number. Assume also that 0 belongs to the interior of $D_\Lambda\,:\!=\,\{\lambda \in \mathbb R^n \,:\, \Lambda(\lambda) < \infty\}$. Define

\begin{equation*}\Lambda^*(x) = \sup_{\lambda\in \mathbb R^n} \left\langle {\lambda,x} \right\rangle - \Lambda(\lambda), \qquad x\in \mathbb R^n.\end{equation*}

Then the following hold:

(a) For any closed set F,
\begin{equation*}\limsup_{\epsilon \rightarrow 0} \epsilon \, \log \mu_\epsilon(F) \le -\inf_{x \in F} \Lambda^*(x) .\end{equation*}
(b) For any open set G,
\begin{equation*}\liminf_{\epsilon \rightarrow 0} \epsilon \, \log \mu_\epsilon(G) \ge-\inf_{x \in G \cap \mathcal F} \Lambda^*(x) ,\end{equation*}
where $\mathcal F$ is the set of exposed points of $\Lambda^*$, whose exposing hyperplane belongs to the interior of $D_\Lambda$.
(c) If $\Lambda$ is an essentially smooth, lower semicontinuous function, then $\mu_\epsilon$ satisfies an LDP with good rate function $\Lambda^*$.

Remark 1. In our paper, the random variable $X^\epsilon$ will correspond to the value of the ASV process X computed at time $1/\epsilon$, and the limiting log-Laplace transform $\Lambda(\lambda)$ from the Gärtner–Ellis theorem therefore coincides with the function h defined in 5.

Definition 3. A partially ordered set $(\mathcal P, {\le})$ is called right-filtering if for every $i,j \in \mathcal P$, there exists $k \in \mathcal P$ such that $i \le k$ and $j \le k$.

Definition 4. A projective system $(\mathcal Y_j, p_{ij})_{i \le j \in \mathcal P}$ on a partially ordered right-filtering set $(\mathcal P,{\le})$ is a family of Hausdorff topological spaces $(\mathcal Y_j)_{j\in \mathcal P}$ and continuous maps $p_{ij}\,:\, \mathcal Y_j \rightarrow \mathcal Y_i$ such that $p_{ik}=p_{ij}\circ p_{jk}$ whenever $i \le j \le k$.

Definition 5. Let $(\mathcal Y_j, p_{ij})_{i \le j \in \mathcal P}$ be a projective system on a partially ordered right-filtering set $(\mathcal P,{\le})$. The projective limit of $(\mathcal Y_j, p_{ij})_{i \le j \in \mathcal P}$, denoted $\smash{\mathcal X = \underset{\longleftarrow}{\lim} \mathcal Y_j}$, is the subset of the topological space ${\mathcal Y = \prod_{j \in \mathcal P} \mathcal Y_j}$ consisting of all the elements $x = (y_j)_{j \in \mathcal P}$ for which $y_i = p_{ij}(y_j)$ whenever $i \le j$, equipped with the topology induced by $\mathcal Y$. The projective limit of closed subsets $F_j \subseteq \mathcal Y_j$ is defined in the same way and denoted by $F = \underset{\longleftarrow}{\lim} F_j$.

Remark 2. The canonical projections of $\mathcal X$, i.e. the restrictions $p_j\,:\, \mathcal X \rightarrow \mathcal Y_j$ of the coordinate maps from $\mathcal{Y}$ to $\mathcal Y_j$, are continuous.

Theorem 3. (Dawson–Gärtner.) Let $(\mathcal Y_j, p_{ij})_{i \le j \in \mathcal P}$ be a projective system on a partially ordered right-filtering set $(\mathcal P,{\le})$, and let $(\mu_\epsilon)$ be a family of probabilities on $\smash{\mathcal X=\underset{\longleftarrow}{\lim} \mathcal Y_j}$, such that for any $j \in \mathcal P$, the Borel probability $\mu_\epsilon \circ p_j^{-1}$ on $\mathcal Y_j$ satisfies the LDP with good rate function $\Lambda^*_j$. Then $\mu_\epsilon$ satisfies the LDP with good rate function

\begin{equation*}\Lambda^*(x) = \sup_{j \in \mathcal P} \Lambda^*_{j}(p_j(x)).\end{equation*}

Theorem 4. (Varadhan’s lemma, version of [Reference Guasoni and Robertson11].) Let $\mathcal X$ be a regular Hausdorff space, and let $(X^\epsilon)_{\epsilon \in ]0,1\,]}$ be a family of $\mathcal X$-valued random variables whose laws $\mu_\epsilon$ satisfy an LDP with rate function $\Lambda^*$. If $\varphi\,:\, \mathcal X \rightarrow \mathbb R \cup\{-\infty\}$ is a function such that the set $\{\varphi>-\infty\}$ is open, and $\varphi$ is continuous on this set and satisfies

(7)

\begin{align}\limsup_{\epsilon \rightarrow 0} \epsilon\,\log \mathbb E\left[\exp\left(\frac{\gamma\,\varphi(X^\epsilon)}{\epsilon}\right) \right] < \infty\end{align}

for some $\gamma > 1$, then

\begin{equation*}\lim_{\epsilon\rightarrow 0} \epsilon \,\log\mathbb E\left[ \exp\left( \frac{\varphi(X^\epsilon)}{\epsilon} \right) \right] = \sup_{x\in \mathcal X} \{ \varphi(x)-\Lambda^*(x) \}.\end{equation*}

4. Trajectorial large deviations for ASV models

In this section, we prove a trajectorial LDP for $(X_t)$ when the time horizon is large. Define, for $\epsilon \in (0,1]$ and $0 \le t \le T$, the scaling $X_t^\epsilon=\epsilon X_{t/\epsilon}$. We proceed by first proving an LDP for $X_t^\epsilon$ in finite dimension, which we extend, in a second step, to the whole trajectory of $(X_t^\epsilon)_{0 \le t \le T}$.

4.1. Finite-dimensional LDP

Let $\tau=\{0<t_1 < \ldots < t_n = t\}$, with the convention $t_0=0$, and define

\begin{equation*}\Lambda_{\epsilon,\tau}(\theta) \,:\!=\, \log\mathbb E\left[e^{\sum_{k=1}^{n} \theta_k X_{t_k}^\epsilon}\right]\end{equation*}

for $\theta \in \mathbb R^n$. We start by formulating our main technical assumption. Recall that w(u) and $\tilde w(u)$ denote, respectively, the stable and unstable equilibrium points of the Riccati equation (2b), defined in Parts (1) and (2) of Theorem 1.

Assumption 3. One of the following conditions is satisfied:

(1) The interval support of F is $J=[u_-,u_+]$, and $w(u_-) = w(u_+)$.
(2) For every $u \in \mathbb R$, $\tilde{w}({\cdot}) = \infty$, i.e, the generalized Riccati equations have only one (stable) equilibrium.

The above assumption may be seen as rather restrictive; however, there are important models in which it is satisfied, such as the Heston model, with or without jumps, when there is no correlation between the Brownian motions driving the asset price and the volatility (see Remark 7). In the following lemma we state a consequence of Assumption 3 which will be used hereafter.

Lemma 3. Let Assumption 3 hold. For every $u_1,u_2 \in I$, $\tilde{w}(u_1) \ge w(u_2)$.

Proof. If Assumption 3(2) holds, then the result is obvious. Assume then that it is Assumption 3(1) that holds. Since $u \mapsto w(u)$ is convex and $u \mapsto\tilde{w}(u)$ is concave [Reference Keller-Ressel14, Lemma 3.3], for every $u_1,u_2 \in I$ we have

\begin{equation*}\tilde{w}(u_1) \ge \frac{u_+-u_1}{u_+-u_-} \tilde{w}(u_-) + \frac{u_1-u_-}{u_+-u_-} \tilde{w}(u_+) \ge w(u_-) ,\end{equation*}

while

\begin{equation*}w(u_2) \le \frac{u_+-u_2}{u_+-u_-} w(u_-) + \frac{u_2-u_-}{u_+-u_-} w(u_+) = w(u_-) .\end{equation*}

Therefore $\tilde{w}(u_1) \ge w(u_2)$ for every $u_1,u_2 \in I$.

As a first step towards applying Theorem 3, we prove the following result.

Theorem 5. Let $\theta \in \mathbb R^n$. If Assumption 3 holds, then

\begin{equation*}\Lambda_{\tau}(\theta)\,:\!=\, \lim_{\epsilon \rightarrow 0} \epsilon \,\Lambda_{\epsilon,\tau}(\theta/\epsilon) =\begin{cases}\sum_{j=1}^n (t_{j}-t_{j-1}) \, h\left({\bar\theta}_j\right) \quad & if\ {\bar\theta}_j \in J \quad \forall j, \\[5pt] \infty & { otherwise},\end{cases}\:\end{equation*}

where ${\bar\theta}_j \,:\!=\, \sum_{k=j}^n \theta_k$, and h is as defined in (5).

Proof. Since Assumption 3 holds, by Lemma 3 we have that $w({\bar\theta}_{j+1}) \in \mathcal B({\bar\theta}_j)$ for every j. Assume first that ${\bar\theta}_j \in J$ for every j. Using the Markov property and Equation (1), we obtain

\begin{align*}\Lambda_\tau(\theta)& = \lim_{\epsilon \rightarrow 0} \epsilon \log \Big(\mathbb E\Big[e^{\sum_{j=1}^n \theta_j X_{t_j/\epsilon}}\Big]\Big) \\ & = \lim_{\epsilon \rightarrow 0} \epsilon \log \Big(\mathbb E\Big[e^{\sum_{j=1}^{n-1}\theta_j X_{t_j/\epsilon}}\, \mathbb E\Big(e^{{\bar\theta}_n X_{t_n/\epsilon}} | X_{t_{n-1}/\epsilon},V_{t_{n-1}/\epsilon}\Big)\Big]\Big)\\ & = \lim_{\epsilon \rightarrow 0} \epsilon \, \phi\Big(\frac{t_{n}-t_{n-1}}{\epsilon},\, {\bar\theta}_n,\, 0 \Big) \\ & \quad + \epsilon \log \bigg(\mathbb E\bigg[e^{\sum_{j=1}^{n-2} \theta_j X_{t_j/\epsilon} + {\bar\theta}_{n-1} X_{t_{n-1}/\epsilon} + \psi\big(\frac{t_{n}-t_{n-1}}{\epsilon},\, {\bar\theta}_n,\, 0 \big)V_{t_{n-1}/\epsilon}}\bigg]\bigg) .\end{align*}

Since ${\bar\theta}_n \in J$ and $0 \in \mathcal B({\bar\theta}_n)$, Equations (3) and (4) apply, and

\begin{align*}\Lambda_\tau(\theta)& = \lim_{\epsilon \rightarrow 0} \epsilon \log \left(\mathbb E\left[e^{\sum_{j=1}^{n-2} \theta_j X_{t_j/\epsilon} + {\bar\theta}_{n-1} X_{t_{n-1}/\epsilon} + \psi\left(\frac{t_{n}-t_{n-1}}{\epsilon},\, {\bar\theta}_n,\, 0 \right)V_{t_{n-1}/\epsilon}}\right]\right) \\ & \quad + (t_{n}-t_{n-1})\, h({\bar\theta}_n) .\end{align*}

Using the fact that ${\bar\theta}_j \in J$ and $w({\bar\theta}_{j+1}) \in \mathcal B({\bar\theta}_j)$ for every j, we can iterate the procedure to obtain

(8)

\begin{align}\Lambda_\tau(\theta)& = \sum_{j=1}^n (t_{j}-t_{j-1}) \, h\left({\bar\theta}_j\right) + \lim_{\epsilon \rightarrow 0} \epsilon \,\psi\left(\frac{t_{1}-t_{0}}{\epsilon},\, {\bar\theta}_1,\, w\left({\bar\theta}_2 \right) \right)V_{0} + \epsilon \sum_{k=1}^n \theta_k X_0 \notag\\& = \sum_{j=1}^n (t_{j}-t_{j-1}) \, h\left({\bar\theta}_j\right) .\end{align}

Assume now that there exists k such that ${\bar\theta}_k \not\in J$. Without loss of generality, we take the largest such k. Following the same procedure, we find

\begin{align*}\Lambda_\tau(\theta)& = \lim_{\epsilon \rightarrow 0} \epsilon \log \left(\mathbb E\left[e^{\sum_{j=1}^{k-2} \theta_j X_{t_j/\epsilon} + {\bar\theta}_{k-1} \, X_{t_{k-1}/\epsilon} + \psi\left(\frac{t_{k}-t_{k-1}}{\epsilon},{\bar\theta}_{k}, w({\bar\theta}_{k+1})\right) \, V_{t_{k-1}/\epsilon}}\right]\right) \\ & \quad + \epsilon \,\phi\left(\frac{t_{k}-t_{k-1}}{\epsilon},{\bar\theta}_{k}, w({\bar\theta}_{k+1}) \right) + \sum_{j=k+1}^n (t_{j}-t_{j-1})\, h({\bar\theta}_j) .\end{align*}

Noting that $\phi(\cdot, u, w)$ explodes in finite time for $u \not\in J$ then finishes the proof.

We now proceed to the finite-dimensional large deviations result. Let us define $J^n \,:\!=\,D_{\Lambda_\tau} $.

Theorem 6. Let $(X_t^\epsilon)_{t \ge 0,\: \epsilon \in (0,1]}$ and $\tau=\{t_1,\ldots,t_n\}$ as previously. Let Assumptions 2 and 3 hold. Then $(X_{t_1}^\epsilon,\ldots,X_{t_n}^\epsilon)$ satisfies an LDP on $\mathbb R^n$ with good rate function

\begin{equation*}\Lambda_\tau^*(x) = \sup_{{\bar\theta} \in J^n} \left\{ \sum_{j=1}^n {\bar\theta}_j (x_j-x_{j-1}) - \sum_{j=1}^n (t_{j}-t_{j-1}) \, h\left({\bar\theta}_j\right) \right\}\! ,\end{equation*}

where h is as defined in (5).

Proof. By Assumption 2(1), there exists $u \in J$ such that $u < 0$, which implies that $[u,1] \subset J$ and therefore 0 is in the interior of $J^n$. Theorem 5 implies that the limit

\begin{equation*}\Lambda_{\tau}(\theta)= \lim_{\epsilon \rightarrow 0} \epsilon \,\Lambda_{\epsilon,\tau}(\theta/\epsilon) =\begin{cases}\sum_{j=1}^n (t_{j}-t_{j-1}) \, h\left({\bar\theta}_j\right) \quad & \text{ if } {\bar\theta}_j \in J \quad \forall j, \\[4pt] \infty & \text{ otherwise,}\end{cases}\:\end{equation*}

where ${\bar\theta}_j \,:\!=\, \sum_{k=j}^n \theta_k$, exists as an extended real number.

Since, by Assumption 2(2), h is essentially smooth and lower semicontinuous, we have that $\Lambda_\tau$ is as well. Theorem 2 then applies and $(X_{t_1}^\epsilon,\ldots,X_{t_n}^\epsilon)$ satisfies an LDP, on $\mathbb R^n$, with good rate function

\begin{equation*}\Lambda_\tau^*(x) = \sup_{\theta\in \mathbb R^n} \big\{\theta^\top x - \Lambda_\tau(\theta)\big\} .\end{equation*}

Furthermore, with the convention $x_0=0$, and letting $\theta_j = \bar\theta_j -\bar\theta_{j+1}$ for $j=1,\dots,n-1$ and $\theta_n = \bar\theta_{n}$, we have

\begin{align*}\Lambda_\tau^*(x)& = \sup_{\theta\in \mathbb R^n} \big\{\theta^\top x - \Lambda_\tau(\theta)\big\} \\[4pt] & = \sup_{{\bar\theta} \in J^n} \left\{ \sum_{j=1}^n \sum_{k=j}^n \theta_k (x_j-x_{j-1}) - \sum_{j=1}^n (t_{j}-t_{j-1}) \, h\left({\bar\theta}_j\right) \right\} \\[4pt] & = \sup_{{\bar\theta} \in J^n} \left\{ \sum_{j=1}^n {\bar\theta}_j (x_j-x_{j-1}) - \sum_{j=1}^n (t_{j}-t_{j-1}) \, h\left({\bar\theta}_j\right) \right\}\! ,\end{align*}

which finishes the proof.

4.2. Infinite-dimensional LDP

4.2.1. Extension of the LDP

We now extend the LDP to the whole trajectory of $(X_t^\epsilon)_{0 \le t \le T}$ on $\mathcal F([0, T], \: \mathbb R)$ $\,:\!=\, \{x \,:\, [0, T] \rightarrow \mathbb R, \: x_0 = 0 \}$, the set of all functions from [0, T] to $\mathbb R$ that vanish at 0, by proving the following general lemma.

Lemma 4. Let $(\mathcal P,{\le})$ be the partially ordered right-filtering set

\begin{equation*}\mathcal P= \bigcup_{n=1}^{\infty} \{(t_1,\ldots,t_n),\: 0 \le t_1 \le \ldots \le t_n \le T\}\end{equation*}

ordered by inclusion. We consider, on $(\mathcal P,{\le})$, the projective system $(\mathcal Y_j, p_{ij})_{i \le j \in \mathcal P}$ defined by $\mathcal Y_j = \mathbb R^{\#j}$ and $p_{ij}\,:\,\mathcal Y_j \rightarrow \mathcal Y_i$ the natural projection on shared times. Assume that for any $j = \{t_1,\ldots,t_n\}$, the finite-dimensional process $(X_{t_1}^\epsilon,\ldots,X_{t_n}^\epsilon)$ satisfies an LDP with good rate function $\Lambda^*_j$. Then the family $(X_t^\epsilon)_{0\le t \le T}$ satisfies an LDP on $\mathcal X= \mathcal F([0, T], \: \mathbb R)$ equipped with the topology of pointwise convergence, with good rate function

\begin{equation*}\Lambda^*(x) = \sup_{j \in \mathcal P} \Lambda^*_{j}(p_j(x)),\end{equation*}

where $p_{\tau}(x) = (x_{t_1},\ldots,x_{t_n})$ is the canonical projection from $\mathcal X$ to $\mathcal Y_\tau$.

Proof. Let $\mu^\epsilon$ be the probability measure generated by $(X_t^\epsilon)_{0\le t \le T}$ on $\mathcal X$. Then, by hypothesis, for any $j \in \mathcal P$, $\mu^\epsilon \circ p_j^{-1}$ satisfies an LDP with good rate function $\Lambda^*_\tau$. The result then follows from Theorem 3.

Theorem 7. Suppose that Assumptions 2 and 3 hold. Then $(X_t^\epsilon)_{0 \le t \le T}$ satisfies an LDP on $\mathcal F([0, T],\mathbb R)$ equipped with the topology of pointwise convergence, as $\epsilon \rightarrow 0$, with good rate function

\begin{equation*}\Lambda^*(x) = \sup_\tau \Lambda_\tau^*(x),\end{equation*}

where the supremum is taken over the discrete ordered subsets of the form $\tau = \{t_1,\ldots,t_n\} \subset [0, T]$.

Proof. The result is a direct application of Lemma 4.

4.2.2. Calculation of the rate function

We finally calculate the rate function in Theorem 7.

Theorem 8. The rate function of Theorem 7 is

\begin{equation*}\Lambda^*(x) = \int_0^T h^*(\dot{x}_{t}^{ac}) \, dt + \int_0^T \mathcal H\left( \frac{d\nu}{d|\nu|}(t)\right)\, d|\nu| ,\end{equation*}

where

\begin{equation*}h^*(y) \,:\!=\, \sup_{\theta \in J} \left\{\theta y - h(\theta) \right\} , \qquad \mathcal H(y) \,:\!=\, \lim_{\epsilon\rightarrow 0} \epsilon \, h^*(y/\epsilon) ,\end{equation*}

h is as defined in (5), $\dot{x}^{ac}$ is the derivative of the absolutely continuous part of x, $\nu$ is the finite signed measure which is the singular component of dx with respect to the Lebesgue measure, $|\nu|$ is the total variation measure of $\nu$, and $\frac{d\nu}{d|\nu|}$ is the Radon–Nikodym derivative of $\nu$ with respect to its total variation measure.

Proof. By Theorem 6, for every $x \in \mathcal F([0, T],\mathbb R)$,

\begin{align*}\sup_\tau \Lambda_\tau^*(x) & = \sup_\tau \sup_{{\bar\theta} \in J^{\# \tau}} \sum_{j=1}^{\# \tau} {\bar\theta}_j (x_{t_j}-x_{t_{j-1}}) - (t_j-t_{j-1})h({\bar\theta}_j)\\& = \sup_\tau \sup_{\xi \in C([0, T],J)} \sum_{j=1}^{\# \tau} \xi_{t_j} (x_{t_j}-x_{t_{j-1}}) - (t_j-t_{j-1})h(\xi_{t_j}) \\& = \sup_{\xi \in C([0, T],J)} \sup_\tau \sum_{j=1}^{\# \tau} \xi_{t_j} (x_{t_j}-x_{t_{j-1}}) - (t_j-t_{j-1})h(\xi_{t_j}) .\end{align*}

The second line follows from the first line since one can always find a continuous function $\xi$ such that $\xi_{t_i} = \bar\theta_i$ for $i=1,\dots,n$. Since we have assumed that there exists $u<0$ in J, if x has infinite variation we immediately find that $\Lambda^*(x) = \infty$. Assume therefore that x has finite variation. We wish to show that

\begin{align*}\sup_{\xi \in C([0, T],J)} \sup_\tau \sum_{j=1}^{\# \tau} \xi_{t_j} & (x_{t_j}-x_{t_{j-1}}) - (t_j-t_{j-1})h(\xi_{t_j}) \\ & = \sup_{\xi \in C([0, T],J)} \int_0^T \xi_{t} dx_t - \int_0^T h(\xi_{t}) dt .\end{align*}

Notice that

\begin{align*}& \sup_{\xi \in C([0, T],J)} \!\sup_\tau \sum_{j=1}^{\# \tau} \xi_{t_j} (x_{t_j}-x_{t_{j-1}}) - (t_j-t_{j-1})h(\xi_{t_j}) \\ & \quad\ge \: \sup_{\xi \in C([0, T],J)} \!\limsup_\tau \sum_{j=1}^{\# \tau} \xi_{t_j} (x_{t_j}-x_{t_{j-1}}) - (t_j-t_{j-1})h(\xi_{t_j}) \\ & \quad = \sup_{\xi \in C([0, T],J)} \int_0^T \xi_{t} dx_t - \int_0^T h(\xi_{t}) dt .\end{align*}

To prove the other inequality, we use the following construction. Fix $\tau$ and let $\xi \in C([0, T],J)$. Let also $\epsilon > 0$ such that $\epsilon < \min (t_j-t_{j-1})$, and define $\xi^{\epsilon,\tau}$ as

\begin{equation*}\xi^{\epsilon,\tau}_t =\begin{cases}\xi_{t_{j-1}} + \frac{t-t_{j-1}}{\epsilon} \, (\xi_{t_{j}}-\xi_{t_{j-1}}) & \text{ if } t \in [t_{j-1},\,t_{j-1}+\epsilon] , \\[5pt] \xi_{t_{j}} & \text{ if } t \in [t_{j-1}+\epsilon,\, t_j] . \\\end{cases}\end{equation*}

Then

\begin{align*} & \left|\sum_{j=1}^{\# \tau} \xi_{t_j} (x_{t_j}-x_{t_{j-1}}) - (t_j-t_{j-1})h(\xi_{t_j}) - \int_0^T \xi_{t}^{\epsilon,\tau} dx_t + \int_0^T h(\xi_{t}^{\epsilon,\tau}) dt \right| \\[4pt] &\quad = \left|\sum_{j=1}^{\# \tau} (\xi_{t_j} - \xi_{t_{j-1}}) \int_{t_{j-1}}^{t_{j-1}+\epsilon} \left(1-\frac{t-t_{j-1}}{\epsilon}\right) dx_t + \int_{t_{j-1}}^{t_{j-1}+\epsilon} h(\xi_{t}^{\epsilon,\tau})-h(\xi_{t_j}) dt \right| \\[4pt] &\quad \le \sum_{j=1}^{\# \tau} \left|\xi_{t_j} - \xi_{t_{j-1}}\right| \left|\int_{t_{j-1}}^{t_{j-1}+\epsilon} \left(1-\frac{t-t_{j-1}}{\epsilon}\right) dx_t \right| \\ & \qquad + 2 \epsilon \max\left\{ |h(\xi)|\,:\,\xi \in [\xi_{t_{j-1}},\xi_{t_{j}}]\right\} \\ & \quad \le \sum_{j=1}^{\# \tau} \left|\xi_{t_j} - \xi_{t_{j-1}}\right| \, \mu_x\big(]0,\epsilon]\big) + 2 \epsilon \max\left\{ |h(\xi)|\,:\,\xi \in [\xi_{t_{j-1}},\xi_{t_{j}}]\right\} \underset{\epsilon \rightarrow 0}{\rightarrow} 0 ,\end{align*}

where $\mu_x$ is the measure associated with x. Hence

\begin{align*}& \sup_{\xi \in C([0, T],J)} \!\sup_\tau \sum_{j=1}^{\# \tau} \xi_{t_j} (x_{t_j}-x_{t_{j-1}}) - (t_j-t_{j-1})h(\xi_{t_j}) \\ & \quad \le \sup_{\xi \in C([0, T],J)} \int_0^T \xi_{t} dx_t - \int_0^T h(\xi_{t}) dt\end{align*}

and

\begin{equation*}\Lambda^*(x) = \sup_{\xi \in C([0, T],J)} \int_0^T \xi_{t} dx_t - \int_0^T h(\xi_{t}) \, dt .\end{equation*}

We will now use [Reference Rockafellar17, Theorem 5] to obtain the result. Since x has finite variation, the measure $dx_t$ is regular. Using the notation of [Reference Rockafellar17], in our case the multifunction D is the constant multifunction $t \mapsto D(t)=J$. Therefore D is fully lower semicontinuous. Furthermore, since $[0, 1] \subset J$, the interior of D(t) is non-empty. The set [0, T] is compact with no non-empty open sets of measure 0, and for every u in the interior of J, and $V \in [0, T]$ open,

\begin{equation*}\int_V |h(u)| \,dt \le T |h(u)| < \infty.\end{equation*}

[Reference Rockafellar17, Theorem 5] then implies that

\begin{equation*}\sup_{\xi \in C([0, T],J)} \int_0^T \xi_{t} \, dx_t - \int_0^T h(\xi_{t}) \, dt = \int_0^T h^*(\dot{x}_{t}^{ac}) \, dt + \int_0^T \mathcal H\left( \frac{d\nu}{d|\nu|}(t) \right)\, d|\nu| ,\end{equation*}

where

\begin{equation*}h^*(y) = \lim_{\epsilon \rightarrow 0} \sup_{\theta \in J} \left\{\theta y - h(\theta) \right\} , \qquad \mathcal H(y) = \lim_{\epsilon\rightarrow 0} \epsilon \, h^*(y/\epsilon) ,\end{equation*}

and $\dot{x}^{ac}$ and $\nu$ are as defined in the statement of the theorem.

Remark 3. In particular, the proof of Theorem 8 shows that if x does not belong to $V_r$, the set of trajectories $x \,:\, [0, t] \rightarrow \mathbb R$ with bounded variation, then $\Lambda^*(x) = \infty$.

5. Variance reduction

Denote by P(S) the payoff of an option on $(S_t)_{0\le t \le T}$. The price of an option is generally calculated as the expectation $\mathbb E[P(S)]$ under a certain risk-neutral measure $\mathbb P$. For any equivalent measure $\mathbb Q$, the price of the option can be written as

\begin{equation*}\mathbb E[P(S)] = \mathbb E^\mathbb Q\left[P(S) \frac{d\mathbb P}{d\mathbb Q}\right] .\end{equation*}

The variance of P(S) is

\begin{equation*}\mathbb{V}\text{ar}_\mathbb P\left(P(S)\right) = \mathbb E\big[P^2(S)\big]-\mathbb E\left[P(S)\right]^2 ,\end{equation*}

whereas

\begin{align*}\mathbb{V}\text{ar}_\mathbb Q\left( P(S) \frac{d\mathbb P}{d\mathbb Q} \right)& = \mathbb E^\mathbb Q\left[P^2(S) \left(\frac{d\mathbb P}{d\mathbb Q}\right)^2\right]-\left(\mathbb E^\mathbb Q\left[P(S) \frac{d\mathbb P}{d\mathbb Q}\right]\right)^2 \\[4pt] & = \mathbb E\left[P^2(S) \frac{d\mathbb P}{d\mathbb Q}\right]- \mathbb E \left[P(S)\right]^2 .\end{align*}

We can therefore choose $\mathbb Q$ so as to reduce the variance of the random variable whose expectation gives the price of the option.

As discussed in the introduction, we follow [Reference Genin and Tankov9] by considering the class of exponentially affine transforms, defined as follows:

\begin{equation*}\frac{d\mathbb P_\Theta}{d\mathbb P} = \frac{e^{\int_0^T X_t \,\Theta(dt)}}{\mathbb E\big[e^{\int_0^T X_t \,\Theta(dt)}\big]},\end{equation*}

where $\Theta$ belongs to M, the set of signed measures on [0, T]. Setting $H(X)=\log P\left(S_0\,e^{X}\right)$, we write the optimization problem as

(9)

\begin{equation}\inf_{\Theta \in M} \mathbb E\left[ \exp\left( 2H(X) - \int_0^T X_t \,\Theta(dt) + \mathcal G_1(\Theta) \right)\right],\end{equation}

where

\begin{equation*}\mathcal G_\epsilon(\Theta) \,:\!=\, \epsilon\log \mathbb E\big[e^{\frac{1}{\epsilon}\int_0^T X_t^\epsilon \,\Theta(dt)}\big] .\end{equation*}

The optimization problem (9) cannot be solved explicitly. We therefore choose to solve the problem asymptotically using the two following lemmas. Denote by $\, \bar{\!M}$ the set of measures $\Theta \in M$ with support on a finite set of points. We first give a lemma that characterizes the behaviour of $\mathcal G_\epsilon(\Theta)$ as $\epsilon\rightarrow 0$, for $\Theta \in\, \bar{\!M}$, as this will be sufficient for the cases that we will consider in Section 6 (see Proposition 4). We stress the fact that although Lemmas 5 and 6 are proved for $\Theta \in \,\bar{\!M}$, the resulting asymptotic proxy for the variance and the resulting candidate importance sampling measure are well defined, and may be used for any $\Theta\in M$, provided that they lead to a sufficient reduction in the variance of the estimator.

Lemma 5. If Assumption 3 holds, then for any measure $\Theta\in \,\bar{\!M}$ such that $\Theta([t,T]) \in J$ for every $t \in [0, T]$, we have

\begin{equation*}\lim_{\epsilon \rightarrow 0}\mathcal G_\epsilon(\Theta) = \int_0^T h(\Theta([t,T])) \,dt ,\end{equation*}

where h is as defined in (5).

Proof. Denote by $\tau = \{t_1,\ldots,t_n\}$ the support of $\Theta$. We then obtain

\begin{align*}\lim_{\epsilon \rightarrow 0} \epsilon\log \mathbb E\left[e^{\frac{1}{\epsilon}\int_0^T X_t^\epsilon \,\Theta(dt)}\right]& = \lim_{\epsilon \rightarrow 0} \epsilon\log \mathbb E\left[ e^{\frac{1}{\epsilon}\sum_{j=1}^n X_{t_j}^\epsilon \,\Theta(\{t_j\})}\right] \\& = \sum_{j=1}^n (t_j-t_{j-1}) \, h\left(\Theta(\{t_j\})\right) \\& = \int_0^T h(\Theta([t,T])) \,dt\end{align*}

by applying Theorem 5 to $\theta=\left(\Theta(\{t_1\}),\ldots, \Theta(\{t_n\})\right)$.

Next, we give a result that characterizes the behaviour of the variance minimization problem (9) where X has been replaced by $X^\epsilon$ as $\epsilon \rightarrow 0$.

Lemma 6. Let $\Theta \in \,\bar{\!M}$ be such that $- \Theta([t,T]) \in J^\circ$ for every $t \in [0, T]$. Let Assumptions 2 and 3 hold, and assume furthermore that $H\,:\, \mathcal F([0, T],\mathbb R) \rightarrow \mathbb R$ is bounded from above by a constant C and is continuous on D, the set of functions $x \in V_r$ such that $H(x) > -\infty$, with respect to the pointwise convergence topology. Then

\begin{align*}& \lim_{\epsilon \rightarrow 0} \epsilon \log \mathbb E\left[ \exp\left(\frac{2H(X^\epsilon) - \int_0^T X_t^\epsilon \,\Theta(dt) +\mathcal G_\epsilon(\Theta)}{\epsilon} \right)\right] \\[4pt] &\quad = \sup_{x\in D} \left\{ 2H(x) - \int_0^T x_t \Theta(dt) - \Lambda^*(x) \right\} + \int_0^T h(\Theta([t,T])) \,dt .\end{align*}

Proof. First note that, by Lemma 5,

\begin{align*}& \lim_{\epsilon \rightarrow 0} \epsilon \log \mathbb E\left[ \exp\left(\frac{2H(X^\epsilon) - \int_0^T X_t^\epsilon \,\Theta(dt) +\mathcal G_\epsilon(\Theta)}{\epsilon} \right)\right] \\[4pt] & \quad = \lim_{\epsilon \rightarrow 0} \epsilon \log \mathbb E\left[ \exp\left(\frac{2H(X^\epsilon) - \int_0^T \! X_t^\epsilon \,\Theta(dt) }{\epsilon} \right)\right] + \int_0^T h(\Theta([t,T])) \,dt .\end{align*}

We therefore just need to prove that

\begin{equation*}\lim_{\epsilon \rightarrow 0} \epsilon \log \mathbb E\!\left[ \exp\left(\frac{2H(X^\epsilon) - \int_0^T \! X_t^\epsilon \,\Theta(dt) }{\epsilon} \right)\right]\! = \sup_{x\in D} \left\{ 2H(x) - \!\int_0^T x_t \Theta(dt) - \Lambda^*(x) \right\}\!.\end{equation*}

Denote by $\varphi\,: \,\mathcal F([0, T],\mathbb R) \rightarrow \mathbb R$ the function $\varphi(x) = 2H(x) - \int_0^T x_t \,\Theta(dt)$. Since H is assumed to be continuous and $\Theta$ has support on $\tau$, $\varphi$ is continuous. Let us show the integrability condition of Theorem 4. For every $\gamma >0$,

\begin{align*}& \limsup_{\epsilon \rightarrow 0} \epsilon\,\log \mathbb E\left[\exp\left(\frac{\gamma\,\varphi(X^\epsilon)}{\epsilon} \right) \right] \\[3pt] & \quad = \limsup_{\epsilon \rightarrow 0} \epsilon \log \mathbb E\left[ \exp\left(\frac{2\gamma H(X^\epsilon) - \gamma \int_0^T X_t^\epsilon \,\Theta(dt)}{\epsilon} \right)\right] \\[3pt] & \quad \le 2 \gamma C + \limsup_{\epsilon \rightarrow 0} \epsilon\log \mathbb E\left[e^{\frac{1}{\epsilon}\int_0^T X_t^\epsilon \,d({-}\gamma\Theta)_t}\right]\! .\end{align*}

Since $- \Theta([t,T]) \in J^\circ$ for every $t \in [0, T]$, there exists $\gamma > 1$ such that $-\gamma\Theta([t,T])$ remains in J for every t. Therefore Lemma 5 applies and

\begin{equation*}\limsup_{\epsilon \rightarrow 0} \epsilon\,\log \mathbb E\left[\exp\left(\frac{\gamma\,\varphi(X^\epsilon)}{\epsilon} \right) \right] \le 2 \gamma C + \int_0^T h({-}\gamma \, \Theta([t,T])) \,dt < \infty .\end{equation*}

Theorem 4 then applies and yields the result.

In view of Lemma 6 we propose to compute the candidate variance reduction parameter by minimizing over $\Theta \in M$ the expression

(10)

\begin{align}\sup_{x\in V_r} \left\{ 2H(x) - \int_0^T x_t \, \Theta(dt) - \Lambda^*(x) \right\} + \int_0^T h(\Theta([t,T])) \,dt .\end{align}

It is then natural to ask how close the corresponding measure change will be to the optimal one which minimizes the variance of the Monte Carlo estimator over all possible measure changes. Varadhan’s lemma allows us to define a notion of asymptotic optimality which provides a partial answer to this question. Consider a family of importance sampling measures $(\mathbb Q(\varepsilon))_{\varepsilon>0}$. By Jensen’s inequality, for all $\varepsilon>0$,

\begin{align*}\mathbb E^{\mathbb Q(\varepsilon)}\left[\left(e^{\frac{1}{\varepsilon} H(X^\varepsilon)}\frac{d\mathbb P}{d\mathbb Q(\varepsilon)}\right)^2\right] &\geq \mathbb E\left[e^{\frac{1}{\varepsilon} H(X^\varepsilon)}\right]^2 .\end{align*}

A family of importance sampling measure changes is called asymptotically optimal if, for this family, the log-scale decay rates of the above expressions are the same. In other words, the asymptotically optimal measure change does at least as well as any other measure change at the logarithmic scale of large deviations.

Definition 6. Let $(\mathbb Q(\varepsilon))_{\varepsilon>0}$ be a family of importance sampling measure changes. We say that $(\mathbb Q(\varepsilon))_{\varepsilon>0}$ is asymptotically optimal if

\begin{equation*}\lim_{\varepsilon\downarrow 0} \varepsilon \log \mathbb E^{\mathbb Q(\varepsilon)}\left[\left(e^{\frac{1}{\varepsilon} H(X^\varepsilon)}\frac{d\mathbb P}{d\mathbb Q(\varepsilon)}\right)^2\right] =2 \lim_{\varepsilon\downarrow 0} \varepsilon \log \mathbb E\left[e^{\frac{1}{\varepsilon} H(X^\varepsilon)}\right].\end{equation*}

The theorem below follows immediately from Theorem 8 of [Reference Genin and Tankov9] and shows that in the case of concave payoffs, the computation of the minimizer of (10) is simplified and we have asymptotic optimality.

Theorem 9. Let H be concave, and assume that the set $\{ x \in V_r \,:\, H(x) > -\infty \}$ is non-empty and contains a constant element. Assume furthermore that H is continuous on this set with respect to the topology of pointwise convergence, that h is lower semicontinuous with open and bounded effective domain, and that there exists a $\lambda > 0$ such that h is complex-analytic on $\{z \in \mathbb C \,:\, |\text{Im}(z)| < \lambda\}$. Then

(11)

\begin{align}&\inf_{\Theta \in M} \sup_{x\in V_r} \left\{ 2H(x) - \int_0^T x_t \Theta(dt) - \Lambda^*(x) + \int_0^T h(\Theta([t,T])) \,dt \right\}\end{align}

(12)

\begin{align}& \quad = 2 \, \inf_{\Theta \in M} \left\{ \widehat{\!H}(\Theta) + \int_0^T h(\Theta([t,T])) \,dt \right\} ,\end{align}

where

\begin{equation*}\widehat{\!H}(\Theta) = \sup_{x\in V_r} \left\{ H(x) - \int_0^T x_t \, \Theta(dt) \right\}\! .\end{equation*}

Furthermore, if the measure $\Theta^*$ minimizes the left-hand side of the above equation, it also minimizes the right-hand side. Finally, if the payoff functional and the measure $\Theta^*$ satisfy the assumptions of Lemma 6, then the importance sampling measure corresponding to $\Theta^*$ is asymptotically optimal.

Finite-dimensional dependence. A simple example of a functional H which is continuous in the topology of pointwise convergence is the situation where H depends on a path x only through its values at a finite number of points $x_{t_1},\ldots,x_{t_n}$. In this case, our results hold under less stringent assumptions, and we state them as a separate proposition. The asymptotically optimal variance reduction measure is also supported by the points $\{t_1,\dots,t_n\}$. To simplify notation, we denote $x_{t_i}$ by $x_i$ and we introduce a function $H_\tau\,:\,\mathbb{R}^n \to \mathbb R\cup\{-\infty\}$ such that $H(x) = H_{\tau}(x_1,\dots,x_n)$.

Proposition 4. Let $\tau = \{t_1,\ldots,t_n\}$ and let $H(x) = H_\tau(x_1,\dots,x_n)$. Assume that the set $\{x\in \mathbb R^n \,:\, H_\tau(x)>-\infty\}$ is non-empty and that $H_\tau$ is concave and continuous on this set. Let Assumptions 2 and 3 hold, let the effective domain J of h be bounded and contain a neighbourhood of 0, and assume that h(x) is lower semicontinuous on $\overline J$. Define

\begin{equation*}\widehat H_\tau({\bar\theta}) = \sup_{x\in \mathbb R^n} \left\{H_\tau(x) - \sum_{i=1}^n {\bar\theta}_i (x_i - x_{i-1}) \right\}.\end{equation*}

Then there exists ${\bar\theta}^*$ which minimizes

(13)

\begin{align}F({\bar\theta})\,:\!=\,\widehat H_\tau({\bar\theta}) + \sum_{i=1}^n (t_i - t_{i-1})h({\bar\theta}_i),\end{align}

and the family of importance sampling measures defined by

\begin{equation*}\frac{d\mathbb P^{\epsilon}}{d\mathbb P} = \frac{e^{\sum_{i=1}^n (X^\epsilon_{t_i}-X^\epsilon_{t_{i-1}}\!) {\bar\theta}^*_i}}{\mathbb E\big[e^{\sum_{i=1}^n (X^\epsilon_{t_i}-X^\epsilon_{t_{i-1}}\!) {\bar\theta}^*_i}\big]}\end{equation*}

is asymptotically optimal.

Proof. Let $\{{\bar\theta}^{(k)}\}_{k\geq 1}\subseteq J^n$ be a minimizing sequence for (13). Since J is bounded, this sequence has a subsequence $\{{\bar\theta}^{(k_m)}\}_{m\geq 1}$ converging to ${\bar\theta}^* \in \overline J^n$. As a supremum over a family of linear functions, $\widehat H_\tau$ is lower semicontinuous. By the lower semicontinuity of h, it follows that F is lower semicontinuous as well, and ${\bar\theta}^*$ is a minimizer of (13) on $\overline J^n$. Moreover, since $H_\tau$ is bounded and $h(0)=0$, $F(0)<\infty$ and thus also $F({\bar\theta}^*)<\infty$. Assume that ${\bar\theta}^*\in \partial (J^n)$, let $\rho \in (0,1)$, and let k be such that ${\bar\theta}^*_k \in \partial J$. Then $\rho {\bar\theta}^* \in {{J}^{^{\!\kern-1pt\circ}}}^n$, and using the convexity of $\widehat H_\tau$ and h, we have the following estimate:

\begin{align*}F(\rho {\bar\theta}^*) &= F({\bar\theta}^*) + \widehat H_\tau(\rho {\bar\theta}^*) -\widehat H_\tau({\bar\theta}) + \sum_{i=1}^n (t_i - t_{i-1})\{h(\rho {\bar\theta}^*_i)-h({\bar\theta}^*_i)\}\\[4pt] &\quad\leq F({\bar\theta}^* ) + (1-\rho) \{\widehat H_\tau(0) - \widehat H_\tau({\bar\theta}^*)\} \\[4pt] &\quad+(1-\rho)\sum_{i=1,\dots,n; i\neq k} (t_i-t_{i-1})\{h(0) - h({\bar\theta}^*_i)\} - (t_k-t_{k-1})(1-\rho){\bar\theta}_k \nabla h(\rho {\bar\theta}_k) .\end{align*}

To fix the ideas, assume that ${\bar\theta}_k>0$. Then, by the essential smoothness of h, $\nabla h(\rho {\bar\theta}_k) \to +\infty$ as $\rho\to1$, so that there exists $\rho<1 $ with $F(\rho{\bar\theta}^*)<F({\bar\theta}^*)$. This is in contradiction to the assumption that ${\bar\theta}^*$ is the minimizer of F, and therefore ${\bar\theta}^* \in {{J}^{^{\!\kern-1pt\circ}}}^n$. In view of this condition, and Lemma 5, the assumption (7) holds for the function $\phi$ given by

\begin{equation*}x\in \mathbb R^n \mapsto \phi(x)\,:\!=\, 2H_\tau(x) - \sum_{i=1}^n {\bar\theta}^*_i(x_i - x_{i-1}),\end{equation*}

with the convention $x_0 = 0$.

Define the measure $\Theta\,:\!=\, \sum_{i=1}^n \theta_i \delta_{t_i}$, where $\delta_{t_i}$ is the Dirac measure at $t_i$, $\theta_i =\bar\theta^*_i - \bar\theta^*_{j+1}$ for $j=1,\dots,n-1$, and $\theta_n= \bar \theta^*_n$. By Varadhan’s lemma (Theorem 4) applied with the LDP of Theorem 6 and Lemma 5 applied with the measure $\Theta$, it follows that

\begin{align*}&\lim_{\epsilon\to 0} \epsilon \log \mathbb E^{\mathbb P^\epsilon}\left[\left(e^{\frac{1}{\epsilon} H_\tau(X^\epsilon)} \frac{d\mathbb P}{d\mathbb P^\epsilon}\right)^2\right]\\[4pt] & \quad = \lim_{\epsilon\to 0} \epsilon \log \Bigg\{\mathbb{E}\left[\exp\left(\frac{2H_\tau(X^\epsilon) -\sum_{i=1}^n (X^\epsilon_{t_i}-X^\epsilon_{t_{i-1}}) {\bar\theta}^*_i }{\epsilon}\right)\right] \\[4pt] &\qquad\times \mathbb{E}\left[\exp\left(\frac{\sum_{i=1}^n (X^\epsilon_{t_i}-X^\epsilon_{t_{i-1}}) {\bar\theta}^*_i }{\epsilon}\right)\right] \Bigg\}\\[4pt] & \quad = \sup_{x\in \mathbb R^n} \left\{2H_\tau(x) - \sum_{i=1}^n (x_{t_i}-x_{t_{i-1}}) {\bar\theta}^*_i - \Lambda^*_\tau(x)\right\} + \sum_{i=1}^n (t_i - t_{i-1})h({\bar\theta}^*_i)\\[4pt] & \quad = \sup_{x\in \mathbb R^n } \inf_{\bar\xi \in J^n}\left\{ 2H_\tau(x) - \sum_{i=1}^n (x_{t_i}-x_{t_{i-1}}) ({\bar\theta}^*_i +\bar\xi_i) +\sum_{i=1}^n (t_i - t_{i-1})(h(\bar\xi_i)+h({\bar\theta}^*_i)) \right\} .\end{align*}

In view of our assumptions (in particular, the boundedness of J), the infimum and the supremum may be exchanged (see Proposition VI.2.3 in [Reference Ekeland and Temam8]), so that the above is equal to the following:

\begin{align*}& \inf_{\bar\xi \in J^n}\sup_{x\in \mathbb R^n } \left\{ 2H_\tau(x) - \sum_{i=1}^n (x_{t_i}-x_{t_{i-1}}) ({\bar\theta}^*_i +\bar\xi_i) +\sum_{i=1}^n (t_i - t_{i-1})(h(\bar\xi_i) + h({\bar\theta}^*_i) ) \right\}\\[2pt] & \quad = \inf_{\bar\xi \in J^n} \left\{ 2\widehat H_\tau\left(\frac{\bar\xi + {\bar\theta}^*}{2}\right) +\sum_{i=1}^n (t_i - t_{i-1})h(\bar\xi_i) + \sum_{i=1}^n (t_i-t_{i-1}) h({\bar\theta}^*_i) \right\}\\[2pt] & \quad = 2\left\{ \widehat H_\tau\left({\bar\theta}^*\right) + \sum_{i=1}^n (t_i-t_{i-1}) h({\bar\theta}^*_i) \right\} .\end{align*}

The last line follows because, on the one hand, by convexity of h and by definition of ${\bar\theta}^*$,

\begin{align*}&\inf_{\bar\xi \in J^n} \left\{ 2\widehat H_\tau\left(\frac{\bar\xi + {\bar\theta}^*}{2}\right) +\sum_{i=1}^n (t_i - t_{i-1})h(\bar\xi_i) + \sum_{i=1}^n (t_i-t_{i-1}) h({\bar\theta}^*_i) \right\} \\[2pt] &\quad \geq \inf_{\bar\xi \in J^n} \left\{ 2\widehat H_\tau\left(\frac{\bar\xi + {\bar\theta}^*}{2}\right) +2\sum_{i=1}^n (t_i - t_{i-1})h\left(\frac{\bar\xi_i + {\bar\theta}^*_i}{2}\right) \right\} \\[2pt] & \quad = 2\inf_{\bar\xi \in J^n} \left\{ \widehat H_\tau\left(\bar\xi \right) +\sum_{i=1}^n (t_i - t_{i-1})h\left(\bar\xi_i\right) \right\} = F({\bar\theta}^*),\end{align*}

and on the other hand, choosing $\bar\xi = {\bar\theta}^*$, we have that

\begin{align*}& \inf_{\bar\xi \in J^n} \left\{ 2\widehat H_\tau\left(\frac{\bar\xi + {\bar\theta}^*}{2}\right) +\sum_{i=1}^n (t_i - t_{i-1})h(\bar\xi_i) + \sum_{i=1}^n (t_i-t_{i-1}) h({\bar\theta}^*_i) \right\} \\[2pt] & \qquad\qquad\leq 2\widehat H_\tau\left({\bar\theta}^*\right) + 2 \sum_{i=1}^n (t_i-t_{i-1}) h({\bar\theta}^*_i).\end{align*}

By another application of Varadhan’s lemma (here the boundedness of $H_\tau$ suffices to ensure its applicability), exchanging the infimum and the supremum by the same argument as above, we have

\begin{align*}&\lim_{\epsilon\to 0} \epsilon \log \mathbb{E}\left[\exp\left(\frac{H_\tau(X^\epsilon) }{\epsilon}\right)\right] = \sup_{x\in \mathbb R^n} \left\{H_\tau(x) - \Lambda^*_\tau(x)\right\}\\[2pt] & \quad = \sup_{x\in \mathbb R^n} \inf_{{\bar\theta}\in J^n}\left\{H_\tau(x) - \sum_{j=1}^n {\bar\theta}_j (x_j-x_{j-1}) + \sum_{j=1}^n (t_j - t_{j-1})h({\bar\theta}_j)\right\}\\[2pt] & \quad = \inf_{{\bar\theta}\in J^n}\sup_{x\in \mathbb R^n} \left\{H_\tau(x) - \sum_{j=1}^n {\bar\theta}_j (x_j-x_{j-1}) + \sum_{j=1}^n (t_j - t_{j-1})h({\bar\theta}_j)\right\}\\[2pt] & \quad = \inf_{{\bar\theta}\in J^n} \left\{\widehat H_\tau({\bar\theta}) - \sum_{j=1}^n (t_j - t_{j-1})h({\bar\theta}_j)\right\}\\[2pt] & \quad = F({\bar\theta}^*),\end{align*}

which shows the asymptotic optimality of ${\bar\theta}^*$ as per Definition 6.

6. Numerical examples

In this section, we apply the variance reduction method to several examples. We first prove a result for options on the average value of the underlying over a finite set of points.

Proposition 5. Let $\tau=\{t_1,\ldots,t_n\}$ and consider an option with log payoff

(14)

\begin{align}H(x) = \log\Big( K - \frac{S_0}{n} \sum_{j=1}^n e^{x_{j}} \Big)_+ .\end{align}

Then H satisfies the assumptions of Proposition 4, and for any ${\bar\theta} \in \mathbb R^n$,

(15)

\begin{equation} \widehat{\!H}({\bar\theta}) = \log\left( \frac{K}{1 - \sum_{l=1}^n \theta_l} \right) - \sum_{m=1}^n \theta_m \log\left(\frac{-\theta_m\,n\,K/S_0}{1 - \sum_{l=1}^n \theta_l}\right)\! ,\end{equation}

where we use the notation $\theta_j ={\bar\theta}_{j}-{\bar\theta}_{j+1}$ for $j=1,\dots,n-1$ and $\theta_n= \bar\theta_n$.

Proof. Let us first show that the assumptions of Proposition 4 are satisfied. The concavity of H follows from Lemma 10 in [Reference Genin and Tankov9] (see the examples in Section 4 of that paper); the payoff is clearly bounded from above and continuous on the set where it is finite.

Let us now turn to the computation of $\widehat H$. For this payoff,

\begin{equation*}H(x) - \sum_{i=1}^n {\bar\theta}_i (x_i - x_{i-1}) = \log\bigg( K - \frac{S_0}{n} \sum_{j=1}^n e^{x_{j}} \bigg)_+ \!\!\! - \, \sum_{j=1}^n \theta_j x_{j} .\end{equation*}

When the option is out of or at the money, the log payoff is $-\infty$. Assume that x is such that $H(x) > -\infty$, and differentiate with respect to $x_{j}$. We obtain

\begin{equation*}0 = \partial_{x_{j}}\left\{\log\left( K - \frac{S_0}{n} \sum_{l=1}^n e^{x_{l}} \right) - \sum_{l=1}^n x_{l} \theta_l \right\} = \frac{- \frac{S_0}{n} \, e^{x_{j}}}{ K - \frac{S_0}{n} \sum_{l=1}^n e^{x_{l}}} - \theta_j .\end{equation*}

Therefore the x that maximizes $H(x) - \sum_{j=1}^n \theta_j x_{j} $ satisfies

\begin{equation*}\frac{e^{x_{j}}}{\theta_j} = - n\,\frac{K}{S_0} + \sum_{l=1}^n e^{x_{l}} = - n\,\frac{K}{S_0} + \frac{e^{x_{j}}}{\theta_j} \sum_{l=1}^n \theta_l\end{equation*}

for every j. Therefore

\begin{equation*}x_{j} = \log\left(\frac{-\theta_j\,n\,K/S_0}{1 - \sum_{l=1}^n \theta_l}\right) .\end{equation*}

Inserting $x_{j}$ in the value of $H(x) - \sum_{j=1}^n \theta_j x_{j} $, we obtain the result.

6.1. European and Asian put options in the Heston model

Consider the Heston model [Reference Heston12]

(16)

\begin{equation}\begin{aligned}dX_t & = -\frac{V_t}{2} \, dt + \sqrt{V_t} \, dW_t^1 , && X_0 = 0 ,\\ dV_t & = \lambda( \mu - V_t) \, dt + \zeta \sqrt{V_t} \, dW_t^2 , && V_0 > 0 , \\ d\left\langle {W^1,W^2} \right\rangle_t & = \rho \, dt ,\end{aligned}\end{equation}

where $W^1,W^2$ are standard $\mathbb P$-Brownian motions. The Laplace transform of $(X_t,V_t)$ is

\begin{equation*}\mathbb E\left( e^{u X_t + w V_t} \right) = e^{ \phi(t,u,w) + \psi(t,u,w) V_0 + u X_0 } ,\end{equation*}

where $\phi, \psi$ satisfy the Riccati equations

(17)

\begin{equation}\begin{aligned}\partial_{t} \phi(t,u,w) & = F(u, \psi(t,u,w)), \qquad && \phi(0,u,w) = 0, \\\partial_{t} \psi(t,u,w) & = R(u, \psi(t,u,w)), \qquad && \psi(0,u,w) = w,\end{aligned}\end{equation}

for $F(u,w) = \lambda \mu w$ and

\begin{equation*}R(u,w) = \frac{\zeta^2}{2} \, w^2 + \zeta \rho \, u w - \lambda w + \frac{1}{2} (u^2 - u) .\end{equation*}

A standard calculation shows that the solution of the Riccati equations (17) is

(18)

\begin{equation}\begin{aligned}\psi(t,u,w) & = \frac{1}{\zeta} \left(\frac{\lambda}{\zeta} - \rho u \right) - \frac{\gamma}{\zeta^2} \,\frac{\tanh\left(\frac{\gamma}{2} \, t\right) + \eta}{1 + \eta \, \tanh\left(\frac{\gamma}{2} \, t\right)} , \\\phi(t,u,w) & = \mu\,\frac{\lambda}{\zeta}\left(\frac{\lambda}{\zeta} - \rho u \right) t - 2\mu \,\frac{\lambda}{\zeta^2}\,\log\left(\cosh\left(\frac{\gamma}{2} \, t\right) + \eta \,\sinh\left(\frac{\gamma}{2} \, t\right)\right) ,\end{aligned}\end{equation}

where

\begin{equation*}\gamma = \gamma(u) = \zeta \, \sqrt{\left(\frac{\lambda}{\zeta} - \rho u\right)^2 \!\!+ \frac{1}{4} - \!\left(u-\frac{1}{2}\right)^2}\end{equation*}

and

\begin{equation*}\eta = \eta(u,w) = \frac{\lambda - \zeta\rho u - \zeta^2 w}{\gamma(u)}.\end{equation*}

Furthermore, for the Heston model, the function h (the asymptotic Laplace exponent of X; see (4)) is given by

(19)

\begin{equation}h(u) = \left\{\begin{aligned}&\mu\,\frac{\lambda}{\zeta}\left(\frac{\lambda}{\zeta} - \rho u\right) - \mu\,\frac{\lambda}{\zeta^2} \,\gamma(u) && \text{if}\quad \left(\frac{\lambda}{\zeta} - \rho u\right)^2 \!\!+\frac{1}{4} - \!\left(u-\frac{1}{2}\right)^2>0,\\[4pt] &+\infty &&\text{otherwise}.\end{aligned} \right.\end{equation}

Remark 4. The function h is the log-Laplace transform of the normal inverse Gaussian process [Reference Barndorff-Nielsen1] which is complex-analytic on a strip around the real axis.

The following proposition describes the effect of the time-dependent Esscher transform on the dynamics of the Heston model.

Proposition 6. Let $\tau = \{t_1,\ldots,t_n\}$, and let $\mathbb P_{\bar\theta}$ be the measure given by

\begin{equation*}\frac{d\mathbb P_{\bar\theta}}{d\mathbb P} = \frac{e^{\sum_{j=1}^n \bar\theta_j \, (X_{t_j}-X_{t_{j-1}}\!)}}{\mathbb E\big[e^{\sum_{j=1}^n \bar\theta_j \, (X_{t_j}-X_{t_{j-1}}\!)}\big]} .\end{equation*}

Under $\mathbb P_{\bar\theta}$, the dynamics of the $\mathbb P$-Heston process $(X_t,V_t)$ becomes

(20)

\begin{equation}\begin{aligned}dX_t & = \left({\bar\theta}_{n_t} + \zeta \rho \,\Psi\left(\tau_t-t, {\bar\theta}_{n_t}, \ldots, {\bar\theta}_n \right) -\frac{1}{2} \right) V_t \, dt + \sqrt{V_t} \, d\tilde{W}_t^1 , && X_0 = 0 ,\\ dV_t & = \tilde{\lambda}_t\, ( \tilde{\mu}_t - V_t) \, dt + \zeta \sqrt{V_t} \, d\tilde{W}_t^2 , && V_0 = V_0 , \\ d\left\langle {\tilde{W}^1,\tilde{W}^2} \right\rangle_t &= \rho \, dt ,\end{aligned}\end{equation}

where $\tilde{W}$ is a two-dimensional correlated $\mathbb P_{\bar\theta}$-Brownian motion, $n_t = \inf\{k\in \mathbb{N}\,:\, t_k \geq t\}$, $\tau_t = \inf\{s \in \tau \,:\, s \ge t \}$, $\Psi$ is defined iteratively as

\begin{align*}\Psi\left(s, {\bar\theta}_{j}, \ldots, {\bar\theta}_n \right)& = \psi\left(s, {\bar\theta}_{j}, \Psi\left(t_{j+1}-t_j, {\bar\theta}_{j+1}, \ldots, {\bar\theta}_n \right)\right)\! , \\\Psi\left(s \right)& = 0 ,\end{align*}

and

\begin{equation*}\tilde{\lambda}_t = \lambda - \zeta {\bar\theta}_{n_t} \rho - \zeta^2 \,\Psi\left(\tau_t-t, {\bar\theta}_{n_t}, \ldots, {\bar\theta}_n \right) \quad \text{and} \quad \tilde{\mu}_t = \frac{\lambda \mu}{\tilde{\lambda}_t} .\end{equation*}

Proof. We define the function $\Phi$ iteratively by

\begin{align*}\Phi\!\left(s, {\bar\theta}_{j}, \ldots, {\bar\theta}_n \right)& = \phi\!\left(s, {\bar\theta}_{j}, \Psi\left(t_{j+1}-t_j, {\bar\theta}_{j+1}, \ldots, {\bar\theta}_n \right)\right) \\ & \quad + \Phi\!\left(t_{j+1}-t_j, {\bar\theta}_{j+1}, \ldots, {\bar\theta}_n \right)\!, \\ \Phi\!\left(s \right)& = 0.\end{align*}

Let

\begin{equation*}D(t,X_t,V_t) = \left.\frac{d\mathbb P_{\bar\theta}}{d\mathbb P}\right|_{\mathcal F_t}\! .\end{equation*}

Then

\begin{align*}D(t,X_t,V_t)& = \frac{e^{\sum_{j=1}^{n_t-1} \bar\theta_j \, (X_{t_j}-X_{t_{j-1}}\!)}}{\mathbb E\big[e^{\sum_{j=1}^n \bar\theta_j \, (X_{t_j}-X_{t_{j-1}}\!)}\big]} \: \mathbb E\left[e^{\sum_{j=n_t}^{n} \bar\theta_j \, (X_{t_j}-X_{t_{j-1}}\!)} \, \middle|\,\mathcal F_t\right] \\[4pt] & = \frac{e^{\sum_{j=1}^{n_t-1} \bar\theta_j \, (X_{t_j}-X_{t_{j-1}}\!) + \Phi\left(\tau_t-t, {\bar\theta}_{n_t}, \ldots, {\bar\theta}_n \right)}}{e^{\Phi\left(t_1, {\bar\theta}_{1}, \ldots, {\bar\theta}_n \right) + \Psi\left(t_1, {\bar\theta}_{1}, \ldots, {\bar\theta}_n \right) \, V_0 + {\bar\theta}_{1} \, X_{0} }} \: e^{ \Psi\left(\tau_t-t, {\bar\theta}_{n_t}, \ldots, {\bar\theta}_n \right) \, V_t + {\bar\theta}_{n_t} \, X_{t} } .\end{align*}

The dynamics of $D(t,X_t,V_t)$ can then be expressed using Itô’s lemma as

\begin{align*}dD(t,X_t,V_t)& = D(t,X_t,V_t) \left({\bar\theta}_{n_t} dX_t + \Psi\left(\tau_t-t, {\bar\theta}_{n_t}, \ldots, {\bar\theta}_n \right) dV_t\right) + \ldots\, dt\\& = D(t,X_t,V_t) \sqrt{V_t}\left({\bar\theta}_{n_t} dW_t^1 + \zeta \, \Psi\left(\tau_t-t, {\bar\theta}_{n_t}, \ldots, {\bar\theta}_n \right) dW_t^2\right)\! .\end{align*}

By Girsanov’s theorem,

\begin{equation*}d\left(\begin{array}{c}\tilde{W}_t^1 \\[5pt] \tilde{W}_t^2\end{array}\right)=d\left(\begin{array}{c}W_t^1 \\[5pt] W_t^2\end{array}\right)- \sqrt{V_t}\left(\begin{array}{c}{\bar\theta}_{n_t} + \zeta \rho \,\Psi\left(\tau_t-t, {\bar\theta}_{n_t}, \ldots, {\bar\theta}_n \right) \\[5pt] {\bar\theta}_{n_t} \rho + \zeta \,\Psi\left(\tau_t-t, {\bar\theta}_{n_t}, \ldots, {\bar\theta}_n \right)\end{array}\right) dt\end{equation*}

is a two-dimensional Brownian motion under the measure $\mathbb P_\theta$. Replacing W in (16) by $\tilde{W}$ gives the result.

Remark 5. Proposition 6 shows that the time-dependent Esscher transform changes a classical Heston process into a Heston process with time-inhomogeneous drift.

Remark 6. (Asymptotic optimality for the Heston model.) The limiting Laplace exponent of the Heston model is finite and continuous on the bounded interval $J=[u_-,u_+]$, where

\begin{equation*}u_\pm = \frac{\left(\frac{1}{2} - \frac{\lambda}{\zeta}\,\rho \right) \pm \sqrt{\left(\frac{1}{2} - \frac{\lambda}{\zeta}\,\rho \right)^2 + \frac{\lambda^2}{\zeta^2} \, (1 - \rho^2)}}{(1 - \rho^2)} ,\end{equation*}

which contains a neighbourhood of zero. The function w (stable equilibrium of the second Riccati equation) is given by

(21)

\begin{align}w(u) = \frac{(\lambda-u\rho\zeta) - \sqrt{(\lambda-u\rho\zeta)^2 - \zeta^2 (u^2-u)}}{\zeta^2},\end{align}

so that

\begin{equation*}w(u_-) = \frac{1}{\zeta}\left(\frac{\lambda}{\zeta} - \rho u_- \right) \qquad \text{and} \qquad w(u_+) = \frac{1}{\zeta}\left(\frac{\lambda}{\zeta} - \rho u_+ \right)\! .\end{equation*}

We see that Assumption 3 is satisfied and thus asymptotic optimality for Asian and European put options is guaranteed in the Heston model only when $\rho = 0$. Nevertheless, since the actual variance reduction problem is itself unsolvable, our goal is to find a good candidate measure that we can test numerically. The fact that we do not have the full theory to justify it is therefore not problematic.

6.1.1. Numerical results for European put options

In this case, the asymptotically optimal variance reduction measure of Proposition 4 is supported by the single point $\{T\}$ and given (with $\epsilon=1$) by

\begin{equation*}\frac{d\mathbb P_{\bar\theta^*}}{d\mathbb P} = \frac{e^{\bar\theta^* X_{T}}}{\mathbb E\left[e^{\bar \theta^* X_T}\right]}.\end{equation*}

For $\bar \theta\in \mathbb R$,

(22)

\begin{align} \widehat{\!H}({\bar\theta}) + T h({\bar\theta}) = \log\left( \frac{K}{1 - {\bar\theta}} \right) - {\bar\theta} \log\left(\frac{-{\bar\theta}\,K/S_0}{1 - \theta}\right) + T \,\mu\,\frac{\lambda}{\zeta} \left(\frac{\lambda}{\zeta} - \rho \, {\bar\theta} - \frac{\gamma({\bar\theta})}{\zeta} \, \right).\end{align}

In order to obtain ${\bar\theta}^*$, we therefore differentiate (22) with respect to ${\bar\theta}$ and solve the resulting equation numerically.

We simulate $N=10000$ trajectories of the Heston model with parameters $\lambda = 1.15$, $\mu = 0.04$, $\zeta = 0.2$, $\rho = -0.4$ and initial values $V_0 = 0.04$ and $S_0 = 1$, under both $\mathbb P$ (Equation (16)) and $\mathbb P_{\bar\theta^*}$ ((20)), with $n=1$ and $t_1 = T$, using a standard Euler scheme with 200 discretization steps. For the $\mathbb P$-realizations $X^{(i)}$, we calculate the European put price as

\begin{equation*}\frac{1}{N} \sum_{j=1}^N \left(K-S_0 \, e^{X_T^{(i)}}\right)_+ ,\end{equation*}

and for the $\mathbb P_{\bar\theta^*}$-realizations $X^{(i,\bar\theta^*)}$, as

(23)

\begin{equation}\frac{e^{\phi\left(T, \bar\theta^*, 0 \right) + \psi\left(T, \bar\theta^*, 0 \right) \, V_0 }}{N} \sum_{j=1}^N e^{-\theta \, X_{T}^{(i, \bar\theta^*)}} \left(K-S_0 e^{X_T^{(i, \bar\theta^*)}}\right)_+ .\end{equation}

Each time, we compute the $\mathbb P_{\bar\theta^*}$-standard deviation, the variance ratio, and the adjusted variance ratio, i.e. the variance ratio divided by the ratio of simulation time. The latter measures the actual efficiency of the method, given the fact that simulating under the measure change takes in general slightly more time.

Table 1: The variance ratio as a function of the maturity for at-the-money European put options.

Table 2: The variance ratio as a function of the strike for the European put option with maturity $T=1$.

Table 3: The variance ratio as a function of the strike for the European put option with maturity $T=3$.

In Table 1, we fix the strike to the value $K=1$ and let the maturity T vary from $0.25$ to 3, whereas in Tables 2 and 3, we fix maturity to $T=1$ and to $T=3$, while we let the strike K vary between $0.25$ and $1.75$. In each case we calculate the price, the standard error, the (non-adjusted) variance ratio, and the variance ratio adjusted by the ratio of simulation times.

In all cases, we can see that the variance ratio is quite significant for deep out-of-the-money options and less significant, yet still very interesting, when the option is at or in the money. This corresponds to the natural behaviour of variance reduction techniques that involve measure changes, as the measure change increases the exercise probability. Note that the simulation time is only slightly larger when simulating with the measure change, while the time required for the optimization procedure is negligible compared with the simulation time. In Figure 1, we fix the maturity to $T=1.5$ and plot the empirical variance of the estimator (23) as a function of $\bar\theta^*$. Our method provides $\bar\theta^* = -0.457$ as the candidate asymptotically optimal measure change. We can therefore see that our candidate $\bar\theta^*$ is very close to the optimal one.

Figure 1: The variance of the Monte Carlo estimator as a function of $\theta$.

6.1.2. Numerical results for Asian put options

We now consider the case of a (discretized) Asian put option with log payoff (14) and discretization dates $t_j =\frac{j}{n}\,T$. By Proposition 4, the asymptotically optimal variance reduction measure (with $\epsilon=1$) is given by

\begin{equation*}\frac{d\mathbb P_{\bar\theta^*}}{d\mathbb P} = \frac{e^{\sum_{j=1}^n \bar\theta^*_j \, (X_{t_j}-X_{t_{j-1}}\!)}}{\mathbb E\left[e^{\sum_{j=1}^n \bar\theta^*_j \, (X_{t_j}-X_{t_{j-1}}\!)}\right]},\end{equation*}

where $\bar\theta^*$ is computed by minimizing

\begin{equation*}\log\left( \frac{K}{1 - {\bar\theta}_1} \right) - \sum_{m=1}^n ({\bar\theta}_m - {\bar\theta}_{m+1})\log\!\left(\frac{-({\bar\theta}_m \!- {\bar\theta}_{m+1})\,nK/S_0}{1 - {\bar\theta}_1}\right) + \frac{T}{n} \sum_{j=1}^n h\left({\bar\theta}_j\right)\!;\end{equation*}

see Proposition 5 and (19). By differentiating with respect to ${\bar\theta}_j$, we obtain, for $j=2,\ldots,n$,

(24)

\begin{equation}\begin{aligned}0& = \partial_{{\bar\theta}_j}\left\{ \widehat{\!H}({\bar\theta}) + \frac{T}{n}\sum_{m=1}^n h\left({\bar\theta}_m \right) \right\} \\[4pt] & = \frac{T\,h'\left({\bar\theta}_j \right)}{n} - \log\left[-({\bar\theta}_j - {\bar\theta}_{j+1})\right] + \log\left[-({\bar\theta}_{j-1} - {\bar\theta}_{j})\right] ,\end{aligned}\end{equation}

while for $j=1$ we have

(25)

\begin{equation}\begin{aligned}0& = \partial_{{\bar\theta}_1}\left\{ \widehat{\!H}({\bar\theta}) + \frac{T}{n}\sum_{m=1}^n h\left({\bar\theta}_m \right) \right\} \\ & = \log\left( 1 - {\bar\theta}_1 \right) - \log(n\,K/S_0) + \frac{T}{n}\,h'\left({\bar\theta}_1 \right) - \log\left[-({\bar\theta}_1 - {\bar\theta}_{2})\right] .\end{aligned}\end{equation}

Taking the exponential in (24) and (25), we obtain

\begin{align*}{\bar\theta}_2-{\bar\theta}_{1\phantom{-1}} & = \: \: (1\phantom{{\bar\theta}_{n}\:}-{\bar\theta}_{1\phantom{-1}}) \: \: e^{\frac{T}{n}\,h'({\bar\theta}_1)} \cdot \frac{S_0}{n\,K} , \\{\bar\theta}_3-{\bar\theta}_{2\phantom{-1}} & = \: \: ({\bar\theta}_{2\phantom{-1}}-{\bar\theta}_{1\phantom{-1}}) \: \: e^{\frac{T}{n}\,h'({\bar\theta}_2)} , \\\vdots \:\:\qquad & = \qquad \qquad \vdots \\{\bar\theta}_n-{\bar\theta}_{n-1} & = \: \: ({\bar\theta}_{n-1}-{\bar\theta}_{n-2}) \: \: e^{\frac{T}{n} \, h'({\bar\theta}_{n-1})} , \\-{\bar\theta}_{n\phantom{-1}} & = \: \: ({\bar\theta}_{n\phantom{-1}}-{\bar\theta}_{n-1}) \: \: e^{\frac{T}{n} \, h'({\bar\theta}_{n})} .\end{align*}

Finally, let $\mathcal{T}$ be the real-valued function defined by

\begin{equation*}{\bar\theta}_n \mapsto \mathcal{T}({\bar\theta}_n) = (1-{\bar\theta}_{1}) \textrm{e}^{\frac{T}{n}\,h'({\bar\theta}_1)} \cdot \frac{S_0}{n\,K} - {\bar\theta}_2-{\bar\theta}_{1} ,\end{equation*}

where ${\bar\theta}_{n-1} = {\bar\theta}_{n} + {\bar\theta}_{n} \, e^{-\frac{T}{n} \, h'({\bar\theta}_{n})}$, and iteratively,

\begin{equation*}{\bar\theta}_{j-2} = {\bar\theta}_{j-1} - ({\bar\theta}_j-{\bar\theta}_{j-1}) \, e^{-\frac{T}{n} \, h'({\bar\theta}_{j-1})} , \qquad j=n,\ldots,3 .\end{equation*}

Solving numerically the equation $\mathcal T({\bar\theta}_n)=0$, we find the optimal ${\bar\theta}^*$.

As before, we simulate $N=10000$ trajectories of the Heston model with parameters $\lambda = 1.15$, $\mu = 0.04$, $\zeta = 0.2$, $\rho = -0.4$ and initial values $V_0 = 0.04$ and $S_0 = 1$, under both $\mathbb P$ (Equation (16)) and $\mathbb P_{\bar\theta^*}$ (Equation (20)), with $n=200$ and $t_j=\frac{j}{n}\,T$, using a standard Euler scheme with 200 discretization steps. For the $\mathbb P$-realizations $X^{(i)}$, we calculate the Asian put price as

(26)

\begin{equation}\frac{1}{N} \sum_{j=1}^N \left(K-\frac{S_0}{n}\sum_{j=1}^n e^{X_{t_j}^{(i)}}\right)_+ ,\end{equation}

and for the $\mathbb P_{\bar\theta^*}$-realizations $X^{(i, \bar\theta^*)}$, as

(27)

\begin{equation}\frac{e^{\Phi\left(t_1, {\bar\theta^*}_{1}, \ldots, {\bar\theta^*}_n \right) + \Psi\left(t_1, {\bar\theta^*}_{1}, \ldots, {\bar\theta^*}_n \right) \, V_0}}{N} \sum_{j=1}^N e^{-\sum_{j=1}^n \bar\theta^*_j \, (X_{t_j}^{(i,\theta^*)}-X_{t_{j-1}}^{(i,\theta^*)})} \left(K-\frac{S_0}{n}\sum_{j=1}^n e^{X_{t_j}^{(i, \bar\theta^*)}}\right)_+ .\end{equation}

Each time, we compute the $\mathbb P_{\bar\theta^*}$-standard deviation and the adjusted and non-adjusted variance ratios. In Table 4, we fix the maturity to $T = 1.5$ and let the strike K vary between $0.6$ and $1.3$.

The conclusion is the same as for the European put option. Indeed, the variance ratio explodes when the option moves away from the money. Because of the time-dependence of the measure change, the adjusted variance ratio is consistently around 13% below the non-adjusted ratio. The adjusted variance ratio remains very interesting, however, with values above 3 around the money.

Table 4: The variance ratio as a function of the strike for the Asian put option, with $\lambda = 1.15$, $\mu = 0.04$, $\zeta = 0.2$, $\rho = -0.4$, $S_0 = 1$, $V_0 = 0.04$, $T = 1.5$, $N = 10000$, and 200 discretization steps.

6.2. European put options in the Heston model with negative exponential jumps

We now consider the Heston model with negative exponential jumps,

(28)

\begin{equation}\begin{aligned}dX_t & = \left( \delta-\frac{V_t}{2}\right) \, dt + \sqrt{V_t} \, dW_t^1 + dJ_t , && X_0 = 0 ,\\ dV_t & = \lambda( \mu - V_t) \, dt + \zeta \sqrt{V_t} \, dW_t^2 , && V_0 = V_0 , \\ d\left\langle {W^1,W^2} \right\rangle_t & = \rho \, dt ,\end{aligned}\end{equation}

where $W^1,W^2$ are standard $\mathbb P$-Brownian motions and $(J_t)_{t \ge 0}$ is an independent compound Poisson process with constant jump rate r and jump distribution $\text{Neg-}\text{Exp}(\alpha)$; i.e. the Lévy measure of $(J_t)_{t \ge 0}$ is $\nu(dx)=r \, \alpha e^{\alpha x} \mathds{1}_{\{x<0\}} dx$. The martingale condition on $S = S_0 \, e^X$ imposes $\delta=\frac{r}{\alpha+1}$. The Laplace transform of $(X_t,V_t)$ is

\begin{equation*}\mathbb E\left( e^{u X_t + w V_t} \right) = e^{ \phi(t,u,w) + \psi(t,u,w) V_0 + u X_0 } ,\end{equation*}

where $\phi, \psi$ satisfy the Riccati equations

(29)

\begin{equation}\begin{aligned}\partial_{t} \phi(t,u,w) & = F(u, \psi(t,u,w)) , \quad && \phi(0,u,w) = 0 ,\\\partial_{t} \psi(t,u,w) & = R(u, \psi(t,u,w)) , \quad && \psi(0,u,w) = w ,\end{aligned}\end{equation}

for $F(u,w) = \lambda \mu \, w + \tilde{\kappa}(u)$ with

\begin{equation*}\tilde{\kappa}(u) = \frac{r u(u-1)}{(\alpha+1)(\alpha+u)},\end{equation*}

and

\begin{equation*}R(u,w) = \frac{\zeta^2}{2} \, w^2 + \zeta \rho \, u w - \lambda w + \frac{1}{2} (u^2 - u) .\end{equation*}

Again, a standard calculation shows that the solution of the generalized Riccati equations (29) is

(30)

\begin{equation}\begin{aligned}\psi(t,u,w) & = \frac{1}{\zeta} \left(\frac{\lambda}{\zeta} - \rho u \right) - \frac{\gamma}{\zeta^2} \,\frac{\tanh\left(\frac{\gamma}{2} \, t\right) + \eta}{1 + \eta \, \tanh\left(\frac{\gamma}{2} \, t\right)} , \\\phi(t,u,w) & = \mu\,\frac{\lambda}{\zeta}\left(\frac{\lambda}{\zeta} - \rho u \right) t - 2\mu\frac{\lambda}{\zeta^2}\,\log\!\left(\cosh\left(\frac{\gamma}{2} t\right) + \eta \,\sinh\left(\frac{\gamma}{2} t\right)\right) \!+ t \tilde{\kappa}(u) ,\end{aligned}\end{equation}

where

\begin{equation*}\gamma = \gamma(u) = \zeta \, \sqrt{\left(\frac{\lambda}{\zeta} - \rho u\right)^2 \!\!+ \frac{1}{4} - \!\left(u-\frac{1}{2}\right)^2}\end{equation*}

and

\begin{equation*}\eta = \eta(u,w) = \frac{\lambda - \zeta\rho u - \zeta^2 w}{\gamma(u)} .\end{equation*}

Furthermore, for the Heston model with negative jumps, the function h is given by

(31)

\begin{equation}h(u) = \mu\,\frac{\lambda}{\zeta}\left(\frac{\lambda}{\zeta} - \rho u \right) - \mu\,\frac{\lambda}{\zeta^2} \, \gamma(u) + \tilde{\kappa}(u) .\end{equation}

Remark 7. (Asymptotic optimality for the Heston model with jumps.) The limiting Laplace exponent of the Heston model is finite and continuous on the bounded interval $J=[u_-,u_+]$, where

which contains a neighbourhood of zero. The function w has the same form (21) as in the Heston model without jumps, so that

\begin{equation*}w(u_-) = \frac{1}{\zeta}\left(\frac{\lambda}{\zeta} - \rho u_-\right)\textbf{1}_{u_->-\alpha} + w({-}\alpha)\textbf{1}_{u_-=-\alpha} \qquad \text{and} \qquad w(u_+) = \frac{1}{\zeta}\left(\frac{\lambda}{\zeta} - \rho u_+ \right) .\end{equation*}

Thus, Assumption 3 is satisfied and asymptotic optimality holds for Asian and European put options when $\rho=0$ and jumps are sufficiently small, namely

\begin{equation*}\alpha>\sqrt{\frac{1}{4} + \frac{\lambda^2}{\zeta^2}} - \frac{1}{2}.\end{equation*}

Let us now study the effect of the Esscher transform on the dynamics of the Heston model with jumps.

Proposition 7. Let $\mathbb P_{\bar\theta}$ be the measure given by

\begin{equation*}\frac{d\mathbb P_{\bar\theta}}{d\mathbb P} = \frac{e^{{\bar\theta} \, X_{T}}}{\mathbb E\left[e^{{\bar\theta} \, X_{T}}\right]} .\end{equation*}

Under $\mathbb P_{\bar\theta}$, the dynamics of the $\mathbb P$-Heston process with jumps $(X_t,V_t)$ becomes

(32)

\begin{align}dX_t & = \delta dt + \left({\bar\theta} + \zeta \rho \,\psi\left(T-t, {\bar\theta}, 0 \right) -\frac{1}{2} \right) V_t \, dt + \sqrt{V_t} \, d\tilde{W}_t^1 + dJ_t , && X_0 = 0 ,\nonumber\\ dV_t & = \tilde{\lambda}_t\, ( \tilde{\mu}_t - V_t) \, dt + \zeta \sqrt{V_t} \, d\tilde{W}_t^2 , && V_0 = V_0 ,\nonumber\\ d\left\langle {\tilde{W}^1,\tilde{W}^2} \right\rangle_t & = \rho \, dt ,\end{align}

where $\tilde{W}$ is a two-dimensional correlated $\mathbb P_{\bar\theta}$-Brownian motion, $\phi$ and $\psi$ are given in (30),

\begin{equation*}\tilde{\lambda}_t = \lambda - \zeta {\bar\theta} \rho - \zeta^2 \,\psi\left(T-t, {\bar\theta}, 0 \right) \quad \text{and} \quad \tilde{\mu}_t = \frac{\lambda \mu}{\tilde{\lambda}_t} ,\end{equation*}

and $(J_t)_{t \ge 0}$ is a compound Poisson process with jump rate $\frac{r \alpha}{\alpha+{\bar\theta}}$ and jump distribution $\text{Neg-Exp}(\alpha+{\bar\theta})$ under $\mathbb P_{\bar\theta}$.

Proof. Define

\begin{equation*}D(t,X_t,V_t) = \left.\frac{d\mathbb P_{\bar\theta}}{d\mathbb P}\right|_{\mathcal F_t} = \frac{e^{\phi\left(T-t, {\bar\theta}, 0 \right)} }{e^{\phi\left(T, {\bar\theta}, 0 \right) + \psi\left(T, {\bar\theta}, 0 \right) \, V_0 }} \: e^{ \psi\left(T-t, {\bar\theta}, 0 \right) \, V_t + {\bar\theta} \, X_{t} } .\end{equation*}

The dynamics of $D(t,X_t,V_t)$ can then be expressed using Itô’s lemma as

\begin{align*}dD(t,X_t,V_t)& = D(t,X_t,V_t) \left({\bar\theta} dX_t + \psi\left(T-t, {\bar\theta}, 0 \right) dV_t\right) + \ldots\, dt\\[4pt] & = D(t,X_t,V_t) \!\left[\sqrt{V_t}\left({\bar\theta} dW_t^1 \!+\! \zeta \, \psi\left(T\!\!-\!t, {\bar\theta}, 0 \right) dW_t^2\right) \!+\! {\bar\theta}\,(\delta dt \!+\! dJ_t) \right] ,\end{align*}

and Girsanov’s theorem then shows that

\begin{equation*}d\left(\begin{array}{c}\tilde{W}_t^1 \\[4pt] \tilde{W}_t^2\end{array}\right)=d\left(\begin{array}{c}W_t^1 \\[4pt] W_t^2\end{array}\right)- \sqrt{V_t}\left(\begin{array}{c}{\bar\theta} + \zeta \rho \,\psi\left(T-t, {\bar\theta}, 0 \right) \\[4pt] {\bar\theta} \rho + \zeta \,\psi\left(T-t, {\bar\theta}, 0 \right)\end{array}\right) dt\end{equation*}

is a two-dimensional Brownian motion under the measure $P_{\bar\theta}$. Replacing W in (16) by $\tilde{W}$ gives (32). In order to finish the proof, it remains to show that the jump process $(J_t)_{t \ge 0}$ has the desired distribution under $\mathbb P_{\bar\theta}$. Let us calculate the $\mathbb P_{\bar\theta}$-Laplace transform of $J_t$:

\begin{align*}\mathbb E^{\mathbb P_{\bar\theta}}\left[e^{u J_t}\right]& = \frac{\mathbb E\left[e^{u J_t}\,\mathbb E\left[e^{{\bar\theta} X_T}\,\middle|\, \mathcal F_t\right]\right]}{\mathbb E\left[e^{{\bar\theta} X_T}\right]} \\[4pt] & = \frac{e^{\phi\left(T-t, {\bar\theta}, 0 \right)}}{\mathbb E\left[e^{{\bar\theta} X_T}\right]} \mathbb E\left[e^{u J_t+\psi\left(T-t, {\bar\theta}, 0 \right) \, V_t + {\bar\theta} X_t}\right] .\end{align*}

By independence of the jumps,

\begin{equation*}\mathbb E\left[e^{u J_t+\psi\left(T-t, {\bar\theta}, 0 \right) \, V_t + {\bar\theta} X_t}\right]=e^{{\bar\theta}\delta \, t} \, \mathbb E\left[e^{(u+{\bar\theta}) J_t}\right] \, \mathbb E\left[e^{\psi\left(T-t, {\bar\theta}, 0 \right) \, V_t + {\bar\theta} (X_t - \delta \, t -J_t) }\right] ,\end{equation*}

where

\begin{equation*}\mathbb E\left[e^{(u+{\bar\theta}) J_t}\right] = e^{-rt \frac{u+{\bar\theta}}{u + {\bar\theta} + \alpha}}.\end{equation*}

Furthermore, $(X_t - \delta \, t -J_t,\, V_t)_{t \ge 0}$ is a standard Heston process without jumps. Therefore, comparing (18) and (30), we find that

\begin{equation*}\mathbb E\left[e^{\psi\left(T-t, {\bar\theta}, 0 \right) \, V_t + {\bar\theta} (X_t - \delta \, t -J_t) }\right]= e^{\phi(t,{\bar\theta},\psi\left(T-t, {\bar\theta}, 0 \right)) - t \frac{r {\bar\theta}({\bar\theta}-1)}{(\alpha+1)(\alpha+{\bar\theta})} + \psi(t,{\bar\theta},\psi\left(T-t, {\bar\theta}, 0 \right)) \, V_0} .\end{equation*}

Using the fact that $\psi(t,{\bar\theta},\psi\left(T-t, {\bar\theta}, 0 \right)) = \psi\left(T, {\bar\theta}, 0 \right)$ and

\begin{equation*}\phi\left(T-t, {\bar\theta}, 0 \right) + \phi(t,{\bar\theta},\psi\left(T-t, {\bar\theta}, 0 \right)) = \phi\left(T, {\bar\theta}, 0 \right)\end{equation*}

(see Equation (2.1) in [Reference Keller-Ressel14]), we finally obtain

\begin{align*}\mathbb E^{\mathbb P_{\bar\theta}}\left[e^{u J_t}\right]& = e^{{\bar\theta}\delta \, t - rt \frac{u+{\bar\theta}}{u + {\bar\theta} + \alpha} - t \frac{r {\bar\theta}({\bar\theta}-1)}{(\alpha+1)(\alpha+{\bar\theta})}} \\& = e^{{\bar\theta}\frac{r}{\alpha+1} \, t - rt \frac{u+{\bar\theta}}{u + {\bar\theta} + \alpha} - t \frac{r {\bar\theta}({\bar\theta}-1)}{(\alpha+1)(\alpha+{\bar\theta})}} = e^{- \frac{r\alpha}{\alpha+{\bar\theta}} \, t \, \frac{u}{u + (\alpha + {\bar\theta})}},\end{align*}

which is indeed the Laplace transform of a compound Poisson process with jump rate $\frac{r\alpha}{\alpha+{\bar\theta}}$ and $\text{Neg-Exp}(\alpha+{\bar\theta})$-distributed jumps.

6.2.1. Numerical results for the European put option

Similarly to the case of the Heston model without jumps, we find the optimal $\bar\theta^*$ by numerically minimizing

(33)

\begin{align}& \widehat{\!H}({\bar\theta}) + T h({\bar\theta}) \,dt = \log\left( \frac{K}{1 - {\bar\theta}} \right) - {\bar\theta} \log\left(\frac{-{\bar\theta}\,K/S_0}{1 - {\bar\theta}}\right) + T \,\mu\,\frac{\lambda}{\zeta} \left(\frac{\lambda}{\zeta} - \rho {\bar\theta} - \frac{\gamma({\bar\theta})}{\zeta} \, \right) + T \, \tilde{\kappa}({\bar\theta}). \end{align}

We simulate $N=10000$ trajectories of the Heston model with jumps with parameters $\lambda = 1.1$, $\mu = 0.7$, $\zeta = 0.3$, $\rho = -0.5$, $r=2$, $\alpha=3$ and initial values $V_0 = 1.3$ and $S_0 = 1$, under both $\mathbb P$ (Equation (28)) and $\mathbb P_{\bar\theta^*}$ (Equation (32)), using a standard Euler scheme with 200 discretization steps. For the $\mathbb P$-realizations $X^{(i)}$, we calculate the standard Monte Carlo estimator of the European put price, and for the $\mathbb P_{\bar\theta^*}$-realizations $X^{(i,{\bar\theta^*})}$, we use (23) with $\phi$ and $\psi$ given in (30) and compute the same statistics as in the previous examples. In Table 5, we fix the strike to the value $K=1$ and let the maturity T vary from $0.25$ to 3, whereas in Tables 6 and 7, we fix the maturity to $T=1$ and to $T=3$, while we let the strike K vary between $0.25$ and $1.75$.

When adding negative jumps to the Heston model, one can see that the variance ratio diminishes. When the options are out of the money, however, it is still sufficiently high to be interesting to use in applications. In Figure 2, we fix the maturity to $T=1.5$ and again plot the empirical variance of the estimator as a function of ${\bar\theta^*}$ for the Heston model with jumps. The method provides ${\bar\theta^*} = -0.312$ as the candidate asymptotically optimal measure change, which, as in the continuous case, is very close to the optimal one.

Table 5: The variance ratio as a function of the maturity for the European put option in the Heston model with jumps.

Table 6: The variance ratio as a function of the strike for the European put option with maturity $T=1$ in the Heston model with jumps.

Table 7: The variance ratio as a function of the strike for the European put option with maturity $T=3$ in the Heston model with jumps.

Figure 2: The variance of the Monte Carlo estimator as a function of ${\bar\theta^*}$ for the Heston model with jumps.

Acknowledgements

We are grateful to the anonymous reviewer for insightful comments on an earlier version of this paper. The research of P. Tankov was supported by the FIME Research Initiative.

References

Barndorff-Nielsen, O. E. (1997). Processes of normal inverse Gaussian type. Finance Stoch. 2, 41–68.CrossRef Google Scholar

Barndorff-Nielsen, O. E. and Shephard, N. (2001). Non-Gaussian Ornstein–Uhlenbeck-Based models and some of their uses in financial economics. J. R. Statist. Soc. B [Statist. Methodology] 63, 167–241.CrossRef Google Scholar

Bates, D. S. (1996). Jumps and stochastic volatility: exchange rate processes implicit in Deutsche mark options. Rev. Financial Studies 9, 69–107.CrossRef Google Scholar

Dembo, A. and Zeitouni, O. (1998). Large Deviations Techniques and Applications, 2nd edn. Springer, Berlin, Heidelberg.CrossRef Google Scholar

Duffie, D., Filipovic, D. and Schachermayer, W. (2003). Affine processes and applications in finance. Ann. Appl. Prob. 13, 984–1053.Google Scholar

Dupuis, P. and Wang, H. (2004). Importance sampling, large deviations, and differential games. Stochastics 76, 481–508.Google Scholar

Dupuis, P. and Wang, H. (2007). Subsolutions of an Isaacs equation and efficient schemes for importance sampling. Math. Operat. Res. 32, 723–757.CrossRef Google Scholar

Ekeland, I. and Temam, R. (1999) Convex Analysis and Variational Problems. Society for Industrial and Applied Mathematics, Philadelphia, PA.CrossRef Google Scholar

Genin, A. and Tankov, P. (2020). Optimal importance sampling for Lévy processes. Stoch. Process. Appl. 130, 20–46.CrossRef Google Scholar

Glasserman, P., Heidelberger, P. and Shahabuddin, P. (1999). Asymptotically optimal importance sampling and stratification for pricing path-dependent options. Math. Finance 9, 117–152.CrossRef Google Scholar

Guasoni, P. and Robertson, S. (2008). Optimal importance sampling with explicit formulas in continuous time. Finance Stoch. 12, 1–19.CrossRef Google Scholar

Heston, S. (1993). A closed-form solutions for options with stochastic volatility with applications to bond and currency options. Rev. Financial Studies 6, 327–343.CrossRef Google Scholar

Jacquier, A., Keller-Ressel, M. and Mijatović, A. (2013). Large deviations and stochastic volatility with jumps: asymptotic implied volatility for affine models. Stochastics 85, 321–345.CrossRef Google Scholar

Keller-Ressel, M. (2011). Moment explosions and long-term behavior of affine stochastic volatility models. Math. Finance 21, 73–98.CrossRef Google Scholar

Léonard, C. (2000). Large deviations for Poisson random measures and processes with independent increments. Stoch. Process. Appl. 85, 93–121.CrossRef Google Scholar

Robertson, S. (2010). Sample path large deviations and optimal importance sampling for stochastic volatility models. Stoch. Process. Appl. 120, 66–83.CrossRef Google Scholar