1. Introduction
Let $(\Omega, {\mathcal{F}}, {\mathbb{P}})$ be a complete probability space carrying the standard Brownian motion $B= (B_t)_{t \ge 0}$ and assume that $({\mathcal{F}}_t)_{t\ge 0}$ is the augmented natural filtration. Let (Y, Z) be the solution of the forward-backward stochastic differential equation (FBSDE)
Let $(Y^n,Z^n)$ be the solution of the FBSDE if the Brownian motion B is replaced by a scaled random walk $B^n$ given by
where $h= \tfrac{T}{n}$ and $(\varepsilon_i)_{i=1,2, \dots}$ is a sequence of independent and identically distributed (i.i.d.) Rademacher random variables. Then $(Y^n,Z^n)$ solves the discretized FBSDE
Many authors have investigated the approximation of backward stochastic differential equations (BSDEs) using random walks, analytically as well as numerically (see, for example, [Reference Briand, Delyon and Mémin7], [Reference Jańczak-Borkowska26, [Reference Ma, Protter, San Martín and Torres29], [Reference Martínez, San Martín and Torres31], [Reference Mémin, Peng and Xu32], [Reference Peng and Xu33], [Reference Cheridito and Stadje16]). In 2001, Briand et al. [Reference Briand, Delyon and Mémin7] showed weak convergence of $(Y^n,Z^n)$ to (Y, Z) for a Lipschitz continuous generator f and a terminal condition in $L_2.$ The rate of convergence of this method remained an open problem. Bouchard and Touzi in [Reference Bouchard and Touzi6] and Zhang in [Reference Zhang42] proposed instead of random walks an approach based on the dynamic programming equation, for which they established a rate of convergence. But this approach involves conditional expectations. Various methods to approximate these conditional expectations have been developed ([Reference Gobet, Lemor and Warin23], [Reference Crisan, Manolarakis and Touzi17], [Reference Chassagneux and Garcia Trillos12]). Also, forward methods have been introduced to approximate (1): a branching diffusion method ([Reference Henry-Labordère, Tan and Touzi24]), a multilevel Picard approximation ([Reference Weinan, Hutzenthaler, Jentzen and Kruse39]), and Wiener chaos expansion ([Reference Briand and Labart8]). Many extensions of (1) have been considered, among them schemes for reflected BSDEs ([Reference Bally and Pagès3], [Reference Chassagneux and Richou14]), high order schemes ([Reference Chassagneux9], [Reference Chassagneux and Crisan10]), fully-coupled BSDEs ([Reference Delarue and Menozzi18], [Reference Bender and Zhang5]), quadratic BSDEs ([Reference Chassagneux and Richou13]), BSDEs with jumps ([Reference Geiss and Labart21]), and McKean–Vlasov BSDEs ([Reference Alanko1], [Reference Chaudru de Raynal and Garcia Trillos15], [Reference Chassagneux, Crisan and Delarue11]).
The aim of this paper is to study the rate of the $L_2$-approximation of $(Y^n_t,Z^n_t)$ to $(Y_t,Z_t)$ when X satisfies (1). For this, we generate the random walk $B^n$ by Skorokhod embedding from the Brownian motion B. In this case the $L_p$-convergence of $B^n$ to B is of order $h^{\frac{1}{4}}$ for any $p>0.$ The special case $X=B$ has already been studied in [Reference Geiss, Labart and Luoto22], assuming a locally $\alpha$-Hölder continuous terminal function g and a Lipschitz continuous generator. An estimate for the rate of convergence was obtained which is of order $h^{\frac{\alpha}{4}}$ for the $L_2$-norm of $Y^n_t-Y_t,$ and of order $\tfrac{h^{\frac{\alpha}{4}}}{\sqrt{T-t}}$ for the $L_2$-norm of $Z^n_t-Z_t.$
In the present paper, where we assume that X is a solution of the stochastic differential equation (SDE) in (1), rather strong conditions on the smoothness and boundedness of f and g and also of b and $\sigma$ are needed. In Theorem 3.1, the main result of the paper, we show that the convergence rate for $(Y^n_t,Z^n_t)$ to $(Y_t,Z_t)$ in $L_2$ is of order $h^{\frac{1}{4}\wedge \frac{\alpha}{2}}$ provided that gʹʹ is locally $\alpha$-Hölder continuous. To the best of our knowledge, these are the first cases in which a convergence rate for the approximation of FBSDEs using random walks has been obtained.
Remark 1.1. For the diffusion setting—in contrast to the case $X=B$—we can derive the convergence rate for $(Y^n_t,Z^n_t)$ to $(Y_t,Z_t)$ in $L_2$ only under strong smoothness conditions on the coefficients, which include also that gʹʹ is locally $\alpha$-Hölder continuous (see Assumption 2.3 below). These requirements appear to be necessary. This becomes visible in Subsection 2.2.2 where we introduce a discretized Malliavin weight to obtain a representation $\hat Z^n$ for $Z^n.$ While it holds that $\hat Z^n =Z^n$ when $X=B,$ in our case $\hat Z^n$ does not coincide with $Z^n.$ However, one can show that the difference $\hat Z^n_t -Z^n_t$ converges to 0 in $L_2$ as $n \to \infty$ using a Hölder continuity property (see (63) in Remark 4.1) for the space derivative of the generator in (3). For this Hölder continuity property to hold one needs enough smoothness in space from the solution $u^n$ to the finite difference equation associated to the discretized FBSDE (3). Provided that Assumption 2.3 holds, we show the smoothness properties for $u^n$ in Proposition 4.2, applying methods known for Lévy driven BSDEs.
The paper is organized as follows. Section 2 contains the setting, the main assumptions, and the approximative representation $\hat{Z}^n$ of $Z^n.$ Our main results about the approximation rate for the case of no generator (i.e. $f=0$) and for the general case are in Section 3. One can see that in contrast to what is known for time discretization schemes, for random walk schemes the Lipschitz generator seems to cause more difficulties than the terminal condition: while in the case $f=0$ we need that gʹ is locally $\alpha$-Hölder continuous, in the case $f\neq0$ this property is required for $g^{\prime\prime}.$ In Section 4 we recall some needed facts about Malliavin weights, the regularity of solutions to BSDEs, and properties of the associated partial differential equations (PDEs). Finally, we sketch how to prove growth and smoothness properties of solutions to the finite difference equation associated to the discretized FBSDE. Section 5 contains technical results which mainly arise from the fact that the construction of the random walk by Skorokhod embedding forces us to compare our processes on different ‘timelines’, one coming from the stopping times of the Skorokhod embedding, the other from the equidistant deterministic times due to the quadratic variation process $[B^n].$
2. Preliminaries
2.1. The SDE and its approximation scheme
We introduce
and its discretized counterpart
where $(\varepsilon_i)_{i=1,2, \dots}$ is a sequence of i.i.d. Rademacher random variables. Letting
it follows that the associated discrete-time random walk $(B^n_{t_k})_{k=0}^n$ is $(\mathcal{G}_k)_{k=0}^n$-adapted. Recall (2) and $h= \tfrac{T}{n}.$ If we extend the sequence $(X^n_{t_k})_{k\ge0}$ to a process in continuous time by defining $X^n_t \coloneqq X^n_{t_k} $ for $t \in [t_k, t_{k+1}),$ it is the solution of the forward SDE (3).
We formulate our first assumptions. Assumption 2.1(ii) will not be used explicitly for our estimates, but it is required for Theorem 4.1 below.
(i) $b, \sigma \in C_b^{0,2}([0,T]\times {\mathbb{R}}),$ in the sense that the derivatives of order $k=0,1,2$ with respect to the space variable are continuous and bounded on $[0,T]\times {\mathbb{R}}$.
(ii) The first and second derivatives of b and $\sigma$ with respect to the space variable are assumed to be $\gamma$-Hölder continuous (for some $\gamma \in (0,1],$ with respect to the parabolic metric $d((t,x),(\bar t, \bar x))=(|t- \bar t| + |x- \bar x|^2)^{\frac{1}{2}})$ on all compact subsets of $[0, T ] \times {\mathbb{R}}$.
(iii) $b, \sigma$ are ${\frac{1}{2}}$-Hölder continuous in time, uniformly in space.
(iv) $\sigma(t,x) \ge \delta >0$ for all (t, x).
(i) g is locally Hölder continuous with order $\alpha \in (0,1]$ and polynomially bounded in the following sense: there exist $p_0 \ge 0, C_g>0$ such that
(6)\begin{align} \forall (x, \bar x) \in {\mathbb{R}}^2, \quad |g(x)-g( \bar x)|\le C_g(1+|x|^{p_0}+ | \bar x|^{p_0})|x- \bar x|^{\alpha}.\end{align}(ii) The function $(t,x,y,z) \mapsto f (t,x,y,z)$ on $[0,T] \times {\mathbb{R}}^3$ satisfies
(7)\begin{eqnarray} |f(t,x,y,z) - f(\bar t, \bar x, \bar y, \bar z)| \le L_f\Big(\sqrt{t- \bar t} + |x- \bar x| + |y- \bar y|+|z- \bar z|\Big). \end{eqnarray}
Notice that (6) implies
for some $K>0.$ From the continuity of f we conclude that
Notation:
• $\|\cdot\|_p\coloneqq \|\cdot\|_{L_p({\mathbb{P}})}$ for $p\ge 1$. For $p=2$ we write simply $\|\cdot\|$.
• If a is a function, C(a) represents a generic constant which depends on a and possibly also on its derivatives.
• ${\mathbb{E}}_{0,x}\coloneqq {\mathbb{E}}(\cdot|X_0=x)$.
• Let $\phi$ be a $C^{0,1}([0,T]\times {\mathbb{R}})$ function. Then $\phi_x$ denotes $\partial_x \phi$, the partial derivative of $\phi$ with respect to x.
2.2. The FBSDE and its approximation scheme
Recall the FBSDE (1) and its approximation (3). The backward equation in (3) can equivalently be written in the form
if one puts $X^n_r\coloneqq X^n_{t_m}$, $Y^n_r\coloneqq Y^n_{t_m}$, and $Z_r^n \coloneqq {Z}_{t_m}^n$ for $r \in [t_m, t_{m+1})$.
Remark 2.1. Equations (3) and (9) do not contain any martingale orthogonal to the random walk $B^n$, since we are in a special case where the orthogonal martingale is zero (see [Reference Briand, Delyon and Mémin7, p. 3] or [Reference Privault34, Proposition 1.7.5]). Indeed, for the symmetric simple random walk $B^n$ the predictable representation property holds; i.e. for any $\mathcal{G}_n$-measurable (see (5)) random variable $\xi= F(\varepsilon_1, \dots, \varepsilon_n)$ there exists a representation
where $c \in {\mathbb{R}}$ and $h_m$ is $\mathcal{G}_{m-1}$-measurable for $m=1,\ldots,n$. To see this, consider
Put $c= {\mathbb{E}} [F(\varepsilon_1, \ldots, \varepsilon_n)].$ Our aim is to determine a $\mathcal{G}_{m-1}$-measurable $h_m$ such that
We define
By the tower property it holds that
hence
One can derive an equation for $Z^n = (Z^n_{t_k})_{k=0}^{n-1}$ if one multiplies (9) by ${\varepsilon}_{k+1}$ and takes the conditional expectation with respect to ${\mathcal{G}}_k$, so that
where ${\mathbb{E}}^{{\mathcal{G}}}_k\coloneqq {\mathbb{E}}(\cdot|{\mathcal{G}}_k)$.
Remark 2.2. For n large enough, the BSDE (3) has a unique solution $({Y}^n,{Z}^n)$ (see [Reference Toldo36, Proposition 1.2]), and $({Y}^n_{t_k}, {Z}^n_{t_k})_{k=0}^{n-1}$ is adapted to the filtration $({\mathcal{G}}_k)_{k=0}^{n-1}.$
2.2.1. Representation for Z
We will use the following representation for Z, due to Ma and Zhang (see [Reference Ma and Zhang30, Theorem 4.2]):
where ${\mathbb{E}}_t \coloneqq {\mathbb{E}}(\cdot|{\mathcal{F}}_t),$ and for all $s \in (t,T]$, we have (cf. Lemma 4.1)
where $\nabla X = (\nabla X_s)_{s \in [0,T]}$ is the variational process; i.e., it solves
with $(X_s)_{s \in [0,T]}$ given in (1).
Remark 2.3. In the following we will assume that gʹʹ exists. In such a case we have the following representation for Z:
2.2.2. Approximation for $Z^n$
In this section we state the discrete counterpart to (11), which, in the general case of a forward process X, does not coincide with $Z^n$ (given by (10)). In contrast to the continuous-time case, where the variational process and the Malliavin derivative are connected by $\tfrac{\nabla X_t}{\nabla X_s} = \tfrac{D_sX_t }{\sigma(s,X_s)}$ ($s \le t$), we cannot expect equality for the corresponding expressions if we use the discretized versions of the processes $(\nabla X_t)_t$ and $(D_s X_t)_{s\le t}$ introduced in (16). This counterpart $\hat{Z}^n$ to Z is a key tool in the proof of the convergence of $Z^n$ to Z. As we will see in the proof of Theorem 3.1, the study of $\|Z^n_{t_k}-Z_{t_k}\|$ goes through the study of $\|Z^n_{t_k}-\hat{Z}^n_{t_k}\|$ and $\|\hat{Z}^n_{t_k}-Z_{t_k}\|$.
Before defining the discretized versions of $(\nabla X_t)_t$ and $(D_s X_t)_{s \le t}$, we briefly introduce the discretized Malliavin derivative. We refer the reader to [Reference Bender and Parczewski4] for more information on this topic.
Definition 2.1. (Definition of $T_{_{m,+}}$, $T_{_{m,-}}$ and $\mathcal{D}^n_m$) For any function $F\,:\,\{-1,1 \}^n \to {\mathbb{R}}$, the mappings $T_{_{m,+}}$ and $T_{_{m,-}}$ are defined by
For any $\xi= F(\varepsilon_1, \dots, \varepsilon_n)$, the discretized Malliavin derivative is defined by
Definition 2.2. (Definition of $\phi_x^{(k,l)}$) Let $\phi$ be a $C^{0,1}([0,T]\times {\mathbb{R}})$ function. We define
If $\mathcal{D}^n_{k} X^n_{t_{\ell-1}}\neq 0$, the second ‘$\coloneqq $’ holds as an identity.
We are now able to define the discretized versions of $(\nabla X_t)_t$ and $(D_s X_t)_{s \le t}$.
Definition 2.3. (Discretized processes $(\nabla X^{n,t_k,x}_{t_m})_{m \in \{k,\dots,n\}}$ and $(\mathcal{D}^n_k X^{n}_{t_m})_{m \in \{k,\dots,n\}}$) For all m in $\{k,\dots,n\}$ we define
(i) Although $\nabla X^{n,t_k,X^n_{t_k}}_{t_m}$ is not equal to
\begin{equation*}\frac{\mathcal{D}^n_{k+1}X^n_{t_m} }{\sigma(t_{k+1},X^n_{t_k})},\end{equation*}we can show that the difference of these terms converges in $L_p$ (see Lemma 5.4).(ii) With the notation introduced above, (10) can be rewritten as
(17)\begin{eqnarray} Z^n_{t_k} &=& {\mathbb{E}}^{{\mathcal{G}}}_{k} \big ( \mathcal{D}^n_{k+1} g(X^n_T) \big) +{\mathbb{E}}^{{\mathcal{G}}}_{k}\Bigg(h\sum_{m=k+1}^{n-1} \mathcal{D}^n_{k+1} f( t_{m+1},X^n_{t_m}, Y^n_{t_m} , Z^n_{t_m}) \Bigg). \end{eqnarray}
In order to define the discrete counterpart to (11), we first define the discrete counterpart to $(N^t_s)_{s \in [t,T]}$ given in (12):
Notice that there is some constant $\widehat \kappa_2>0$ depending on $b,\sigma,T,\delta$ such that
Definition 2.4. (Discrete counterpart to (14).) Let the process $\hat{Z}^n = (\hat{Z}^n_{t_k})_{k=0}^{n-1}$ be defined by
Remark 2.5. In (20) the approximate expression ${\mathbb{E}}^{{\mathcal{G}}}_{k} ({ g(X^n_T) N^{n,t_k}_{t_n} \sigma(t_{k+1},X^n_{t_k}) })$ also could have been used, but since we will assume that gʹʹ exists, we work with the correct term.
The study of the convergence ${\mathbb{E}}^{{\mathcal{G}}}_{0,x} | Z^n_{t_k} - \hat{Z}^n_{t_k}|^2$ requires stronger assumptions on the coefficients b, $\sigma$, f, and g.
Assumption 2.3. Assumptions 2.1 and 2.2 hold. Additionally, we assume that all first and second derivatives with respect to the variables x, y, z of b(t, x), $\sigma(t,x)$, and f(t, x, y, z) exist and are bounded Lipschitz functions with respect to these variables, uniformly in time. Moreover, gʹʹ satisfies (6).
Proposition 2.1. If Assumption 2.3 holds, then
where ${\mathbb{E}}^{{\mathcal{G}}}_{0,x}\coloneqq {\mathbb{E}}^{{\mathcal{G}}}(\cdot|X_0=x)$, the function $\hat \Psi$ is defined in (62) below, and $C_{\hyperref[Pro2.1]{\scriptstyle\textcolor{blue}{2.1}}}$ depends on b, $\sigma$, f, g, T, $p_0$, and $\delta$.
Proof. According to [Reference Briand, Delyon and Mémin7, Proposition 5.1] one has the representations
where $u^n$ is the solution of the finite difference equation (44) with terminal condition $u^n(t_n,x)= g(x).$ Notice that by the definition of $\mathcal{D}^n_{m+1}$ in (15) the expression $\mathcal{D}^n_{m+1} u^n(t_{m+1},X^{n}_{t_{m+1}})$ depends in fact on $X^{n}_{t_m}.$ Hence we can put
From (20) and (17) we conclude the following (we use ${\mathbb{E}}\coloneqq {\mathbb{E}}^{{\mathcal{G}}}_{0,x}$ for $\|\cdot \|$):
With the notation introduced in Definition 2.2 applied to $F^n$,
For $A_1$ we use Definition 2.2 again and exploit the fact that
is locally $\alpha$-Hölder continuous according to (63). By Hölder’s inequality and Lemma 5.4 Parts (i) and (iii),
For the estimate of $A_2$ we notice that by our assumptions the $L_4$-norm of $F^{n,(\ell,m+1)}_x $ is bounded by $C \Psi^2(x),$ so that it suffices to estimate
The second expression on the right-hand side of (22) is bounded by $C(b,\sigma,T,\delta)h^{\frac{1}{2}}$ as a consequence of Lemma 5.4 Parts (ii) and (iii). To show that the first expression is also bounded by $C(b,\sigma,T,\delta)h^{\frac{1}{2}}$, we rewrite it using (16) and get
We take the $L_4$-norm of (23) and apply the Burkholder–Davis–Gundy (BDG) inequality and Hölder’s inequality. The second term on the right-hand side of (23) will be used for Gronwall’s lemma, while the first and last terms can be bounded by $ C(b,\sigma,T)h^{\frac{1}{2}},$ using Lemma 5.4(iii). For the last term we also use the Lipschitz continuity of $b_x$ and $\sigma_x$ in space and Lemma 5.4(i).
3. Main results
In order to compute the mean square distance between the solution to (1) and the solution to (3), we construct the random walk $B^n$ from the Brownian motion B by Skorokhod embedding. Let
Then $(B_{\tau_{k}} -B_{\tau_{k-1}})_{k=1}^\infty$ is a sequence of i.i.d. random variables with
which means that $\sqrt{h} {\varepsilon}_k \stackrel{d}{=} B_{\tau_{k}}-B_{\tau_{k-1}}\!.$ We will denote by ${\mathbb{E}}_{\tau_k}\!$ the conditional expectation with respect to ${\mathcal{F}}_{\tau_k} \coloneqq {\mathcal{G}}_k.$ In this case we also use the notation ${\mathcal{X}}_{\tau_k}\coloneqq X^n_{t_k}$ for all $k =0, \dots,n,$ so that (4) turns into
Assumption 3.1. We assume that the random walk $B^n$ in (3) is given by
where the $\tau_k$, $k =1,\ldots,n$, are taken from (24).
Remark 3.1. Note that for $p>0$ there exists a $C(p) >0$ such that for all $k =1, \dots,n$ it holds that
The upper estimate is given in Lemma 5.1. For $p\in [4,\infty)$ the lower estimate follows from [Reference Ankirchner, Kruse and Urusov2, Proposition 5.3]. We get the lower estimate for $p\in (0,4)$ by choosing $0<\theta <1$ and $0<p<p_1$ such that $ \frac{1}{4} = \frac{1-\theta}{p} + \frac{\theta}{ p_1}.$ Then it holds by the log-convexity of $L_p$ norms (see for example [Reference Tao35, Lemma 1.11.5]) that
Since for $t\in [t_k,t_{k+1})$ it holds that $B^n_t= B_{\tau_k}$ and $\|B_t -B_{t_k} \|_p \le C(p) h^\frac{1}{2},$ we have for any $p>0$ that
Proposition 3.1 states the convergence rate of $(Y^n_v,Z^n_v)$ to $(Y_v,Z_v)$ in $L_2$ when $f=0$, and Theorem 3.1 generalizes this result to any f which satisfies Assumption 2.3.
Proposition 3.1. Let Assumptions 2.1 and 3.1 hold. If $f =0$ and $g \in C^1$ is such that gʹ is a locally $\alpha$-Hölder continuous function in the sense of (6), then for all $0 \le v < T$, we have (for sufficiently large n) that
where $C^y_{\hyperref[Pro3.1]{\scriptstyle\textcolor{blue}{3.1}}}= C(C_g, b, \sigma, T, p_0, \delta)$ and $C^z_{\hyperref[Pro3.1]{\scriptstyle\textcolor{blue}{3.1}}}= C(C_{g'}, b, \sigma, T, p_0, \delta).$
Theorem 3.1. Let Assumptions 2.3 and 3.1 be satisfied. Then for all $v \in [0,T)$ and large enough n, we have
where $C_{\hyperref[The3.1]{\scriptstyle\textcolor{blue}{3.1}}}= C(b,\sigma,f,g,T,p_0,\delta)$ and $\hat{\Psi}$ is given in (62).
Remark 3.2. As observed above, the filtration $\mathcal{G}_k$ coincides with $\mathcal{F}_{\tau_k}$, for all $k = 0,\dots,n$. The expectation ${\mathbb{E}}_{0,x}$ appearing in Proposition 3.1 and in Theorem 3.1 is defined on the probability space $(\Omega,\mathcal{F},\mathbb{P})$.
Remark 3.3. In order to avoid too much notation for the dependencies of the constants, if for example only g is mentioned and not $C_g,$ this means that the estimate might depend also on the bounds of the derivatives of g.
From (25) one can see that the convergence rates stated in Proposition 3.1 and Theorem 3.1 are the natural ones for this approach. The results are proved in the next two sections. In both proofs, we will use the following remark.
Remark 3.4. Since the process $(X_t)_{t \ge 0}$ is strong Markov, we can express conditional expectations with the help of an independent copy of B denoted by $\tilde B$. For example, ${\mathbb{E}}_{\tau_k} g(X^n_T) = \tilde {\mathbb{E}} g( \tilde {\mathcal{X}}^{\tau_k,{\mathcal{X}}_{\tau_k}}_{\tau_n})$ for $0 \leq k \leq n$, where
(we define $\tilde \tau_k\coloneqq 0$, $\tilde \tau_j\coloneqq \inf\{ t> \tilde \tau_{j-1}\,:\, |\tilde B_t- \tilde B_{\tilde \tau_{j-1}}| = \sqrt{h} \}$ for $j \ge 1$, and $\tau_n \coloneqq \tau_k +\tilde \tau_{n-k}$ for ${n\ge k}$). In fact, to represent the conditional expectations ${\mathbb{E}}_{t_k}$ and ${\mathbb{E}}_{\tau_k}$, we work here with $\tilde {\mathbb{E}}$ and the Brownian motions Bʹ and Bʹʹ, respectively, given by
3.1. Proof of Proposition 3.1: the approximation rates for the zero generator case
To shorten the notation, we use ${\mathbb{E}} \coloneqq {\mathbb{E}}_{0,x}.$ Let us first deal with the error of Y. If v belongs to $[t_k,t_{k+1})$ we have $Y^n_v= Y^n_{t_k}$. Then
Using Theorem 4.1 we bound $\|Y_v - Y_{t_k}\|$ by
(since $\alpha=1$ can be chosen when g is locally Lipschitz continuous). It remains to bound
By (6) and the Cauchy–Schwarz inequality (with $\Psi_1\coloneqq C_g(1+|\tilde X^{t_k,X_{t_k}}_{t_n}|^{p_0}+| \tilde {\mathcal{X}}^{\tau_k,{{\mathcal{X}}}_{\tau_k}}_{\tau_n}|^{p_0})$),
Finally, we get by Lemma 5.2(v) that
Let us now deal with the error of Z. We use $ \|Z_v - Z^n_v \| \le \|Z_v - Z_{t_k}\| + \|Z_{t_k} - Z^n_{t_k}\|$ and the representation
(see Theorem 4.2), where
For the first term we get by the assumption on g and Lemma 5.2 Parts (i) and (iii) that
We compute the second term using $Z^n_{t_k} $ as given in (17). Hence, with the notation from Definition 2.2,
We insert $\pm \tilde {\mathbb{E}} ( g_x^{(k+1,n+1)} \nabla \tilde X^{t_k,X_{t_k}}_{t_n})$ and get by the Cauchy–Schwarz inequality that
For the estimate of $ \tilde {\mathbb{E}} | \nabla \tilde X^{t_k,X_{t_k}}_{t_n} |^2$ we use Lemma 5.2. Since gʹ satisfies (6) we proceed with
where $\Psi_1\coloneqq C_{g'}(1+|\tilde X^{t_k,X_{t_k}}_{t_n}|^{p_0}+|\vartheta T_{_{k+1,+}} \tilde {\mathcal{X}}^{\tau_k,{\mathcal{X}}_{\tau_k}}_{\tau_n}+ (1-\vartheta) T_{_{k+1,-}}\tilde {\mathcal{X}}^{\tau_k,{\mathcal{X}}_{\tau_k}}_{\tau_n}|^{p_0}). $ For $\tilde {\mathbb{E}} \Psi_1^4$ and
we use Lemma 5.4 and Lemma 5.2(v). For the last term in (29) we notice that
By Lemma 5.2 we have ${\mathbb{E}}\tilde {\mathbb{E}} | \nabla \tilde X^{t_k,X_{t_k}}_{t_n}- \nabla \tilde {\mathcal{X}}^{\tau_k,{\mathcal{X}}_{\tau_k}}_{\tau_n}|^p \le C(b,\sigma,T,p) h^{\frac{p}{4}},$ and by Lemma 5.4,
Consequently, $ \|Z_{t_k} - Z^n_{t_k}\|^2 \le C(C_{g'}, b,\sigma,T,p_0,\delta) \Psi^2(x) h^\frac{\alpha}{2}.$
3.2. Proof of Theorem 3.1: the approximation rates for the general case
Let $u \,:\, [0,T)\times {\mathbb{R}} \to {\mathbb{R}}$ be the solution of the PDE (38) associated to (1). We use the representations $ Y_s = u(s, X_s)$ and $Z_s = \sigma(s, X_s)u_x(s, X_s)$ stated in Theorem 4.2 and define
where Proposition 3.1 provides the estimate for the terminal condition. We decompose the generator term as follows:
We use
and estimate the expressions on the right-hand side. For the function F defined in (30) we use Assumption 2.3 (which implies that (6) holds for $\alpha=1$) to derive by Theorem 4.2 and the mean value theorem that for $x_1, x_2 \in {\mathbb{R}}$ there exists $\xi \in [\min\{x_1,x_2\},\max\{x_1,x_2\}] $ such that
By (7), standard estimates on $(X_s),$ Theorem 4.1(i), and Proposition 4.1 for $p=2$, we immediately get
For the estimate of $d_2$ one exploits
and then uses (31) and Lemma 5.2(v). This gives
For $d_3$ we start with Jensen’s inequality and then continue similarly as above to get
and for the last term we get
This implies
where $C= C(L_f, C^y_{\hyperref[no f]{\scriptstyle\textcolor{blue}{3.1}}}, C^y_{\hyperref[The4.1]{\scriptstyle\textcolor{blue}{4.1}}}, C_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}}, c^{2,3}_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}},b,\sigma, T,p_0) = C(b,\sigma, f,g, T,p_0,\delta).$
For $\| Z_{t_k} -Z^n_{t_k}\|$ we use the representations (14) and (17), the approximation (20), and Proposition 2.1. Instead of $N^{n,t_k}_{t_n}$ we will use here the notation $N^{n,\tau_k}_{\tau_n}$ to indicate its measurability with respect to the filtration $({\mathcal{F}}_t)$. It holds that
For the terminal condition, Proposition 3.1 provides
We continue with the generator terms and use F defined in (30) to decompose the difference
where $s \in [t_m, t_{m+1})$. For ${\tt t}_1$ we use that ${\mathbb{E}}_{t_k}f(t_m, X_{t_k},Y_{t_k},Z_{t_k}) (N^{t_k}_s -N^{t_k}_{t_m}) =0,$ so that
As before, we rewrite the conditional expectations with the help of the independent copy $\tilde B.$ Then
and
We apply the conditional Hölder inequality, and from the estimates (37) and
we get
since for $0\le t <s \le T$ we have by Theorem 4.1 and Proposition 4.1 that
For the estimate of ${\tt t}_2$, Lemma 5.2, Lemma 5.3, (31), and (37) yield
For ${\tt t}_3$ we use the conditional Hölder inequality, (31), (19), and Lemma 5.2:
The term ${\tt t}_4$ can be estimated as follows:
Finally, for the remaining term of the estimate of $\| Z_{t_k} -Z^n_{t_k}\|, $ we use (35) and (37) to get
Consequently, from (33), (34), and the estimates for the remaining term and for ${\tt t}_1,\ldots,{\tt t}_4$, it follows that
Then we use (32) and the above estimate to get
Consequently, summarizing the dependencies, there is a $ C=C(b,\sigma,f,g,T,p_0,\delta)$ such that
By Theorem 4.1 (note that by Assumption 2.3 on g we have $\alpha=1$) it follows that
while Proposition 4.1 implies that
and hence we have
with $C_{\hyperref[the-result]{\scriptstyle\textcolor{blue}{3.1}}} = C_{\hyperref[the-result]{\scriptstyle\textcolor{blue}{3.1}}}(b,\sigma,f,g,T,p_0,\delta).$
4. Some properties of solutions to BSDEs and their associated PDEs
4.1. Malliavin weights
We use the SDE from (1) started in (t, x),
and recall the Malliavin weight and its properties from [Reference Geiss, Geiss and Gobet20, Subsection 1.1 and Remark 3].
Lemma 4.1. Let $H\,:\, {\mathbb{R}} \to {\mathbb{R}}$ be a polynomially bounded Borel function. If Assumption 2.1 holds and $X^{t,x}$ is given by (36), then setting
implies that $G \in C^{1,2}([0,T)\times {\mathbb{R}} ).$ Specifically, it holds for $0 \le t \le r < T$ that
where $({\mathcal{F}}^t_r)_{r\in [t,T]}$ is the augmented natural filtration of $(B^{t,0}_r)_{r \in [t,T]},$
and $\nabla X^{t,x}_s$ is given in (13). Moreover, for $ q \in (0, \infty)$ there exists a $\kappa_q>0$ such that
and we have
for $1<q,p < \infty$ with $\frac{1}{p} + \frac{1}{q} =1.$
4.2. Regularity of solutions to BSDEs
The following result originates from [Reference Geiss, Geiss and Gobet20, Theorem 1], where path-dependent cases were also included. We formulate it only for our Markovian setting but use ${\mathbb{P}}_{t,x}$ since we are interested in an estimate for all $(t,x) \in [0,T) \times {\mathbb{R}}.$ A sketch of a proof of this formulation can be found in [Reference Geiss, Labart and Luoto22].
Theorem 4.1. Let Assumptions 2.1 and 2.2 hold. Then for any $p\in [2,\infty)$ the following assertions are true.
(i) There exists a constant $C^y_{\hyperref[The4.1]{\scriptstyle\textcolor{blue}{4.1}}} >0$ such that for $0\le t < s \le T$ and $x\in {\mathbb{R}}$,
\begin{eqnarray*} \| Y_s - Y_t\|_{L_p({\mathbb{P}}_{t,x})} \le C^y_{\hyperref[The4.1]{\scriptstyle\textcolor{blue}{4.1}}} \Psi(x) \left ( \int_t^s (T-r)^{\alpha -1}dr \right )^{\frac{1}{2}}.\end{eqnarray*}(ii) There exists a constant $C^z_{\hyperref[The4.1]{\scriptstyle\textcolor{blue}{4.1}}} >0$ such that for $0\le t < s<T$ and $x\in {\mathbb{R}},$
\begin{eqnarray*} \| Z_s - Z_t\|_{L_p({\mathbb{P}}_{t,x})} \le C^z_{\hyperref[The4.1]{\scriptstyle\textcolor{blue}{4.1}}} \Psi(x) \left ( \int_t^s (T-r)^{\alpha -2}dr \right )^{\frac{1}{2}}.\end{eqnarray*}
The constants $ C^y_{\hyperref[The4.1]{\scriptstyle\textcolor{blue}{4.1}}}$ and $C^z_{\hyperref[The4.1]{\scriptstyle\textcolor{blue}{4.1}}}$ depend on $ (L_f, K_f, C_g,c^{1,2}_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}}, \kappa_q, b,\sigma, T, p_0,p)$, and $\Psi(x)$ is defined in (8).
4.3. Properties of the associated PDE
The theorem below collects properties of the solution to the PDE associated to the FBSDE (1). For a proof see [Reference Zhang43, Theorem 3.2], [Reference Zhang41], and [Reference Geiss, Labart and Luoto22, Theorem 5.4].
Theorem 4.2. Consider the FBSDE (1) and let Assumptions 2.1 and 2.2 hold. Then for the solution u of the associated PDE
we have the following:
(i) $Y_t=u(t,X_t)$ almost surely, where $u(t,x)={\mathbb{E}}_{t,x} \!\left (g(X_T)+\int_t^T\! f(r,X_r,Y_r,Z_r)dr \right )$, and $|u(t,x)|\le c^1_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}} \Psi(x)$ for $\Psi$ given in (8), where $c^{1}_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}}$ depends on $L_f$, $K_f$, $C_g$, T, and $p_0$, as well as on the bounds and Lipschitz constants of b and $\sigma$.
(ii) (a) $\partial_x u$ exists and is continuous in $[0,T)\times {\mathbb{R}}$.
(b) $Z^{t,x}_s= u_x(s,X_s^{t,x})\sigma(s,X_s^{t,x})$ almost surely.
(c)
\begin{equation*}|u_x(t,x)|\le \frac{ c^2_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}} \Psi(x)}{(T-t)^{\frac{1-\alpha}{2}}},\end{equation*}where $c^{2}_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}}$ depends on $L_f$, $K_f$, $C_g$, T, $p_0$, and $\kappa_2= \kappa_2(b,\sigma,T,\delta)$, as well as on the bounds and Lipschitz constants of b and $\sigma$, and hence $c^{2}_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}} = c^{2}_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}}(L_f, K_f, C_g,b,\sigma, T,p_0,\delta).$(iii) (a) $\partial^2_x u$ exists and is continuous in $[0,T)\times {\mathbb{R}}$.
(b)
\begin{equation*}|\partial^2_x u(t,x)|\le \frac{ c^3_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}} \Psi(x)}{(T-t)^{1-\frac{\alpha}{2}}},\end{equation*}where $c^{3}_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}}$ depends on $L_f$, $C_g$, T, $p_0$, $\kappa_2= \kappa_2(b,\sigma,T,\delta)$, $C^y_{\hyperref[The4.1]{\scriptstyle\textcolor{blue}{4.1}}}$, and $C^z_{\hyperref[The4.1]{\scriptstyle\textcolor{blue}{4.1}}}$, as well as on the bounds and Lipschitz constants of b and $\sigma$, and hence $c^{3}_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}} = c^{3}_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}}(L_f, K_f, C_g,b,\sigma, T,p_0,\delta). $
Using Assumption 2.3, we are now in a position to improve the bound on $\| Z_s - Z_t\|_{L_p({\mathbb{P}}_{t,x})}$ given in Theorem 4.1.
Proposition 4.1. If Assumption 2.3 holds, then there exists a constant $C_{\hyperref[Pro4.1]{\scriptstyle\textcolor{blue}{4.1}}} >0$ such that for $0\le t < s \le T$ and $x\in {\mathbb{R}},$
where $C_{\hyperref[Pro4.1]{\scriptstyle\textcolor{blue}{4.1}}}$ depends on $c^{2,3}_{\hyperref[betterZ]{\scriptstyle\textcolor{blue}{4.2}}}$, b, $\sigma$, f, g, T, $p_0$, and p, and hence $C_{\hyperref[Pro4.1]{\scriptstyle\textcolor{blue}{4.1}}}= C_{\hyperref[Pro4.1]{\scriptstyle\textcolor{blue}{4.1}}}(b,\sigma, f,g,T,p_0,p,\delta).$
Proof. From $Z^{t,x}_s= u_x(s,X_s^{t,x})\sigma(s,X_s^{t,x})$ and
we conclude that
It is well-known (see e.g. [Reference El Karoui, Peng and Quenez19]) that the solution $\nabla Y$ of the linear BSDE
can be represented as
where $\Theta_r \coloneqq (r,X_r,Y_r,Z_r)$, $\Gamma^s$ denotes the adjoint process given by
and
where $\tilde B$ denotes an independent copy of B. Notice that $\nabla X^{t,x}_t=1,$ so that
Then, by (39),
Since $(\nabla Y_s, \nabla Z_s)$ is the solution to the linear BSDE (40) with bounded $f_x, f_y, f_z,$ we have that $\|\nabla Y_t\|_{L_{2p}({\mathbb{P}}_{t,x})} \le C(b,\sigma,f,g,T,p).$ Obviously, $ \| X^{t,x}_s -x\|_{L_{2p}({\mathbb{P}}_{t,x})} \le C(b,\sigma,T,p) (s-t)^{\frac{1}{2}}.$ So it remains to show that
We intend to use (41) in the following. There is a certain degree of freedom in how to connect B and $\tilde B$ in order to compute conditional expectations. Here, unlike in (27), we define the processes
as driving Brownian motions for $\tfrac{\nabla Y_s}{\nabla X_s}$ and $\tfrac{\nabla Y_t}{\nabla X_t},$ respectively. This will especially simplify the estimate for $ \tilde {\mathbb{E}} |\tilde \Gamma^{s,X_s}_T -\tilde \Gamma^{t,x}_T|^ q $ below. From the above relations we get the following (with $X_s\coloneqq X^{t,x}_s $):
Since gʹ is Lipschitz continuous and of polynomial growth, we have
by Hölder’s inequality and the $L_q$-boundedness for any $q >0$ of all the factors, as well as from the estimates for $ \tilde X^{s, X_s}_T - \tilde X^{t,x}_T$ and $\nabla \tilde X^{s,X_s}_T - \nabla \tilde X^{t,x}_T $ as in Lemma 5.2. For the $\Gamma$ differences we first apply the inequalities of Hölder and BDG:
Since $f_y$ and $f_z$ are bounded we have $\tilde {\mathbb{E}} | \tilde \Gamma^{s,X_s}_r|^q + \tilde {\mathbb{E}} |\tilde \Gamma^{t,x}_r |^q \le C(f,T,q). $ Similarly to (31), since $f_x, f_y,f_z$ are Lipschitz continuous with respect to the space variables,
so that Lemma 5.2 yields
The same holds for $|f_y(\tilde \Theta^{s,X_s}_r) - f_y(\tilde \Theta^{t,x}_r) |$ and $|f_z(\tilde \Theta^{s,X_s}_r) - f_z(\tilde \Theta^{t,x}_r) |.$ Applying these inequalities and Gronwall’s lemma, we arrive at
for $p> 0.$
For $J_2\le C (t-s)$ it is enough to realise that the integrand is bounded. The estimate for $J_3$ follows similarly to that of $J_1.$
4.4. Properties of the solution to the finite difference equation
Recall the definition of $ \mathcal{D}^n_m$ given in (15). By (4),
so that
While for the solution to the PDE (38) one can observe in Theorem 4.2 the well-known smoothing property which implies that u is differentiable on $[0,T)\times {\mathbb{R}}$ even though g is only Hölder continuous, in the following proposition, for the solution $u^n$ to the finite difference equation we have to require from g the same regularity as we want for $u^n.$
Proposition 4.2. Let Assumption 2.3 hold and assume that $u^n$ is a solution of
with terminal condition $u^n(t_n,x)= g(x).$ Then, for sufficiently small h, the map $x \mapsto u^n(t_m,x)$ is $C^2,$ and it holds that
and
uniformly in $ m=0,\dots,n-1$. The constants $C_{u^n\!,1}$, $C_{u^n\!,2}$, and $C_{u^n\!,3}$ depend on the bounds of f, g, b, $\sigma$, and their derivatives, and on T and $p_0$.
Proof. Step 1. From (44), since g is $C^2$ and $f_y$ is bounded, for sufficiently small h we conclude by induction (backwards in time) that $u^n_x(t_m,x)$ exists for $m=0,\ldots,n-1,$ and that
Similarly one can show that $u^n_{xx}(t_m,x)$ exists and solves the derivative of the previous equation.
Step 2. As stated in the proof of Proposition 2.1, the finite difference equation (44) is the associated equation to (9) in the sense that we have the representations (21). We will use that $u^n(t_m,x) =Y_{t_m}^{n,t_m,x}$ and exploit the BSDE
in which we will drop the superscript $t_m,x$ from now on. For $u^n_x(t_m,x)$ we will consider
Similarly as in the proof of [Reference Ma and Zhang30, Theorem 3.1], the BSDE (47) can be derived from (46) as a limit of difference quotients with respect to x. Notice that the generator of (47) is random but has the same Lipschitz constant and linear growth bound as f. Assumption 2.3 allows us to find a $p_0\ge 0$ and a $K>0$ such that
In order to get estimates simultaneously for (46) and (47) we prove the following lemma.
Lemma 4.2. We fix n and assume a BSDE
with $\xi^n = g(X_T^{n,t_m,x}) $ or $\xi^n = g'(X_T^{n,t_m,x}) \partial_x X_T^{n,t_m,x} $, and $\textsf{X}_s\coloneqq X_s^{n,t_m,x}$ or $\textsf{X}_s\coloneqq \partial_x X_s^{n,t_m,x}$, such that $\textsf{f}:\Omega \times [0,T] \times {\mathbb{R}}^3 \to {\mathbb{R}} $ is measurable and satisfies
Then for any $p \ge 2,$
(i)
\begin{equation*} {\mathbb{E}} |\textsf{Y}_{t_k}|^p + \frac{\gamma_p}{4} {\mathbb{E}} \int_{(t_k,T]} |\textsf{Y}_{s-} |^{p-2} |\textsf{Z}_{s-}|^2 d[B^n]_s \le C\Psi^p(x)\end{equation*}for $k=m,\ldots,n$ and some $\gamma_p>0,$(ii) $ {\mathbb{E}} \sup_{t_m < s \le T}|\textsf{Y}_{s-}|^p \le C \Psi^p(x), $ and
(iii) $ {\mathbb{E}} \Big ( \int_{(t_m,T]} |\textsf{Z}_{s-}|^2 d[B^n]_s \Big)^{\frac{p}{2}} \le C\Psi^p(x),$
for some constant $ C=C(b,\sigma, f,g,T,p,p_0)$.
Proof.
(i) By Itô’s formula (see [Reference Jacod and Shiryaev25, Theorem 4.57]) we get for $p\ge 2$ that
(50)\begin{align} |\textsf{Y}_{t_k}|^p& = | \xi^n|^p - p \int_{(t_k,T]} \textsf{Y}_{s-} |\textsf{Y}_{s-} |^{p-2} \textsf{Z}_{s-} dB^n_s \notag \\[3pt] &\quad + \ p \int_{(t_k,T]} \textsf{Y}_{s-} |\textsf{Y}_{s-} |^{p-2} \textsf{f}(s,\textsf{X}_{s-},\textsf{Y}_{s-}, \textsf{Z}_{s-})d[B^n]_s \notag\\[3pt] &\quad - \sum_{s \in (t_k,T]} [|\textsf{Y}_s|^p - |\textsf{Y}_{s-}|^p - p \textsf{Y}_{s-} |\textsf{Y}_{s-} |^{p-2} (\textsf{Y}_s-\textsf{Y}_{s-})].\end{align}Following the proof of [Reference Kruse and Popier27, Proposition 2] (which is carried out there in the Lévy process setting but can be done also for martingales with jumps, like $B^n$) we can use the estimate\begin{eqnarray*} - \!\sum_{s \in (t_k,T]} [ |\textsf{Y}_s|^p \,{-}\, |\textsf{Y}_{s-}|^p \,{-}\, p \textsf{Y}_{s-} |\textsf{Y}_{s-} |^{p-2} (\textsf{Y}_s\,{-}\,\textsf{Y}_{s-})]\le {-} \gamma_p \!\sum_{s \in (t_k,T]} |\textsf{Y}_{s-}|^{p-2} (\textsf{Y}_s{-}\textsf{Y}_{s-})^2,\end{eqnarray*}where $\gamma_p >0$ is computed in [Reference Yao40, Lemma A4]. Since\begin{equation*} \textsf{Y}_{t_{\ell+1}} - \textsf{Y}_{{t_{\ell+1}}-}=\textsf{f}(t_{\ell+1},\textsf{X}_{t_\ell},\textsf{Y}_{t_\ell}, \textsf{Z}_{t_\ell})h- \textsf{Z}_{t_\ell} \sqrt{h} {\varepsilon}_{\ell+1}\end{equation*}we have\begin{align*}& - \sum_{s \in (t_k,T]} [ |\textsf{Y}_s|^p - |\textsf{Y}_{s-}|^p - p \textsf{Y}_{s-} |\textsf{Y}_{s-} |^{p-2} (\textsf{Y}_s-\textsf{Y}_{s-})] \\[3pt] &\quad\le - \gamma_p \,\sum_{\ell=k}^{n-1} |\textsf{Y}_{t_\ell} |^{p-2} \, \Big (\textsf{f}(t_{\ell+1},\textsf{X}_{t_\ell},\textsf{Y}_{t_\ell}, \textsf{Z}_{t_\ell})h - \textsf{Z}_{t_\ell} \sqrt{h} {\varepsilon}_{\ell+1} \Big)^2 \\[3pt] &\quad= - \gamma_p \, h \int_{(t_k,T]} |\textsf{Y}_{s-} |^{p-2} \, \textsf{f}^2(s,\textsf{X}_{s-},\textsf{Y}_{s-},\textsf{Z}_{s-})d[B^n]_s - \gamma_p \int_{(t_k,T]} |\textsf{Y}_{s-} |^{p-2} |\textsf{Z}_{s-}|^2 d[B^n]_s \\[3pt] &\qquad + 2 \gamma_p \, \int_{(t_k,T]} |\textsf{Y}_{s-} |^{p-2} \, \textsf{f}(s,\textsf{X}_{s-},\textsf{Y}_{s-}, \textsf{Z}_{s-}) \textsf{Z}_{s-}(B^n_s-B^n_{s-}) d[B^n]_s.\end{align*}Hence we get from (50) that\begin{align*} |\textsf{Y}_{t_k}|^p&\le | \xi^n|^p- p \int_{(t_k,T]} \textsf{Y}_{s-} |\textsf{Y}_{s-} |^{p-2} \textsf{Z}_{s-} dB^n_s \\[3pt] &\quad + p\int_{(t_k,T]} \textsf{Y}_{s-} |\textsf{Y}_{s-} |^{p-2} \, \textsf{f}(s,\textsf{X}_{s-},\textsf{Y}_{s-}, \textsf{Z}_{s-})d[B^n]_s \notag\\[3pt] &\quad - \gamma_p \int_{(t_k,T]} |\textsf{Y}_{s-} |^{p-2} \, |\textsf{Z}_{s-}|^2 d[B^n]_s \\[3pt] &\quad + 2 \gamma_p \, \int_{(t_k,T]} |\textsf{Y}_{s-} |^{p-2} \, \textsf{f}(s,\textsf{X}_{s-},\textsf{Y}_{s-}, \textsf{Z}_{s-}) \textsf{Z}_{s-}(B^n_s-B^n_{s-}) d[B^n]_s.\end{align*}From Young’s inequality and (49) we conclude that there is a $ c'= c'(p,K_f,L_f, \gamma_p)>0$ such that\begin{eqnarray*} p|\textsf{Y}_{s-} |^{p-1}\, |\textsf{f}(s,\textsf{X}_{s-},\textsf{Y}_{s-},\textsf{Z}_{s-})|\le \tfrac{\gamma_p}{4} |\textsf{Y}_{s-} |^{p-2} \, |\textsf{Z}_{s-}|^2 + c'(1+| \textsf{X}_{s-}|^p + |\textsf{Y}_{s-} |^p),\end{eqnarray*}and for $ \sqrt{h} < \tfrac{1}{8 (L_f + K_f)}$ we find a $c^{\prime\prime} =c^{\prime\prime}(p, L_f, K_f, \gamma_p )>0$ such that\begin{align*} & 2 \gamma_p \, \sqrt{h} |\textsf{Y}_{s-} |^{p-2} \, |\textsf{f}(s,\textsf{X}_{s-},\textsf{Y}_{s-},\textsf{Z}_{s-})| | \textsf{Z}_{s-}| \le \tfrac{ \gamma_p}{4} |\textsf{Y}_{s-} |^{p-2} \, | \textsf{Z}_{s-}|^2\\[3pt] &\quad + c^{\prime\prime} \,(1+|\textsf{X}_{s-}|^p +|\textsf{Y}_{s-}|^p).\end{align*}Then for $c=c'+c^{\prime\prime}$ we have(51)\begin{align} |\textsf{Y}_{t_k}|^p&\le | \xi^n|^p - p \int_{(t_k,T]} \textsf{Y}_{s-} |\textsf{Y}_{s-} |^{p-2}\, \textsf{Z}_{s-} dB^n_s + c\int_{(t_k,T]} 1+| \textsf{X}_{s-}|^p + |\textsf{Y}_{s-} |^p d[B^n]_s \notag\\[3pt] &\quad - \tfrac{\gamma_p}{2} \int_{(t_k,T]} |\textsf{Y}_{s-} |^{p-2} \, |\textsf{Z}_{s-}|^2 d[B^n]_s.\end{align}By standard methods, approximating the terminal condition and the generator by bounded functions, it follows that for any $a>0$,\begin{equation*}{\mathbb{E}} \sup_{t_k \le s\le T} |\textsf{Y}_s|^a < \infty \quad \text{ and } \quad {\mathbb{E}} \bigg (\int_{(t_k,T]} |\textsf{Z}_{s-}|^2 d[B^n]_s \bigg)^{\frac{a}{2}} < \infty.\end{equation*}Hence $ \int_{(t_k,T]} \textsf{Y}_{s-} |\textsf{Y}_{s-} |^{p-2} \textsf{Z}_{s-} dB^n_s$ has expectation zero. Taking the expectation in (51) yields(52)\begin{align} &{\mathbb{E}} |\textsf{Y}_{t_k}|^p + \tfrac{\gamma_p}{2} {\mathbb{E}} \int_{(t_k,T]} |\textsf{Y}_{s-}|^{p-2} |\textsf{Z}_{s-}|^2 d[B^n]_s\le {\mathbb{E}}| \xi^n|^p \nonumber\\[3pt] &\quad + c {\mathbb{E}} \int_{(t_k,T]} 1+| \textsf{X}_{s-}|^p + |\textsf{Y}_{s-} |^p d[B^n]_s.\end{align}Since ${\mathbb{E}}| \xi^n|^p$ and $ {\mathbb{E}} \int_{(t_k,T]} 1+| \textsf{X}_{s-}|^p d[B^n]_s$ are polynomially bounded in x, Gronwall’s lemma gives\begin{eqnarray*}&& {} \| \textsf{Y}_{t_k}\|_p \le C(b,\sigma, f,g, T,p,p_0) (1 + |x|^{ p_0+1}), \quad k=m,\ldots,n,\end{eqnarray*}and inserting this into (52) yields\begin{align*} & \bigg ({\mathbb{E}} \int_{(t_k,T]} |\textsf{Y}_{s-} |^{p-2} |\textsf{Z}_{s-}|^2 d[B^n]_s \bigg )^\frac{1}{p} \le C(b,\sigma,f,g,T,p,p_0) (1 + |x|^{ p_0+1}),\\[3pt] &\quad k=m,\ldots,n-1.\end{align*}(ii) From (51) we derive by the inequality of BDG and Young’s inequality that for ${t_m \le t_k\,{\le}\,T}$,
\begin{align*}& {\mathbb{E}} \sup_{t_k < s \le T}|\textsf{Y}_{s-}|^p \\[3pt] &\quad \le {\mathbb{E}}| \xi^n|^p + C(p) {\mathbb{E}} \bigg ( \int_{(t_k,T]} |\textsf{Y}_{s-} |^{2p-2} |\textsf{Z}_{s-} |^2 d[B^n]_s\bigg)^{\frac{1}{2}}\\[3pt] &\qquad +c{\mathbb{E}} \int_{(t_k,T]}1+| \textsf{X}_{s-}|^p + |\textsf{Y}_{s-} |^p d[B^n]_s \notag\\ &\quad \le {\mathbb{E}}| \xi^n|^p + c {\mathbb{E}} \int_{(t_k,T]} 1+| \textsf{X}_{s-}|^p d[B^n]_s \notag\\[2pt] &\qquad + C(p) {\mathbb{E}} \left [ \sup_{t_k < s \le T} |\textsf{Y}_{s-} |^{\frac{p}{2}} \left ( \int_{(t_k,T]} |\textsf{Y}_{s-} |^{p-2} |\textsf{Z}_{s-} |^2 d[B^n]_s\right )^{\frac{1}{2}} \right ]\\[3pt] &\qquad + c {\mathbb{E}} \int_{(t_k,T]} |\textsf{Y}_{s-} |^pd[B^n]_s \\[3pt] &\quad \le {\mathbb{E}}| \xi^n|^p + c {\mathbb{E}} \int_{(t_k,T]} 1+| \textsf{X}_{s-}|^p d[B^n]_s + C(p) {\mathbb{E}} \int_{(t_k,T]} |\textsf{Y}_{s-} |^{p-2} |\textsf{Z}_{s-} |^2 d[B^n]_s \\[3pt] &\qquad + {\mathbb{E}} \sup_{t_k < s \le T} |\textsf{Y}_{s-} |^p( \tfrac{1}{4} + c (T-t_k)).\end{align*}We assume that h is sufficiently small so that we find a $t_k$ with $c (T-t_k) < \tfrac{1}{4}.$ We rearrange the inequality to have $ {\mathbb{E}} \sup_{t_k < s \le T} |\textsf{Y}_{s-} |^p$ on the left-hand side, and from (i) we conclude that\begin{align*} {\mathbb{E}} \sup_{t_k < s \le T}|\textsf{Y}_{s-}|^p &\le 2 {\mathbb{E}} | \xi^n|^p + 2c {\mathbb{E}} \int_{(t_k,T]} 1+| \textsf{X}_{s-}|^p d[B^n]_s \\[3pt] &\quad + 2C(p) {\mathbb{E}} \int_{(t_k,T]} |\textsf{Y}_{s-} |^{p-2} |\textsf{Z}_{s-} |^2 d[B^n]_s \\[3pt] &\le C(b,\sigma, f,g, T,p,p_0) (1+|x|^{(p_0+1)p}). \end{align*}Now we may repeat the above step for ${\mathbb{E}} \sup_{t_\ell < s \le t_k}|\textsf{Y}_{s-}|^p$ with $c (t_k-t_\ell) < \tfrac{1}{4}$ and $\xi^n=\textsf{Y}_T$ replaced by $\textsf{Y}_{t_k},$ and continue doing so until we eventually get the assertion (ii).(iii) We proceed from (48):
\begin{align*} &\sup_{k\le \ell\le n} \Big | \int_{(t_\ell,T]} \textsf{Z}_{s-} dB^n_s \Big|^p\\[3pt] &\quad \le C(p) \bigg ( | \xi^n|^p + \sup_{k\le \ell \le n} |\textsf{Y}_{t_\ell}|^p + \Big |\int_{(t_k,T]} | \textsf{f}(s,\textsf{X}_{s-},\textsf{Y}_{s-}, \textsf{Z}_{s-}) | \,d[B^n]_s\Big |^p \bigg),\end{align*}so that by (49) and the inequalities of BDG and Hölder we have that\begin{align*}& {\mathbb{E}} \bigg ( \int_{(t_k,T]} |\textsf{Z}_{s-}|^2 d[B^n]_s \bigg)^{\frac{p}{2}} \\[2pt] &\quad\le C(p) \bigg ( {\mathbb{E}} | \xi^n|^p + {\mathbb{E}} \sup_{k\le \ell\le n} | \textsf{Y}_{t_\ell}|^p \bigg) +C(p,L_f,K_f) {\mathbb{E}} \bigg(\int_{(t_k,T]} 1+|\textsf{X}_{s-}|+| \textsf{Y}_{s-}|d[B^n]_s\bigg)^p \\[2pt] &\qquad +C(p,L_f,K_f) (T-t_k)^{\frac{p}{2}} {\mathbb{E}} \left (\int_{(t_k,T]} |\textsf{Z}_{s-}|^2d[B^n]_s\right )^{\frac{p}{2}}. \end{align*}Hence for $C(p,L_f,K_f) (T-t_k)^{\frac{p}{2}} <{\frac{1}{2}}$ we derive from the assertion (ii) and from the growth properties of the other terms that(53)\begin{align} {\mathbb{E}} \bigg ( \int_{(t_k,T]} |{\textsf{Z}_{s-}}|^2 d[B^n]_s \bigg)^{\frac{p}{2}} \le C(b,\sigma, f,g,T,p,p_0) (1+|x|^{(p_0+1)p}).\end{align}Repeating this procedure eventually yields (iii).
Step 3. Applying Lemma 4.2 to (46) and (47) we see that for all $m =0,\ldots,n$ we have
and
Our next aim is to show that $u^n_{xx}(t_m,x)$ is locally Lipschitz in x. We first show that $u^n_{xx}(t_m,x)$ has polynomial growth. We introduce the BSDE which describes $u^n_{xx}(t_m,x)$, for simplicity writing
and consider
We denote the generator of this BSDE by $\hat f$ and notice that it is of the structure
Here $f_0(\omega, t)$ denotes the integrand of the first integral on the right-hand side of (55), and from the previous results one concludes that ${\mathbb{E}} (\!\int_{(t_m,T]} |f_0(s-\!)| d[B^n]_s)^p < \infty.$ The functions $f_1(t) = (D^{(1,0,0)} f)(t, \cdot) = (\partial_x f)(t, \cdot)$, $f_2(t) = (\partial_y f)(t, \cdot)$, and $f_3(t) = (\partial_z f)(t, \cdot)$ are bounded by our assumptions. We put
Denoting the solution by $(\hat{\textsf{Y}}, \hat{\textsf{Z}})$, we get for $C(f_3 ) (T-t_m) \le \tfrac{1}{2}$ that
Now we derive the polynomial growth ${\mathbb{E}} | \hat \xi^n|^2 \le C \Psi^2(x) $ from the properties of gʹ and gʹʹ and from the fact that ${\mathbb{E}} \sup_{t_m< s \le T} |\partial^j_x X_{s}^{n}|^p$ is bounded for $j=1,2$ under our assumptions. Then the estimate
can be derived from Lemma 4.2 Parts (ii) and (iii), so that Gronwall’s lemma implies
Finally, to show (45), one uses (55) and derives an inequality as in (56), but now for the difference $\partial_x^2 Y_{t_m}^{n,t_m,x}-\partial_x^2 Y_{t_m}^{n,t_m,\bar x}.$
Before proving (45), let us state the following lemma.
Lemma 4.3 Let Assumption 2.3 hold. We have
for some constant $C=C(b,\sigma,f,g,T,p,p_0)$.
Proof of Lemma 4.3. Proof of (58): Introduce $G(t_{k+1}, x) \coloneqq \mathcal{D}^n_{k+1} u^n(t_{k+1}, X^{n,t_k,x}_{t_{k+1}})$. Using the relations (42)–(43) and the bounds (54) and (57) for $u^n_x$ and $u^n_{xx}$, respectively, one obtains
uniformly in $t_{k+1}$. Since $Z^{n,t_m,x}_{t_k} = \mathcal{D}^n_{k+1} u^n(t_{k+1}, X^{n,t_k, \eta}_{t_{k+1}}) = G(t_{k+1}, \eta)$, where $\eta = X^{n,t_m,x}_{t_k}$, the previous bound yields
uniformly for each $t_m \leq t_k < T$. The inequality (58) then follows by applying the Cauchy–Schwarz inequality and standard $L_p$-estimates for the process $X^n$.
Proof of (59): This can be shown similarly to Lemma 4.2(iii), by considering the BSDE for the difference $\partial_x Y_{t_m}^{n,t_m,x}-\partial_x Y_{t_m}^{n,t_m,\bar x}$ instead of (47) itself.
Proof of (60): This can again be shown by repeating the proof of Lemma 4.2(iii), but now for the BSDE (55).
We return to the main proof. By our assumptions we have
where we use $|x-\bar x|^2 \le C(1+|x|^2+|\bar x|^2) |x-\bar x|^{2\alpha}.$ (The term $|x-\bar x|^2$ appears, for example, in the estimate of $(\partial_x X_T^{n,t_m,x })^2 - ( \partial_x X_T^{n,t_m, \bar x })^2$.) To see that
we check the terms with the highest polynomial growth. We have to deal with terms like
and
for example. We bound the first term by using (53) and (58):
We bound the second term by using (53) and (59):
While all the other terms can be easily estimated using the results we have obtained already, for
we need the bound (60).
The result then follows from Gronwall’s lemma.
Remark 4.1. Under Assumption 2.3 we conclude that by Proposition 4.2 there exists a constant $C = C(b,\sigma, f,g, T,p,p_0) > 0$ such that
uniformly in $m = 0,1, \dots, n-1$, where
In addition, for
we have
uniformly in $m = 0,1, \dots, n-1$. The latter inequality follows from the assumption that the partial derivatives of f are bounded and Lipschitz continuous with respect to the spatial variables, from estimates proved in Proposition 4.2, and from those stated in (61) above.
From the calculations it can be seen that in general Assumption 2.3 cannot be weakened if one needs $ \partial_x F^n(t_{m+1}, x)$ to be locally $\alpha$-Hölder continuous.
5. Technical results and estimates
In this section we collect some facts which are needed for the proofs of our results. We start with properties of the stopping times used to construct a random walk.
Lemma 5.1 (Proposition 11.1 in [Reference Walsh38], Lemma A.1 in [Reference Geiss, Labart and Luoto22]) For all $0 \leq k \leq m \leq n$ and $p > 0$, it holds for $h = \tfrac{T}{n}$ and $\tau_k$ as defined in (24) that
(i) ${\mathbb{E}} \tau_k = kh$;
(ii) ${\mathbb{E}} |\tau_1 |^p \leq C(p) h^p$;
(iii) ${\mathbb{E}} | B_{\tau_k} - B_{t_k}|^{2p} \leq C(p) {\mathbb{E}} |\tau_k - t_k|^p \leq C(p) (t_k h)^{\frac{p}{2}}. $
The next lemma lists some estimates concerning the diffusion X defined by (28) and its discretization (26), where we assume that B and $\tilde B$ are connected as in (27).
Lemma 5.2 Under Assumption 2.1 on b and $\sigma$, for $p \geq 2$ there exists a constant $C=C(b,\sigma,T,p) >0$ such that the following hold:
(i) $ {\mathbb{E}}\big|X^{s,y}_{T} - X^{t,x}_T\big|^p \leq C ( |y-x|^p + |s-t|^{\frac{p}{2}}), \quad x,y \in {\mathbb{R}}, \, s,t \in [0,T]. $
(ii) $ \tilde {\mathbb{E}} \sup_{\tilde \tau_l \wedge t_{m} \le r \le \tilde \tau_{l+1} \wedge t_{m}} |\tilde X^{t_k,x}_{t_k+r}- \tilde X^{t_k,x}_{t_k+ \tilde \tau_l \wedge t_{m}}|^{p} \le C h^{\frac{p}{4}}, \quad 0\le k\le n, \, 0 \le l \le n-k-1, \, 0 \leq m \le n-k.$
(iii) $ {\mathbb{E}}|\nabla X^{s,y}_T - \nabla X^{t,x}_T |^p \le C( |y-x|^p + |s-t|^{\frac{p}{2}}), \quad x,y \in {\mathbb{R}}, \, s,t \in [0,T]. $
(iv) $ {\mathbb{E}} \sup_{0 \leq l \leq m} \big|\nabla X^{n,t_k,x}_{t_k+t_l}\big|^{p} \leq C, \quad 0 \leq k \leq n, \, 0 \le m \le n-k. $
(v) $\tilde {\mathbb{E}}\big|\tilde X^{t_k,x}_{t_k+ t_m}-\tilde {\mathcal{X}}^{\tau_k,y}_{\tau_k +\tilde \tau_m}\big|^p \le C ( |x - y|^p+ h^{\frac{p}{4}}), \quad 0 \leq k \leq n,\, 0 \leq m \le n-k.$
(vi) $\tilde {\mathbb{E}} | \nabla \tilde X_{t_k+ t_m }^{t_k, x} - \nabla \tilde {\mathcal{X}}_{\tau_k +\tilde \tau_m}^{\tau_k,y}|^p \le C( |x - y |^p+ h^{\frac{p}{4}}), \quad 0 \leq k \leq n, \, 0 \leq m \le n-k.$
Proof.
(i) This estimate is well-known.
(ii) For the stochastic integral we use the inequality of BDG and then, since b and $\sigma $ are bounded, we get by Lemma 5.1(ii) that
\begin{eqnarray*}&& {} \tilde {\mathbb{E}} \sup_{\tilde \tau_l \wedge t_{m} \le r \le \tilde \tau_{l+1} \wedge t_{m}} |\tilde X^{t_k,x}_{t_k+r}- \tilde X^{t_k,x}_{t_k+ \tilde \tau_l \wedge t_{m}}|^{p} \\[3pt] && \quad\le\, C(p)(\|{b}\|^p_{\infty} \tilde{\mathbb{E}} | \tilde \tau_{l+1}- \tilde \tau_l |^p + \|{\sigma}\|^p_{\infty} {\mathbb{E}} | \tilde \tau_{l+1}- \tilde \tau_l|^{\frac{p}{2}}) \le C(b,\sigma, T,p) \, h^{\frac{p}{2}}.\end{eqnarray*}(iii) This can be easily seen because the process $(\nabla X^{s,y}_r)_{r \in [s,T]} $ solves the linear SDE (13) with bounded coefficients.
(iv) The process solves (65). The estimate follows from the inequality of BDG and Gronwall’s lemma.
(v) Recall that from (4) and (26) we have
\begin{equation*}\tilde {\mathcal{X}}^{\tau_k,y}_{\tau_k+\tilde \tau_m} = \tilde X^{n,t_k,y}_{t_k+t_m} = y + \int_{(0, t_m]} b(t_k+ r, \tilde X^{n,t_k,y}_{t_k+r-}) d[\tilde B^n, \tilde B^n]_r + \int_{(0, t_m]} \sigma(t_k+r,\tilde X^{n,t_k,y}_{t_k+r-}) d \tilde B^n_{r},\end{equation*}and $\tilde X^{t_k,x}_{t_k+t_m} $ is given by\begin{equation*}\tilde X^{t_k,x}_{t_k+t_m} = x + \int_0^{t_m} b(t_k+r, \tilde X^{t_k,x}_{t_k+r}) dr + \int_0^{t_m} \sigma(t_k+r,\tilde X^{t_k,y}_{t_k+r}) d \tilde B_r.\end{equation*}To compare the stochastic integrals of the previous two equations we use the relation\begin{equation*} \int_{(0, t_m]} \sigma(t_k+r,\tilde X^{n,t_k,y}_{t_k+r-}) d \tilde B^n_r = \int_0^\infty \sum_{l=0}^{m-1} \sigma(t_{k+l+1},\tilde X^{n,t_k,y}_{t_{k+l}}) {\textbf{1}}_{(\tilde \tau_l, \tilde \tau_{l+1}]}(r) d \tilde B_r.\end{equation*}We define an ‘increasing’ map, $i(r) \coloneqq t_{l+1} $ for $r \in (t_l,t_{l+1}]$, and a ‘decreasing’ map, $d(r) \coloneqq t_{l} $ for $r \in (t_l,t_{l+1}]$, and split the differences as follows (using Assumption 2.1(iii) for the coefficient b):(64)\begin{align} & \tilde {\mathbb{E}}\big|\tilde X^{t_k,x}_{t_k+t_m}- \tilde X^{n,t_k,y}_{t_k+t_m}\big|^p \notag \\[3pt] &\,\, \le C(b,p) \left (\! |x - y|^p + \tilde {\mathbb{E}} \!\int_0^{t_m} |r- i(r)|^{\frac{p}{2}} +| \tilde X^{t_k,x}_{t_k+r}- \tilde X^{t_k,x}_{t_k+d(r)} |^p + | \tilde X^{t_k,x}_{t_k+d(r)}- \tilde X^{n,t_k,y}_{t_k+d(r)} |^p dr \!\right)\notag\\[3pt] &\qquad +C(p) \tilde {\mathbb{E}} | \int_{t_{m} \wedge \tilde \tau_{m}}^{t_{m}} \sigma(t_k+r,\tilde X^{t_k,x}_{t_k+r}) d \tilde B_r|^p \notag \\[3pt] &\qquad + C(p) \tilde {\mathbb{E}} | \int_{t_{m} \wedge \tilde \tau_{m}}^{\tilde \tau_{m}} \sum_{l=0}^{ m-1} \sigma(t_{k+l+1},\tilde X^{n,t_k,y}_{t_{k+l}}){\textbf{1}}_{(\tilde \tau_l, \tilde \tau_{l+1}]}(r) d \tilde B_r|^p\notag \\[3pt] &\qquad + C(p) \tilde {\mathbb{E}} | \int_0^{t_{m} \wedge \tilde \tau_{m}} \!\!\!\! \sigma(t_k+r,\tilde X^{t_k,x}_{t_k+r}) - \sum_{l=0}^{m-1} \sigma(t_{k+l+1},\tilde X^{n,t_k,y}_{t_{k+l}}){\textbf{1}}_{(\tilde \tau_l, \tilde \tau_{l+1}]}(r) d \tilde B_r|^p.\end{align}We estimate the terms on the right-hand side as follows: by standard estimates for SDEs with bounded coefficients one has that\begin{eqnarray*} \tilde {\mathbb{E}} \int_0^{t_m} |r- i(r)|^{\frac{p}{2}} +| \tilde X^{t_k,x}_{t_k+r}- \tilde X^{t_k,x}_{t_k+d(r)} |^p dr \le C(b,\sigma,T,p) h^\frac{p}{2}.\end{eqnarray*}By the BDG inequality, the fact that $\sigma$ is bounded, and Lemma 5.1, we conclude that\begin{eqnarray*}&& {} \tilde {\mathbb{E}} \bigg | \int_{t_{m} \wedge \tilde \tau_{m}}^{t_{m}} \sigma(t_k+r,\tilde X^{t_k,x}_{t_k+r}) d \tilde B_r\bigg |^p + \tilde {\mathbb{E}} \bigg | \int_{t_{m} \wedge \tilde \tau_{m}}^{\tilde \tau_{m}} \sum_{l=0}^{m-1} \sigma(t_{k+l+1},\tilde X^{n,t_k,y}_{t_{k+l}}){\textbf{1}}_{(\tilde \tau_l, \tilde \tau_{l+1}]}(r) d \tilde B_r \bigg|^p \\&& \quad\le C(\sigma,p) \|\sigma\|^p_\infty \tilde {\mathbb{E}} |\tilde\tau_{m}-t_{m}|^\frac{p}{2} \le C(\sigma,p) ( t_{m} h)^\frac{p}{4}.\end{eqnarray*}Finally, by the BDG inequality,\begin{eqnarray*} && {} \tilde {\mathbb{E}} \Bigg | \int_0^{t_{m} \wedge \tilde \tau_{m}} \!\!\!\! \sigma(t_k+r,\tilde X^{t_k,x}_{t_k+r}) - \sum_{l=0}^{m-1}\sigma(t_{k+l+1},\tilde X^{n,t_k,y}_{t_{k+l}}){\textbf{1}}_{(\tilde \tau_l, \tilde \tau_{l+1}]}(r) d \tilde B_r \Bigg|^p \\[-2pt] &&\quad\le\, C(p) \tilde {\mathbb{E}} \Bigg ( \int_0^{t_{m}} \sum_{l=0}^{m-1}|\sigma(t_k+r,\tilde X^{t_k,x}_{t_k+r})-\sigma(t_{k+l+1},\tilde X^{n,t_k,y}_{t_{k+l}})|^2 {\textbf{1}}_{(\tilde \tau_l, \tilde \tau_{l+1}]}(r) dr\Bigg )^\frac{p}{2} \\[-2pt] &&\quad\le\, C(\sigma,p)\tilde {\mathbb{E}} \Bigg ( \sum_{l=0}^{m-1}\int_{\tilde \tau_l \wedge t_{m}}^{\tilde \tau_{l+1} \wedge t_{m}}\!\!\!\!|\tilde \tau_{l+1}-t_{l+1}|^\frac{p}{2} + |\tilde \tau_l -t_{l+1} |^\frac{p}{2} + |\tilde X^{t_k,x}_{t_k +r}- \tilde X^{t_k,x}_{t_k+\tilde \tau_l \wedge t_{m}}|^p \\&& \qquad +\, |\tilde X^{t_k,x}_{t_k+ \tilde \tau_l \wedge t_{m}}-\tilde X^{n,t_k,y}_{t_{k+l}} |^p dr\Bigg ) \\[-2pt] &&\quad\le\, C(\sigma,T,p) \Bigg ( h^\frac{p}{2} + \max_{1\le l<m} ( \tilde {\mathbb{E}} |\tilde \tau_l -t_l|^p)^{\frac{1}{2}}\\[-2pt] && \qquad +\, \max_{0\le l<m} \Bigg( \tilde {\mathbb{E}} \sup_{\tilde \tau_l \wedge t_{m} \le r \le \tilde \tau_{l+1} \wedge t_{m}} |\tilde X^{t_k,x}_{t_k+r}- \tilde X^{t_k,x}_{t_k+ \tilde \tau_l \wedge t_{m}}|^{2p} \Bigg)^{\frac{1}{2}} \\ && \qquad +\, \tilde {\mathbb{E}} \sum_{l=0}^{m-1} |\tilde X^{t_k,x}_{t_k+\tilde \tau_l \wedge t_{m}} -\tilde X^{n,t_k,y}_{t_{k+l}} |^p (\tilde \tau_{l+1} - \tilde \tau_l ) \Bigg ). \end{eqnarray*}Moreover, since $\tilde \tau_{l+1} - \tilde \tau_l$ is independent of $|\tilde X^{t_k,x}_{t_k+ \tilde \tau_l \wedge t_{m}} -\tilde X^{n,t_k,y}_{t_k+t_l} |^p$, by Lemma 5.1(i) we get\begin{eqnarray*}&& {} \tilde {\mathbb{E}} \sum_{l=0}^{m-1} |\tilde X^{t_k,x}_{t_k+ \tilde \tau_l \wedge t_{m}} -\tilde X^{n,t_k,y}_{t_{k+l}} |^p (\tilde \tau_{l+1} - \tilde \tau_l ) \\ &&\quad=\, \tilde {\mathbb{E}} \sum_{l=0}^{m-1} |\tilde X^{t_k,x}_{t_k+\tilde \tau_l \wedge t_{m}} -\tilde X^{n,t_k,y}_{t_{k+l}} |^p (t_{l+1} - t_l )\\ &&\quad\le\, C(T,p) \bigg ( \tilde {\mathbb{E}} \int_0^{t_m} |\tilde X^{t_k,x}_{t_k+d(r)} -\tilde X^{n,t_k,y}_{t_k+d(r)} |^p dr + \max_{0\le l<m} \tilde {\mathbb{E}} |\tilde X^{t_k,x}_{t_k+ \tilde \tau_l \wedge t_{m}}- \tilde X^{t_k,x}_{t_k+t_l}|^p \bigg ). \end{eqnarray*}Using Lemma 5.1(iii), one concludes similarly as in the proof of (ii) that\begin{equation*} \tilde {\mathbb{E}} |\tilde X^{t_k,x}_{t_k+ \tilde \tau_l \wedge t_{m}}- \tilde X^{t_k,x}_{t_k+t_l}|^p\le C(b,\sigma,T,p) h^\frac{p}{4} .\end{equation*}Then (64) combined with the above estimates implies that\begin{eqnarray*}\tilde {\mathbb{E}}\big|\tilde X^{t_k,x}_{t_k+t_m}- \tilde X^{n,t_k,y}_{t_k+t_m}\big|^p\le C(b,\sigma,T,p) \bigg ( |x - y|^p + h^\frac{p}{4} + \tilde {\mathbb{E}} \int_0^{t_m} |\tilde X^{t_k,x}_{t_k+ d(r)} -\tilde X^{n,t_k,y}_{t_k+d(r)} |^p dr \bigg ). \end{eqnarray*}Gronwall’s lemma yields\begin{equation*}\tilde {\mathbb{E}}\big|\tilde X^{t_k,x}_{t_k+t_m}- \tilde X^{n,t_k,y}_{t_k+t_m}\big|^p\le C(b,\sigma,T,p) ( |x - y|^p + h^\frac{p}{4}). \end{equation*}(vi) We have
(65)\begin{align} \nabla \tilde X^{n,t_k, y}_{t_k+ t_m}= 1 &+ \int_{(0,t_m]} b_x (t_k+ r,X^{n, t_k, y}_{t_k+r-}) \nabla \tilde X^{n,t_k, y}_{t_k+r-} d[\tilde B^n,\tilde B^n]_r \! \notag\\&+ \int_{(0,t_m]} \sigma_x(t_k+ r, \tilde X^{n,t_k,y}_{t_k+r-}) \nabla \tilde X^{n,t_k, y}_{t_k+r-} d \tilde B^n_r\end{align}and(66)\begin{eqnarray} \nabla \tilde X^{t_k,x}_{t_k+t_m} = 1 + \int_0^{t_m} b_x(t_k+ r, \tilde X^{t_k,x}_{t_k+r}) \nabla \tilde X^{t_k,x}_{t_k+r}dr + \int_0^{t_m} \sigma_x(t_k+ r,\tilde X^{t_k,x}_{t_k+r}) \nabla \tilde X^{t_k,x}_{t_k+r}d \tilde B_{r}.\qquad \end{eqnarray}
We may proceed similarly as in (v), except that this time the coefficients are not bounded but have linear growth. Here one uses that the integrands are bounded in any $L_p({\mathbb{P}}).$
Finally, we estimate the difference between the continuous-time Malliavin weight and its discrete-time counterpart.
Lemma 5.3 Let B and $\tilde B$ be connected via (27). Under Assumption 2.1 it holds that
Proof. For $N^{n,\tau_k}_{\tilde \tau_m}$ and $N^{t_k}_{t_{m}}$ given by (12) and (18), respectively, we introduce the notation
with
By the inequality of BDG,
The assertion then follows from Lemma 5.1 and from the estimates
So it remains to prove these inequalities. We put
and notice that by Assumption 2.1 both expressions are bounded by $ \|{\sigma}\|_{\infty}\delta^{-1}.$ To show (67) let us split $a_{t_k+s}- a^n_{\tau_k+ \tilde \tau_\ell}$ in the following way:
Then
since one can show similarly to Lemma 5.2(ii) that
Notice that $\nabla \tilde X_t^{t_k, X_{t_k}}$ and $\nabla \tilde {\mathcal{X}}_{\tau_m}^{\tau_k,{\mathcal{X}}_{\tau_k}} $ solve the linear SDEs (66) and (65), respectively. Therefore,
For the second term we get
For the third term, Lemma 5.2(vi) implies that
The last term we estimate similarly to the second one:
To see (68), use the estimates (69).
We close this section with estimates concerning the effect of $T_{_{m,{\pm}}}$ and the discretized Malliavin derivative $\mathcal{D}^n_k$ (see Definition 2.1) on $X^n.$
Lemma 5.4 Under Assumption 2.1, and for $p\ge 2,$ we have the following:
(i) ${\mathbb{E}} |X^n_{t_l}- T_{_{m,{\pm}}}X^n_{t_l}|^p \leq C(b,\sigma,T,p) h^{\frac{p}{2}}, \quad 1 \leq l, m \leq n$.
(ii) ${\mathbb{E}}\bigg | \nabla X^{n,t_k,X^n_{t_k}}_{t_m} - \dfrac{\mathcal{D}^n_{k+1}X^n_{t_m} }{\sigma(t_{k+1},X^n_{t_k})} \bigg|^p \le C(b,\sigma,T,p) h^{\frac{p}{2}}, \quad 0 \le k < m \le n$.
(iii) ${\mathbb{E}} |\mathcal{D}^n_k X^n_{t_m} |^p \le C(b,\sigma,T,p), \quad0 \le k \leq m \leq n$.
Proof.
(i) By definition, $T_{_{m, \pm}}X^n_{t_l} = X^n_{t_l}$ for $l \leq m -1, $ and for $l\ge m$ we have
\begin{align*}T_{_{m, \pm}}X^n_{t_l} &= X^n_{t_{m-1}} + b(t_m,X^n_{t_{m-1}})h \pm \sigma(t_m,X^n_{t_{m-1}}) \sqrt{h} \\[3pt] &\quad + \ h \sum_{j=m+1}^{ {l}} b(t_j, T_{_{m,\pm}}X^n_{t_{j-1}})+ \sqrt{h} \sum_{j=m+1}^{ {l}} \sigma(t_{ j}, T_{_{m,\pm}}X^n_{t_{j-1}}){\varepsilon}_j.\end{align*}By the properties of b and $\sigma$, and thanks to the inequality of BDG and Hölder’s inequality, we see that\begin{align*}& {\mathbb{E}}|X^n_{t_l}- T_{_{m, \pm}}X^n_{t_l}|^p\\ & \quad\leq C(p) \Bigg( {\mathbb{E}}\big|\sigma(t_m,X^n_{t_{m-1}}) \sqrt{h}(1 \pm\varepsilon_m)\big|^p+ h^{p} {\mathbb{E}} \Bigg| \sum_{j = m+1 }^l \big(b(t_j, X^n_{t_{j-1}}) - b(t_j, T_{_{m,\pm}}X^n_{t_{j-1}})\big)\Bigg|^p\\ & \quad \quad + h^{\frac{p}{2}} {\mathbb{E}} \Bigg |\sum_{j=m+1}^l \big(\sigma(t_j, X^n_{t_{j-1}}) - \sigma(t_j, T_{_{m,\pm}} X^n_{t_{j-1}})\big)^2 \Bigg |^{\frac{p}{2}} \Bigg )\\ & \quad\leq C(p) \Bigg ( \|{\sigma}\|_{\infty}^p h^{\frac{p}{2}} + h (\|{b_x}\|_{\infty}^p t_{l-m}^{p-1} + \|{\sigma_x}\|_{\infty}^pt_{l-m}^{{\frac{p}{2}}-1}) \sum_{j=m+1}^l {\mathbb{E}}|X^n_{t_{j-1}}- T_{_{m,\pm}}X^n_{t_{j-1}}|^p \Bigg ).\end{align*}It remains to apply Gronwall’s lemma.(ii) By the inequality of BDG and Hölder’s inequality,
\begin{align*}&{\mathbb{E}}\!\left | \nabla X^{n,t_k,X^n_{t_k}}_{t_m} - \frac{ \mathcal{D}^n_{k+1} X^n_{t_{m}}}{\sigma( t_{k+1},X^n_{t_k})} \right |^p\\[3pt] &\quad \le C(p,T) \bigg ( | b_x(t_{k+1}, X^n_{t_k})h + \sigma_x(t_{k+1},X^n_{t_k}) \sqrt{h} {\varepsilon}_{k+1}|^p \notag \\[3pt] & \qquad + h^p \sum_{l=k+2}^{m} {\mathbb{E}} \bigg| b_x(t_l, X^n_{t_{l-1}}) \nabla X^{n,t_k,X^n_{t_k}}_{t_{l-1}} - b_x^{(k+1,l)} \frac{\mathcal{D}^n_{k+1} X^n_{t_{l-1}}}{\sigma( t_{k+1},X^n_{t_k})} \bigg|^p \nonumber\\[3pt]& \qquad + h^{\frac{p}{2}} \!\!\! \sum_{l=k+2}^{m} {\mathbb{E}} \bigg|\sigma_x(t_l, X^n_{t_{l-1}}) \nabla X^{n,t_k,X^n_{t_k}}_{t_{l-1}} -\sigma_x^{(k+1,l)} \frac{\mathcal{D}^n_{k+1} X^n_{t_{l-1}}}{\sigma(t_{k+1},X^n_{t_k})} \bigg|^{p} \bigg ).\end{align*}Since by Lemma 5.4(i) we conclude that\begin{align*}{\mathbb{E}} |b_x^{(k+1,l)} - b_x(t_l, X^n_{t_{l-1}})|^{2p} + {\mathbb{E}} |\sigma_x^{(k+1,l)} - \sigma_x(t_l, X^n_{t_{l-1}})|^{2p} \leq C(b,\sigma,T,p) h^{p},\end{align*}and Lemma 5.2 implies that\begin{equation*} {\mathbb{E}} \sup_{k+1 \leq l \leq m} \Big |\nabla X^{n,t_k,X^n_{t_k}}_{t_{l-1}}\Big|^{2p} \leq C(b,\sigma,T,p),\end{equation*}the assertion follows by Gronwall’s lemma.(iii) This is an immediate consequence of (i).
Acknowledgement
Christel Geiss would like to thank the Erwin Schrödinger Institute, Vienna, where a part of this work was written, for its hospitality and support.