Hostname: page-component-745bb68f8f-kw2vx Total loading time: 0 Render date: 2025-02-11T13:09:39.599Z Has data issue: false hasContentIssue false

Two-player zero-sum stochastic differential games with random horizon

Published online by Cambridge University Press:  15 November 2019

M. Ferreira*
Affiliation:
Universidade de Lisboa and Escola Superior de Hotelaria e Turismo, Instituto Politécnico do Porto
D. Pinheiro*
Affiliation:
Brooklyn College and The Graduate Center, City University of New York
S. Pinheiro*
Affiliation:
Queensborough Community College, City University of New York
*
*Postal address: CEMAPRE, ISEG, Universidade de Lisboa, Rua do Quelhas 6, 1200-781 Lisboa, Portugal. Email address: miguelferreira@esht.ipp.pt
**Postal address: Department of Mathematics, Brooklyn College, City University of New York, 2900 Bedford Avenue, Brooklyn, NY 11210, USA. Email address: dpinheiro@brooklyn.cuny.edu
***Postal address: Department of Mathematics and Computer Science, Queensborough Community College, City University of New York, 222-05, 56th Avenue, Bayside, NY 11364, USA. Email address: scoutopinheiro@qcc.cuny.edu
Rights & Permissions [Opens in a new window]

Abstract

We consider a two-player zero-sum stochastic differential game with a random planning horizon and diffusive state variable dynamics. The random planning horizon is a function of a non-negative continuous random variable, which is assumed to be independent of the Brownian motion driving the state variable dynamics. We study this game using a combination of dynamic programming and viscosity solution techniques. Under some mild assumptions, we prove that the value of the game exists and is the unique viscosity solution of a certain nonlinear partial differential equation of Hamilton–Jacobi–Bellman–Isaacs type.

Type
Original Article
Copyright
© Applied Probability Trust 2019 

1. Introduction

The central object of the study of differential game theory concerns games taking place over a whole interval of time and thus with decisions being made continuously – a class of problems first addressed by Isaacs [Reference Isaacs21] and later studied in greater detail by Berkovitz and Fleming [Reference Berkovitz and Fleming5] and Friedman [Reference Friedman18, Reference Friedman19]. The aim of the theory is to describe, from a general perspective, the interaction between agents, eventually in conflict, occurring in the most diverse situations, such as armed conflicts, economic competition, and parlor games. One very interesting aspect is that any actions by the players both influence and are influenced by the evolution of the state of the system over time, determined by a given differential equation.

The key mathematical techniques used to address this class of problems are closely related to optimal control theory, namely Pontryagin’s maximum principle, and Bellman’s dynamic programming principle and the corresponding Hamilton–Jacobi–Bellman–Isaacs (HJBI) equation. It should be noted, however, that differential games are usually far more complex than optimal control problems. The reason behind this feature is not only related to the fact that, unlike optimal control problems, differential games correspond to the case where more than one controller or player is involved but, more importantly, there is no immediately obvious notion of what constitutes a solution for the game. Indeed, over time, multiple proposals were put forward for what should be considered a solution. Among these one can list, for instance, minimax, Nash, Stackelberg, open-loop and closed-loop solutions.

Isaacs successfully set up the framework of differential game theory even though he did not have a mathematically rigorous theory of differential game value. Early definitions of differential game value made use of time discretizations [Reference Friedman18] and were later replaced by the more convenient Elliott–Kalton notion of differential game value [Reference Elliott and Kalton15]. Evans and Souganidis [Reference Evans and Souganidis16] characterized the upper and lower Elliott–Kalton value functions as unique viscosity solutions of the corresponding HJBI partial differential equations (PDEs) by employing the theory of viscosity solutions introduced by Crandall and Lions [Reference Crandall and Lions12]. Also resorting to viscosity solution methods, Souganidis [Reference Souganidis40] showed that the Elliott–Kalton value functions are in fact the same as those defined using time discretizations. The notion of differential game value extends naturally to the set-up of stochastic differential games. Fleming and Souganidis [Reference Fleming and Souganidis17] proved the existence of value for two-player zero-sum stochastic differential games under the assumption that the Isaacs condition holds. Recent developments of the theory have addressed differential games with more general state variable dynamics [Reference Biswas6, Reference Hamadène and Mu20] and payoff functionals [Reference Buckdahn and Li9, Reference Buckdahn, Li and Quincampoix10, Reference Kumar26], as well as alternative control sets [Reference Bayraktar and Yao4, Reference Tang and Hou41] and game formulations [Reference Cardaliaguet and Rainer11, Reference Pham and Zhang35].

To the best of our knowledge, differential games with a random time horizon were first considered by Petrosyan and Murzov [Reference Petrosyan and Murzov32] within the set-up of zero-sum pursuit games with terminal payoffs at a random terminal time. A more general formulation for differential games with a random planning horizon has been developed by Petrosyan and Shevkoplyas in [Reference Petrosyan and Shevkoplyas33, Reference Petrosyan and Shevkoplyas34]. The theory developed in [Reference Petrosyan and Murzov32, Reference Petrosyan and Shevkoplyas33, Reference Petrosyan and Shevkoplyas34] concerns a set-up in which the random time horizon has a probability measure with unbounded support and continuous density. Moreover, the state variable dynamics considered therein are deterministic, given by an ordinary differential equation.

In the present paper we study a two-player zero-sum stochastic differential game (SDG) with a random planning horizon. The planning horizon is assumed to be of the form $\xi=\min\{\tau,T\}$, where $\tau$ is a continuous non-negative random variable whose distribution is common knowledge to the players, and $T>0$ is a deterministic constant. As a consequence, the probability measure of the random planning horizon $\xi$ has support on a bounded interval. Further, its distribution function is, in general, discontinuous at T. The game’s state variable dynamics are given by a stochastic differential equation (SDE) of diffusive type, with the Brownian motion driving the dynamics assumed to be independent of the random variable $\tau$ determining the random planning horizon $\xi$. The game’s payoff functionals depend heavily on the planning horizon $\xi$ in the sense that the running payoff is given as an integral over the random time interval $[0,\xi]$ and the terminal payoff is evaluated at time $\xi$. We handle this issue by transforming the problem under consideration herein into one with a fixed planning horizon. This is achieved by taking the expected value with respect to the distribution of the random variable $\tau$, carefully distinguishing the two complementary cases where $\tau\leq T$ and $\tau> T$. As a result, we obtain a payoff functional which resembles that of a differential game with non-constant discount rate, but with an additional term reflecting the specificity of the random time horizon $\xi=\min\{\tau,T\}$ under consideration. We should remark that this is somewhat connected to the analysis of Marín-Solano and Shevkoplyas [Reference Marín-Solano and Shevkoplyas29], with the key differences being that [Reference Marín-Solano and Shevkoplyas29] concerns differential games with deterministic state variable dynamics and a random time horizon with an absolutely continuous probability measure with a continuous density with unbounded support. On the other hand, the main similarity to [Reference Marín-Solano and Shevkoplyas29] is that, at an intermediate step of our analysis, a transformation from a random to a deterministic planning horizon is performed, yielding a payoff functional resembling that of a discounted differential game (with an additional term due to the specific form of $\xi$), with non-constant discount rate related to a certain family of conditional probabilities. Such a transformation to a deterministic planning horizon admits the following intuitive interpretation: agents plan their actions as if the game would continue until time T, but with a subjective rate of time preferences.

We remark that the current stochastic differential games (SDGs) literature is mostly focused in games with either a deterministic finite time horizon or an infinite time horizon. Henceforth, we believe in the value of extending the current literature to include the case of a random planning horizon. Moreover, we think that this is not only relevant from a theoretical point of view, but also that it might eventually contribute to a better understanding of a number of economic and financial applications exhibiting random planning horizons (see e.g. [Reference Blanchet-Scalliet, El Karoui, Jeanblanc and Martellini7], [Reference Bruhn and Steffensen8], [Reference Duarte, Pinheiro, Pinto and Pliska14], [Reference Kwak, Yong and Choi27], [Reference Mousa, Pinheiro and Pinto30], [Reference Pliska and Ye36], [Reference Shen and Wei37], and [Reference Yaari42]).

We extend the strategy introduced by Fleming and Souganidis in [Reference Fleming and Souganidis17] to account for the introduction of the random time horizon into the stochastic differential game formulation. More specifically, we employ a combination of dynamic programming and viscosity solutions techniques to prove that the value of the game exists and is the unique viscosity solution of a certain nonlinear partial differential equation of HJBI type. We find this approach to be rather amenable, as we are able to rely on some classical and seminal results, extending only those for which the influence of the random planning horizon, or its distribution, is of relevance. We should remark that the approach developed by Fleming and Souganidis in [Reference Fleming and Souganidis17] relies on an asymmetric formulation of the game under consideration. Indeed, when employing this approach, two subgames are defined, with one player having an information advantage in one of the subgames and the remaining player having a similar advantage in the second subgame. The stronger player uses Elliot–Kalton strategies while the weaker player resorts to open-loop controls. Other examples where a dynamic programming principle is proved resorting to asymmetric game formulations include, for instance, the papers by Katsoulakis [Reference Katsoulakis25] and Cardaliaguet and Rainer [Reference Cardaliaguet and Rainer11]. A very interesting alternative approach, recently introduced by Sîrbu [Reference Sîrbu38] building on previous related work by Bayraktar and Sîrbu [Reference Bayraktar and Sîrbu1, Reference Bayraktar and Sîrbu2, Reference Bayraktar and Sîrbu3], uses the stochastic Perron’s method to show that the values of stochastic differential games formulated symmetrically over appropriately specified elementary feedback strategies are the unique continuous viscosity solutions of the corresponding HJBI equation. Moreover, using such techniques, a dynamical programming principle can be shown to hold over stopping rules, i.e. stopping times where the decision to stop is based solely on observing the state variable, but not for stopping times on the physical probability space.

This paper is organized as follows. In Section 2, we describe the problem we propose to address and state our main results. Section 3 is concerned with the characterization of the value functions of an auxiliary two-player zero-sum discounted SDG with a deterministic time horizon. We prove our main result in Section 4 and conclude in Section 5.

2. Framework and main results

In this section we formulate the problem under consideration and state our main result.

2.1. Notation and set-up

Let $T>0$ be a deterministic finite time horizon and, for every $t \in [0,T]$ and $s\in [t,T] $, let $\Omega^{\omega}_{t,s}$ be the set of $\mathbb{R}^M$-valued continuous functions on [t, s] taking the value 0 at t, that is,

\begin{equation*}\Omega^{\omega}_{t,s} = \{ \omega \in C ([t,s];\,\mathbb{R}^M ) \colon \omega(t)=0 \} .\end{equation*}

Let $\mathcal{G}_{t,u}^{\omega}$ be the $\sigma$-algebra generated by paths $\omega \in \Omega^{\omega}_{t,s}$ up to some time $u\in [t,s] $ with $\mathbb{G}^{\omega}_{t,s} = \{ \mathcal{G}_{t,u}^{\omega} \colon u \in [t,s]\} $ being the corresponding filtration. When endowed with the Wiener measure $\mathbb{P}^{\omega}_{t,s}$ on $\mathcal{G}_{t,s}^{\omega}$, $\Omega^{\omega}_{t,s}$ becomes a classical Wiener space. Let $B^t= \{ B^t(s) \colon s\in [t,T], B^t(t)=0 \} $ be a Brownian motion on the filtered probability space $ (\Omega^{\omega}_{t,T}, \mathcal{G}_{t,T}^{\omega},\mathbb{G}^{\omega}_{t,T}, \mathbb{P}^{\omega}_{t,T}) $.

Let us introduce the following technical assumptions.

  1. (A1) U and V are compact metric spaces.

  2. (A2) The maps $f \colon [0,T]\ {\times}\ \mathbb{R}^N\ {\times}\ U\ {\times}\ V\ {\rightarrow}\ \mathbb{R}^N $, $\sigma \colon [0,T]\ {\times}\ \mathbb{R}^N\ {\times}\ U\ {\times}\ V\ {\rightarrow}\ \mathbb{R}^{N \times M}$, $\Psi \colon [0,T] \times \mathbb{R}^N \rightarrow \mathbb{R}$, and $L \colon [0,T] \times \mathbb{R}^N \times U \times V \rightarrow \mathbb{R}$ are bounded, uniformly continuous with respect to all its variables, and Lipschitz-continuous with respect to $ (t,x) \in [0,T]\times {\mathbb R}^N$ uniformly in $ (u,v) \in U \times V$.

  3. (A3) $\tau$ is an (absolutely) continuous random variable (with respect to the Lebesgue measure on ${\mathbb R}^+=(0,+\infty)$) defined on a probability space $ (\Omega^\tau, \mathcal{G}^\tau,\mathbb{P}^\tau)$ and has a positive, bounded, and Lipschitz-continuous probability density function defined on ${\mathbb R}^+$.

  4. (A4) For each $t\in [0,T]$, the random variable $\tau$ is independent of the filtration $\mathbb{G}_{t,T}^\omega$ generated by the Brownian motion $B^t({\cdot})$.

For $\skew3\hat{t} \in (t,T)$ and $\omega \in \Omega^\omega_{t,T}$, let

\begin{equation*}\omega^{t,\skew3\hat{t}} = \omega|_{[t,\skew3\hat{t}]} \quad\text{and}\quad \omega^{\skew3\hat{t},T} = \omega - \omega|_{[\skew3\hat{t},T]},\end{equation*}

and define $\pi \colon \Omega^\omega_{t,T} \rightarrow \Omega^\omega_{t,\skew3\hat{t}} \times \Omega^\omega_{\skew3\hat{t},T}$ to be the map given by

\begin{equation*} \pi (\omega) = (\omega^{t,\skew3\hat{t}},\omega^{\skew3\hat{t},T}) .\end{equation*}

Then, $\pi$ induces the identification

\begin{equation*}\Omega^\omega_{t,T}= \Omega^\omega_{t,\skew3\hat{t}} \times \Omega^\omega_{\skew3\hat{t},T} {,}\end{equation*}

and the inverse of $\pi$ acts on pairs of paths $ (\omega^{t,\skew3\hat{t}},\omega^{\skew3\hat{t},T})\in \Omega^\omega_{t,\skew3\hat{t}} \times \Omega^\omega_{\skew3\hat{t},T}$ by concatenation, i.e. $\omega = \pi^{-1}(\omega^{t,\skew3\hat{t}},\omega^{\skew3\hat{t},T})\in\Omega^{\omega}_{t,T}$. Finally, note that ${\mathbb{P}}^\omega_{t,T} = {\mathbb{P}}^\omega_{t,\skew3\hat{t}} \otimes {\mathbb{P}}^\omega_{\skew3\hat{t},T}$, where ${\mathbb{P}}^\omega_{t,\skew3\hat{t}}$ and ${\mathbb{P}}^{\omega}_{\skew3\hat{t},T}$ are the Wiener measures on $\Omega^\omega_{t,\skew3\hat{t}}$ and $\Omega^\omega_{\skew3\hat{t},T}$, respectively.

For each $t\in[0,T]$, the probability measure ${\mathbb{P}}^\tau$ of the random variable $\tau$ induces a conditional probability measure on $\Omega_t^\tau=(t,\infty)$ determined by

\begin{equation*}\mathbb{P}_t^\tau(\tau\in A) = \mathbb{P}^\tau(\tau\in A \mid \tau > t ), \quad A\in\mathcal{G}^\tau_{t},\end{equation*}

where $\mathcal{G}^\tau_{t}={\mathcal B}(\Omega_t^\tau)$ denotes the Borel $\sigma$-algebra of $\Omega_t^\tau$. Resorting to assumption (A4) regarding independence between the Brownian motion $B^t$ and the random variable $\tau$, we define $\Omega_{t}$ as the direct product

\begin{equation*}\Omega_t = \Omega_{t,T}^\omega\times\Omega_t^\tau,\end{equation*}

defining accordingly the probability measure

(1) \begin{equation}\mathbb{P}_t={\mathbb{P}}^\omega_{t,T}\otimes {\mathbb{P}}^\tau_t,\end{equation}

the $\sigma$-algebras $\mathcal{G}_{t,s}$, $s\in[t,T]$, as the completion of $\mathcal{G}^\omega_{t,s}\otimes\mathcal{G}^\tau_{t}$ with respect to the measure $\mathbb{P}_t$, and the filtration $\mathbb{G}_{t,T}$ as $\mathbb{G}_{t,T}= \{\mathcal{G}_{t,s} \colon s\in [t,T] \}$.

2.2. A stochastic differential game with a random horizon

Let us define the random horizon $\xi$ as

(2) \begin{equation}\xi = \min\{\tau,T\} {,}\end{equation}

and note that $\xi$ takes values on the interval [0, T] ${\mathbb{P}}_t$-a.s. (almost surely). The two-player zero-sum stochastic differential game with random horizon is defined on the filtered probability space $ (\Omega_t, \mathcal{G}_{t,T}, \mathbb{G}_{t,T},\mathbb{P}_t)$ and consists of the controlled stochastic differential equation

(3) \begin{equation} \begin{aligned}\mathrm{d} X(s) & = f (s, X(s), u(s),v(s)) \, \mathrm{d} s + \sigma (s, X(s),u(s),v(s))\, \mathrm{d} B^t (s), \quad s \geq t {,} \\*X(t) &= x\end{aligned}\end{equation}

and payoff functional

(4) \begin{equation} J(t, x;\, u({\cdot}),v({\cdot})) = {{\mathbb E}}_{{\mathbb{P}}_t}\bigg[ \int_t^{\xi} L (s, X_{t,x}^{u,v}(s), u(s),v(s)) \, \mathrm{d} s + \Psi(\xi, X_{t,x}^{u,v}(\xi)) \bigg],\end{equation}

where $X_{t,x}^{u,v}(s)$, $s\in[t,T]$, denotes the solution of the initial value problem (3) associated with a specific choice of $u({\cdot}),v({\cdot})$. We will refer to the functions L and $\Psi$ determining the payoff functional J as the running payoff and terminal payoff, respectively. Thus, as far as the game is concerned, the payoff functional (4) represents some payoff that a first player (Player I) is trying to minimize (and thus a second player (Player II) seeks to maximize) subject to the state variable dynamics defined by (3) and some constraints of the form $u(s) \in U$ and $v(s) \in V$ for every appropriately defined instant of time $s\geq t$.

An admissible control process $u({\cdot})$ (resp. $v({\cdot})$) for Player I (resp. II) on [t, T] is a $\mathbb{G}^{\omega}_{t,T}$-progressively measurable process taking values in U (resp. V). The set of all admissible controls for Player I (resp. II) on [t, T] is denoted by $\mathcal{U}(t,T)$ (resp. $\mathcal{V}(t,T)$). We say that two controls $u_1({\cdot}), u_2({\cdot}) \in \mathcal{U}(t,T)$ are the same on [t, s], for some $s\in[t,T]$, and denote it by $u_1({\cdot}) \approx u_2({\cdot})$, if $\mathbb{P}^{\omega}_{t,T} \{u_1({\cdot}) = u_2({\cdot}) \ \text{{a.e. (almost everywhere)}} \ \text{in} \ [t,s]\} = 1$. A similar convention is used for elements of $\mathcal{V}(t,T)$.

An admissible strategy $\alpha$ (resp. $\beta$) for Player I (resp. II) on [t, T] is a mapping $\alpha \colon \mathcal{V}(t,T) \rightarrow \mathcal{U}(t,T)$ (resp. $\beta \colon \mathcal{U}(t,T) \rightarrow \mathcal{V}(t,T)$) such that if $v({\cdot}) \approx \tilde{v}({\cdot})$ (resp. $u({\cdot}) \approx \tilde{u}({\cdot})$) on [t, s] for every $s \in [t,T]$, then $\alpha [v({\cdot})] \approx \alpha [\tilde{v}({\cdot})]$ (resp. $\beta[u({\cdot})] \approx \beta [\tilde{u}({\cdot})]$). The set of all admissible strategies for Player I (resp. II) on [t, T] is denoted by $\mathcal{A}(t,T)$ (resp. $\mathcal{B}(t,T)$).

Let $ (t,x)\in[0,T]\times{\mathbb R}^N$. The lower value function of the stochastic differential game (SDG) with random horizon (3)–(4) is given by

(5) \begin{equation}V^-(t,x)=\inf_{\beta \in \mathcal{B}(t,T)} \sup_{u({\cdot}) \in \mathcal{U}(t,T)} J(t, x;\, u({\cdot}),\beta[u({\cdot})]) {,}\end{equation}

while the corresponding upper value function is

(6) \begin{equation}V^+(t,x)=\sup_{\alpha \in \mathcal{A}(t,T)} \inf_{v({\cdot}) \in \mathcal{V}(t,T)} J(t, x;\, \alpha[v({\cdot})], v({\cdot})) .\end{equation}

We say that the SDG with random horizon (3)–(4) has a value if $V^+(t,x)=V^-(t,x)$, and call it the common value of the SDG game. We also note that this definition is consistent with the standard notion of common value of the game introduced by Elliot and Kalton [Reference Elliott and Kalton15] for differential games with a deterministic horizon.

Choosing the controls at time t, the player who moves first (the maximizing player for the lower game, and the minimizing player for the upper game) is allowed to use the past of the Brownian motion $B^t({\cdot})$ driving (3), while the player with the advantage (Player II for the lower game, Player I for the upper game), is allowed to use both the past of $B^t({\cdot})$ and the other player’s control.

2.3. Statement of main results

For all $0 \leq t \leq s$, we let $G^+(s,t)$ and $G^-(s,t)$ denote the conditional probabilities

(7) \begin{equation} \begin{aligned} G^+(s,t) &= \mathbb{P}_t^\tau (\tau > s) = \mathbb{P}^\tau (\tau > s \mid \tau > t) {,} \\ G^-(s,t) &= \mathbb{P}_t^\tau (\tau \leq s) = \mathbb{P}^\tau (\tau \leq s \mid \tau > t) . \end{aligned} \end{equation}

Moreover, note that for each fixed $t\in[0,T]$, $G^-(s,t)$ is the probability distribution function of a continuous random variable and let $g^-(s,t)$ denote the corresponding conditional density function

(8) \begin{equation}g^-(s,t) = \dfrac{{\mathrm d}}{{\mathrm d} s},\quad G^-(s,t) .\end{equation}

In general, the value functions $V^-$ and $V^+$ defined by the variational identities (5) and (6) are not smooth. Nevertheless, they can still be characterized using the language of partial differential equations, relying on the notion of viscosity solutions originally proposed by Crandall and Lions in [Reference Crandall and Lions12] for the case of first-order Hamilton–Jacobi equations. See Appendix A for the definition of viscosity solution used herein. A central step of the proof of our main result is the introduction of an auxiliary SDG, with a non-constant discount rate related to the conditional probabilities (7) and a deterministic time horizon. Indeed, the lower and upper value functions $V^-$ and $V^+$ can be characterized in terms of the value functions of said auxiliary game.

Theorem 1. Assume that assumptions (A1)–(A4) hold. The lower and upper value functions $V^-$ and $V^+$ are the unique viscosity solutions of the Hamilton–Jacobi–Bellman–Isaacs equation

(9) \begin{equation}W_t - g^-(t,t)W + \mathcal{H}^{-}(t,x,W_x,W_{xx}) = 0{,} \quad W(T,x) = \Psi(T,x)\end{equation}

and

(10) \begin{equation}W_t - g^-(t,t)W + \mathcal{H}^{+}(t,x,W_x,W_{xx}) = 0 {,} \quad W(T,x) = \Psi(T,x),\end{equation}

where, for $A \in \mathbb{S}^N$ (the set of symmetric $N \times N$ matrices), $p,x \in {\mathbb R}^N$, and $t\in[0,T]$, we have

\begin{align*}\mathcal{H}^-(t,x,p,A)&= \max_{u \in U}\; \min_{v \in V}\; H(t,x,u,v,p,A), \\*\mathcal{H}^+(t,x,p,A)&= \min_{v \in V}\; \max_{u \in U}\; H(t,x,u,v,p,A){,}\end{align*}

and

\begin{align*}H(t,x,u,v,p,A) &= \mathrm{tr}\,\Big(\dfrac{1}{2}a(t,x,u,v)A\Big)+f(t,x,u,v)p+ L(t,x,u,v)+g^-(t,t) \Psi(t,x) {,}\end{align*}

with $a=\sigma \sigma'$, $\sigma'$ denoting the transpose of $\sigma$.

Note the presence of the additional (non-standard) terms $-g^-(t,t)W(t,x)$ and $g^-(t,t)\Psi(t,x)$ related to the conditional probabilities (7)–(8) on the HJBI equations (9) and (10). Such terms reflect the randomness of the planning horizon and encapsulate the agents’ behavior planning their actions as if the horizon were fixed at T, but with subjective rate of time preferences determined by the conditional probabilities (7).

We say that the Isaacs condition holds if, for all $ (t,x,p,A) \in [0,T]\times {\mathbb R}^N \times {\mathbb R}^N \times \mathbb{S}^N$, the following relation holds:

(11) \begin{equation}\mathcal{H}^+ (t,x,p,A) = \mathcal{H}^- (t,x,p,A) .\end{equation}

The next result is then a consequence of combining Isaacs condition above with the uniqueness of the viscosity solutions to (9) and (10), guaranteed by Theorem 1. In particular, this ensures existence of value for the SDG with random horizon (3)–(4) in the sense of Elliot and Kalton [Reference Elliott and Kalton15].

Corollary 1. If the Isaacs condition (11) holds, then the upper and the lower value functions of the SDG with a random horizon (3)–(4) coincide.

The rest of the paper is devoted to the proof of Theorem 1. The analysis, in Section 3, of a related discounted SDG plays a central role in this endeavor.

3. A discounted stochastic differential game with deterministic horizon

We will now momentarily divert our attention towards the following problem: an SDG with deterministic horizon $T>0$ and non-constant discount factor specified by a function $\Theta \colon D(\Theta)\to{\mathbb R}$, where $D(\Theta)=\{(s,t)\in[0,T]^2 \colon s\geq t\}$. Suppose the following conditions hold.

  1. (D1) $\Theta$ is positive, bounded, and continuously differentiable on $D(\Theta)$.

  2. (D2) For every $ (s,t),(\hat{s},s)\in D(\Theta)$, we have that

    \begin{equation*}\Theta(\hat{s},t)=\Theta(\hat{s},s)\Theta(s,t) .\end{equation*}
  3. (D3) The derivative

    \begin{equation*}\theta(t) = \dfrac{{\mathrm{d}}}{{\mathrm{d}} s}\Theta(s,t)_{|s=t}\end{equation*}
    is positive, bounded, and Lipschitz-continuous on [0, T].

The two-player zero-sum discounted stochastic differential game is defined on the filtered probability space $ (\Omega_{t,T}^\omega, \mathcal{G}_{t,T}^\omega, \mathbb{G}_{t,T}^\omega,\mathbb{P}_{t,T}^\omega)$ and consists of the controlled stochastic differential equation (3) and the discounted payoff functional

(12) \begin{align}{\mathcal J}(t, x;\, u({\cdot}),v({\cdot})) = {{\mathbb E}}_{{\mathbb{P}}^\omega_{t,T}}\bigg[ \int_t^{T} \Theta(s,t)L (s, X_{t,x}^{u,v}(s), u(s),v(s)) \, \mathrm{d} s + \Theta(T,t)\Psi(T, X_{t,x}^{u,v}(T)) \bigg],\end{align}

where $X_{t,x}^{u,v}(s)$, $s\in[t,T]$, denotes the solution of the initial value problem (3) associated with a specific choice of admissible controls $ (u({\cdot}),v({\cdot})) \in {\mathcal U}(t,T)\times{\mathcal V}(t,T)$.

The lower value function of the discounted SDG determined by (3) and (12) is given by

(13) \begin{equation}W^-(t,x):=\inf_{\beta \in \mathcal{B}(t,T)} \sup_{u({\cdot}) \in \mathcal{U}(t,T)} {\mathcal J}(t, x;\, u({\cdot}),\beta[u({\cdot})]),\end{equation}

while the upper value function of the discounted SDG determined by (3) and (12) is

(14) \begin{equation}W^+(t,x):=\sup_{\alpha \in \mathcal{A}(t,T)} \inf_{v({\cdot}) \in \mathcal{V}(t,T)} {\mathcal J}(t, x;\, \alpha[v({\cdot})], v({\cdot})) .\end{equation}

The goal of this section is to characterize the lower and upper value functions $W^-$ and $W^+$ of the discounted SDG determined by (3) and (12) as, respectively, the unique viscosity solutions of the Hamilton–Jacobi–Bellman–Isaacs equations

(15) \begin{equation}W_t - \theta(t)W + \mathcal{H}^{-}(t,x,W_x,W_{xx}) = 0{,} \quad W(T,x) = \Psi(T,x)\end{equation}

and

(16) \begin{equation}W_t - \theta(t)W + \mathcal{H}^{+}(t,x,W_x,W_{xx}) = 0 {,}\quad W(T,x) = \Psi(T,x),\end{equation}

where, for $A \in \mathbb{S}^N$ (the set of symmetric $N \times N$ matrices), $p,x \in {\mathbb R}^N$, and $t\in[0,T]$, we have

\begin{align*}\mathcal{H}^-(t,x,p,A)&= \max_{u \in U}\; \min_{v \in V}\; H(t,x,u,v,p,A), \\*\mathcal{H}^+(t,x,p,A)&= \min_{v \in V}\; \max_{u \in U}\; H(t,x,u,v,p,A){,}\end{align*}

and

\begin{equation*}H(t,x,u,v,p,A) = \mathrm{tr}\,\Big(\dfrac{1}{2}a(t,x,u,v)A\Big)+f(t,x,u,v)p \\* + L(t,x,u,v)\end{equation*}

with $a=\sigma \sigma'$, $\sigma'$ denoting the transpose of $\sigma$.

We will resort to the concepts of r-strategies and r-lower and r-upper values introduced by Fleming and Souganidis [Reference Fleming and Souganidis17], which combined with an appropriate discretization procedure, yield the existence and uniqueness of viscosity solutions to the HJBI equations (15) and (16).

3.1. Some preliminary results

Before proceeding, we need to introduce further notation and terminology that will be useful below. Let $ (t,x)\in[0,T]\times{\mathbb R}^N$ be fixed and, for any given $u({\cdot}) \in \mathcal{U}(t,T)$ and $v({\cdot}) \in \mathcal{V}(t,T)$, define

\begin{equation*}\gamma(s,\omega) = (u(s,\omega),v(s,\omega))\end{equation*}

for every $s\geq t$ and $\omega\in\Omega_{t,T}^\omega$. By definition of the control processes $u({\cdot}) \in \mathcal{U}(t,T)$ and $v({\cdot}) \in \mathcal{V}(t,T)$, it immediately follows that $\gamma({\cdot})$ is $\mathbb{G}_{t,T}^\omega$-progressively measurable. Moreover, by standard results from stochastic differential equations theory (see e.g. [Reference Karatzas and Shreve24, Reference Mao28] for further details), it is known that the SDE (3) admits a unique solution $X_{t,x}^{u,v}({\cdot})$ on the filtered probability space $ (\Omega_{t,T}^\omega,\mathcal{G}_{t,T}^\omega,\mathbb{G}_{t,T}^\omega,\mathbb{P}_{t,T}^\omega)$ for any fixed $u({\cdot})\in \mathcal{U}(t,T)$ and $v({\cdot})\in \mathcal{V}(t,T)$. Moreover, $X_{t,x}^{u,v}({\cdot})$ satisfies

(17) \begin{equation}X^{u,v}_{t,x}(s) = X^{u,v}_{t,x}(\skew3\hat{t}\kern1.5pt) + \int_{\skew3\hat{t}}^s f(r,X_{t,x}^{u,v}(r),{\gamma}(r)) \, \mathrm{d} r + \int_{\skew3\hat{t}}^s \sigma(r,X_{t,x}^{u,v}(r),{\gamma}(r)) \, \mathrm{d} B^t(r),\end{equation}

where $t\leq \skew3\hat{t}\leq s\leq T$. Further, noting that

(18) \begin{equation}B^t(s,\pi^{-1}(\omega^{t,\skew3\hat{t}},\omega^{\skew3\hat{t},T})) - B^t(\skew3\hat{t},\pi^{-1}(\omega^{t,\skew3\hat{t}},\omega^{\skew3\hat{t},T})) = \omega^{\skew3\hat{t},T}(s),\end{equation}

we obtain that for ${\mathbb{P}}^\omega_{t,\skew3\hat{t}}$-a.e. $\omega^{t,\skew3\hat{t}}\in\Omega^\omega_{t,\skew3\hat{t}}$ the left-hand side of (18) coincides with the standard Brownian motion $B^{\skew3\hat{t}}(s,\omega^{\skew3\hat{t},T})$ on the filtered probability space $ (\Omega^\omega_{\skew3\hat{t},T},{\mathcal G}^\omega_{\skew3\hat{t},T},\mathbb{G}^\omega_{\skew3\hat{t},T},{\mathbb{P}}^\omega_{\skew3\hat{t},T})$.

Also define

\begin{align*}\tilde{\gamma}(s,\omega^{t,\skew3\hat{t}},\omega^{\skew3\hat{t},T}) &= \gamma (s,\pi^{-1}(\omega^{t,\skew3\hat{t}},\omega^{\skew3\hat{t},T})), \\ \tilde{X}(s,\omega^{t,\skew3\hat{t}},\omega^{\skew3\hat{t},T}) &= X_{t,x}^{u,v} (s,\pi^{-1}(\omega^{t,\skew3\hat{t}},\omega^{\skew3\hat{t},T})){,}\end{align*}

and note that the relation

\begin{align*} \tilde{X}(s,\omega^{t,\skew3\hat{t}},\cdot) &= X^{u,v}_{t,x}(\skew3\hat{t}\kern1.5pt) + \int_{\skew3\hat{t}}^s f(r,\tilde{X}(r,\omega^{t,\skew3\hat{t}},\cdot),\tilde{\gamma}(r,\omega^{t,\skew3\hat{t}},\cdot)) \, \mathrm{d} r \notag \\ &\quad\, + \int_{\skew3\hat{t}}^s \sigma(r,\tilde{X}(r,\omega^{t,\skew3\hat{t}},\cdot),\tilde{\gamma}(r,\omega^{t,\skew3\hat{t}},\cdot)) \, \mathrm{d} B^{\skew3\hat{t}}(r)\end{align*}

holds ${\mathbb{P}}^\omega_{t,\skew3\hat{t}}$-a.e. $\omega^{t,\skew3\hat{t}}\in\Omega^\omega_{t,\skew3\hat{t}}$ as a consequence of (17) and the comments following it. Moreover, by uniqueness of solutions of (3), we get that the paths of $\tilde{X}(s,\omega^{t,\skew3\hat{t}},\cdot)$, $s\in[\skew3\hat{t},T]$, coincide with those of (3) with initial condition $ (\skew3\hat{t},X^{u,v}_{t,x}(\skew3\hat{t}\kern1.5pt))$ and controls $ (u(\cdot,\omega^{t,\skew3\hat{t}}),v(\cdot,\omega^{t,\skew3\hat{t}}))$ for ${\mathbb{P}}^\omega_{t,\skew3\hat{t}}$-a.e. $\omega^{t,\skew3\hat{t}}\in\Omega^\omega_{t,\skew3\hat{t}}$. From this point onwards we will also use the notation $X^{u,v}_{t,x}({\cdot})$ to refer to the stochastic process $\tilde{X}({\cdot})$ on the filtered probability space $ (\Omega^\omega_{\skew3\hat{t},T},{\mathcal G}^\omega_{\skew3\hat{t},T},\mathbb{G}^\omega_{\skew3\hat{t},T},{\mathbb{P}}^\omega_{\skew3\hat{t},T})$.

The comments above, together with the fact that

\begin{equation*} {\mathbb E}_{{\mathbb{P}}^\omega_{t,\skew3\hat{t}} \otimes {\mathbb{P}}^\omega_{\skew3\hat{t},T}} [\phi (\omega^{t,\skew3\hat{t}},\omega^{\skew3\hat{t},T}) \mid {\mathcal G}_{t,\skew3\hat{t}}^{\omega}] = {\mathbb E}_{{\mathbb{P}}^\omega_{\skew3\hat{t},T}} [\phi (\omega^{t,\skew3\hat{t}},\omega^{\skew3\hat{t},T}) ] \quad \text{${\mathbb{P}}^\omega_{t,\skew3\hat{t}}$-a.s.}\end{equation*}

for any bounded and measurable function $\phi \colon \Omega_{t,T}^\omega\to{\mathbb R}$, yield the following technical lemma.

Lemma 1. Suppose that (A1)–(A2) hold and let $X^{u,v}_{t,x}({\cdot})$ denote the solution of (3) with initial condition $ (t,x)\in[0,T]\times{\mathbb R}^N$ and controls $ (u({\cdot}),v({\cdot}))\in{\mathcal U}(t,T)\times{\mathcal V}(t,T)$. For any bounded continuous function $\phi$ and any deterministic $s \in [\skew3\hat{t},T]$,

\[{{\mathbb{E}}_{\mathbb{P}_{t,T}^{\omega }}}[\phi (X_{t,x}^{u,v}(s),\gamma (s,\omega ))\mid \mathcal{G}_{t,\text{ }\hat{t}}^{\omega }]=\mathbb{E}_{\mathbb{P}_{\hat{t},T}^{\omega }}^{{}}[\phi (X_{\hat{t},X_{t,x}^{u,v}(\hat{t})}^{u,v}(s),\tilde{\gamma }(s,{{\omega }^{t,\hat{t}}},{{\omega }^{\hat{t},T}}))]\quad \mathbb{P}_{\text{t},\text{\hat{t}}}^{\omega }\text{-}a.s.\]

The next lemma ensures boundedness of the value functions $W^-$ and $W^+$ introduced in (13) and (14), as well as their Lipschitz continuity with respect to x and Hölder continuity with respect to t.

Lemma 2. Suppose that (A1)–(A2) and (D1)–(D2) hold. We have the following.

  1. (i) For every $u({\cdot}) \in \mathcal{U}(t,T)$, $v({\cdot}) \in \mathcal{V}(t,T)$, $\alpha \in \mathcal{A}(t,T)$, and $\beta \in \mathcal{B}(t,T)$, the discounted payoff functionals

    \begin{equation*} x \to {\mathcal J}(t,\cdot;\, \alpha[v({\cdot})],v({\cdot}))\quad \text{and} \quad x \to {\mathcal J}(t,\cdot;\,u({\cdot}),\beta[u({\cdot})])\end{equation*}
    are bounded and Lipschitz-continuous in x, uniformly in t, $\alpha$, $v({\cdot})$ and t, $\beta$, $u({\cdot})$, respectively.
  2. (ii) The discounted SDG value functions $W^-$ and $W^+$ in (13) and (14) are bounded and Lipschitz-continuous in x, uniformly in t.

Proof. Let us start by proving item (i) for the discounted payoff functional

(19) \begin{equation}x \to {\mathcal J}(t,\cdot;\, \alpha[v({\cdot})],v({\cdot})),\end{equation}

where $t\in[0,T]$, $v({\cdot}) \in \mathcal{V}(t,T)$ and $\alpha \in \mathcal{A}(t,T)$. The proof for $x \to {\mathcal J}(t,\cdot;\,u({\cdot}),\beta[u({\cdot})])$, with $t\in[0,T]$, $u({\cdot}) \in \mathcal{U}(t,T)$ and $\beta \in \mathcal{B}(t,T)$, is similar.

Boundedness of (19) follows from boundedness of L and $\Psi$, guaranteed by assumption (A2), as well as boundedness of the non-constant discount factor $\Theta$, guaranteed by assumption (D1).

As for Lipschitz continuity of (19), this will follow from

(20) \begin{equation}{\mathbb E}_{{\mathbb{P}}^\omega_{t,T}}\big[|X^{u,v}_{t,x}(s)-X^{u,v}_{t,y}(s)|\big] \leq C |x-y| \quad \text{for all} \ x,y \in {\mathbb R}^N,\end{equation}

where $X^{u,v}_{t,x}(s)$ and $X^{u,v}_{t,y}(s)$, $s \in [t,T]$, are the solutions of (3) starting at t from x and y, respectively, with the same control pair (u, v). To see that (20) holds, set $Z(s)=X^{u,v}_{t,x}(s)-X^{u,v}_{t,y}(s)$, $s \in [t,T]$. From Itô’s formula, we obtain

\begin{align*}{\mathbb E}_{{\mathbb{P}}^\omega_{t,T}}\big[|X^{u,v}_{t,x}(s)-X^{u,v}_{t,y}(s)|^2\big] & = |x-y|^2+ {\mathbb E}_{{\mathbb{P}}^\omega_{t,T}}\bigg[\int_t^s \big[2Z(r)\cdot f_1(r, X^{u,v}_{t,x}(r),X^{u,v}_{t,y}(r),u(r),v(r)) \notag \\* &\quad\, + \mathrm{tr}\,(\sigma_1 \sigma_1^T)(r, X^{u,v}_{t,x}(r),X^{u,v}_{t,y}(r),u(r),v(r)) \big] {\mathrm{d}} r \bigg] \notag,\end{align*}

where $f_1$ and $\sigma_1$ are defined by

\begin{align*}f_1(t, x,y,u,v) &= f(t, x,u,v)-f(t, y,u,v) {,} \\*\sigma_1(t, x,y,u,v) &= \sigma(t, x,u,v)-\sigma(t, y,u,v)\end{align*}

for $t \in [0,T]$, $x,y \in {\mathbb R}^N$, $u \in U$, and $v \in V$. Using assumption (A2) and the Fubini–Tonelli theorem, we obtain that there exists a positive constant $C_1$ such that

\begin{equation*}{\mathbb E}_{{\mathbb{P}}^\omega_{t,T}}\big[|X^{u,v}_{t,x}(s)-X^{u,v}_{t,y}(s)|^2\big] \leq |x-y|^2 + C_1 \int_t^s {\mathbb E}_{{\mathbb{P}}^\omega_{t,T}}\big[|X^{u,v}_{t,x}(r)-X^{u,v}_{t,y}(r)|^2\big] \, \mathrm{d} r .\end{equation*}

Applying Gronwall’s inequality, we get that there exist positive constants $C_2$ and $C_3$ such that the following inequalities hold:

(21) \begin{equation}{\mathbb E}_{{\mathbb{P}}^\omega_{t,T}}\big[|X^{u,v}_{t,x}(s)-X^{u,v}_{t,y}(s)|^2\big] \leq \bigg(1+\int_t^s \,{\mathrm{e}}^{C_2 r} \, \mathrm{d} r\bigg)|x-y|^2 \leq C_3 |x-y|^2 .\end{equation}

Inequality (20) now follows from combining (21) with Hölder’s inequality, and Lipschitz continuity of (19) with respect to x follows from combining inequality (20) with Lipschitz continuity of L and $\Psi$, as guaranteed by assumption (A2).

As for the proof of item (ii), we note that boundedness of $W^-$ and $W^+$, as well as Lipschitz continuity with respect to x, follows as a consequence of the corresponding uniform properties of the discounted payoff functionals.

In the next section we will introduce a special class of restrictive strategies and the corresponding value functions, following a method originally developed by Fleming and Souganidis [Reference Fleming and Souganidis17]. These will enable us to prove certain sub- and super-optimal dynamic programming principles.

3.2. Sub-optimal and super-optimal dynamic programming principles

As noted by Fleming and Souganidis in their seminal paper [Reference Fleming and Souganidis17], serious measurability issues seem to prevent a generalization of the method for the proof of the deterministic dynamic programming principle to the stochastic set-up. To overcome these difficulties, Fleming and Souganidis have introduced the concept of restrictive strategies or, more commonly, r-strategies, that we employ here.

Before proceeding to the definition of r-strategies, we note that by definition of admissible control process, for $0\leq \bar{t}\leq t\leq T$, $u({\cdot})\in{\mathcal U}(\bar{t},T)$, and ${\mathbb{P}}^\omega_{\bar{t},t}$ a.e. $\omega^{\bar{t},t}\in\Omega^\omega_{\bar{t},t}$, the map $u(\omega_{\bar{t},t})\colon [t,T]\times\Omega^\omega_{t,T}\to U$ defined via the relation

\begin{equation*}u(\omega^{\bar{t},t})(s,\omega^{t,T}) = u (s,\omega),\end{equation*}

where $\omega=\pi^{-1}(\omega^{\bar{t},t},\omega^{t,T})$, is an admissible control for Player I, i.e. $u(\omega_{\bar{t},t})\in {\mathcal U}(t,T)$.

Given the discounted SDG determined by (3) and (12), we say that a r-strategy $\beta$ for Player II on [t, T] is an admissible strategy with the following additional property: for every $\bar{t} < t < \skew3\hat{t}$ and $u({\cdot}) \in \mathcal{U}(\bar{t},T)$ the map $ (s,\omega) \rightarrow \beta[u(\omega_{\bar{t},t})({\cdot})](s,\omega^{t,T})$ is $ ({\mathcal B}([t,\skew3\hat{t}]) \otimes {\mathcal G}_{t,\skew3\hat{t}}^{\omega},{\mathcal B}(U))$-measurable, where ${\mathcal B}(X)$ stands for the Borel $\sigma$-algebra of a set X. The set of r-strategies for Player II is denoted by $\mathcal{B}_r(t,T)$. We define r-strategies for Player I in a similar fashion and denote the set of these strategies by $\mathcal{A}_r(t,T)$.

The r-lower and r-upper value functions of the discounted SDG determined by (3) and (12) with initial data (t, x) are given by

\begin{equation*}W_r^-(t,x) = \inf_{\beta \in \mathcal{B}_r(t,T)} \sup_{u({\cdot}) \in \mathcal{U}(t,T)} {\mathcal J}(t,x;\, u({\cdot}), \beta[u({\cdot})]) \\\end{equation*}

and

\begin{equation*}W_r^+(t,x) = \sup_{\alpha \in \mathcal{A}_r(t,T)} \inf_{v({\cdot}) \in \mathcal{V}(t,T)} {\mathcal J}(t,x;\, \alpha[v({\cdot})], v({\cdot})) .\end{equation*}

The next result is a consequence of Lemma 2 as well as the definitions of admissible strategies and r-strategies. We skip its proof.

Corollary 2. Suppose that (A1)–(A2) and (D1)–(D2) hold.

  1. (a) The r-value functions $W_r^-$ and $W_r^+$ of the discounted SDG determined by (3) and (12) are bounded and Lipschitz-continuous in x, uniformly in t.

  2. (b) For every $ (t,x) \in [0,T] \times {\mathbb R}^N $,

    \begin{equation*}W^-(t,x) \leq W_r^-(t,x) \quad \text{and} \quad W_r^+(t,x) \leq W^+(t,x) .\end{equation*}

Although the r-value functions do not satisfy the full dynamic programming principle, it is nevertheless possible to obtain sub- and super-optimal dynamic programming principles for these functions. This is the content of the next result.

Proposition 1. (Sub-optimal and super-optimal dynamic programming principle.) Suppose that conditions (A1)–(A2) and (D1)–(D2) hold. For any $ (t,x) \in [0,T) \times \mathbb{R}^N$ and every $\skew3\hat{t}\in [t,T)$, we obtain

(22) \begin{align}W_r^-(t,x) &\leq \inf_{\beta\in \mathcal{B}_r(t,T)} \sup_{u({\cdot}) \in \mathcal{U}(t,T)}{\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[ \Theta(\skew3\hat{t},t) W_r^-(\skew3\hat{t},X_{t,x}^{u,v}(\skew3\hat{t}\kern1.5pt)) \notag \\ & \quad\, + \int_t^{\skew3\hat{t}} \Theta(s,t)L(s,X_{t,x}^{u,v}(s),u(s), \beta[u({\cdot})](s)) \, \mathrm{d} s \bigg], \end{align}

where $X_{t,x}^{u,v}({\cdot})$ is the solution of (3) with $v({\cdot}) = \beta[u({\cdot})]({\cdot})\in {\mathcal V}(t,T)$ for $u({\cdot}) \in {\mathcal U}(t,T)$, and

(23) \begin{align} W_r^+(t,x) &\geq \sup_{\alpha\in \mathcal{A}_r(t,T)} \inf_{v({\cdot}) \in \mathcal{V}(t,T)}{\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[ \Theta(\skew3\hat{t},t) W_r^+(\skew3\hat{t},X_{t,x}^{u,v}(\skew3\hat{t}\kern1.5pt)) \notag \\ & \quad\, + \int_t^{\skew3\hat{t}} \Theta(s,t)L(s,X_{t,x}^{u,v}(s),\alpha[v({\cdot})](s), v(s)) \, \mathrm{d} s \bigg],\end{align}

where $X_{t,x}^{u,v}({\cdot})$ is the solution of (3) with $u({\cdot}) = \alpha[v({\cdot})]({\cdot})\in {\mathcal U}(t,T)$ for $v({\cdot}) \in {\mathcal V}(t,T)$.

Proof. We only prove inequality (22), with the proof of (23) being analogous. For simplicity of notation, we will drop the superscripts u, v from the solution $X_{t,x}^{u,v}({\cdot})$, with the precise controls used at each instant being clear from the context.

Let $ (t,x)\in[0,T)\times{\mathbb R}^N$ be fixed, let $\skew3\hat{t}\in[t,T)$ be arbitrary, and denote the right-hand side of (22) by $\overline{W}(t,x)$. Note that for any $ \epsilon > 0$ there exists $\beta_\epsilon({\cdot}) \in \mathcal{B}_r(t,T)$ such that

(24) \begin{equation}\overline{W}(t,x) \geq {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}\bigg[ \Theta(\skew3\hat{t},t)W_r^-(\skew3\hat{t},X_{t,x}(\skew3\hat{t}\kern1.5pt)) + \int_t^{\skew3\hat{t}} \Theta(s,t)L(s,X_{t,x}(s),u(s), \beta_\epsilon[u({\cdot})](s)) \, \mathrm{d} s \bigg] -\epsilon\end{equation}

for every $u({\cdot}) \in \mathcal{U}(t,T)$. Moreover, for each $y \in {\mathbb R}^N$, we have

\begin{equation*} W_r^-(\skew3\hat{t},y)=\inf_{\beta \in \mathcal{B}_r(\skew3\hat{t},T)} \sup_{u({\cdot}) \in \mathcal{U}(\skew3\hat{t},T)} {\mathcal J}(\skew3\hat{t},y;\,u({\cdot}),\beta[u({\cdot})]) .\end{equation*}

Hence, there exists $\beta_{y} \in \mathcal{B}_r(\skew3\hat{t},T)$ such that

(25) \begin{equation}W_r^-(\skew3\hat{t},y)\geq\sup_{u({\cdot}) \in \mathcal{U}(\skew3\hat{t},T)}{\mathcal J}(\skew3\hat{t},y;\,u({\cdot}),\beta_{y}[u({\cdot})])-\epsilon .\end{equation}

Let $\{D_i\}_{i\in{\mathbb{N}}}$ be a Borel partition of $\mathbb{R}^N$ with diameter $\mathrm{diam}(D_i)<\delta$ and pick $y_i \in D_i$ for each $i\in{\mathbb{N}}$. By Lemma 2(i) and Corollary 2(a), the diameter $\delta>0$ can be chosen to be sufficiently small that, for any $y \in D_i$,

(26) \begin{equation}|{\mathcal J}(\skew3\hat{t},y;\,u({\cdot}),\beta[u({\cdot})])-{\mathcal J}(\skew3\hat{t},y_i;\,u({\cdot}),\beta[u({\cdot})])| < \epsilon\end{equation}

for every $u({\cdot}) \in \mathcal{U}(\skew3\hat{t},T)$ and $\beta \in \mathcal{B}(\skew3\hat{t},T)$, and also

\begin{equation*} |W_r^-(\skew3\hat{t},y)-W_r^-(\skew3\hat{t},y_i)| <\epsilon .\end{equation*}

For each $ (\skew3\hat{t},\omega) \in [t,T]\times \Omega^\omega_{t,T}$ and $u({\cdot}) \in \mathcal{U}(t,T)$, define

\begin{equation*}\tilde{\beta}[u({\cdot})](s,\omega)=\begin{cases}\beta_\epsilon[u({\cdot})](s,\omega) &\textrm{if}\ s\in[t,\skew3\hat{t}), \\\sum_{i\in{\mathbb{N}}} \textbf{1}_{D_i}(X_{t,x}(\skew3\hat{t}\kern1.5pt))\beta_{y_i}[u(\omega^{t,\skew3\hat{t}})({\cdot})](s,\omega^{\bar{t},T}) &\textrm{if s}\in[\skew3\hat{t},T],\end{cases}\end{equation*}

where $\omega=(\omega^{t,\skew3\hat{t}},\omega^{\bar{t},T}) \in \Omega^\omega_{t,\skew3\hat{t}} \times \Omega^\omega_{\skew3\hat{t},T}$ and $u(\omega^{t,\skew3\hat{t}})({\cdot}) \in \mathcal{U}(\skew3\hat{t},T)$ is the admissible control introduced immediately before the definition of the r-value functions. Note that $\tilde{\beta}$ is an r-strategy by construction, i.e. $\tilde{\beta} \in \mathcal{B}_r(t,T)$.

Moreover, whenever $X_{t,x}(\skew3\hat{t}\kern1.5pt) \in D_i$ for some $i\in{\mathbb{N}}$ and $u({\cdot}) \in \mathcal{U}(t,T)$, relation (25) and inequality (26) yield

(27) \begin{align}W_r^-(\skew3\hat{t},y_i) &\geq {\mathcal J}(\skew3\hat{t},y_i;\,u(\omega^{t,\skew3\hat{t}})({\cdot}),\beta_{y_i}[u(\omega^{t,\skew3\hat{t}})({\cdot})])-\epsilon \notag \\ &\geq {\mathcal J}(\skew3\hat{t},X_{t,x}(\skew3\hat{t}\kern1.5pt);\,u(\omega^{t,\skew3\hat{t}})({\cdot}),\beta_{y_i}[u(\omega^{t,\skew3\hat{t}})({\cdot})])-2\epsilon\end{align}

for all $u({\cdot})\in{\mathcal U}(t,T)$ and ${\mathbb{P}}^\omega_{t,\skew3\hat{t}}$-a.e. $\omega^{t,\skew3\hat{t}}\in\Omega^\omega_{t,\skew3\hat{t}}$.

From the definition of the discounted payoff functional in (12), we get

(28) \begin{align}&{\mathcal J}(t, x;\, u({\cdot}),\tilde{\beta}[u({\cdot})]) \notag\\ & \qquad = {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[ \int_t^{T} \Theta(s,t)L(s, X_{t,x}(s), u(s),\tilde{\beta}[u({\cdot})](s)) \, \mathrm{d} s + \Theta(T,t) \Psi(T, X_{t,x}(T)) \bigg] \notag \\& \qquad = {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[ \int_t^{\skew3\hat{t}} \Theta(s,t)L(s, X_{t,x}(s), u(s),\tilde{\beta}[u({\cdot})](s)) \, \mathrm{d} s \notag \\&\qquad\quad\, + \sum_{i\in{\mathbb{N}}} {\textbf 1}_{D_i}(X_{t,x}(\skew3\hat{t}\kern1.5pt))\bigg(\int_{\skew3\hat{t}}^{T} \Theta(s,t)L(s, X_{t,x}(s), u(s),\tilde{\beta}[u({\cdot})](s))\, \mathrm{d} s \notag\\ &\qquad\quad\, + \Theta(T,t) \Psi (T, X_{t,x}(T)) \bigg) \bigg] .\end{align}

Combining assumption (D2) with (28), we get

\begin{align*} &{\mathcal J}(t, x;\, u({\cdot}),\tilde{\beta}[u({\cdot})]) ={\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[ \int_t^{\skew3\hat{t}} \Theta(s,t)L(s, X_{t,x}(s), u(s),\tilde{\beta}[u({\cdot})](s)) \, \mathrm{d} s \notag \\ &\qquad + \Theta(\skew3\hat{t},t)\sum_{i\in{\mathbb{N}}} {\textbf 1}_{D_i}(X_{t,x}(\skew3\hat{t}\kern1.5pt))\bigg(\int_{\skew3\hat{t}}^{T} \Theta(s,\skew3\hat{t})L(s, X_{t,x}(s), u(s),\tilde{\beta}[u({\cdot})](s)) \, \mathrm{d} s\notag \\ &\qquad + \Theta(T,\skew3\hat{t}) \Psi(T, X_{t,x}(T))\bigg) \bigg] \notag .\end{align*}

From the definition of the r-strategy $\tilde{\beta}$, we get

\begin{align*} & {\mathcal J}(t, x;\, u({\cdot}),\tilde{\beta}[u({\cdot})]) = {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[ \int_t^{\skew3\hat{t}} \Theta(s,t)L(s, X_{t,x}(s), u(s),\beta_\epsilon[u({\cdot})](s)) \, \mathrm{d} s \ &\qquad + \Theta(\skew3\hat{t},t)\sum_{i\in{\mathbb{N}}} {\textbf 1}_{D_i}(X_{t,x}(\skew3\hat{t}\kern1.5pt))\,{\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}\bigg[\int_{\skew3\hat{t}}^{T} \Theta(s,\skew3\hat{t})L(s, X_{t,x}(s), u(s),\tilde{\beta}[u({\cdot})](s)) \, \mathrm{d} s \\ &\qquad + \Theta(T,\skew3\hat{t}) \Psi (T, X_{t,x}(T)) \mid {\mathcal G}_{t,\skew3\hat{t}}\bigg] \bigg] .\end{align*}

Combining the previous relation with Lemma 1, we obtain

\begin{align*} & {\mathcal J}(t, x;\, u({\cdot}),\tilde{\beta}[u({\cdot})]) = {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[ \int_t^{\skew3\hat{t}} \Theta(s,t)L(s, X_{t,x}(s), u(s),\beta_\epsilon[u({\cdot})](s)) \, \mathrm{d} s \\ &\qquad + \Theta(\skew3\hat{t},t)\sum_{i\in{\mathbb{N}}} {\textbf 1}_{D_i}(X_{t,x}(\skew3\hat{t}\kern1.5pt)){\mathcal J}(\skew3\hat{t},X_{t,x}(\skew3\hat{t}\kern1.5pt);\,u(\omega^{t,\skew3\hat{t}})({\cdot}),\beta_{y_i}[u(\omega^{t,\skew3\hat{t}})({\cdot})])\bigg] .\end{align*}

Using inequalities (27) and (26), we get

\begin{align*} {\mathcal J}(t, x;\, u({\cdot}),\tilde{\beta}[u({\cdot})]) &\leq {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[ \int_t^{\skew3\hat{t}} \Theta(s,t)L(s, X_{t,x}(s), u(s),\beta_\epsilon[u({\cdot})](s)) \, \mathrm{d} s \\ &\quad\, + \Theta(\skew3\hat{t},t)\sum_{i\in{\mathbb{N}}} {\textbf 1}_{D_i}(X_{t,x}(\skew3\hat{t}\kern1.5pt))W_r^-(\skew3\hat{t},y_i)\bigg] +2\epsilon\\&\leq {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[ \int_t^{\skew3\hat{t}} \Theta(s,t)L(s, X_{t,x}(s), u(s),\beta_\epsilon[u({\cdot})](s)) \, \mathrm{d} s \\ &\quad\, + \Theta(\skew3\hat{t},t)W_r^-(\skew3\hat{t},X_{t,x}(\skew3\hat{t}\kern1.5pt))\bigg] +3\epsilon\notag .\end{align*}

Finally, combining the previous inequality with (24), we conclude

\begin{equation*}{\mathcal J}(t, x;\, u({\cdot}),\tilde{\beta}[u({\cdot})]) \leq \overline{W}(t, x) + 4\epsilon\end{equation*}

for every $u({\cdot})\in{\mathcal U}(t,T)$. As a consequence, we obtain

\begin{equation*}W_r^-(t,x) \leq \overline{W}(t, x) + 4\epsilon .\end{equation*}

The proof is completed by letting $\epsilon$ go to zero.

Proposition 1 can be used to guarantee Hölder continuity of $W_r^-$ and $W_r^+$ with respect to t.

Corollary 3. Suppose that (A1)–(A2) and (D1)–(D2) hold. The r-value functions $W_r^-$ and $W_r^+$ of the discounted SDG determined by (3) and (12) are $\tfrac12$-Hölder-continuous in t, uniformly in x.

Proof. We will focus on establishing Hölder continuity of $W_r^-$ with respect to t, with the corresponding argument for $W_r^+$ being similar. To simplify notation, we will drop the superscripts u,v from the solution $X_{t,x}^{u,v}({\cdot})$, with the precise controls used at each instant being clear from the context.

Without loss of generality, suppose that $t_1,t_2\in[0,T]$ are such that $t_1 < t_2$ and $|t_2 - t_1| < 1$. Using (22) and rearranging terms, we get

\begin{align*}& W_r^-(t_1,x)-W_r^-(t_2,x) \notag \\ &\qquad\leq \inf_{\beta\in \mathcal{B}_r(t_1,T)} \sup_{u({\cdot}) \in \mathcal{U}(t_1,T)}{\mathbb E}_{{\mathbb{P}}_{t_1,T}^\omega} \bigg[\int_{t_1}^{t_2} \Theta(s,t_1)L(s,X_{t_1,x}(s),u(s), \beta[u({\cdot})](s)) \, \mathrm{d} s \notag \\ & \qquad\quad\, + \Theta(t_2,t_1) ( W_r^-(t_2,X_{t_1,x}(t_2)) -W_r^-(t_2,x))\notag \\ & \qquad\quad\, + (\Theta(t_2,t_1)-\Theta(t_1,t_1))W_r^-(t_2,x) {\bigg]} .\end{align*}

Combining the inequality above with uniform Lipschitz continuity of $W_r^-(t, x)$ in x and of $\Theta(s,t)$ in s, as well as boundedness of $\Theta$, L, and $W_r^-$, we obtain that there exists a positive constant $C_1$ such that

(29) \begin{equation}W_r^-(t_1,x)-W_r^-(t_2,x) \leq C_1\big( |t_2-t_1| + {\mathbb E}_{{\mathbb{P}}_{t_1,T}^\omega} \big[|X_{t_1,x}(t_2)-x|\big]\big) .\end{equation}

A first-moment estimate for SDEs [28, Corollary 2.4.6] guarantees the existence of a positive constant $C_2$ such that

(30) \begin{equation}{\mathbb E}_{{\mathbb{P}}_{t_1,T}^\omega} \big[|X_{t_1,x}(t_2)-x|\big] \leq C_2|t_2-t_1|^{1/2} .\end{equation}

Putting together inequalities (29) and (30), we conclude that

(31) \begin{equation}W_r^-(t_1,x)-W_r^-(t_2,x) \leq K_1|t_2-t_1|^{1/2}\end{equation}

for some positive constant $K_1$.

Given $u({\cdot}) \in {\mathcal U}(t_2,T)$, define $u^*({\cdot})\in {\mathcal U}(t_1,T)$ as

\begin{equation*}u^*(s,\omega) =\begin{cases}u(t_2,\omega^{t_1,t_2}) &\textrm{if $s \in [t_1, t_2]$}, \\u(s,\omega^{t_1,t_2}) &\textrm{if $s \in (t_2, T]$},\end{cases}\end{equation*}

and given $\beta^* \in {\mathcal B}_r(t_1,T)$, define $\beta \in {\mathcal B}_r(t_2,T)$ as

\begin{equation*}\beta[u({\cdot})](s,\omega^{t_2,T}) = \beta^*[u^*({\cdot})](s,\pi^{-1}(\omega^{t_1,t_2},\omega^{t_2,T})) .\end{equation*}

We now observe that for $\beta^* \in {\mathcal B}_r(t_1,T)$ we have

\begin{align*} & {\mathcal J}(t_1, x;\, u^*({\cdot}),\beta^*[u^*({\cdot})]) \\ &\qquad = {{\mathbb E}}_{{\mathbb{P}}^\omega_{t_1,T}}\bigg[ \int_{t_1}^{T} \Theta(s,t_1)L(s, X_{t_1,x}(s), u^*(s),\beta^*[u^*({\cdot})](s)) \, \mathrm{d} s + \Theta(T,t_1)\Psi(T, X_{t_1,x}(T)) \bigg] \\&\qquad= {{\mathbb E}}_{{\mathbb{P}}^\omega_{t_1,T}}\bigg[ \int_{t_1}^{t_2} \Theta(s,t_1)L(s, X_{t_1,x}(s), u^*(s),\beta^*[u^*({\cdot})](s)) \, \mathrm{d} s \notag \\ & \qquad\quad\, + \Theta(t_2,t_1){\mathcal J}(t_2, X_{t_1,x}(t_2);\, u({\cdot}),\beta[u({\cdot})]) \bigg] \\&\qquad= {{\mathbb E}}_{{\mathbb{P}}^\omega_{t_1,T}}\bigg[ \int_{t_1}^{t_2} \Theta(s,t_1)L (s, X_{t_1,x}(s), u^*(s),\beta^*[u^*({\cdot})](s)) \, \mathrm{d} s \notag \\ &\qquad\quad\,+ \Theta(t_2,t_1) ({\mathcal J}(t_2, X_{t_1,x}(t_2);\, u({\cdot}),\beta[u({\cdot})])-{\mathcal J}(t_2, x;\, u({\cdot}),\beta[u({\cdot})])) \\ &\qquad\quad\,+ (\Theta(t_2,t_1)-\Theta(t_1,t_1)) {\mathcal J}(t_2, x;\, u({\cdot}),\beta[u({\cdot})]) + {\mathcal J}(t_2, x;\, u({\cdot}),\beta[u({\cdot})]) \bigg] {.}\end{align*}

Combining this equality with boundedness and Lipschitz continuity of $x\mapsto{\mathcal J}(t, x;\, u({\cdot}), v({\cdot}))$, as guaranteed by Corollary 2, as well as boundedness and Lipschitz continuity of $\Theta$ and boundedness of L, guaranteed by the assumptions in the statement, we obtain that there exists a positive constant C such that

\begin{equation*} {\mathcal J}(t_1, x;\, u^*({\cdot}),\beta^*[u^*({\cdot})]) \geq - C\big(|t_2-t_1| + {\mathbb E}_{{\mathbb{P}}_{t_1,T}^\omega} \big[|X_{t_1,x}(t_2)-x|\big] \big) + {\mathcal J}(t_2, x;\, u({\cdot}),\beta[u({\cdot})]){.}\end{equation*}

As a consequence, we obtain

\begin{align*}&\sup_{u({\cdot}) \in \mathcal{U}(t_1,T)}{\mathcal J}(t_1, x;\, u({\cdot}),\beta^*[u({\cdot})]) \\ &\qquad \geq - C\big(|t_2-t_1| + {\mathbb E}_{{\mathbb{P}}_{t_1,T}^\omega} \big[|X_{t_1,x}(t_2)-x|\big] \big) + \sup_{u({\cdot}) \in \mathcal{U}(t_2,T)}{\mathcal J}(t_2, x;\, u({\cdot}),\beta[u({\cdot})]) \\ &\qquad \geq - C\big(|t_2-t_1| + {\mathbb E}_{{\mathbb{P}}_{t_1,T}^\omega} \big[|X_{t_1,x}(t_2)-x|\big] \big) + W_r^-(t_2,x) .\end{align*}

Resorting once more to the first-moment estimate (30), using the previous inequality we are able to obtain

(32) \begin{equation}W_r^-(t_1, x)-W_r^-(t_2,x) \geq - K_2|t_2-t_1|^{1/2}\end{equation}

for some positive constant $K_2$. Hölder continuity of $W_r^-$ follows from combining the estimates (31) and (32).

We now observe that the r-value functions $W_r^-$ and $W_r^+$ are continuous functions of (t, x), a consequence of Corollaries 2 and 3. Indeed, the r-value functions $W_r^-$ and $W_r^+$ are, respectively, viscosity subsolutions and supersolutions of the HJBI equations (15) and (16). The proof of this fact is similar to that of [17, Proposition 1.12], with only minor adjustments being required. We skip the details here for the sake of brevity.

Proposition 2. Suppose that conditions (A1)–(A2) and (D1)–(D3) hold. The r-lower value function $W_r^-$ (resp. r-upper value function $W_r^+$) of the discounted SDG determined by (3) and (12) is a viscosity subsolution (resp. supersolution) of (15) (resp. (16)).

The next section employs an approximation procedure originally due to Fleming and Souganidis [Reference Fleming and Souganidis17, Reference Souganidis39, Reference Souganidis40]. This procedure is based on a discretization of the time variable and yields viscosity solutions for (15) and (16).

3.3. Time-discretization procedure

Let $\pi=\{0=t_0<t_1<\cdots<t_m=T\}$ be a partition of [0, T], and let

\begin{equation*}\|\pi\|=\max_{1\leq i\leq m} (t_i - t_{i-1})\end{equation*}

denote the mesh of the partition $\pi$.

A $\pi$-admissible control $u({\cdot})$ for Player I on [t, T] is an admissible control with the following additional property. If $i_0 \in \{0,\ldots,m-1\}$ is such that $t\in[t_{i_0},t_{i_0+1})$, then $u(s)=u$ for $s\in[t,t_{i_0+1})$ with $u\in U$ and $u(s)=u_{t_k}$ for $s\in[t_k,t_{k+1})$ for $k=i_0+1,\ldots,m-1$ where $u_{t_k}$ is $\mathcal{G}_{t,t_k}^\omega$-measurable. The set of $\pi$-admissible controls for Player I on [t, T] will be denoted by $\mathcal{U}_{\pi}(t,T)$. A $\pi$-admissible control $v({\cdot})$ for Player II on [t, T] is defined similarly and the set of all such controls will be denoted by $\mathcal{V}_{\pi}(t,T)$.

A $\pi$-admissible strategy $\alpha$ for Player I on [t, T] is an element of the set of admissible strategies $\mathcal{A}(t,T)$ with the additional properties that $\alpha[\mathcal{V}(t,T)] \subset \mathcal{U}_{\pi}(t,T)$, if $t \in[t_{i_0},t_{i_0+1})$ then for every $v({\cdot}) \in \mathcal{V}(t,T)$ the resulting control $\alpha[v({\cdot})]|_{[t,t_{i_0+1})}$ does not depend on $v({\cdot})$, and if $v({\cdot}) \approx \tilde{v}({\cdot})$ on $[t,t_k]$, then $\alpha[v({\cdot})](t_k) = \alpha[\tilde{v}({\cdot})](t_k)$, ${\mathbb{P}}_{t,T}^{\omega}$-a.s. for every $k\in\{i_0+1,\ldots,m\}$. The set of all $\pi$-admissible strategies for Player I on [t, T) will be denoted by $\mathcal{A}_{\pi}(t,T)$. A $\pi$-admissible strategy $\beta$ for Player II on [t, T] is defined similarly and the set of all these strategies will be denoted by $\mathcal{B}_{\pi}(t,T)$.

Let $C_b^{0,1}({\mathbb R}^N)$ denote the space of bounded, Lipschitz-continuous functions on ${\mathbb R}^N$. For every $t\in [0,T)$ and $\skew3\hat{t} \in (t,T]$, define the operator $F^-_{t,\skew3\hat{t}}\colon C_b^{0,1}({\mathbb R}^N)\to C_b^{0,1}({\mathbb R}^N)$ by

(33) \begin{equation} F^-_{t,\skew3\hat{t}}\,\phi(x) = \sup_{u\in U} \inf_{v({\cdot}) \in \mathcal{V}(t,\skew3\hat{t})} {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}\bigg[\Theta(\skew3\hat{t},t)\phi(X_{t,x}^{u,v}(\skew3\hat{t}\kern1.5pt)) +\int_t^{\skew3\hat{t}}\Theta(s,t)L(s, X_{t,x}^{u,v}(s),u,v(s))\, \mathrm{d} s\bigg],\end{equation}

where $\mathcal{V}(t,\skew3\hat{t})$ denotes the set of admissible controls for Player II on $[t,\skew3\hat{t})$ and $X_{t,x}^{u,v}({\cdot})$ is the solution of (3) on $[t,\skew3\hat{t})$ associated with the choice of admissible controls $u({\cdot}) \equiv u$ and $v({\cdot})\in\mathcal{V}(t,\skew3\hat{t})$ having initial condition x at time t.

In a similar fashion, define the operator $F^+_{t,\skew3\hat{t}}\colon C_b^{0,1}({\mathbb R}^N)\to C_b^{0,1}({\mathbb R}^N)$ as

\begin{equation*}F^+_{t,\skew3\hat{t}}\,\phi(x) = \inf_{v\in V} \sup_{u({\cdot}) \in \mathcal{U}(t,\skew3\hat{t})} {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}\bigg[\Theta(\skew3\hat{t},t)\phi(X_{t,x}^{u,v}(\skew3\hat{t}\kern1.5pt))+\int_t^{\skew3\hat{t}}\Theta(s,t)L(s, X_{t,x}^{u,v}(s),u(s),v)\, \mathrm{d} s\bigg],\end{equation*}

where $\mathcal{U}(t,\skew3\hat{t})$ denotes the set of admissible controls for Player I on $[t,\skew3\hat{t})$ and $X_{t,x}^{u,v}({\cdot})$ is the solution of (3) on $[t,\skew3\hat{t})$ associated with the choice of admissible controls $v({\cdot}) \equiv v$ and $u({\cdot})\in\mathcal{U}(t,\skew3\hat{t})$ having initial condition x at time t.

Let $w_{\pi}^-\colon [0,T]\times{\mathbb R}^N \to{\mathbb R}$ be such that $w_{\pi}^-(T,x)=\Psi(T,x)$ and

(34) \begin{equation}w_{\pi}^-(t,x) = F^-_{t,t_{i_0+1}} \prod_{k=i_0+2}^m F^-_{t_{k-1},t_{k}} \Psi(T,x)\end{equation}

whenever $t \in [t_{i_0},t_{i_0+1})$, and similarly, let $w_{\pi}^+\colon [0,T]\times{\mathbb R}^N \to{\mathbb R}$ be such that $w_{\pi}^+(T,x)=\Psi(T,x)$ and

(35) \begin{equation}w_{\pi}^+(t,x) = F^+_{t,t_{i_0+1}} \prod_{k=i_0+2}^m F^+_{t_{k-1},t_{k}} \Psi(T,x)\end{equation}

whenever $t \in [t_{i_0},t_{i_0+1})$. Under assumptions (A1)–(A2) and (D1)–(D2), $w_{\pi}^-$ and $w_{\pi}^+$ are both well-defined. Moreover, $w_{\pi}^-$ and $w_{\pi}^+$ admit a stochastic game characterization, as described in the next result.

Proposition 3. Suppose that conditions (A1)–(A2) and (D1)–(D2) hold. For every $ (t,x) \in [0,T] \times {\mathbb R}^N$, we have

(36) \begin{equation}w_{\pi}^-(t,x) = \inf_{\beta \in \mathcal{B}(t,T)} \sup_{u({\cdot}) \in \mathcal{U}_{\pi}(t,T)} {\mathcal J}(t,x;\,u({\cdot}),\beta[u({\cdot})])\end{equation}

and

(37) \begin{equation}w_{\pi}^+(t,x) = \sup_{\alpha \in \mathcal{A}(t,T)} \inf_{v({\cdot}) \in \mathcal{V}_{\pi}(t,T)} {\mathcal J}(t,x;\,\alpha[v({\cdot})],v({\cdot})) .\end{equation}

Proof. We prove relation (36) only, with the proof of (37) being similar. The proof of (36) relies on the following two claims.

  1. (i) For every $ (t,x) \in [0,T]\times{\mathbb R}^N$ and every $\epsilon >0$, there exist $\alpha_{\epsilon} \in \mathcal{A}_{\pi}(t,T)$ and $\beta_{\epsilon} \in \mathcal{B}_{\pi}(t,T)$ such that

    (38) \begin{equation}{\mathcal J}(t,x;\,u({\cdot}),\beta_{\epsilon}[u({\cdot})]) - \epsilon \leq w_{\pi}^-(t,x) \leq {\mathcal J}(t,x;\,\alpha_{\epsilon}[v({\cdot})],v({\cdot})) + \epsilon\end{equation}
    for all $u({\cdot}) \in \mathcal{U}_{\pi}(t,T)$ and $v({\cdot}) \in \mathcal{V}_{\pi}(t,T)$.
  2. (ii) For any $\beta \in \mathcal{B}(t,T)$, the pair of strategies $\alpha_{\epsilon} \in \mathcal{A}_{\pi}(t,T)$ and $\beta \in \mathcal{B}(t,T)$ define controls $u^{\epsilon}({\cdot}) \in \mathcal{U}_{\pi}(t,T)$ and $v^{\epsilon}({\cdot}) \in \mathcal{V}(t,T)$ for which

    (39) \begin{equation} {\mathcal J}(t,x;\,\alpha_{\epsilon}[v^\epsilon({\cdot})],v^{\epsilon}({\cdot})) = {\mathcal J}(t,x;\,u^{\epsilon}({\cdot}),\beta[u^\epsilon({\cdot})]) .\end{equation}

Indeed, once the two claims above are proved, the result follows from noting that the left-hand side of (38) guarantees that

\begin{equation*}w_{\pi}^-(t,x) \geq \inf_{\beta \in \mathcal{B}(t,T)} \sup_{u({\cdot}) \in \mathcal{U}_{\pi}(t,T)} {\mathcal J}(t,x;\,u({\cdot}),\beta[u({\cdot})]),\end{equation*}

while combining the right-hand side of (38) with (39) yields the reverse inequality.

The proof of claim (ii) is similar to that of the corresponding statement in [Reference Fleming and Souganidis17] and we skip it. Let us then prove claim (i). For $\varphi \in C_b^{0,1}({\mathbb R}^N)$, $x\in{\mathbb R}^N$, $u \in U$, $t\in [0,T]$, and $\skew3\hat{t} \in (t,T]$, define

\begin{align*}\psi(x,u,t,\skew3\hat{t},\varphi) = \inf_{v({\cdot}) \in \mathcal{V}(t,\skew3\hat{t})} {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}\bigg[\Theta(\skew3\hat{t},t)\varphi(X_{t,x}^{u,v}(\skew3\hat{t}\kern1.5pt)) +\int_t^{\skew3\hat{t}}\Theta(s,t)L(s, X_{t,x}^{u,v}(s),u,v(s)) \, \mathrm{d} s\bigg],\end{align*}

where $X_{t,x}^{u,v}({\cdot})$ is the solution of (3) under the choice of the admissible controls $u(s) \equiv u$ and $v({\cdot})\in\mathcal{V}(t,\skew3\hat{t})$ and initial condition x at time t. Using assumptions (A1)–(A2) and (D1)–(D2), we obtain that $\psi(\cdot,\cdot,t,\skew3\hat{t},\varphi) \in C_b^{0,1}({\mathbb R}^N \times U)$ and

\begin{equation*}F^-_{t,\skew3\hat{t}}\,\varphi(x) = \sup_{u \in U} \psi(x,u,t,\skew3\hat{t},\varphi),\end{equation*}

where $F_{t,\skew3\hat{t}}$ is the operator defined in (33).

If $t \in [t_{i_0},t_{i_0+1})$ for $i_0 \in \{0,1,\ldots,m-1\}$, let

\begin{align*}\varphi_m &= \Psi(T,\cdot) {,} \\ \varphi_j &= F^-_{t_j,t_{j+1}}\varphi_{j+1}, \quad j=i_0+1,\ldots,m-1 {,}\\ \varphi_{i_0} &= F^-_{t,t_{i_0+1}}\varphi_{i_0+1} .\end{align*}

Hence, we obtain that

\begin{equation*}\varphi_{i_0}(x)=w_{\pi}^-(t,x) .\end{equation*}

Using [31, Lemma 1], we partition ${\mathbb R}^N$ and U into Borel sets of diameter less than some positive constant $\delta$, to be determined below. Denote these partitions by $\{A_k\colon k=1,2,\ldots\}$ and $\{B_\ell\colon \ell=1,2,\ldots,L\}$, respectively, and pick $x_k \in A_k$ and $u_\ell \in B_\ell$ for each $k=1,2,\ldots$ and $\ell=1,2,\ldots,L$. For any $\gamma >0$ there exists $\delta$ small enough and $u_{kj}^* = u_{\ell(k,j)} \in U$, $k=1,2,\ldots$ and $j=i_0+1,\ldots,m$, such that

\begin{equation*} \psi(x_k,u_{kj}^*,t_{j-1},t_j,\varphi_j) > F^-_{t_{j-1},t_j} \varphi_j(x_k) - \gamma .\end{equation*}

Further, we choose $v_{kj}^\ell({\cdot}) \in V(t_{j-1},t_j)$ such that, for $u({\cdot})$ identically equal to $u_\ell\in U$ on the interval $[t_{j-1},t_j)$, we obtain

\begin{align*}&{\mathbb E}_{{\mathbb{P}}_{t_{j-1},T}^\omega}\bigg[\Theta(t_j,t_{j-1})\varphi(X_{t_{j-1},x_k}^{\ell}(t_j)) +\int_{t_{j-1}}^{t_j} \Theta(s,t_{j-1})L(s, X_{t_{j-1},x_k}^{\ell}(s),u_\ell,v_{kj}^\ell(s)) \, \mathrm{d} s\bigg]\\ & \qquad < \psi (x_k,u_\ell,t_{j-1},t_j,\varphi_{j}) +\gamma,\end{align*}

with $t_{i_0}=t$ whenever $j=i_0+1$. The notation $X_{t_{j-1},x_k}^{\ell}({\cdot})$ stands for the solution of (3) with initial condition $x_k$ at time $t_{j-1}$ subject to the admissible controls $u({\cdot})\equiv u_\ell$ and $v_{kj}^\ell({\cdot})$.

We will now exhibit the strategies $\alpha_{\epsilon}$ and $\beta_{\epsilon}$ in (38). Fix $ (t,x) \in {\mathbb R}^N \times [0,T)$. For $v({\cdot}) \in \mathcal{V}(t,T)$, define

\begin{equation*}\alpha_{\epsilon}[v({\cdot})](s) = I_{[t,t_{i_0+1})}(s) \sum_{k} u_{ki_0}^* I_{A_k}(x)+ \sum_{j=i_0+1}^{m-1}I_{[t_j,t_{j+1})}(s)\sum_{k} u_{kj}^* I_{A_k}(X(t_j)),\end{equation*}

where $X({\cdot})$ is defined on each of the intervals $[t,t_{i_0+1}]$ and $[t_j,t_{j+1}]$, $j=i_0+1,\ldots,m-1$, as the solution of (3) with $u({\cdot})=\alpha_{\epsilon}[v({\cdot})]$. For $u({\cdot}) \in \mathcal{U}(t,T)$, define

\begin{align*}\beta_{\epsilon}[u({\cdot})](s) &= I_{[t,t_0+1)}(s) \sum_{k,\ell} \hat{v}^\ell_{k i_0}(s) I_{A_k}(x) I_{B_\ell}(u(s)) \\ &\quad\, + \sum_{j=i_0+1}^{m-1}\sum_{k,\ell}I_{[t_j,t_{j+1})}(s)\hat{v}^\ell_{k j}(s) I_{A_k}(X(t_j)) I_{B_\ell}(u(s)), \notag\end{align*}

where $X({\cdot})$ is now defined on each of the intervals $[t,t_{i_0+1}]$ and $[t_j,t_{j+1}]$, $j=i_0 + 1,\ldots,m-1$, as the solution of (3) with $v({\cdot})=\beta_{\epsilon}[u({\cdot})]$, and $\hat{v}^\ell_{kj}(\cdot,\omega) = v^\ell_{kj}(\cdot,\omega^{t_j,T})$ using the identification of $\Omega^\omega_{t,T}$ with $\Omega^\omega_{t,t_j} \times \Omega^\omega_{t_j,T}$ provided by $\pi(\omega) = (\omega^{t,t_j},\omega^{t_j,T})$ discussed in Section 2.1.

Let ${\mathcal J}$ stand for either ${\mathcal J}(t,x;\,\alpha_{\epsilon}[v({\cdot})],v({\cdot}))$ or ${\mathcal J}(t,x;\,u({\cdot}),\beta_{\epsilon}[u({\cdot})])$. For any $v({\cdot}) \in \mathcal{V}(t,T)$ and $u({\cdot}) = \alpha_{\epsilon}[v({\cdot})]$ or $u({\cdot}) \in \mathcal{U}_{\pi}(t,T)$ and $v({\cdot}) = \beta_{\epsilon}[u({\cdot})]$, we have

(40) \begin{align}& w_{\pi}^-(t,x) - {\mathcal J} \notag\\ &\qquad = \varphi_{i_0}(x)-{\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}\bigg[ \int_t^{T} \Theta(s,t) L (s, X(s), u(s),v(s)) \, \mathrm{d} s + \Theta(T,t) \varphi_m(X(T)) \bigg] \notag\\&\qquad = \sum_{j=i_0+1}^m \big\{ {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}[ \Theta(t_{j-1},t) \varphi_{j-1}(X(t_{j-1}))]- {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} [\Theta(t_{j},t)\varphi_{j}(X(t_{j}))] \big\} \notag\\ & \qquad\quad\, -{\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}\bigg[ \int_t^{T} \Theta(s,t) L (s, X(s), u(s),v(s)) \, \mathrm{d} s\bigg] .\end{align}

Using assumption (D2), we obtain that

(41) \begin{align}& \sum_{j=i_0+1}^m \big\{ {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}[ \Theta(t_{j-1},t) \varphi_{j-1}(X(t_{j-1}))]- {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} [\Theta(t_{j},t)\varphi_{j}(X(t_{j}))] \big\} \notag\\&\quad\, -{\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}\bigg[ \int_t^{T} \Theta(s,t) L (s, X(s), u(s),v(s)) \, \mathrm{d} s\bigg] \notag\\&\qquad = \sum_{j=i_0+1}^m \Theta(t_{j-1},t)\bigg\{{\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}[\varphi_{j-1}(X(t_{j-1}))] \notag\\&\qquad\quad\, -{\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[\Theta(t_{j},t_{j-1})\varphi_j(X(t_{j})) + \int_{t_{j-1}}^{t_j} \Theta(s,t_{j-1}) L (s, X(s), u(s),v(s)) \, \mathrm{d} s\bigg]\bigg\} \notag \\&\qquad = {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}\bigg[\sum_{j=i_0+1}^m \Theta(t_{j-1},t)\bigg\{\varphi_{j-1}(X(t_{j-1}))\\&\qquad\quad\, -{\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[\Theta(t_{j},t_{j-1})\varphi_j(X(t_{j})) + \int_{t_{j-1}}^{t_j} \Theta(s,t_{j-1}) L (s, X(s), u(s),v(s)) \, \mathrm{d} s \Bigm| \mathcal{G}^\omega_{t,t_{j-1}}\bigg] \bigg\} \bigg]\notag\end{align}

Combining (40) and (41), we get

\begin{align*} & w_{\pi}^-(t,x) - {\mathcal J} \\&\qquad = {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}\bigg[\sum_{j=i_0+1}^m \Theta(t_{j-1},t)\bigg\{\varphi_{j-1}(X(t_{j-1}))\\&\qquad\quad\, -{\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[\Theta(t_{j},t_{j-1})\varphi_j(X(t_{j})) + \int_{t_{j-1}}^{t_j} \Theta(s,t_{j-1}) L (s, X(s), u(s),v(s)) \, \mathrm{d} s \Bigm|\mathcal{G}^\omega_{t,t_{j-1}}\bigg] \bigg\} \bigg] .\end{align*}

Inequality (38) follows from the relation above after checking that the following two statements hold ${\mathbb{P}}_{t,T}^{\omega}$-a.s.

  1. (A) For any $v({\cdot}) \in \mathcal{V}(t,T)$ and $u({\cdot}) = \alpha_{\epsilon}[v({\cdot})]$, we have

    \begin{align*}\varphi_{j-1}(X(t_{j-1}))& \leq {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[\Theta(t_{j},t_{j-1})\varphi_j(X(t_{j}))\\&\quad\, + \int_{t_{j-1}}^{t_j} \Theta(s,t_{j-1}) L (s, X(s), u(s),v(s)) \, \mathrm{d} s \Bigm| \mathcal{G}^\omega_{t,t_{j-1}}\bigg] + \epsilon(t_j-t_{j-1}) .\end{align*}
  2. (B) For any $u({\cdot}) \in \mathcal{U}_{\pi}(t,T)$ and $v({\cdot}) = \beta_{\epsilon}[u({\cdot})]$, we have

    \begin{align*}& {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega}\bigg[\Theta(t_{j},t_{j-1})\varphi_j(X(t_{j})) + \int_{t_{j-1}}^{t_j} \Theta(s,t_{j-1}) L (s, X(s), u(s),v(s)) \, \mathrm{d} s \Bigm| \mathcal{G}^\omega_{t,t_{j-1}}\bigg] \\ &\qquad \leq \varphi_{j-1}(X(t_{j-1})) + \epsilon(t_j-t_{j-1}) .\end{align*}

The proofs of (A) and (B) can be obtained by performing appropriate adjustments to the proofs of analogous statements in [Reference Fleming and Souganidis17]. We skip them to keep the presentation brief.

The next lemma follows from assumptions (A1)–(A2) and (D1)–(D2), as well as the characterizations of $w_{\pi}^-$ and $w_{\pi}^+$ given above.

Lemma 3. There exists a positive constant C, depending solely on assumptions (A1)–(A2) and (D1)–(D2), such that the inequalities

\begin{equation*} |w_{\pi}^\pm(t,x)| \leq C \quad \textrm{and}\quad |w_{\pi}^\pm(t,x)-w_{\pi}^\pm(\skew3\hat{t},\hat{x})| \leq C (|x-\hat{x}|+|t-\skew3\hat{t}|^{1/2} ) \end{equation*}

hold for all $x,\hat{x} \in {\mathbb R}^N$ and $t,\skew3\hat{t} \in [0,T]$.

Resorting to Lemma 3 above and the Arzela–Ascoli theorem, we obtain that the families of functions $\{w_{\pi}^-\}$ and $\{w_{\pi}^+\}$ converge uniformly as $\|\pi\|\to 0$ along subsequences to bounded uniformly continuous functions. We will see that these uniform limits are viscosity solutions of (15) and (16). That is the content of the result below.

Proposition 4. Assume that (A1)–(A2) and (D1)–(D3) hold and let $w_{\pi}^-$ and $w_{\pi}^+$ be given by (34) and (35), respectively. Then the limits

\begin{equation*}w^-=\lim_{\|\pi\| \to 0}w_{\pi}^- \quad \textrm{and} \quad w^+=\lim_{\|\pi\| \to 0}w_{\pi}^+\end{equation*}

exist locally uniformly and are the unique viscosity solution of (15) and (16), respectively.

Proof. Existence of $w^-$ and $w^+$ follows from a comparison theorem (Theorem 5 in the appendix) and Lemma 3 as long as one guarantees that any subsequential limit of the families $\{w_{\pi}^-\}$ and $\{w_{\pi}^+\}$ as $\|\pi\|\to 0$ is a viscosity solution of (15) and (16), respectively. This can be achieved by employing the same arguments as in [17, Proposition 2.5]. We omit the details here for the sake of brevity.

3.4. $W^-$ and ${{W}^{+}}$ characterization as viscosity solutions of (15) and (16)

In what follows, we will compile the results obtained in the preceding sections to complete the characterization of the lower and upper value functions $W^-$ and $W^+$ as the unique viscosity solutions of (15) and (16), respectively. For that purpose, we start by noting that since the limit functions $w^-$ and $w^+$ of Proposition 4 are the unique viscosity solutions of (15) and (16), respectively, then Proposition 2 and a comparison theorem (Theorem 5 in the appendix) yield the following result.

Lemma 4. For every $ (t,x) \in [0,T]\times{\mathbb R}^N$ we have that

\begin{equation*}W_r^-(t,x) \leq w^-(t,x) \quad \textit{and} \quad W_r^+(t,x) \geq w^+(t,x) .\end{equation*}

We will now show that $W^-(t,x) \geq w^-(t,x)$ and $W^+(t,x) \leq w^+(t,x)$ for every $ (t,x) \in [0,T]\times{\mathbb R}^N$. As a consequence, we will obtain that the lower and upper value functions $W^-$ and $W^+$ are the unique viscosity solutions of (15) and (16), respectively.

Theorem 2. Suppose that (A1)–(A2) and (D1)–(D3) hold. The lower and upper value functions $W^-$ and $W^+$ of the discounted SDG determined by (3) and (12) are, respectively, the unique viscosity solutions of the Hamilton–Jacobi–Bellman–Isaacs equations (15) and (16). Moreover, if the Isaacs condition (11) holds, then the discounted SDG determined by (3) and (12) has a value.

Proof. We only prove the statement concerning the lower value function, with a similar proof holding for the corresponding statement concerning the upper value function.

Combining Corollary 2 with Lemma 4 we obtain that $W^-\leq W_r^-\leq w^-$ on $[0,T]\times{\mathbb R}^N$. On the other hand, by Proposition 3 we have that for every partition $\pi$ of [0, T], the inequality $w_{\pi}^- \leq W^-$ holds on $[0,T]\times{\mathbb R}^N$. Proposition 4 then implies that $w^- \leq W^-$ on $[0,T]\times{\mathbb R}^N$, guaranteeing that $w^- = W^-$ on $[0,T]\times{\mathbb R}^N$.

Finally, if the Isaacs condition holds, then the HJBI equations (15) and (16) coincide. Hence, uniqueness of the viscosity solutions – a consequence of Theorem 5 – ensures that $W^-$ and $W^+$ are identical.

Finally, we are able to state a dynamic programming principle for each one of the value functions $W^-$ and $W^+$, defined in (13) and (14).

Theorem 3. (Dynamic programming principle.) Assume that conditions (A1)–(A2) and (D1)–(D3) hold and let $t, \skew3\hat{t} \in [0,T]$ be such that $t<\skew3\hat{t}$. Then, for every $x \in {\mathbb R}^N$, we have the following.

  1. (i) The lower value function of the discounted SDG (3)–(14) is determined by the recursive relation

    (42) \begin{align}W^-(t,x) & = \inf_{\beta \in \mathcal{B}(t,T)} \sup_{u \in \mathcal{U}(t,T)} {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[ \Theta(\skew3\hat{t},t) W^- (\skew3\hat{t},X_{t,x}^{u,v}(\skew3\hat{t}\kern1.5pt))\notag \\*&\quad\, + \int_t^{\skew3\hat{t}} \Theta(s,t)L(s,X_{t,x}^{u,v}(s),u(s),\beta[u({\cdot})](s)) \, \mathrm{d}s \bigg],\end{align}
    combined with the boundary condition $W^- (T,x) = \Psi (T,x)$, where $X_{t,x}^{u,v}(s)$, $s\in[t,T]$, is the solution of (3) with $v({\cdot})=\beta[u({\cdot})]\in \mathcal{V}(t,T)$ for $u({\cdot}) \in \mathcal{U}(t,T)$.
  2. (ii) The upper value function of the discounted SDG (3)–(13) is determined by the recursive relation

    \begin{align*} W^+(t,x) & = \sup_{\alpha \in \mathcal{A}(t,T)} \inf_{v \in \mathcal{V}(t,T)} {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[ \Theta(\skew3\hat{t},t) W^+ (\skew3\hat{t},X_{t,x}^{u,v}(\skew3\hat{t}\kern1.5pt)) \notag \\*& \quad\, + \int_t^{\skew3\hat{t}} \Theta(s,t)L(s,X_{t,x}^{u,v}(s),\alpha[v({\cdot})](s),v(s)) \, \mathrm{d} s \bigg],\end{align*}
    combined with the boundary condition $W^+ (T,x) = \Psi(T,x)$, where $X_{t,x}^{u,v}(s)$, $s\in[t,T]$, is the solution of (3) with $u({\cdot})=\alpha[v({\cdot})]\in \mathcal{U}(t,T)$ for $v({\cdot}) \in \mathcal{V}(t,T)$.

Proof. We start by proving that the lower value function of the discounted SDG (3)–(12) satisfies relation (42). The corresponding proof for the upper value function is similar and we omit it here.

Let $\skew3\hat{t} \in (0,T]$ be fixed and let $\overline{W}(t,x)$ denote the right-hand side of (42). It is enough to consider in (42) controls $u({\cdot})$ and strategies $\beta$ defined in $[t,\skew3\hat{t}]$. By Theorem 2, we have that $\overline{W}$ is the viscosity solution of (15) on $[0,\skew3\hat{t}]\times{\mathbb R}^N$ with $\overline{W}(\skew3\hat{t},x)=W^-(\skew3\hat{t},x)$. Since $W^-$ is the viscosity solution of the same problem, uniqueness of viscosity solutions yields that $\overline{W} = W^-$.

4. Proof of Theorem 1

This section is devoted to the proof of Theorem 1. We will see that the payoff functional (4) can be related to the payoff functional of an auxiliary problem, with deterministic time horizon, but readjusted running and terminal payoffs accounting for the uncertainty induced by the random horizon via the introduction of a non-constant discount rate. We will then discuss how to formulate the resulting problem as a zero-sum discounted SDG, studied in detail in Section 3.

We start by stating and proving a simple lemma concerning an iterative property for the conditional probabilities defined in (7).

Lemma 5. The identities

\begin{align*}G^+(s,t) &= G^+(s,\skew3\hat{t})\ G^+(\skew3\hat{t},t),\\ g^-(s,t) &= g^-(s,\skew3\hat{t})\ G^+(\skew3\hat{t},t)\end{align*}

hold for every $0\leq t\leq \skew3\hat{t}\leq s\leq T$.

Proof. Recall the definition of the conditional probability $G^+(t,s)$ given in (7):

(43) \begin{equation}G^+(s,t) = \mathbb{P}_t^\tau (\tau > s) = \mathbb{P}^\tau (\tau > s \mid \tau > t) .\end{equation}

Using the definition of conditional probability and the fact that $t\leq \skew3\hat{t}\leq s$, we get

(44) \begin{align}\mathbb{P}^\tau (\tau > s \mid \tau > t) & = \dfrac{\mathbb{P}^\tau (\{\tau > s\} \cap \{\tau > t\}) }{\mathbb{P}^\tau (\tau > t) } \notag \\ &= \dfrac{\mathbb{P}^\tau (\{\tau > s \} \cap \{ \tau > \skew3\hat{t}\}) }{\mathbb{P}^\tau (\tau > \skew3\hat{t}) }\dfrac{\mathbb{P}^\tau ( \{\tau > \skew3\hat{t}\}\cap \{ \tau > t\}) }{\mathbb{P}^\tau (\tau > t) } \notag \\ &= \mathbb{P}^\tau (\tau > s \mid \tau > \skew3\hat{t})\, \mathbb{P}^\tau ( \tau > \skew3\hat{t}\mid \tau > t) .\end{align}

The first relation in the statement follows from combining (43) and (44). The second relation is a consequence of the first one after noting that $g^-(s,t)$ is the density function associated with the distribution function $G^-(s,t)=1-G^+(s,t)$.

The next lemma plays a central role in the formulation of an auxiliary discounted SDG with a deterministic time horizon associated with the SDG with random horizon (3)–(4). In what follows we will denote the indicator function of a set A by $\textbf{1}_A$.

Lemma 6. Let $u\colon [t,T]\times\Omega_{t,T}^\omega\to U$ and $v\colon [t,T]\times\Omega_{t,T}^\omega\to V$ be ${\mathbb G}_{t,T}^\omega$-adapted processes. For every $ (t,x)\in[0,T)\times{\mathbb R}^N$, the payoff functional $J(t,x;\, u({\cdot}), v({\cdot}))$ defined in (4) admits the representation

(45) \begin{align} J(t, x;\, u({\cdot}),v({\cdot})) = {\mathbb E}_{{\mathbb{P}}^\omega_{t,T}} \bigg[ \int_t^{T} G^+(s,t) {\mathcal{L}}(s, X_{t,x}^{u,v}(s), u(s),v(s)) \, \mathrm{d} s + G^+(T,t) \Psi(T, X_{t,x}^{u,v}(T)) \bigg],\nonumber\\\end{align}

where ${\mathcal{L}}$ is the conditional running payoff

(46) \begin{equation}\mathcal{L}(t,x,u,v) = L (s,x,u,v) + g^-(t,t) \Psi (t,x),\end{equation}

$G^+$ and $g^-$ are as given in (7) and (8), respectively, and $X_{t,x}^{u,v}(s)$, $s\in[t,T]$, denotes the solution of the initial value problem (3) associated with $ (u({\cdot}),v({\cdot})) $.

Proof. Given the definition of the random horizon $\xi$ in (2) and that of the payoff functional J in (4), we obtain

\begin{align*}& J(t,x;\,u({\cdot}), v({\cdot})) \notag\\ &\qquad = {\mathbb E}_{{\mathbb{P}}_t} \bigg[ \textbf{1}_{(T,+\infty)}(\tau)\bigg(\int_t^{T} L (s, X_{t,x}^{u,v}(s), u(s),v(s)) \, \mathrm{d} s + \Psi (T, X_{t,x}^{u,v}(T)) \bigg) \\ & \qquad \quad\, + \textbf{1}_{(t,T]}(\tau)\bigg(\int_t^{\tau} L (s, X_{t,x}^{u,v}(s), u(s),v(s)) \, \mathrm{d} s + \Psi (\tau, X_{t,x}^{u,v}(\tau)) \bigg)\bigg] \notag .\end{align*}

Combining the representation (1) for the probability measure ${\mathbb{P}}_t$ with the linearity of the expected value with respect to the distribution of $\tau$ and the definition of the conditional probabilities (7), we are able to rewrite the relation above as

(47) \begin{align}& J(t,x;\,u({\cdot}),v({\cdot})) \notag \\ &\qquad = {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[ G^+(T,t)\bigg(\int_t^{T} L (s, X_{t,x}^{u,v}(s), u(s),v(s)) \, \mathrm{d} s + \Psi (T, X_{t,x}^{u,v}(T)) \bigg) \notag \\ &\qquad\quad\, + \int_t^T g^-(r,t) \int_t^{r} L (s, X_{t,x}^{u,v}(s), u(s),v(s)) \, \mathrm{d} s \, \mathrm{d} r \notag \\ &\qquad\quad\, + \int_t^T g^-(s,t) \Psi (s, X_{t,x}^{u,v}(s)) \, \mathrm{d} s \bigg] .\end{align}

Applying the Fubini–Tonelli theorem to the second term on the right-hand side of relation (47), we get

\begin{align*}& \int_t^T \int_t^r g^-(r,t)L(s,X_{t,x}^{u,v}(s),u(s),v(s))\, \mathrm{d} s \, \mathrm{d} r \\*&\qquad=\int_t^T \int_s^T g^-(r,t)L(s,X_{t,x}^{u,v}(s),u(s),v(s))\, \mathrm{d} r \, \mathrm{d} s \\*&\qquad=\int_t^T (G^+(s,t)-G^+(T,t)) L(s,X_{t,x}^{u,v}(s),u(s),v(s)))\, \mathrm{d} s .\end{align*}

Combining the last equality with (47) and rearranging terms, we get

\begin{align*} & J(t,x;\,u({\cdot}),v({\cdot})) \notag \\ &\qquad = {\mathbb E}_{{\mathbb{P}}_{t,T}^\omega} \bigg[ \int_t^{T} G^+(s,t) L (s, X_{t,x}^{u,v}(s), u(s),v(s)) + g^-(s,t) \Psi (s, X_{t,x}^{u,v}(s)) \, \mathrm{d} s \\ &\qquad\quad\, + G^+(T,t)\Psi (T, X_{t,x}^{u,v}(T)) \bigg] . \notag\end{align*}

The result now follows from Lemma 5 after factoring out the term $G^+(s,t)$ from the integrand in the equality above.

The representation (45) for the payoff functional (4) reflects the transformation of the SDG under consideration from a random planning horizon to a deterministic one via the introduction of a subjective rate of time preferences, that resembles a non-constant discount factor related to the family of conditional probabilities (7).

Proof of Theorem 1. We employ the representation for the payoff functional (4) provided in Lemma 6. Start by observing that the payoff functional (45) is of the same form as the discounted payoff functional (12), with non-constant discount factor given by $\Theta(s,t)=G^+(s,t)$ and running payoff of the form (46). Moreover, combining assumption (A3), the definition of the conditional probabilities (7) and (8), and Lemma 5, we obtain that hypotheses (D1)–(D3) hold. Therefore, Theorem 2 ensures that the value functions, say $W^-$ and $W^+$, associated with the auxiliary discounted SDG specified by (3) and (45) are, respectively, the unique viscosity solutions of the HJBI equations (9) and (10), each of which can be obtained from (15) and (16) by performing the appropriate adjustments listed above to the discount factor $\Theta(s,t)$ and the running payoff L(t, x, u, v). Finally, by Lemma 6, we conclude that the lower and upper value functions $V^-$ and $V^+$ associated with the SDG with a random horizon (3)–(4) are, respectively, identically equal to the lower and upper value functions $W^-$ and $W^+$ associated with the auxiliary discounted SDG mentioned above.

5. Conclusions

We have studied a two-player zero-sum stochastic differential game with a random horizon and diffusive state variable dynamics. We have employed dynamic programming and viscosity solutions techniques to prove that the value function of this game is the unique viscosity solution of a certain HJBI equation. Further, under the Isaacs condition, we have obtained that the value of the game exists.

Appendix A. Viscosity solutions and a comparison theorem

Let $\Omega$ be an open subset of ${\mathbb R}^n$. For any function $u\colon \Omega \rightarrow {\mathbb R}$, define $u^*\colon \bar{\Omega} \rightarrow {\mathbb R} \cup \{-\infty, \infty\}$ as

\begin{equation*}u^*(x)=\lim_{r \rightarrow 0^+} \sup{\{u(y)\colon y \in B(x;\,r) \cap \Omega\}} \quad \text{for} \ x \in \bar{\Omega},\end{equation*}

and $u_*\colon \bar{\Omega} \rightarrow {\mathbb R} \cup \{-\infty, \infty\}$ as

\begin{equation*}u_*(x)=\lim_{r \rightarrow 0^+} \inf{\{u(y)\colon y \in B(x;\,r) \cap \Omega\}} \quad \text{for} \ x \in \bar{\Omega} .\end{equation*}

Note that $u^* \geq u \geq u_*$ on $\Omega$, $u^*$ is upper semi-continuous (u.s.c.) on $\bar{\Omega}$ and $u_*$ is lower semi-continuous (l.s.c.) on $\bar{\Omega}$. Note also that $u_*=({-}\,u)^*$ and that if u is u.s.c. at $x \in \Omega$, then $u^*(x)=u(x)$. The functions $u^*$ and $u_*$ are called, respectively, the u.s.c. and l.s.c. envelopes of u.

Given $A \in {\mathbb R}^{m \times n}$, let A’ and $\|A\|$ stand, respectively, for the transpose and the norm of A. Let $\mathcal{A}$ and $\mathcal{B}$ be non-empty, and set $\Lambda=\Omega \times \mathcal{A} \times \mathcal{B}$. Consider functions $\Sigma\colon \Lambda \rightarrow {\mathbb R}^{m \times n}$, $b\colon \Lambda \rightarrow {\mathbb R}^n$, $c\colon \Lambda \rightarrow {\mathbb R}$, $d\colon \Lambda \rightarrow {\mathbb R}$, and define $A\colon \Lambda \rightarrow \mathbb{S}^n$ as

\begin{equation*}A(x,\alpha,\beta)=\Sigma'(x,\alpha,\beta) \Sigma(x,\alpha,\beta)\end{equation*}

and $F\colon \Omega \times {\mathbb R} \times {\mathbb R}^n \times \mathbb{S}^n \rightarrow {\mathbb R}$ as

\begin{equation*} F(x,r,p,X) = \inf_{\beta \in \mathcal{B}} \sup_{\alpha \in \mathcal{A}}\{- \mathrm{tr}\, (A(x,\alpha,\beta)X) +\langle b(x,\alpha,\beta),p \rangle +c(x,\alpha,\beta)r+d(x,\alpha,\beta)\} .\end{equation*}

Consider the nonlinear PDE

(48) \begin{equation}F(x,u,Du,D^2u)=0 \ \text{in} \ \Omega .\end{equation}

A function $u\colon \Omega \rightarrow {\mathbb R}$ is called a viscosity subsolution of (48) if $u^*(x) < \infty$ for $x \in \bar{\Omega}$ and if, whenever $\phi \in C^2(\Omega)$, $y \in \Omega$ and $ (u^*-\phi)(y)=\max_{\Omega}(u^*-\phi)$,

\begin{equation*}F (y,u^*(y),D\phi(y),D^2\phi(y)) \leq 0 .\end{equation*}

In a similar way, a function $u\colon \Omega \rightarrow {\mathbb R}$ is called a viscosity supersolution of (48) if $u_*(x) > -\infty$ for $x \in \bar{\Omega}$ and if, whenever $\phi \in C^2(\Omega)$, $y \in \Omega$ and $ (u_*-\phi)(y)=\min_{\Omega}(u_*-\phi)$,

\begin{equation*}F (y,u_*(y),D\phi(y),D^2\phi(y)) \geq 0 .\end{equation*}

A function $u\colon \Omega \rightarrow {\mathbb R}$ is called a viscosity solution of (48) if it is both a viscosity sub- and supersolution of (48).

Consider the following assumptions.

  1. (H1) For each bounded subset B of $\Omega$, the functions A, b, c, and d are bounded on $B \times \mathcal{A} \times \mathcal{B}$.

  2. (H2) $\Sigma$ and b are Lipschitz-continuous with respect to x, that is,

    \begin{equation*} \sup \dfrac{\|\Sigma(x,\alpha,\beta)-\Sigma(y,\alpha,\beta)\|}{|x-y|} < \infty \end{equation*}
    and
    \begin{equation*} \sup \dfrac{\|b(x,\alpha,\beta)-b(y,\alpha,\beta)\|}{|x-y|} < \infty, \end{equation*}
    where the supremum is taken for all $ (x,\alpha,\beta), (y,\alpha,\beta) \in \Lambda$ with $x \neq y$.
  3. (H3) The functions $f=c,d$ satisfy

    \begin{equation*} \lim_{r \rightarrow 0} \sup \{|f(x,\alpha,\beta)-f(y,\alpha,\beta)| \colon x,y \in B\, (\alpha,\beta) \in \mathcal{A} \times \mathcal{B}, |x-y| \leq r\} = 0, \end{equation*}
    for bounded subsets B of $\Omega$.
  4. (H4) $\inf \{c(x,\alpha,\beta)\colon (x,\alpha, \beta) \in \Lambda\} > 0$.

The following comparison result is due to Ishii [22, Theorem 7.3].

Theorem 4. Assume that (H1)–(H4) hold. Let u and v be, respectively, viscosity sub- and supersolutions of

\begin{equation*}F(x,u,Du,D^2u) = 0 \ \textit{in} \ \Omega .\end{equation*}

If $\Omega$ is unbounded, then assume that

\begin{equation*}\lim_{x \in \Omega, |x| \rightarrow \infty} \dfrac{u^+(x)}{\log |x|}=0 \quad \textit{and} \quad\lim_{x \in \Omega, |x| \rightarrow \infty} \dfrac{u^-(x)}{\log |x|}=0 .\end{equation*}

Suppose that $u^*(x) \leq v_*(x)$ for $x \in \partial \Omega$. Then $u^*\leq v_*$ on $\Omega$.

Details regarding further extensions of the theorem above, including how to extend to parabolic equations such as those under consideration herein, may be found in [Reference Ishii and Lions23] and the ‘User’s guide to viscosity solutions of second order partial differential equations’ [Reference Crandall, Ishii and Lions13]. In particular, the following theorem holds.

Theorem 5. Assume that the functions $\sigma$, f, L, and $\Psi$ are bounded and Lipschitz-continuous. If v and $\tilde{v}$ (resp. u and $\tilde{u}$) are a viscosity subsolution and supersolution of (15) (resp. (16)) with boundary condition $\Psi$ and $\tilde{\Psi}$ and if $\Psi\leq \tilde{\Psi}$ on ${\mathbb R}^N \times \{T\}$, then $v\leq \tilde{v}$ (resp. $u\leq \tilde{u}$) on ${\mathbb R}^N \times [0,T]$.

Acknowledgements

The authors would like to thank the Associate Editor and two anonymous referees for their careful reading and numerous suggestions leading to improvements in this paper. M. Ferreira’s research was funded by Fundação para a Ciência e a Tecnologia in the form of a postdoctoral scholarship with reference SFRH / BPD / 109311 / 2015. D. Pinheiro’s research was supported by the PSC-CUNY research awards TRADA-46-251 and TRADA-47-142, jointly funded by the Professional Staff Congress and the City University of New York.

References

Bayraktar, E. and Sîrbu, M. (2012). Stochastic Perron’s method and verification without smoothness using viscosity comparison: the linear case. Proc. Amer. Math. Soc. 140, 36453654.CrossRefGoogle Scholar
Bayraktar, E. and Sîrbu, M. (2013). Stochastic Perron’s method for Hamilton–Jacobi–Bellman equations. SIAM J. Control Optim. 51, 42744294.CrossRefGoogle Scholar
Bayraktar, E. and Sîrbu, M. (2014). Stochastic Perron’s method and verification without smoothness using viscosity comparison: obstacle problems and Dynkin games. Proc. Amer. Math. Soc. 142, 13991412.CrossRefGoogle Scholar
Bayraktar, E. and Yao, S. (2013). A weak dynamic programming principle for zero-sum stochastic differential games with unbounded controls. SIAM J. Control Optim. 51, 20362080.CrossRefGoogle Scholar
Berkovitz, L. and Fleming, W. (1955). On Differential Games with Integral Payoff (Ann. Math. Study 39). Princeton University Press, Princeton, NJ.Google Scholar
Biswas, I. (2012). On zero-sum stochastic differential games with jump-diffusion driven state: a viscosity solution framework. SIAM J. Control Optim. 50, 18231858.CrossRefGoogle Scholar
Blanchet-Scalliet, C., El Karoui, N., Jeanblanc, M. and Martellini, L. (2008). Optimal investment decisions when time-horizon is uncertain. J. Math. Econom. 44, 11001113.CrossRefGoogle Scholar
Bruhn, K. and Steffensen, M. (2011). Household consumption, investment and life insurance. Insurance Math. Econom. 48, 315325.CrossRefGoogle Scholar
Buckdahn, R. and Li, J. (2008). Stochastic differential games and viscosity solutions of Hamilton–Jacobi–Bellman–Isaacs equations. SIAM J. Control Optim. 47, 444475.CrossRefGoogle Scholar
Buckdahn, R., Li, J. and Quincampoix, M. (2014). Value in mixed strategies for zero-sum stochastic differential games without Isaacs condition. Ann. Prob. 42, 17241768.CrossRefGoogle Scholar
Cardaliaguet, P. and Rainer, C. (2009). Stochastic differential games with asymmetric information. Appl. Math. Optim. 59, 136.CrossRefGoogle Scholar
Crandall, M. and Lions, P.-L. (1983). Viscosity solutions of Hamilton–Jacobi equations. Trans. Amer. Math. Soc. 277, 142.CrossRefGoogle Scholar
Crandall, M., Ishii, H. and Lions, P.-L. (1992). User’s guide to viscosity solutions of second order partial differential equations. Bull. Amer. Math. Soc. 27, 167.CrossRefGoogle Scholar
Duarte, I., Pinheiro, D., Pinto, A. and Pliska, S. (2014). Optimal life insurance purchase, consumption and investment on a financial market with multi-dimensional diffusive terms. Optimization 63, 17371760.CrossRefGoogle Scholar
Elliott, R. and Kalton, N. (1972). The Existence of Value in Differential Games (Mem. Amer. Math. Soc. 126). American Mathematical Society.Google Scholar
Evans, L. and Souganidis, P. (1984). Differential games and representation formulas for solutions of Hamilton–Jacobi–Isaacs equations. Indiana Univ. Math. J. 33, 773797.CrossRefGoogle Scholar
Fleming, W. and Souganidis, P. (1989). On the existence of value functions of two-player, zero-sum stochastic differential games. Indiana Univ. Math. J. 38, 293314.CrossRefGoogle Scholar
Friedman, A. (1971). Differential Games. Wiley, New York, NY.Google Scholar
Friedman, A. (1972). Stochastic differential games. J. Differential Equations 11, 79108.CrossRefGoogle Scholar
Hamadène, S. and Mu, R. (2015). Existence of Nash equilibrium points for Markovian non-zero-sum stochastic differential games with unbounded coefficients. Stochastics 87, 85111.CrossRefGoogle Scholar
Isaacs, R. (1965). Differential Games: A Mathematical Theory with Applications to Warfare. Wiley, New York, NY.Google Scholar
Ishii, H. (1989). On uniqueness and existence of viscosity solutions of fully nonlinear second-order elliptic PDEs. Comm. Pure Appl. Math. 42, 1545.CrossRefGoogle Scholar
Ishii, H. and Lions, P.-L. (1990). Viscosity solutions of fully nonlinear second-order elliptic partial differential equations. J. Differential Equations 83, 2678.CrossRefGoogle Scholar
Karatzas, I. and Shreve, S. (2000). Brownian Motion and Stochastic Calculus. Springer, New York, Heidelberg and Berlin.Google Scholar
Katsoulakis, M. (1995). A representation formula and regularizing properties for viscosity solutions of second-order fully nonlinear degenerate parabolic equations. Nonlinear Anal. 24, 147158.CrossRefGoogle Scholar
Kumar, K. (2008). Nonzero sum stochastic differential games with discounted payoff criterion: an approximating Markov chain approach. SIAM J. Control Optim. 47, 374395.CrossRefGoogle Scholar
Kwak, M., Yong, H. and Choi, U. (2009). Optimal investment and consumption decision of family with life insurance. Insurance Math. Econom. 48, 176188.CrossRefGoogle Scholar
Mao, X. (2007). Stochastic Differential Equations and Applications, 2nd edn. Horwood Publishing, Chichester, UK.Google Scholar
Marín-Solano, J. and Shevkoplyas, E. (2011). Non-constant discounting and differential games with random time horizon. Automatica 47, 26262638.CrossRefGoogle Scholar
Mousa, A., Pinheiro, D. and Pinto, A. (2016). Optimal life insurance purchase from a market of several competing life insurance providers. Insurance Math. Econom. 67, 133144.CrossRefGoogle Scholar
Nisio, M. (1988). Stochastic differential games and viscosity solutions of Isaacs equations. Nagoya Math. J. 110, 163184.CrossRefGoogle Scholar
Petrosyan, L. and Murzov, N. (1966). Game-theoretic problems of mechanics. Litovsk. Math. Sb. 6, 423433.Google Scholar
Petrosyan, L. and Shevkoplyas, E. (2000). Cooperative games with random duration. Vestnik of St. Petersburg Univ. Series 1 4, 1823.Google Scholar
Petrosyan, L. and Shevkoplyas, E. (2003). Cooperative solutions for games with random duration. Game Theory and Applications IX, 125139.Google Scholar
Pham, T. and Zhang, J. (2014). Two person zero-sum game in weak formulation and path dependent Bellman–Isaacs equation. SIAM J. Control Optim. 52, 20902121.CrossRefGoogle Scholar
Pliska, S. and Ye, J. (2007). Optimal life insurance purchase and consumption/investment under uncertain lifetime. J. Bank Finance 31, 13071319.CrossRefGoogle Scholar
Shen, Y. and Wei, J. (2016). Optimal investment-consumption-insurance with random parameters. Scand. Actuar. J. 2016, 3762.CrossRefGoogle Scholar
Sîrbu, M. (2014). Stochastic Perron’s method and elementary strategies for zero-sum differential games. SIAM J. Control Optim. 52, 16931711.CrossRefGoogle Scholar
Souganidis, P. (1985). Approximation schemes for viscosity solutions of Hamilton–Jacobi equations. J. Differential Equations 59, 143.CrossRefGoogle Scholar
Souganidis, P. (1985). Max-min representations and product formulas for the viscosity solutions of Hamilton–Jacobi equations with applications to differential games. Nonlinear Anal. 9, 217257.CrossRefGoogle Scholar
Tang, S. and Hou, S. (2007). Switching games of stochastic differential systems. SIAM J. Control Optim. 46, 900929.CrossRefGoogle Scholar
Yaari, M. (1965). Uncertain lifetime, life insurance and the theory of the consumer. Rev. Econom. Stud. 32, 137150.CrossRefGoogle Scholar