1. Introduction
In recent years, stochastic control theory has gained significant interest in the insurance literature. This is because the insurance company can control the surplus process such that a certain objective function is minimized (maximized). In particular, there are three main criteria: maximizing the discounted total dividend, minimizing the probability of ruin and maximizing the exponential utility. The corresponding modern continuous-time approach was pioneered by Browne (Reference Browne1995) and Asmussen & Taksar (Reference Asmussen and Taksar1997), who applied classical stochastic control methods to reduce the optimization problem to a matter of solving a Hamilton-Jacobi-Bellman (HJB) equation. Browne (Reference Browne1995) found the optimal investment strategies to minimize the probability of ruin and to maximize the exponential utility function under the model of Brownian motion with drift. For the same model, Asmussen & Taksar (Reference Asmussen and Taksar1997) obtained the optimal dividend strategy. Since their pioneering work, many attempts have been made to solve the optimization problem in a framework that allows more controls. Examples where the optimal dividend problem was treated under the model of diffusion are Paulsen & Gjessing (Reference Paulsen and Gjessing1997), Paulsen (Reference Paulsen2003), Asmussen et al. (Reference Asmussen, Hϕjgaard and Taksar2000), Hϕjgaard & Taksar (Reference Hϕjgaard and Taksar1999, Reference Hϕjgaard and Taksar2001, Reference Hϕjgaard and Taksar2004) and Choulli et al. (Reference Choulli, Taksar and Zhou2003). For the same model, Schmidli (Reference Schmidli2001).
Taksar & Markussen (Reference Taksar and Markussen2003), Promislow & Young (Reference Promislow and Young2005) and Bai & Guo (Reference Bai and Guo2008) considered the problem of minimizing the probability of ruin. All of these under the diffusion model gave the closed-form solution. However, in the classical risk model, since the corresponding HJB equation contains the integro term and differential term simultaneously, it is more difficult to solve. For when the objective function is an exponential utility function, Yang & Zhang (Reference Yang and Zhang2005) gave the closed-form solution for the jump-diffusion model, whereas when the objective function is mean variance, Bai & Zhang (Reference Bai and Zhang2008) gave the optimal solution for the classical risk model. For other objective functions, especially the ruin probability, only the existence of a solution to the HJB equation was proved, and a verification theorem was given. Among them are Hipp & Plum (Reference Hipp and Plum2000), Hipp & Taksar (Reference Hipp and Taksar2000), Schmidli (Reference Schmidli2001, Reference Schmidli2002), and Hipp & Vogt (Reference Hipp and Vogt2003).
In this paper, the surplus is modeled as a compound Poisson process perturbed by diffusion. Assume that the insurer is allowed to ask its customers for input to minimize the distance from some prescribed target path and the total discounted cost on a fixed interval. Then, the objective is to find the amount of the input at every time (the optimal control) such that the distance from some prescribed target path and the total discounted cost are minimized and to calculate the minimizing value (the optimal value function).
For the above optimization problem, we first use a dynamic programming approach to solve it. By changing the HJB equation to an ordinary partial differential equation, the analytic solutions to the optimal control and the optimal value function are obtained. Then, it is treated again by the completion of square and stochastic maximum principle. This is different from the dynamic programming approach in that two methods lead to a stochastic differential equation for the optimal control process and not a nonlinear partial differential equation for the optimal value function. Solving the stochastic differential equation yields the optimal control. Then, the optimal value function is obtained by two different methods again.
By comparing three methods, it can be found that (1) the dynamic programming is the best method for solving the optimal solution in this paper and that (2) the dynamic programming is limited for the Markov process, and the two other methods, the completion of square and stochastic maximum principle, are not. Therefore, the process given in this paper to solve for the optimal solution has the inspiring effect of using these two methods to solve the optimal control problems in Non-Markov risk processes (for example, the classical risk model with fractional Brownian motion perturbation).
The paper is organized as follows. In section 2, the model assumptions are formulated. The control and the objective function are introduced. In section 3, the control problem is solved. The problem is divided into three parts. In subsection 3.1, the dynamic programming approach is used. The optimal value function and the optimal control are obtained by the solution and the minimizing function of the HJB equation. In subsection 3.2, stochastic differential equations for the optimal control process are first obtained by the completion of square approach. Solving the equation results in the optimal control. Then, the optimal value function is obtained by its definition. In subsection 3.4, the same stochastic differential equations are obtained using the Hamiltonian system. Then, to obtain the optimal value function, we give a lemma that complements the results given by Framstad et al. (Reference Framstad, Øksendal and Sulem2004). Combining these results, the expression of the optimal value function is again obtained.
2. The Model
Consider the following classical surplus process perturbed by diffusion
where c is the rate at which the premiums are received. {N t ; t≥0} is a Poisson process with parameter β, denoting the total number of claims with claim times T i (i=1, 2 …). Y 1, Y 2, … , independent of {N t ; t≥0}, are positive i.i.d. random variables with a common distribution function (df)F(x), the moment $\mu _{j} {\equals}{\int}_0^\infty {x^{j} F(dx)} $ , for j=1, 2. {B t ; t≥0} is a standard Wiener process that is independent of the aggregate claim process $${\sum_{{i{\equals}1}}^{{N_{t} }}} {Y_{i} } $$ , and σ is the dispersion parameter.
In addition to the premium income, we here assume that the company also receives interest on its reserves with interest force δ t . The interest is assumed to be a deterministic function of time. Thus, the surplus at time t, without control, is given by the dynamics
where T is a fixed time and s and x denote the initial time and initial surplus, respectively.
For the remainder of this paper, we work on a complete probability space (Ω, F, P) on which the process {X t , 0≤t≤T} is defined. The information at time t is given by the complete filtration $\left\{ {F{\rm }_{t}^{s} } \right\}$ generated by {X t , s≤t≤T}.
A strategy α is described by a stochastic process $\left\{ {u_{t}^{\alpha } ,s\leq t\leq T} \right\}$ , where $u_{t}^{\alpha } $ represents the input in time interval (t, t+dt). When applying the strategy α, we let $\left\{ {X_{t}^{\alpha } } \right\}$ denote the controlled risk process. The dynamic for $X_{t}^{\alpha } $ is then given by
The strategy α is said to be admissible if $u_{t}^{\alpha } $ is $F{\rm }_{t}^{s} $ -progressively measurable and such that stochastic differential equation (2.3) has a unique solution. In this case, we call the process $\left\{ {u_{t}^{\alpha } } \right\}$ the control process or simply the control. We denote by Π the set of all admissible strategies.
For a given admissible strategy α, we define the objective function V α by
In (2.4), q t and A t are both continuous functions on the interval [0, T), and λ denotes a discount rate. A t represents the prescribed target path, and q t represents the prescribed proportion. In particular, when q t =1, this choice of objective function ensures the minimization of the distance from some prescribed target path A t and simultaneously minimized the total discounted cost over the interval [s, T].
The objective is to find the optimal value function
and to find an optimal control α * such that
Let C 1,2 denote the space of ϕ(r, x) such that ϕ and its partial derivatives ϕ r , ϕ x , ϕ xx are continuous on [0, T]×R. Let $C_{{pc}}^{{1,2}} $ denote the space of ϕ(r, x) such that ϕ∈C 1,2 and satisfies a polynomial growth condition, i.e., there exist constants K and n such that, for all (r, x)∈R +×R, |ϕ(r, x)|≤K(1+|x| n ). Moreover, ϕ(r, x) satisfies
for any control α. As will be seen in Theorem 3.1, the polynomial growth condition mainly ensures that the term of the stochastic integral over Brownian motion is a martingale (see Fleming & Soner (Reference Fleming and Soner1993), P135), while (2.6) ensures that the term of the stochastic integral over the compensated Poisson point process is a martingale (see Brémaud (Reference Brémaud1981), P235).
Let $L_{F}^{2} \left( {s,T{\hbox {;}}\,R} \right)$ denotes the set of all $\left\{ {F_{t}^{s} } \right\}_{{t\geq s}} $ -adapted R-valued processes Y(·) such that $E{\int}_s^T {\left| {Y(r)} \right|^{2} dr} \,\lt\,\infty$ .
3. Solution of the Control Problem
We now present an analytic solution of the control problem. The problem is treated in three ways. One way is through the dynamic programming approach, which is traditionally used to solve the optimal control problem for the case whereby the controlled process has the Markov property. The second method is through a completion of squares method, inspired by the recent work of Frangos et al. (Reference Frangos, Vrontos and Yannacopoulos2008) on the same linear quadratic problem under a fractional Brownian motion. The third method is through the application of a stochastic maximum principle for jump diffusion. This method was proposed for general control problems by Framstad et al. (Reference Framstad, Øksendal and Sulem2004).
3.1. The dynamic programming method
From standard arguments, we know that if the optimal value function V∈C 1,2, then V satisfies the following Hamilton-Jacobi-Bellman (HJB) equation
with the terminal value
where, for notational convenience, we replace s by t in (2.5), and Y is a generic random variable that has the same distribution as Y i (i=1, 2…).
Note that in many cases, the optimal value function may fail to be sufficiently smooth to satisfy the HJB equation (3.1) in the classical sense. However, it still satisfies (3.1) in the viscosity sense (see Fleming & Soner (Reference Fleming and Soner1993). The following verification theorem shows that the classical solution to the HJB equation yields the solution to the optimization problem.
Theorem 3.1. Assume that $W\in C_{{pc}}^{{1,2}} $ satisfies (3.1)–(3.2). Then, the value function V given by (2.5) and W coincide. Furthermore, let u *(t, x) be such that
for all (t, x)∈[0, T]×R. Then, the Markov control strategy α * of the from $u_{t}^{{\rm {\asterisk}}} {\equals}u^{{\rm {\asterisk}}} \left( {t,X_{t}^{{u^{{\rm {\asterisk}}} }} } \right)$ is optimal. Specifically, $W(t,x){\equals}V(t,x){\equals}V_{{\alpha ^{{\rm {\asterisk}}} }} (t,x)$ .
Proof Let α be an arbitrary control. Then, by applying the Itô formula
since W(t, x) satisfies (3.1). The terminal value (3.2) implies that $W\left( {T,X_{T}^{\alpha } } \right){\equals}{1 \over 2}q_{T} \left( {X_{T}^{\alpha } {\minus}A_{T} } \right)^{2} $ . Then, rearranging yields
Since the compound Poisson process jumps only finitely in any finite interval, the second integral does not change if r is replaced by r−. Thus, by $W\in C_{{pc}}^{{1,2}} $ , we have that
is a martingale. Taking expectations on both sides of inequality (3.3), it follows that V α (t, x)≥W(t, x), which implies V(t, x)≥W(t, x). For the optimal control α *, the inequality becomes an equality, that is, $V_{{\alpha ^{{\rm {\asterisk}}} }} (t,x){\equals}W(t,x)$ . Thus, V(t, x)≤W(t, x), which completes the proof.
We see from Theorem 3.1 that if the classical solution $W\in C_{{pc}}^{{1,2}} $ to (3.1)-(3.2) can be found, then we have the (unique) optimal value function V(t, x) and the corresponding optimal control {α *}. In other words, for the above optimal problem, we need to solve the nonlinear partial differential equation (3.1) and find the value u *(t, x) that minimizes the function
Theorem 3.2. Define
where π(·), g(·) satisfy
f(t) satisfies
with boundary condition
Then $W(t,x)\in C_{{pc}}^{{1,2}} $ is a solution of the HJB equation (3.1). The corresponding minimizing function is given by
with terminal condition
Proof By direct calculation, we obtain that
Differentiating with respect to u in (3.4) and setting the derivative to be zero result in
Thus, we have
Plugging (3.10) into (3.11) we obtain
Notice that E[Y]=μ 1 and E[Y 2]=μ 2. Inserting (3.6), (3.7) and (3.8) into (3.12), we obtain that
It is obvious that W(T, x) satisfies (3.2). Thus, we deduce that W(t, x) is solution of (3.1)–(3.2) and the optimal control is given by (3.9).
Remark 3.1. In particular, let δ t =q t =0 for t<T. In this case, we can obtain the solution of equation (3.6):
3.2. The completion of squares method
Now, we show that the optimal control can be given by the solution of a forward backward stochastic differential equation. The approach is similar to that of Frangos et al. (Reference Frangos, Vrontos and Yannacopoulos2008).
Theorem 3.3 The optimal control α* is given by $u_{t}^{{\asterisk}} {\equals}{\minus}p_{t} e^{{\lambda t}} $ , where p t is the solution of the following backward stochastic differential equation:
for (s, x)∈[0, T]×R. Here, $X_{t}^{{\rm {\asterisk}}} $ denotes the resulting process controlled by $\left\{ {u_{t}^{{\rm {\asterisk}}} } \right\}$ , and η t and γ t are two continuous processes such that
for any control α.
Proof Let α be an arbitrary control and recall the definition of V α , as given in (2.4). The objective function V α may not be continuously differentiable. Consider
Using the equality
results in $V_{\alpha } (s,x){\minus}V_{{\alpha ^{{\asterisk}} }} (s,x){\equals}I_{1} {\plus}I_{2} $ , where
in which we have used $u_{t}^{{\rm {\asterisk}}} {\equals}{\minus}p_{t} e^{{\lambda t}} $ . Considering that $X_{t}^{\alpha } $ and $X_{t}^{{\rm {\asterisk}}} $ solve equation (2.3), we can obtain
Substituting $u_{t}^{\alpha } {\minus}u_{t}^{{\asterisk}} $ of (3.17) into I 2 yields
In view of (3.14), p t satisfies
Then, I 2 evolves as
Since (3.15) and (3.16) implies
is a martingale, I 2 becomes
Applying the Itô formula to $\left( {X_{t}^{\alpha } {\minus}X_{t}^{{\rm {\asterisk}}} } \right)p_{t} $ results in
An analysis similar to that in Theorem 3.1 shows that the first term does not change if t− is replaced by t. Simultaneously, that $X_{t}^{\alpha } {\minus}X_{t}^{{\rm {\asterisk}}} $ is a continuous finite variation process implies that t− in the second term can also be replaced by t and that the last term is equal to zero. Thus, we have
where the second equality follows from the boundary condition $p_{T} {\equals}q_{T} \left( {X_{T}^{{\rm {\asterisk}}} {\minus}A_{T} } \right)$ . We therefore conclude that $V_{\alpha } (s,x)\geq V_{{\alpha ^{{\asterisk}} }} (s,x)$ for α∈Π, which proves that α * is optimal.
We now give the solution of equation (3.14), which provides the optimal control and coincides with the result obtained in the above subsection.
Theorem 3.4. The solution of equation (3.14) has the form
where the deterministic functions π(t) and g(t) are the solutions of the ordinary differential equation (3.6).
Proof Assume that q T , A T are deterministic. Then,
Substituting (3.18) into (3.14), we have
and
Then, (3.19) becomes
Thus, by comparing the coefficient of $X_{t}^{{\rm {\asterisk}}} dt$ in (3.20) and (3.21), we have
Taking the coefficients of $${\sum_{{i{\equals}1}}^{{N_{t} }}} {Y_{i} } $$ and dB t yields
From the terms with dt, we have
The condition $p_{T} {\equals}q_{T} \left( {X_{T}^{{\rm {\asterisk}}} {\minus}A_{T} } \right)$ implies that we have the following final conditions:
The proof is complete.
We now calculate the optimal value function V(s, x), that is, the corresponding objective function to the optimal control $u_{t}^{{\asterisk}} $ . First, by definition, we have
Substituting $p_{T} {\equals}q_{T} \left( {X_{T}^{{\rm {\asterisk}}} {\minus}A_{T} } \right)$ into the last term and applying the Itô formula to the resulting term yield
Then, we denote $Ep_{T} \left( {X_{T}^{{\asterisk}} {\minus}A_{T} } \right){\equals}J_{1} {\plus}J_{2} $ , where
and
Plugging (3.14) into the right-hand side of the above equality and using the martingale property result in
Note that J 2 does not change if t− is replaced by t. Thus, we have
Rearranging yields
In addition, Theorem 3.3 shows that $e^{{2\lambda t}} p_{t}^{2} {\equals}u_{t}^{{{\asterisk}^{2} }} $ . Thus, all the above reasoning yields
In the following, we use properties of the function g(x) to cancel the stochastic term of (3.23). First, the boundary condition g(T)=0 yields $g(T)(X_{T}^{{\asterisk}} {\minus}A_{T} ){\equals}0$ . On the other hand, applying the Itô formula to it results in
Replacing $dX_{t}^{{\rm {\asterisk}}} $ by the first equality in (3.14) and g’(t) by the second equality in (3.6) yields
Replacing t− by t and adding (3.25) to (3.23) result in
Replacing $\pi (t)X_{t}^{{\asterisk}} $ by (3.18) to the right-hand side and rearranging yields
where
In view of the first equality of (3.14), we have
So
which coincides with (3.5), (3.7) with the boundary condition (3.8).
3.3. The maximum principle
This subsection employs the maximum principle to solve the problem. According to Framstad et al. (Reference Framstad, Øksendal and Sulem2004), the Hamiltonian H:[0, T]×R 4×R→R for the above problem becomes
where ℛ is the set of functions r:R 2→R such that the integral in (3.26) converges. The adjoint equation (corresponding to the pair (X, u)) in the unknown adapted process p(t)∈R, Q(t)∈R and r(t, z)∈R is the backward stochastic differential equation (BSDE)
with terminal condition
where N(t, z) is a Poisson random measures with Lévy measures βF(dz).
By Framstad et al. (Reference Framstad, Øksendal and Sulem2004, Theorem 2.1), (X *, u *) is an optimal pair if it satisfies
for all t∈[s, T] and that
exists and is a concave function of x for all t∈[s, T], where (p(t), Q(t), r(t, z)) is a solution of the corresponding (X *, u *) to adjoint equation (3.27). We take γ(t, z)=η t z and Q(t)=−σγ t ; then, (3.27)-(3.28) become
All the above statements yield that the optimal control $u_{t}^{{\rm {\asterisk}}} $ is
in which p t is the solution of the following equation:
Note that the optimal control given by (3.31)–(3.32) is equal to that given by Theorem 3.3. Thus, Theorem 3.3 is again proven. The solution of equation (3.32) has been given by Theorem 3.4.
We now seek the expression of the optimal value function V not by definition and the HJB equation but by the relations between the maximum principle and dynamic programming in the jump-diffusion case. By Framstad et al. (Reference Framstad, Øksendal and Sulem2004, equation 24a), we know that
In our case, this implies that
where f(t) is a suitable function. To determine f(t), we give the following Lemma; it also complements the results given by Framstad et al. (Reference Framstad, Øksendal and Sulem2004).
Lemma 3.1 Let (X *, u *) be an optimal pair. Suppose that the optimal value function V∈C 1,2 . Then,
where G is defined by
Proof The previous analysis shows that the optimal control is
It shows that the optimal control is Markovian, i.e., it depends only on the actual surplus and not on the history of the process. Thus, the resulting surplus process $X_{t}^{{\rm {\asterisk}}} $ still has the Markov property. Therefore, we have
Inspired by Yong & Zhou (Reference Yong and Zhou1999, P251), we define
Clearly, m(·) is ${\scr F}_{t}^{s} $ -adapted square-integrable martingale. Thus, by the martingale representation theorem (see Tang & Li (Reference Tang and Li1994), Lemma 2.3), we have
where $M\in L_{{\scr F}}^{2} (s,T{{\hbox {;}}}\,R)$ and $$H\in B_{{\scr F}} \left( {s,\,T{\hbox {;}}\,\,R} \right)$$ . Then, by (3.38) and (3.40),
On the other hand, applying the Itô formula to $V\left( {t,X_{t}^{{\rm {\asterisk}}} } \right)$ yields
Comparing (3.41) with (3.42) results in
This proves the first equality in (3.35). Since V∈C 1,2, it satisfies the HJB equation (3.1), which implies the second equality in (3.35).
Combining (3.34) and, (3.37) with the first equality of (3.35), we obtain (3.5) with f(t) given by (3.7)-(3.8) again.
Acknowledgements
This work was supported by the National Natural Science Foundation of China (11471171,11571189)