1. Introduction
In recent years, stochastic control theory has gained significant interest in the insurance literature. This is because the insurance company can control the surplus process such that a certain objective function is minimized (maximized). In particular, there are three main criteria: maximizing the discounted total dividend, minimizing the probability of ruin and maximizing the exponential utility. The corresponding modern continuous-time approach was pioneered by Browne (Reference Browne1995) and Asmussen & Taksar (Reference Asmussen and Taksar1997), who applied classical stochastic control methods to reduce the optimization problem to a matter of solving a Hamilton-Jacobi-Bellman (HJB) equation. Browne (Reference Browne1995) found the optimal investment strategies to minimize the probability of ruin and to maximize the exponential utility function under the model of Brownian motion with drift. For the same model, Asmussen & Taksar (Reference Asmussen and Taksar1997) obtained the optimal dividend strategy. Since their pioneering work, many attempts have been made to solve the optimization problem in a framework that allows more controls. Examples where the optimal dividend problem was treated under the model of diffusion are Paulsen & Gjessing (Reference Paulsen and Gjessing1997), Paulsen (Reference Paulsen2003), Asmussen et al. (Reference Asmussen, Hϕjgaard and Taksar2000), Hϕjgaard & Taksar (Reference Hϕjgaard and Taksar1999, Reference Hϕjgaard and Taksar2001, Reference Hϕjgaard and Taksar2004) and Choulli et al. (Reference Choulli, Taksar and Zhou2003). For the same model, Schmidli (Reference Schmidli2001).
Taksar & Markussen (Reference Taksar and Markussen2003), Promislow & Young (Reference Promislow and Young2005) and Bai & Guo (Reference Bai and Guo2008) considered the problem of minimizing the probability of ruin. All of these under the diffusion model gave the closed-form solution. However, in the classical risk model, since the corresponding HJB equation contains the integro term and differential term simultaneously, it is more difficult to solve. For when the objective function is an exponential utility function, Yang & Zhang (Reference Yang and Zhang2005) gave the closed-form solution for the jump-diffusion model, whereas when the objective function is mean variance, Bai & Zhang (Reference Bai and Zhang2008) gave the optimal solution for the classical risk model. For other objective functions, especially the ruin probability, only the existence of a solution to the HJB equation was proved, and a verification theorem was given. Among them are Hipp & Plum (Reference Hipp and Plum2000), Hipp & Taksar (Reference Hipp and Taksar2000), Schmidli (Reference Schmidli2001, Reference Schmidli2002), and Hipp & Vogt (Reference Hipp and Vogt2003).
In this paper, the surplus is modeled as a compound Poisson process perturbed by diffusion. Assume that the insurer is allowed to ask its customers for input to minimize the distance from some prescribed target path and the total discounted cost on a fixed interval. Then, the objective is to find the amount of the input at every time (the optimal control) such that the distance from some prescribed target path and the total discounted cost are minimized and to calculate the minimizing value (the optimal value function).
For the above optimization problem, we first use a dynamic programming approach to solve it. By changing the HJB equation to an ordinary partial differential equation, the analytic solutions to the optimal control and the optimal value function are obtained. Then, it is treated again by the completion of square and stochastic maximum principle. This is different from the dynamic programming approach in that two methods lead to a stochastic differential equation for the optimal control process and not a nonlinear partial differential equation for the optimal value function. Solving the stochastic differential equation yields the optimal control. Then, the optimal value function is obtained by two different methods again.
By comparing three methods, it can be found that (1) the dynamic programming is the best method for solving the optimal solution in this paper and that (2) the dynamic programming is limited for the Markov process, and the two other methods, the completion of square and stochastic maximum principle, are not. Therefore, the process given in this paper to solve for the optimal solution has the inspiring effect of using these two methods to solve the optimal control problems in Non-Markov risk processes (for example, the classical risk model with fractional Brownian motion perturbation).
The paper is organized as follows. In section 2, the model assumptions are formulated. The control and the objective function are introduced. In section 3, the control problem is solved. The problem is divided into three parts. In subsection 3.1, the dynamic programming approach is used. The optimal value function and the optimal control are obtained by the solution and the minimizing function of the HJB equation. In subsection 3.2, stochastic differential equations for the optimal control process are first obtained by the completion of square approach. Solving the equation results in the optimal control. Then, the optimal value function is obtained by its definition. In subsection 3.4, the same stochastic differential equations are obtained using the Hamiltonian system. Then, to obtain the optimal value function, we give a lemma that complements the results given by Framstad et al. (Reference Framstad, Øksendal and Sulem2004). Combining these results, the expression of the optimal value function is again obtained.
2. The Model
Consider the following classical surplus process perturbed by diffusion

where c is the rate at which the premiums are received. {N
t
; t≥0} is a Poisson process with parameter β, denoting the total number of claims with claim times T
i
(i=1, 2 …). Y
1, Y
2, … , independent of {N
t
; t≥0}, are positive i.i.d. random variables with a common distribution function (df)F(x), the moment
$\mu _{j} {\equals}{\int}_0^\infty {x^{j} F(dx)} $
, for j=1, 2. {B
t
; t≥0} is a standard Wiener process that is independent of the aggregate claim process
$${\sum_{{i{\equals}1}}^{{N_{t} }}} {Y_{i} } $$
, and σ is the dispersion parameter.
In addition to the premium income, we here assume that the company also receives interest on its reserves with interest force δ t . The interest is assumed to be a deterministic function of time. Thus, the surplus at time t, without control, is given by the dynamics

where T is a fixed time and s and x denote the initial time and initial surplus, respectively.
For the remainder of this paper, we work on a complete probability space (Ω, F, P) on which the process {X
t
, 0≤t≤T} is defined. The information at time t is given by the complete filtration
$\left\{ {F{\rm }_{t}^{s} } \right\}$
generated by {X
t
, s≤t≤T}.
A strategy α is described by a stochastic process
$\left\{ {u_{t}^{\alpha } ,s\leq t\leq T} \right\}$
, where
$u_{t}^{\alpha } $
represents the input in time interval (t, t+dt). When applying the strategy α, we let
$\left\{ {X_{t}^{\alpha } } \right\}$
denote the controlled risk process. The dynamic for
$X_{t}^{\alpha } $
is then given by

The strategy α is said to be admissible if
$u_{t}^{\alpha } $
is
$F{\rm }_{t}^{s} $
-progressively measurable and such that stochastic differential equation (2.3) has a unique solution. In this case, we call the process
$\left\{ {u_{t}^{\alpha } } \right\}$
the control process or simply the control. We denote by Π the set of all admissible strategies.
For a given admissible strategy α, we define the objective function V α by

In (2.4), q t and A t are both continuous functions on the interval [0, T), and λ denotes a discount rate. A t represents the prescribed target path, and q t represents the prescribed proportion. In particular, when q t =1, this choice of objective function ensures the minimization of the distance from some prescribed target path A t and simultaneously minimized the total discounted cost over the interval [s, T].
The objective is to find the optimal value function

and to find an optimal control α * such that

Let C
1,2 denote the space of ϕ(r, x) such that ϕ and its partial derivatives ϕ
r
, ϕ
x
, ϕ
xx
are continuous on [0, T]×R. Let
$C_{{pc}}^{{1,2}} $
denote the space of ϕ(r, x) such that ϕ∈C
1,2 and satisfies a polynomial growth condition, i.e., there exist constants K and n such that, for all (r, x)∈R
+×R, |ϕ(r, x)|≤K(1+|x|
n
). Moreover, ϕ(r, x) satisfies

for any control α. As will be seen in Theorem 3.1, the polynomial growth condition mainly ensures that the term of the stochastic integral over Brownian motion is a martingale (see Fleming & Soner (Reference Fleming and Soner1993), P135), while (2.6) ensures that the term of the stochastic integral over the compensated Poisson point process is a martingale (see Brémaud (Reference Brémaud1981), P235).
Let
$L_{F}^{2} \left( {s,T{\hbox {;}}\,R} \right)$
denotes the set of all
$\left\{ {F_{t}^{s} } \right\}_{{t\geq s}} $
-adapted R-valued processes Y(·) such that
$E{\int}_s^T {\left| {Y(r)} \right|^{2} dr} \,\lt\,\infty$
.
3. Solution of the Control Problem
We now present an analytic solution of the control problem. The problem is treated in three ways. One way is through the dynamic programming approach, which is traditionally used to solve the optimal control problem for the case whereby the controlled process has the Markov property. The second method is through a completion of squares method, inspired by the recent work of Frangos et al. (Reference Frangos, Vrontos and Yannacopoulos2008) on the same linear quadratic problem under a fractional Brownian motion. The third method is through the application of a stochastic maximum principle for jump diffusion. This method was proposed for general control problems by Framstad et al. (Reference Framstad, Øksendal and Sulem2004).
3.1. The dynamic programming method
From standard arguments, we know that if the optimal value function V∈C 1,2, then V satisfies the following Hamilton-Jacobi-Bellman (HJB) equation

with the terminal value

where, for notational convenience, we replace s by t in (2.5), and Y is a generic random variable that has the same distribution as Y i (i=1, 2…).
Note that in many cases, the optimal value function may fail to be sufficiently smooth to satisfy the HJB equation (3.1) in the classical sense. However, it still satisfies (3.1) in the viscosity sense (see Fleming & Soner (Reference Fleming and Soner1993). The following verification theorem shows that the classical solution to the HJB equation yields the solution to the optimization problem.
Theorem 3.1.
Assume that
$W\in C_{{pc}}^{{1,2}} $
satisfies (3.1)–(3.2). Then, the value function V given by (2.5) and W coincide. Furthermore, let u
*(t, x) be such that

for all (t, x)∈[0, T]×R. Then, the Markov control strategy α
* of the from
$u_{t}^{{\rm {\asterisk}}} {\equals}u^{{\rm {\asterisk}}} \left( {t,X_{t}^{{u^{{\rm {\asterisk}}} }} } \right)$
is optimal. Specifically,
$W(t,x){\equals}V(t,x){\equals}V_{{\alpha ^{{\rm {\asterisk}}} }} (t,x)$
.
Proof Let α be an arbitrary control. Then, by applying the Itô formula

since W(t, x) satisfies (3.1). The terminal value (3.2) implies that
$W\left( {T,X_{T}^{\alpha } } \right){\equals}{1 \over 2}q_{T} \left( {X_{T}^{\alpha } {\minus}A_{T} } \right)^{2} $
. Then, rearranging yields

Since the compound Poisson process jumps only finitely in any finite interval, the second integral does not change if r is replaced by r−. Thus, by
$W\in C_{{pc}}^{{1,2}} $
, we have that

is a martingale. Taking expectations on both sides of inequality (3.3), it follows that V
α
(t, x)≥W(t, x), which implies V(t, x)≥W(t, x). For the optimal control α
*, the inequality becomes an equality, that is,
$V_{{\alpha ^{{\rm {\asterisk}}} }} (t,x){\equals}W(t,x)$
. Thus, V(t, x)≤W(t, x), which completes the proof.
We see from Theorem 3.1 that if the classical solution
$W\in C_{{pc}}^{{1,2}} $
to (3.1)-(3.2) can be found, then we have the (unique) optimal value function V(t, x) and the corresponding optimal control {α
*}. In other words, for the above optimal problem, we need to solve the nonlinear partial differential equation (3.1) and find the value u
*(t, x) that minimizes the function

Theorem 3.2. Define

where π(·), g(·) satisfy

f(t) satisfies

with boundary condition

Then
$W(t,x)\in C_{{pc}}^{{1,2}} $
is a solution of the HJB equation (3.1). The corresponding minimizing function is given by

with terminal condition

Proof By direct calculation, we obtain that

Differentiating with respect to u in (3.4) and setting the derivative to be zero result in

Thus, we have

Plugging (3.10) into (3.11) we obtain

Notice that E[Y]=μ 1 and E[Y 2]=μ 2. Inserting (3.6), (3.7) and (3.8) into (3.12), we obtain that

It is obvious that W(T, x) satisfies (3.2). Thus, we deduce that W(t, x) is solution of (3.1)–(3.2) and the optimal control is given by (3.9).
Remark 3.1. In particular, let δ t =q t =0 for t<T. In this case, we can obtain the solution of equation (3.6):

3.2. The completion of squares method
Now, we show that the optimal control can be given by the solution of a forward backward stochastic differential equation. The approach is similar to that of Frangos et al. (Reference Frangos, Vrontos and Yannacopoulos2008).
Theorem 3.3
The optimal control α* is given by
$u_{t}^{{\asterisk}} {\equals}{\minus}p_{t} e^{{\lambda t}} $
, where p
t
is the solution of the following backward stochastic differential equation:

for (s, x)∈[0, T]×R. Here,
$X_{t}^{{\rm {\asterisk}}} $
denotes the resulting process controlled by
$\left\{ {u_{t}^{{\rm {\asterisk}}} } \right\}$
, and η
t
and γ
t
are two continuous processes such that


for any control α.
Proof Let α be an arbitrary control and recall the definition of V α , as given in (2.4). The objective function V α may not be continuously differentiable. Consider

Using the equality

results in
$V_{\alpha } (s,x){\minus}V_{{\alpha ^{{\asterisk}} }} (s,x){\equals}I_{1} {\plus}I_{2} $
, where


in which we have used
$u_{t}^{{\rm {\asterisk}}} {\equals}{\minus}p_{t} e^{{\lambda t}} $
. Considering that
$X_{t}^{\alpha } $
and
$X_{t}^{{\rm {\asterisk}}} $
solve equation (2.3), we can obtain

Substituting
$u_{t}^{\alpha } {\minus}u_{t}^{{\asterisk}} $
of (3.17) into I
2 yields

In view of (3.14), p t satisfies

Then, I 2 evolves as

Since (3.15) and (3.16) implies

is a martingale, I 2 becomes

Applying the Itô formula to
$\left( {X_{t}^{\alpha } {\minus}X_{t}^{{\rm {\asterisk}}} } \right)p_{t} $
results in

An analysis similar to that in Theorem 3.1 shows that the first term does not change if t− is replaced by t. Simultaneously, that
$X_{t}^{\alpha } {\minus}X_{t}^{{\rm {\asterisk}}} $
is a continuous finite variation process implies that t− in the second term can also be replaced by t and that the last term is equal to zero. Thus, we have

where the second equality follows from the boundary condition
$p_{T} {\equals}q_{T} \left( {X_{T}^{{\rm {\asterisk}}} {\minus}A_{T} } \right)$
. We therefore conclude that
$V_{\alpha } (s,x)\geq V_{{\alpha ^{{\asterisk}} }} (s,x)$
for α∈Π, which proves that α
* is optimal.
We now give the solution of equation (3.14), which provides the optimal control and coincides with the result obtained in the above subsection.
Theorem 3.4. The solution of equation (3.14) has the form

where the deterministic functions π(t) and g(t) are the solutions of the ordinary differential equation (3.6).
Proof Assume that q T , A T are deterministic. Then,

Substituting (3.18) into (3.14), we have

and

Then, (3.19) becomes

Thus, by comparing the coefficient of
$X_{t}^{{\rm {\asterisk}}} dt$
in (3.20) and (3.21), we have

Taking the coefficients of
$${\sum_{{i{\equals}1}}^{{N_{t} }}} {Y_{i} } $$
and dB
t
yields

From the terms with dt, we have

The condition
$p_{T} {\equals}q_{T} \left( {X_{T}^{{\rm {\asterisk}}} {\minus}A_{T} } \right)$
implies that we have the following final conditions:


The proof is complete.
We now calculate the optimal value function V(s, x), that is, the corresponding objective function to the optimal control
$u_{t}^{{\asterisk}} $
. First, by definition, we have

Substituting
$p_{T} {\equals}q_{T} \left( {X_{T}^{{\rm {\asterisk}}} {\minus}A_{T} } \right)$
into the last term and applying the Itô formula to the resulting term yield

Then, we denote
$Ep_{T} \left( {X_{T}^{{\asterisk}} {\minus}A_{T} } \right){\equals}J_{1} {\plus}J_{2} $
, where

and

Plugging (3.14) into the right-hand side of the above equality and using the martingale property result in

Note that J 2 does not change if t− is replaced by t. Thus, we have

Rearranging yields

In addition, Theorem 3.3 shows that
$e^{{2\lambda t}} p_{t}^{2} {\equals}u_{t}^{{{\asterisk}^{2} }} $
. Thus, all the above reasoning yields

In the following, we use properties of the function g(x) to cancel the stochastic term of (3.23). First, the boundary condition g(T)=0 yields
$g(T)(X_{T}^{{\asterisk}} {\minus}A_{T} ){\equals}0$
. On the other hand, applying the Itô formula to it results in

Replacing
$dX_{t}^{{\rm {\asterisk}}} $
by the first equality in (3.14) and g’(t) by the second equality in (3.6) yields

Replacing t− by t and adding (3.25) to (3.23) result in

Replacing
$\pi (t)X_{t}^{{\asterisk}} $
by (3.18) to the right-hand side and rearranging yields

where

In view of the first equality of (3.14), we have

So

which coincides with (3.5), (3.7) with the boundary condition (3.8).
3.3. The maximum principle
This subsection employs the maximum principle to solve the problem. According to Framstad et al. (Reference Framstad, Øksendal and Sulem2004), the Hamiltonian H:[0, T]×R 4×R→R for the above problem becomes

where ℛ is the set of functions r:R 2→R such that the integral in (3.26) converges. The adjoint equation (corresponding to the pair (X, u)) in the unknown adapted process p(t)∈R, Q(t)∈R and r(t, z)∈R is the backward stochastic differential equation (BSDE)

with terminal condition

where N(t, z) is a Poisson random measures with Lévy measures βF(dz).
By Framstad et al. (Reference Framstad, Øksendal and Sulem2004, Theorem 2.1), (X *, u *) is an optimal pair if it satisfies

for all t∈[s, T] and that

exists and is a concave function of x for all t∈[s, T], where (p(t), Q(t), r(t, z)) is a solution of the corresponding (X *, u *) to adjoint equation (3.27). We take γ(t, z)=η t z and Q(t)=−σγ t ; then, (3.27)-(3.28) become

All the above statements yield that the optimal control
$u_{t}^{{\rm {\asterisk}}} $
is

in which p t is the solution of the following equation:

Note that the optimal control given by (3.31)–(3.32) is equal to that given by Theorem 3.3. Thus, Theorem 3.3 is again proven. The solution of equation (3.32) has been given by Theorem 3.4.
We now seek the expression of the optimal value function V not by definition and the HJB equation but by the relations between the maximum principle and dynamic programming in the jump-diffusion case. By Framstad et al. (Reference Framstad, Øksendal and Sulem2004, equation 24a), we know that

In our case, this implies that

where f(t) is a suitable function. To determine f(t), we give the following Lemma; it also complements the results given by Framstad et al. (Reference Framstad, Øksendal and Sulem2004).
Lemma 3.1 Let (X *, u *) be an optimal pair. Suppose that the optimal value function V∈C 1,2 . Then,

where G is defined by

Proof The previous analysis shows that the optimal control is

It shows that the optimal control is Markovian, i.e., it depends only on the actual surplus and not on the history of the process. Thus, the resulting surplus process
$X_{t}^{{\rm {\asterisk}}} $
still has the Markov property. Therefore, we have

Inspired by Yong & Zhou (Reference Yong and Zhou1999, P251), we define

Clearly, m(·) is
${\scr F}_{t}^{s} $
-adapted square-integrable martingale. Thus, by the martingale representation theorem (see Tang & Li (Reference Tang and Li1994), Lemma 2.3), we have

where
$M\in L_{{\scr F}}^{2} (s,T{{\hbox {;}}}\,R)$
and
$$H\in B_{{\scr F}} \left( {s,\,T{\hbox {;}}\,\,R} \right)$$
. Then, by (3.38) and (3.40),

On the other hand, applying the Itô formula to
$V\left( {t,X_{t}^{{\rm {\asterisk}}} } \right)$
yields

Comparing (3.41) with (3.42) results in

This proves the first equality in (3.35). Since V∈C 1,2, it satisfies the HJB equation (3.1), which implies the second equality in (3.35).
Combining (3.34) and, (3.37) with the first equality of (3.35), we obtain (3.5) with f(t) given by (3.7)-(3.8) again.
Acknowledgements
This work was supported by the National Natural Science Foundation of China (11471171,11571189)