1. Introduction
Motivated by genetic studies, Karlin and Tavaré [Reference Karlin and Tavaré11] introduced the class of continuous-time birth–death processes with killing (or CBDKs for short) and obtained explicit results on the killing time for the case of linear CBDKs. (Earlier, Puri [Reference Puri13] also discussed related processes.) Later, van Doorn and Zeifman [Reference Van Doorn and Zeifman17,Reference Van Doorn and Zeifman18] considered more general CBDKs and derived the eventual killing probability, among other things. (See also [1, Chapter 9] and [Reference Brockwell2–Reference Di Crescenzo, Giorno, Nobile and Ricciardi6,Reference Stirzaker16] for work on the closely related topic of birth–death processes with catastrophes.)
A CBDK
$\{X_t\}$
is a continuous-time Markov chain with state space
$\{0,1,\ldots\}\cup \{e\}$
, where e is an absorbing state and transitions from state
$i \;(\ne e)$
may go to either
$i-1$
(if
$i>0$
) or
$i+1$
or e with respective transition rates
$\mu_i$
(death rate),
$\lambda_i$
(birth rate), and
$\gamma_i$
(killing rate). We assume that
$\{X_t\}$
is non-explosive. See equations (2) and (3) of [Reference Van Doorn and Zeifman18] for conditions on
$\lambda_i, \mu_i$
, and
$\gamma_i$
under which
$\{X_t\}$
is non-explosive.
Let
$T_e\,:\!=\,\inf\{t\ge 0\colon X_t=e\}$
, the killing time. (By convention,
$\inf \emptyset \,:\!=\,\infty$
.) The distribution of
$T_e$
and the probability of
$T_e<\infty$
(the eventual killing probability) are of primary interest in the study of CBDKs. Let
$\mathcal{L}_i(T_e)$
denote the distribution of
$T_e$
given
$X_0=i$
. We write
$\mathcal{L}_i(T_e)\leq_d \mathcal{L}_j(T_e)$
(or equivalently
$\mathcal{L}_j(T_e) \geq_d \mathcal{L}_i(T_e)$
) if
$\mathcal{L}_i(T_e)$
is stochastically smaller than
$\mathcal{L}_j(T_e)$
, that is,

CBDKs have found applications in medicine and biology. Nagylaki [Reference Nagylaki12] considered a stochastic model for a progressive chronic disease. For a patient with the disease, it is assumed that there is a useful prognostic indicator for the course of the disease and the survival of the patient. This indicator may be modeled as a birth–death process with killing. (In [Reference Nagylaki12], the indicator is modeled as a pure birth process with killing under the assumption that the condition of the patient cannot improve.) For example, a patient in a coma is given a score
$i \in \{3,\ldots, 15\}$
(on the Glasgow Coma Scale) indicating the state of the patient’s consciousness. (A lower score corresponds to a more severe condition.) With the absorbing state e denoting the patient’s death, the killing time
$T_e$
is the patient’s survival time. A natural question is whether the survival time of the patient with score i is stochastically non-decreasing in i, i.e.
$\mathcal{L}_i(T_e) \leq_d \mathcal{L}_{j}(T_e)$
for
$i<j$
. Theorem 1 in the next section answers this question in the affirmative provided the killing rate
$\gamma_i$
is non-increasing in i, regardless of the values of
$\lambda_i$
and
$\mu_i$
. In another application of CBDKs, Hadeler [Reference Hadeler and Tautu7] considered a situation where a host carries a finite number of parasites. Let
$X^\prime_t$
denote the parasite population size within the host at time t, which may be modeled as a birth–death process. Let T denote the death time of the host, and define the process
$\{X_t\}$
by
$X_t=X^\prime_t$
for
$t<T$
and
$X_t=e$
for
$t\geq T$
. Hadeler [Reference Hadeler and Tautu7] proposed modeling
$\{X_t\}$
as a CBDK. Note that
$T=T_e=\inf\{t\geq 0\colon X_t=e\}$
is the survival time of the host. Again, Theorem 1 shows that the survival time of the host carrying i parasites is stochastically non-increasing in i if the killing rate
$\gamma_i$
is non-decreasing in i.
The rest of this paper is organized as follows. Section 2 states the main results Theorem 1 for CBDKs and Theorem 2 for the discrete-time counterpart of CBDKs. The proofs of both theorems rely on Lemma 1 concerning the monotonicity of the row sums of powers of non-negative tri-diagonal matrices, which is of independent interest. Section 3 presents numerical examples. The proofs of Theorems 1 and 2 and Lemma 1 are relegated to Section 4.
2. Main results
For a non-negative matrix
$M=(M_{i,j})$
, let
$(M)_{i,+}$
denote the ith row sum of M, i.e.
$(M)_{i,+}=\sum_j M_{i,j}$
. Before presenting the main results Theorems 1 and 2, we state Lemma 1 below, which is needed for the proofs of Theorems 1 and 2. All the proofs are given in Section 4.
Lemma 1. Let
$\mathbb{S}=\{0,1,\ldots, I\}$
(for some integer
$I>0$
) or
$\mathbb{S}=\{0,1,\ldots\}$
or
$\mathbb{S}=\mathbb{Z}\,:\!=\,\{0, \pm 1,\ldots\}$
. Suppose
$M=(M_{i,j})_{i,j \in \mathbb{S}}$
is a non-negative tri-diagonal matrix.
-
(i) If, for some
$i^{*}$ ,
$(M)_{i,+} \le (M)_{j,+}$ for all
$i\leq i^{*}$ and
$j>i^{*}$ , then we have
\begin{align}(M^{n})_{i^{*},+} \leq (M^{n})_{i^{*}+1,+} \quad for all n\geq 2.\end{align}
-
(ii) If, for some
$i^*$ ,
$(M)_{i,+} \ge (M)_{j,+}$ for all
$i\leq i^{*}$ and
$j>i^{*}$ , then we have
\begin{align}(M^{n})_{i^{*},+} \geq (M^{n})_{i^{*}+1,+} \quad for all n\geq 2.\end{align}
-
(iii) If the row sums of M are monotone, so are the row sums of
$M^n$ for all
$n\ge 2$ .
Theorem 1. Suppose
$\{X_t\}$
is a non-explosive CBDK.
-
(i) If, for some
$i^* \in \{0,1,\ldots\}$ , the killing rates satisfy
$\gamma_i \le \gamma_{j}$ for all
$i \le i^*$ and
$j>i^*$ , then we have
$\mathcal{L}_{i^*}(T_e) \geq_d \mathcal{L}_{i^*+1}(T_e)$ .
-
(ii) If, for some
$i^*$ ,
$\gamma_i \geq \gamma_{j}$ for all
$i \le i^*$ and
$j>i^*$ , then we have
$\mathcal{L}_{i^*}(T_e) \leq_d \mathcal{L}_{i^*+1}(T_e)$ .
-
(iii) If
$\gamma_i \leq \gamma_{i+1}$ for all
$i \geq 0$ , then we have
$\mathcal{L}_{i}(T_e) \geq_d \mathcal{L}_{i+1}(T_e)$ for all
$i \geq 0$ .
-
(iv) If
$\gamma_i \geq \gamma_{i+1}$ for all
$i \geq 0$ , then we have
$\mathcal{L}_{i}(T_e) \leq_d \mathcal{L}_{i+1}(T_e)$ for all
$i \geq 0$ .
Theorem 1 has a discrete-time analogue concerning a Markov chain
$\{Y_n\}$
with state space
$\{0,1,\ldots\} \cup \{e\}$
in discrete time
$n=0,1,\ldots.$
Specifically,
$\{Y_n\}$
has transition probabilities satisfying

and
$\mathbb{P}(Y_{n+1}=e \mid Y_n=e)=1$
. (Note that
$\mathbb{P}(Y_{n+1}=i \mid Y_n=i)>0$
is allowed.) We refer to
$\{Y_n\}$
as a discrete-time birth–death process with killing (or DBDK for short). Let
$T_e^d\,:\!=\,\inf\{n\ge 0\colon Y_n=e\}$
, which is the discrete-time counterpart of
$T_e$
. (By convention,
$\inf \emptyset\,:\!=\,\infty$
.)
Theorem 2. Let
$p_{i,e}\,:\!=\,\mathbb{P}(Y_{n+1}=e \mid Y_n=i)$
,
$i=0,1,\ldots.$
-
(i) If, for some
$i^*$ ,
$p_{i,e} \leq p_{j,e}$ for all
$i \le i^*$ and
$j>i^*$ , then we have
$\mathcal{L}_{i^*}(T^d_e) \geq_d$
$\mathcal{L}_{i^*+1}(T^d_e)$ .
-
(ii) If, for some
$i^*$ ,
$p_{i,e} \geq p_{j,e}$ for all
$i \le i^*$ and
$j>i^*$ , then we have
$\mathcal{L}_{i^*}(T^d_e) \leq_d$
$\mathcal{L}_{i^*+1}(T^d_e)$ .
-
(iii) If
$p_{i,e} \leq p_{i+1,e}$ for all
$i \geq 0$ , then we have
$\mathcal{L}_{i}(T^d_e) \geq_d \mathcal{L}_{i+1}(T^d_e)$ for all
$i \geq 0$ .
-
(iv) If
$p_{i,e} \geq p_{i+1,e}$ for all
$i \geq 0$ , then we have
$\mathcal{L}_{i}(T^d_e) \leq_d \mathcal{L}_{i+1}(T^d_e)$ for all
$i \geq 0$ .
Remark 1. By Theorem 1, we have
$\mathcal{L}_i(T_e)$
stochastically monotone in i provided the killing rate
$\gamma_i$
is monotone in i, regardless of the birth and death rates
$\lambda_i$
and
$\mu_i$
. The fact that the process
$\{X_t\}$
is skip-free to the left and to the right (apart from jumping into the absorbing state e) is crucial for Theorem 1 to hold. In a different context, Irle and Gani [Reference Irle and Gani10] and Irle [Reference Irle9] obtained level-crossing stochastic ordering results for Markov chains and semi-Markov processes which are skip-free to the right.
Remark 2. He and Chavoushi [Reference He and Chavoushi8] considered queueing systems with customer interjections in which two parameters (denoted by
$\eta_I$
and
$\eta_C$
) are introduced to describe the interjection behavior. They investigated the effects of the two parameters on the customer’s waiting time, and derived some monotonicity properties of the distribution of the waiting time and its mean and variance in terms of the parameters. (The proof of their Lemma 2.2 contains a special case of our Lemma 1.) In particular, with
$W_n(\eta_I,\eta_C)$
denoting the waiting time of a customer initially in position n in the queue, they showed that
$W_n(\eta_I,\eta_C) \leq_d W_n(\eta_I',\eta_C')$
if
$\eta_I \leq \eta_I'$
and
$\eta_C \leq \eta_C'$
. Note that two different pairs of parameter values
$(\eta_I,\eta_C)$
and
$(\eta_I',\eta_C')$
correspond to two different transition rate matrices (with the same state space). In contrast, our results are concerned with a single transition rate matrix with an absorbing state and compare the absorbing time distributions when the Markov chain starts from different states.
3. Numerical examples
In this section we illustrate Theorems 1 and 2 with numerical examples. We first consider a DBDK
$\{Y_n\}$
with transition probabilities given by

Figure 1 plots the survival probabilities

where the matrix
$M=(M_{i,j})$
is given by
$M_{i,j}=\mathbb{P}(Y_1=j \mid Y_0=i)$
for
$i,j=0,1,\ldots.$
It shows that
$\mathbb{P}(T_e^d > n \mid Y_0= i)$
decreases as i increases, as expected in view of Theorem 2(iii).

Figure 1: The survival probabilities
$\mathbb{P}(T_e^d > n \mid Y_0= i)$
,
$i=0,1,2,3$
.
For a linear CBDK
$\{X_t\}$
with
$\mu_i=ai$
,
$\lambda_i=bi+\theta$
, and
$\gamma_i=ci$
, where a,b,c and
$\theta$
are positive constants, Karlin and Tavaré [Reference Karlin and Tavaré11] derived the explicit formula

where
$0<v_0<1<v_1$
are the two roots of the equation
$bx^2 -(a+b+c)x + a=0$
and
$\sigma_t={\mathrm{e}}^{-b(v_1-v_0)t}$
. Since
$\gamma_i$
is increasing in i,
$\mathbb{P}(T_e>t \mid X_0=i)$
should be decreasing in i by Theorem 1(iii). Noting that
$v_0(v_1-1)+v_1 \sigma_t (1-v_0)<v_1-1+\sigma_t (1-v_0)$
(since
$0<\sigma_t<1$
), we see that
$\mathbb{P}(T_e>t \mid X_0=i)$
decays geometrically in i. Figure 2 plots
$\mathbb{P}(T_e>t \mid X_0=i)$
for
$0<t<5$
and
$i=0,2,4,6,8$
with
$a=b=c=\theta=1$
.

Figure 2: The survival probabilities
$\mathbb{P}(T_e > t \mid X_0= i)$
,
$i=0,2,4,6, 8$
.
While no explicit formula for
$\mathbb{P}(T_e>t \mid X_0=i)$
is available for general CBDKs, we may use the technique of uniformization (see [14, Section 5.10] and [15, Section 6.7]) to compute
$\mathbb{P}(T_e>t \mid X_0=i)$
if
$\{X_t\}$
is uniformizable (i.e.
$\sup_i (\lambda_i+\mu_i+\gamma_i)\leq C<\infty$
for some constant
$C>0$
). Specifically, with
$\mu_0\,:\!=\,0$
, let
$\{Z_n\}$
be a DBDK with transition probabilities satisfying
$\mathbb{P}(Z_{n+1}=e \mid Z_n=e)=1$
, and for
$i\geq 0$

Let
$\{N(t)\}$
be a Poisson process of constant rate C, independent of
$\{Z_n\}$
. Then we have the following result:

In other words, the CBDK
$\{X_t\}$
may be constructed by placing the transitions of the DBDK
$\{Z_n\}$
at Poisson arrival epochs. It follows from (2) that

where the matrix
$M^*=(M^*_{i,j})$
is given by
$M^*_{i,j}=\mathbb{P}(Z_1=j \mid Z_0=i)$
. So
$\mathbb{P}(T_e>t \mid X_0=i)$
may be approximated by

for sufficiently large integer L(t) (depending on t). The distributional equivalence of
$\{X_t\}$
and
$\{Z_{N(t)}\}$
in (2) will be called for in the proof of Theorem 1 in the next section.
4. Proofs of Theorems 1 and 2 and Lemma 1
To prove Theorem 2, let
$M=(M_{i,j})_{i,j\in \{0,1,\ldots\}}$
be the transition matrix of
$\{Y_n\}$
restricted to the states
$0,1,\ldots.$
That is,

By (1), we see that
$M_{i,j}=0$
for
$|i-j|>1$
and that

Moreover,

In view of (3) and (4), Theorem 2 follows immediately from Lemma 1. To prove Theorem 1, one may approximate the CBDK
$\{X_t\}$
by a sequence of DBDKs via time-discretization and argue that
$T_e$
is the limit of the corresponding
$T^d_e$
. Instead of taking this approach, we first consider uniformizable
$\{X_t\}$
(i.e.
$\sup_i\; (\lambda_i+\mu_i+\gamma_i)<\infty$
) for which the associated (embedded) discrete-time Markov chain is a DBDK, so that the desired results follow from Theorem 2. The general CBDK
$\{X_t\}$
is then approximated by a sequence of uniformizable CBDKs. The details are given in the proof below.
Proof of Theorem 1. We first prove part (i) under the additional assumption that
$\{X_t\}$
is uniformizable, i.e.
$\sup_i\; (\lambda_i+\mu_i+\gamma_i)\leq C<\infty$
for some
$C>0$
. Let
$\{Z_n\}$
and
$\{N(t)\}$
be, respectively, the DBDK and (independent) Poisson process of constant rate C as described in the last paragraph of Section 3. Letting

we see that
$\{Z_n\}$
is a DBDK with
$p^{(Z)}_{i,e} \leq p^{(Z)}_{j,e}$
for
$i\leq i^*$
and
$j>i^*$
, so that by Theorem 2(i),

Consequently, for
$t>0$
,

This proves part (i) for uniformizable
$\{X_t\}$
.
For general (non-explosive)
$\{X_t\}$
, to prove

for any (fixed)
$t_0>0$
, let

where
$T_k\,:\!=\,\inf\{t\geq 0\colon X_t=k\}$
. For
$X_t$
to reach a (large) state k starting from
$i^*$
or
$i^*+1$
, at least
$k-i^*-1$
transitions are required. Since
$\{X_t\}$
is non-explosive, we have

For each
$k>i^*$
, let
$\{X_t^{(k)}\}$
be a CBDK whose birth, death, and killing rates in state i are given by
$\lambda_i^{(k)}\,:\!=\,\lambda_{i\wedge k}$
,
$\mu_i^{(k)}\,:\!=\,\mu_{i\wedge k}$
, and
$\gamma_i^{(k)}\,:\!=\,\gamma_{i\wedge k}$
, where
$i \wedge k\,:\!=\,\min\{i,k\}$
. Clearly
$\{X_t^{(k)}\}$
is uniformizable with the killing rates satisfying
$\gamma_i^{(k)}\leq \gamma_j^{(k)}$
for
$i\leq i^*$
and
$j>i^*$
. Let
$\mathcal{L}_i(T_k)$
(
$\mathcal{L}_i(T_e \wedge T_k)$
, respectively) denote the distribution of
$T_k$
(
$T_e \wedge T_k$
, respectively) given
$X_0=i$
. Similarly, let
$\mathcal{L}_i(T_k^{(k)})$
(
$\mathcal{L}_i(T_e^{(k)}\wedge T_k^{(k)})$
, respectively) denote the distribution of
$T_k^{(k)}$
(
$T_e^{(k)}\wedge T_k^{(k)}$
, respectively) given
$X_0^{(k)}=i$
, where
$T_r^{(k)}\,:\!=\,\inf\{t \geq 0\colon X_t^{(k)}=r\}$
for
$r=k,e$
.
For
$k>i^*$
, given
$X_0=i^*$
or
$i^*+1$
, the distributions of
$T_k$
and
$T_e \wedge T_k$
depend only on the values of
$\lambda_i, \mu_i, \gamma_i$
for
$i< k$
. Similarly, given
$X_0^{(k)}=i^*$
or
$i^*+1$
, the distributions of
$T_k^{(k)}$
and
$T_e^{(k)} \wedge T_k^{(k)}$
depend only on the values of
$\lambda_i^{(k)}, \mu_i^{(k)}, \gamma_i^{(k)}$
for
$i< k$
. Since
$\lambda_i=\lambda_i^{(k)}$
,
$\mu_i=\mu_i^{(k)}$
, and
$\gamma_i=\gamma_i^{(k)}$
for
$i\leq k$
, we find that

implying that for
$k>i^*$
,


Since
$\{X_t^{(k)}\}$
is uniformizable with
$\gamma^{(k)}_i \leq \gamma^{(k)}_j$
for
$i \leq i^*$
and
$j>i^*$
, we have shown that

Also for
$k>i^*$
and
$i=i^*$
,
$i^*+1$
,

and

where
$a_k$
and
$b_k$
are as defined in (6) and
$a_k \vee b_k\,:\!=\,\max\{a_k,b_k\}$
. Consequently, for
$k>i^*$
,

Furthermore, for
$k>i^*$
and
$i=i^*$
,
$i^*+1$
,

which together with (11) implies that for
$k>i^*$
and
$i=i^*$
,
$i^*+1$
,

It follows from (7), (10), and (12) that

completing the proof of part (i).
Part (ii) can be proved by using a similar argument and invoking Theorem 2(ii). Parts (iii) and (iv) follow immediately from parts (i) and (ii), respectively. □
Proof of Lemma 1. Note that part (ii) follows from part (i) by reversing the ordering of the rows and that of the columns, and that part (iii) is a consequence of parts (i) and (ii). It remains to prove part (i). It suffices to deal only with the case
$\mathbb{S}=\mathbb{Z}$
, since the cases
$\mathbb{S}=\{0,1,\ldots, I\}$
and
$\mathbb{S}=\{0,1,\ldots\}$
can be treated as special cases of
$\mathbb{S}=\mathbb{Z}$
. More precisely, for
$M=(M_{i,j})_{i,j \in \{0,1,\ldots,I\}}$
satisfying
$(M)_{i,+}\le (M)_{j,+}$
for
$i \le i^*<j$
, define
$\widetilde{M}=(\widetilde{M}_{i,j})_{i,j \in \mathbb{Z}}$
by

where
$\textbf{1}_A$
denotes the indicator function of a set A and
$\delta_{i,j}=1$
if
$i=j$
and
$\delta_{i,j}=0$
otherwise. We have
$(\widetilde{M})_{i,+}\le (\widetilde{M})_{j,+}$
for
$i\le i^*<j$
. Moreover,
$(\widetilde{M}^n)_{i,+}=(M^n)_{i,+}$
for
$i=0,1,\ldots,I$
.
We now prove part (i) for
$\mathbb{S}=\mathbb{Z}$
. Without loss of generality, we assume
$i^*=0$
, so that the tri-diagonal matrix M satisfies
$M_{i,j}\ge 0$
for
$i,j \in \mathbb{Z}$
,
$M_{i,j}=0$
if
$|i-j|>1$
, and
$(M)_{i,+}\le (M)_{j,+}$
for
$i \le 0<j$
. To show that
$(M^{n})_{0,+} \leq (M^{n})_{1,+}$
for any (fixed)
$n\geq 2$
, note that
$(M^{n})_{0,+}$
and
$(M^{n})_{1,+}$
do not depend on the values of
$M_{i,i-1}$
,
$M_{i,i}$
, and
$M_{i,i+1}$
for
$i\leq -n$
or
$i\geq n+1$
. Thus
$(M^{n})_{0,+}=(\overline{M}^{n})_{0,+}$
and
$(M^{n})_{1,+} = (\overline{M}^{n})_{1,+}$
, where
$\overline{M}=(\overline{M}_{i,j})$
is defined by

Note that
$\overline{M}$
has bounded row sums and satisfies
$(\overline{M})_{i,+}\le (\overline{M})_{j,+}$
for
$i\leq 0<j$
. If we can show part (i) of the theorem for non-negative tri-diagonal matrices with bounded row sums, then we have

So it suffices to establish part (i) with M having bounded row sums. We may further assume that the row sums of M are bounded by 1.
To show that
$(M^{n})_{0,+} \leq (M^{n})_{1,+}$
for
$n\geq 2$
, we introduce a Markov chain
$\{X_{n}\colon n=0,1,\ldots \}$
with state space
$\mathbb{Z}\cup \{e\}$
and transition probabilities given by

Note that the state e is absorbing. Let
$T_{1}=\inf \{n\geq 0\colon X_{n}=1\}$
and
$T_{e}=\inf \{n\geq 0\colon X_{n}=e\}$
. (In Theorem 1,
$T_e$
is defined with respect to the continuous-time process
$\{X_t\}$
. Here the same notation
$T_e$
is used with respect to the discrete-time process
$\{X_n\}$
. This proof does not involve the continuous-time process
$\{X_t\}$
.)
We write
$\mathbb{P}_{i}( \cdot )=\mathbb{P}( \cdot\mid X_{0}=i)$
, and claim that for
$n=1,2,\ldots,$
and
$j=1,2,\ldots,$

Note that for
$1 \leq \ell < n$
,

By (14), the left-hand side of (13) with
$j=1$
equals

Thus the inequality (13) with
$j=1$
is equivalent to

Since
$\mathbb{P}_{i}(T_{e}>n) =\mathbb{P}(X_{n}\in \mathbb{Z} \mid X_{0}=i) = (M^{n})_{i,+}$
, (15) is equivalent to
$(M^{n})_{0,+} \leq (M^{n})_{1,+}$
.
We now prove (13) by induction on n. For
$n=1$
, the left-hand side of (13) equals

while the right-hand side equals
$\mathbb{P}_{j}(T_{e}>1)=(M)_{j,+}$
. Thus the inequality (13) with
$n=1$
follows from the assumption that
$(M)_{0,+}\leq (M)_{j,+}$
for
$j>0$
.
For
$m \geq 1$
, suppose (13) holds for
$n=1,\ldots,m$
and
$j=1,2,\ldots.$
In particular, the induction hypothesis (see (15)) implies that

We need to show that for
$j=1,2,\ldots,$

Note that for
$j=1,2,\ldots,$



Except for
$j=1$
in (18), (18)–(20) follow immediately from the induction hypothesis. By (16),
$\mathbb{P}_{0}(T_{e}>m-\ell) \leq \mathbb{P}_{1}(T_{e}>m-\ell) $
for
$\ell=1,\ldots,m-1$
, so the left-hand side of (18) with
$j=1$
equals

establishing (18) for
$j=1$
.
Note that for
$\ell=m$
,

so (18) is equivalent to

Similarly, (19) and (20), respectively, are equivalent to


Letting
$a_j=M_{j,j-1}$
,
$ b_j=M_{j,j}$
, and
$c_j=M_{j,j+1}$
, we have

Similarly, for
$1 \leq \ell \leq m$
,

In view of (24) and (25), it follows from (21)–(23) that

Note that, given
$X_0=0$
,

and

We have

The above inequality follows since

where the inequality is due to the assumption that
$a_{i}+b_{i}+c_{i} \leq a_{j}+b_{j}+c_{j}$
for
$i \leq 0 < j$
.
Finally (17) follows from (26) and (27). The proof is complete. □
Acknowledgements
The authors are grateful to the Editor and two reviewers for their helpful comments leading to an improved presentation. The authors also gratefully acknowledge support by the Ministry of Science and Technology, Taiwan, ROC.