1. Introduction
In this article we are interested in $X=(X_t, t\geq 0)$ , a finite-mean, continuous-state branching process (CSBP). In particular, this means that X is a $[0,\infty)$ -valued strong Markov process with absorbing state at zero and with law on $\mathbb{D}([0,\infty),\mathbb{R})$ (the space of càdlàg mappings from $[0,\infty)$ to $\mathbb{R}$ ) given by $\mathbb{P}_x$ for each initial state $x\geq 0$ , such that $\mathbb{P}_{x+y}=\mathbb{P}_x*\mathbb{P}_y$ . Here, $\mathbb{P}_{x+y}=\mathbb{P}_x*\mathbb{P}_y$ means that the sum of two independent processes, one issued from x and the other issued from y, has the same law as the process issued from $x+y$ . Its semigroup is characterised by the Laplace functional
where $u_t(\theta)$ uniquely solves the evolution equation
Here, we assume that the so-called branching mechanism $\psi$ takes the form
where $\alpha\in\mathbb{R}$ , $\beta\geq 0$ , and $\Pi$ is a measure concentrated on $(0,\infty)$ which satisfies $\int_{(0,\infty)}(x\wedge x^2)\Pi(\text{d}x)<\infty$ . These restrictions on $\psi$ are very mild and only exclude the possibility of having a non-conservative process or processes which have an infinite mean growth rate.
We also assume for convenience that $-\psi$ is not the Laplace exponent of a subordinator (i.e. a Bernstein function), thereby ruling out the case that X has monotone paths. It is easily checked that $\psi$ is an infinitely smooth convex function on $(0,\infty)$ with at most two roots in $[0,\infty)$ . More precisely, 0 is always a root; however, if $\psi'(0+)<0$ , then there is a second root in $(0,\infty)$ .
The process X is henceforth referred to as a $\psi$ -CSBP. It is easily verified that
The mean growth of the process is therefore characterised by $\psi'(0+)$ , and accordingly we classify CSBPs by the value of this constant. We say that the $\psi$ -CSBP is supercritical, critical, or subcritical accordingly as $-\psi'(0+) = \alpha$ is strictly positive, equal to zero, or strictly negative, respectively.
It is known that the process $(X,\mathbb{P}_x)$ , $x>0$ , can also be represented as the unique strong solution to the stochastic differential equation (SDE)
for $ x>0$ and $t\geq 0$ , where $W(\text{d}s, \text{d}u)$ is a white noise process on $(0, \infty)^{2}$ based on the Lebesgue measure $\text{d}s\otimes \text{d}u $ , and $N(\text{d}s,\text{d}r,\text{d}\nu)$ is a Poisson point process on $[0, \infty)^{3}$ with intensity $\text{d}s\otimes \Pi(\text{d}r) \otimes \text{d}\nu$ . Moreover, we denote by $\tilde{N}(\text{d}s,\text{d}r,\text{d}\nu)$ the compensated measure of $N(\text{d}s,\text{d}r,\text{d}\nu)$ . See [Reference Bertoin and Le Gall6, Reference Dawson and Li7, Reference Dawson and Li8] for this fact and further properties of the above SDEs.
Through the representation of a CSBP as either a strong Markov process whose semi-group is characterised by an integral equation, or as a solution to an SDE, there are three fundamental probabilistic decompositions that play a crucial role in motivating the main results in this paper. These concern CSBPs conditioned to die out, CSBPs conditioned to survive, and a path decomposition of the supercritical CSBPs.
1.1. CSBPs conditioned to die out
To understand what this means, let us momentarily recall that for all supercritical CSBPs (without immigration) the event $\{\lim_{t\to\infty} X_t =0\}$ occurs with positive probability. Moreover, for all $x\geq 0$ ,
where $\lambda^*$ is the unique root on $(0,\infty)$ of the equation $\psi(\theta) = 0$ . Note that $\psi$ is strictly convex with the property that $\psi(0) = 0$ and $\psi(+\infty) = \infty$ , thereby ensuring that the root $\lambda^*>0$ exists; see Chapters 8 and 9 of [Reference Kyprianou24] for further details. It is straightforward to show that the law of $(X, \mathbb{P}_x)$ conditional on the event $\{\lim_{t\uparrow\infty} X_t =0\}$ , say $\mathbb{P}^*_x$ , agrees with the law of a $\psi^*$ -CSBP, where
See, for example, [Reference Sheu45].
1.2. CSBPs conditioned to survive
The event $\{\lim_{t\to\infty}X_t=0\}$ can be categorised further according to whether its intersection with $\{X_t >0 \text{ for all }t\geq 0\}$ is empty or not. The classical work of Grey [Reference Grey21] distinguishes between these two cases according to an integral test. Indeed, the intersection is empty if and only if
If we additionally assume that $-\psi'(0+)=\alpha \leq 0$ , that is to say, the process is critical or subcritical, then it is known that the notion of conditioning the process to stay positive can be made rigorous through a limiting procedure. More precisely, if we write
then for all $A\in \mathcal{F}_t^X: = \sigma(X_s\,:\,s\leq t)$ and $x>0$ ,
is well defined as a probability measure and satisfies the Doob h-transform
In addition, $(X, \mathbb{P}^\uparrow_x)$ , $x>0$ , has been shown to be equivalent in law to a process with a pathwise description which we give below. Before doing so, we need to introduce some more notation. To this end, define $N^*$ to be a Poisson random measure on $[0,\infty)^2\times\mathbb{D}([0,\infty), \mathbb{R})$ with intensity measure ${\text{d}} s\otimes r\Pi({\text{d}} r)\otimes\mathbb{P}_r({\text{d}}\omega)$ . Moreover, when it exists, $\mathbb{Q}$ is the intensity, or ‘excursion’ measure, on the space $\mathbb{D}([0,\infty), \mathbb{R})$ which satisfies
for $\theta,t\geq 0$ . Here, the measure $\mathbb{Q}$ is the excursion measure on the space $\mathbb{D}([0,\infty), \mathbb{R})$ associated to $\mathbb{P}_x$ , $x>0$ . See Theorems 3.10, 8.6, and 8.22 of [Reference Li36] and [Reference Duquesne and Labbé9, Reference Dynkin and Kuznetsov12, Reference El Karoui and Roelly14, Reference Le Gall30, Reference Li38] for further details. We should note in particular that Theorem 3.10 in [Reference Li36] gives a necessary and sufficient condition under which $\mathbb{Q}$ is well defined as an excursion entrance law, namely that $\lim_{\theta\to\infty}\psi'(\theta) = \infty$ . This is automatically satisfied when $\beta>0$ . However, when $\beta = 0$ , the reader will note that, in what we described below and elsewhere in the paper, the use of $\mathbb{Q}$ is not needed. We can accordingly build a Poisson point process $N^{\text{c}}$ on $[0,\infty)\times \mathbb{D}([0,\infty), \mathbb{R})$ with intensity ${\color{black} 2\beta{\text{d}} s\otimes \mathbb{Q}({\text{d}} \omega)}$ . Then, for $x>0$ , $(X, \mathbb{P}^\uparrow_x)$ is equal in law to the stochastic process
where X ′ has the law $\mathbb{P}_x$ and is independent of $N^{\text{c}}$ and $N^*$ , which are also independent of one another.
Intuitively, we can think of the process $(\Lambda_t, t\geq 0)$ as being the result of first running a subordinator
where we have slightly abused our notation and written $N^*({\text{d}} s,{\text{d}} r)$ , $s, r>0$ , in place of $\int_{ \mathbb{D}([0,\infty),\mathbb{R} )} N^*({\text{d}} s,{\text{d}} r, {\text{d}} \omega )$ , $s,r>0$ . The subordinator $(S_t, t\geq0)$ is usually referred to as the spine. To explain the formula (1.6), in a Poissonian way, we dress the spine with versions of X sampled under the excursion measure $\mathbb{Q}$ . Moreover, at each jump of S we initiate an independent copy of X with initial mass equal to the size of the jump of S. See, for example (3.9) in [Reference Li35] (4.3) in [Reference Li34] (4.18) in [Reference Li33], or the discussion in Section 12.3.2 of [Reference Kyprianou24] or [Reference Li36]. The reader is also referred, e.g., to [Reference Lambert28, Reference Lambert29, Reference Roelly-Coppoletta and Rouault42] for further details of the notion of a spine.
It turns out that we may also identify the effect of the change of measure within the context of the SDE setting. In [Reference Fittipaldi and Fontbona19], it was shown that $(X, \mathbb{P}^\uparrow_x)$ , $x>0$ , offers the unique strong solution to the SDE
where W, N, and $\tilde{N}$ are as in (1.4) and $N^*$ is as above, and all noises are independent. See also [Reference Dawson and Li8] and [Reference Fu and Li20].
1.3. Skeletal path decomposition of supercritical CSBPs
In [Reference Berestycki, Kyprianou and Murillo-Salas4, Reference Bertoin, Fontbona and Martínez5, Reference Duquesne and Winkel11] it was shown that the law of the process X, where X is defined by (1.4), can be recovered from a supercritical continuous-time Galton–Watson (GW) process, issued with a Poisson number of initial ancestors, and dressed in a Poissonian way using the law of the original process conditioned to become extinguished.
To be more precise, they showed that for each $x\geq 0$ , $(X, \mathbb{P}_x)$ has the same law as the process $(\Lambda_t, t \geq 0)$ which has the following pathwise construction. First, sample from a continuous-time GW process with branching rate $q = \psi'(\lambda^*)$ and offspring distribution $\{p_k\,:\,k\geq 0\}$ such that its branching generator is given by
This continuous-time GW process goes by the name of the skeleton and offers the genealogy of prolific individuals, that is, individuals who have infinite genealogical lines of descent (cf. [Reference Bertoin, Fontbona and Martínez5]). With the particular branching generator given by (1.8), $p_0 = p_1 =0$ , and for $k\geq 2$ , $p_k : = p_k ([0,\infty))$ , where, for $r\geq 0$ ,
If we denote the aforesaid GW process by $Z = (Z_t, t\geq 0)$ then we shall also insist that $Z_0$ has a Poisson distribution with parameter $\lambda^*x$ . Next, thinking of the trajectory of Z as a graph, dress the life-lengths of Z in such a way that a $\psi^*$ -CSBP is independently grafted on to each edge of Z at time t with rate
Moreover, on the event that an individual dies and branches into $k\geq 2$ offspring, with probability $p_k(\text{d}x)$ , an additional independent $\psi^*$ -CSBP is grafted on to the branching point with initial mass $x\geq 0$ . The quantity $\Lambda_t$ is now understood to be the total dressed mass present at time t together with the mass present at time t in an independent $\psi^*$ -CSBP issued at time zero with initial mass x. Whilst it is clear that the pair $(Z,\Lambda)$ is Markovian, it is less clear that $\Lambda$ alone is Markovian. This must, however, be the case given the conclusion that $\Lambda$ and X are equal in law. A key element in this respect is the non-trivial observation that, for each $t\geq 0$ , the law of $Z_t$ given $\Lambda_t$ is that of a Poisson random variable with parameter $\lambda^*\Lambda_t$ .
Such skeletal path decompositions for continuous-state branching processes, and spatial versions thereof, are by no means new. Examples include [Reference Berestycki, Kyprianou and Murillo-Salas4, Reference Duquesne and Winkel11, Reference Etheridge and Williams15, Reference Evans and O’Connell18, Reference Harris, Hesse and Kyprianou22, Reference Kyprianou, Pérez and Ren26, Reference Kyprianou and Ren27, Reference Salisbury and Verzani43, Reference Salisbury and Verzani44].
1.4. Overview
In this paper our objective is to understand the relationship between the skeletal decompositions of the type described above and the emergence of a spine on conditioning the process to survive. In particular, our tool of choice will be the use of SDE theory. The importance of this study is that it underlines a methodology that should carry over to the spatial setting of superprocesses, where recent results have increasingly sought the use of skeletal decompositions to transfer results from the branching particle setting to the setting of measure-valued processes; cf. [Reference Eckhoff, Kyprianou and Winkel13, Reference Kyprianou, Murillo-Salas and Pérez25, Reference Milos39, Reference Protter40]. In future work we hope to develop the SDE approach to skeletal decompositions in the spatial setting. We also expect this approach to be helpful in studying analogous decompositions in the setting of CSBPs with competition [Reference Berestycki, Fittipaldi and Fontbona3, Reference Protter40]. Moreover, although our method takes inspiration from the genealogical coding of CSBPs by Lévy excursions, cf. [Reference Duquesne and Le Gall10], our approach appears to be applicable where the aforesaid method fails, namely supercritical processes.
2. Main results
In this section we summarise the main results of the paper. We have three main results. First, we provide a slightly more general family of skeletal decompositions in the spirit of [Reference Duquesne and Winkel11], albeit with milder assumptions and that we use the language of SDEs. Second, taking lessons from this first result, we give a time-inhomogeneous skeletal decomposition, again using the language of SDEs, for both supercritical and (sub)critical CSBPs. Nonetheless, our proof will take inspiration from classical ideas on the genealogical coding of CSBPs through the exploration of associated excursions of reflected Lévy processes; see, for example, [Reference Duquesne and Le Gall10] and the references therein. Finally, our third main result shows that a straightforward limiting procedure in the SDE skeletal decomposition for (sub)critical processes, which corresponds to conditioning on survival, reveals a weak solution to the SDE given in (1.7). It will transpire that conditioning the process to survive until later and later times is equivalent to ‘thinning’ the skeleton such that, in the limit, we get the spine decomposition. The limiting procedure also intuitively explains how the spine emerges in the conditioned process as a consequence of stretching out the skeleton in the SDE decomposition of the (sub)critical processes.
Before moving to the first main result, let us introduce some more notation. The reader will note that it is very similar to, but nonetheless subtly different from, previously introduced terms. Define the Esscher transformed branching mechanism $\psi_\lambda\,:\,\mathbb{R}_+\rightarrow \mathbb{R}_+$ for $\theta\geq -\lambda$ and $\lambda\geq \lambda^*$ by
where
This is the branching mechanism of a subcritical branching process on account of the fact that $-\psi'_\lambda(0+) = -\psi'(\lambda)<0$ . Heuristically speaking, given that $\lambda\mapsto \psi'(\lambda)$ is increasing, the $\psi_\lambda$ -CSBP becomes more and more subcritical as $\lambda$ increases.
Next, we need the continuous-time GW process parameterised by $\lambda\geq \lambda^*$ , which has been seen before in, e.g., [Reference Duquesne and Winkel11], and agrees with the process described by (1.8) when $\lambda = \lambda^*$ . It branches at rate $\psi'(\lambda)$ and has a branching generator given by
That is to say, writing $F_\lambda(s)$ as in the left-hand side of (1.8), we now have $p_0={\psi(\lambda)}/{\lambda \psi'(\lambda)}$ , $p_1=0$ , and, for $k\geq 2$ ,
We will also use the family $(\eta_k(\!\cdot\!))_{k\geq 0}$ of branch point immigration laws (conditional on the number of offspring at the branch point), where $\eta_1({\text{d}} r)=0$ , $x\geq0$ , and, otherwise,
for $r\geq 0$ . Note, in particular, that when $\lambda > \lambda^*$ there is the possibility that no offspring are allowed. Since in this case some lines of descent are finite, the GW process no longer represents the prolific individuals.
Finally, we need to introduce a series of driving sources of randomness for the SDE which will appear in Theorem 2.1 below. Let $ {\texttt N}^{0}$ be a Poisson random measure on $[0, \infty)^{3}$ with intensity measure ${\text{d}} s\otimes \text{e}^{-\lambda r} \Pi({\text{d}} r) \otimes \text{d}\nu$ , $\tilde{\texttt N}^0$ be the associated compensated version of ${\texttt N}^0$ , $ {\texttt N}^1({\text{d}} s,{\text{d}} r, {\text{d}}{j}) $ be a Poisson point process on $[0, \infty)^{2}\times {\mathbb{N}}%\times [0,1] $ with intensity $ {\text{d}} s\otimes r \text{e}^{-\lambda r} \Pi({\text{d}} r) \otimes\sharp({\text{d}}{j})$ , and finally ${\texttt N}^{2}({\text{d}} s,{\text{d}} r, {\text{d}}{k},{\text{d}}{j})$ be a Poisson point process on $[0, \infty)^{2}\times {\mathbb{N}}_0\times\mathbb{N}%\times [0,1] $ with intensity $ \psi'(\lambda){\text{d}} s\otimes \eta_k({\text{d}} r) \otimes p_k\sharp({\text{d}} k)\otimes \sharp({\text{d}}{j}) $ , where $\mathbb{N}_0 = \{0\}\cup\mathbb{N}$ and $\sharp({\text{d}}\ell)= \sum_{i\in \mathbb{N}_0}\delta_{i}({\text{d}} \ell)$ , $\ell\geq 0$ , denotes the counting measure on $\mathbb{N}_0$ . As before, $W(\text{d}s, \text{d}u)$ will denote a white noise process on $(0, \infty)^{2}$ based on the Lebesgue measure $\text{d}s\otimes \text{d}u$ .
Theorem 2.1. Suppose that $\psi$ corresponds to a supercritical branching mechanism (i.e. $\alpha>0$ ) and $\lambda\geq \lambda^*$ . Consider the coupled system of SDEs
Equation (2.3) has a unique strong solution for arbitrary ( $\mathcal{F}_0$ -measurable) initial values $\Lambda_0\geq 0$ and $Z_0\in \mathbb{N}_0$ (where $\mathcal{F}_t\,:\!=\sigma((\Lambda_s,Z_s)\,:\,s\leq t)$ ). Furthermore, under the assumption that $Z_0$ is an independent random variable which is Poisson distributed with intensity $\lambda \Lambda_0$ , this unique solution satisfies the following:
(i) For $t\geq0$ , conditional on $\mathcal{F}^\Lambda_t: = \sigma(\Lambda_s\,:\,s\leq t)$ , $Z_t $ is Poisson distributed with intensity $\lambda\Lambda_t$ .
(ii) The process $(\Lambda_t, t\geq 0)$ is Markovian and a weak solution to (1.4).
(iii) If $Z_0=0$ , then $(\Lambda_t,t\geq 0)$ is a subcritical CSBP with branching mechanism $\psi_{\lambda}$ .
If we focus on the second element, Z, in the SDE (2.3), it can be seen that there is no dependency on the first element $\Lambda$ . The converse is not true, however. Indeed, the stochastic evolution for Z is simply that of the continuous-time GW process with branching mechanism given by $F_\lambda(s)$ , $s\in[0,1]$ . Given the evolution of Z, the process $\Lambda$ here describes nothing more than the aggregation of a Poisson and branch-point dressing on Z together with an independent copy of a $\psi_\lambda$ -CSBP. As is clear from (2.2), this results in the skeleton Z having the possibility of ‘dead ends’ (no offspring). Of course, if $\lambda = \lambda^*$ then this occurs with zero probability and the joint system of SDEs in (2.3) describes precisely the prolific skeleton decomposition. In the spirit of [Reference Duquesne and Winkel11], albeit using different technology and in a continuum setting, Theorem 2.1 puts into a common framework a parametric family of skeletal decompositions for supercritical processes. Related work also appears in [Reference Abraham and Delmas1, Reference Li37].
Remark 2.1. Although we have assumed in the introduction that
the reader can verify from the proof that this is in fact not needed. Indeed, suppose that we relax the assumption on $\Pi$ to just $\int_{(0,\infty)}(1\wedge x^2)\,\Pi({\text{d}} x)<\infty$ , and we take the branching mechanism in the form
where $\psi'(0)<0$ and
to ensure conservative supercriticality. Then the necessary adjustment we need to make occurs, for example, in (1.4), where jumps of size greater than or equal to 1 in the Poisson random measures N are separated out without compensation. However, the form of (2.3) remains the same, as all jumps of $N^0$ can be compensated.
Our objective, however, is to go further and demonstrate how the SDE approach can also apply in the finite-horizon setting. We do this below, but we should remark that the skeletal decomposition is heavily motivated by the description of the CSBP genealogy using the so-called height process in [Reference Duquesne and Le Gall10]. Indeed, for (sub)critical CSBPs we may consider the conclusion of Theorem 2.2 below as a rewording thereof. However, as the proof does not rely on the CSBP being (sub)critical, the same result holds in the supercritical case. Thus, Theorem 2.2 is also a time-inhomogeneous version of Theorem 2.1 for supercritical CSBPs, which setting was not discussed in [Reference Duquesne and Le Gall10].
Assume that $\psi$ is a branching mechanism that satisfies Grey’s condition (1.5). We fix a time marker $T>0$ and we want to describe a coupled system of SDEs in the spirit of (2.3) in which the second component describes prolific genealogies to the time horizon T. In other words, our aim is to provide an SDE decomposition of the CSBP along those individuals in the population who have a descendant at time T.
To this end, recall that $(u_t(\theta), t\geq 0)$ is given by (1.1) and accordingly, for $t\geq 0$ , $u_t(\infty) = -x^{-1}\log \mathbb{P}_x(X_t=0) $ gives the rate at which extinction has occurred by time t. We need a Poisson random measure $\texttt{N}^{0}_T$ on $[0,T)\times [0, \infty)^{2}$ with intensity $\text{d}s\otimes \text{e}^{-u_{T-s}(\infty) r} \Pi(\text{d}r) \otimes \text{d}\nu$ , a Poisson process $ {\texttt N}^1_T$ on $[0,T)\times [0, \infty)\times {\mathbb{N}}_0%\times [0,1] $ with intensity $\text{d} s\otimes r \text{e}^{-u_{T-s}(\infty) r} \Pi(\text{d}r) \otimes\sharp(\text{d}j),$ and a Poisson process $ {\texttt N}^{2}_T(\text{d}s,\text{d}r,\text{d}k,\text{d}j)$ on $[0,T)\times[0, \infty)\times {\mathbb{N}}_0\times\mathbb{N}$ with intensity
where, for $k\geq 2$ ,
$p^{T-s}_k$ is such that $p^{T-s}_0 =p^{T-s}_1 = 0$ , and the remaining probabilities are computable by insisting that $\eta_k^{T-s}(\!\cdot\!)$ is itself a probability distribution for each $k\geq2$ .
Theorem 2.2. Suppose that $\psi$ corresponds to a branching mechanism which satisfies Grey’s condition (1.5). Fix a time horizon $T>0$ and consider the coupled system of SDEs
Equation (2.5) has a unique strong solution for arbitrary ( $\mathcal{F}_0^T$ -measurable) initial values $\Lambda_0^T\geq 0$ and $Z_0^T\in \mathbb{N}_0$ (where $\mathcal{F}_t^T: = \sigma((\Lambda_s^T,Z_s^T)\,:\,s\leq t)$ , $t{<}T$ ). Furthermore, under the assumption that $Z_0^T$ is an independent random variable which is Poisson distributed with intensity $u_T(\infty) \Lambda^T_0$ , this unique solution satisfies the following:
(i) For $T>t\geq0$ , conditional on $\mathcal{F}^{\Lambda^T}_t: = \sigma(\Lambda^T_s\,:\,s\leq t)$ , $Z^T_t $ is Poisson distributed with intensity $u_{T-t}(\infty)\Lambda^T_t$ .
(ii) The process $(\Lambda^T_t, 0\leq t{<}T)$ is Markovian and a weak solution to (1.4).
(iii) Conditional on $\{Z^T_0=0\}$ , the process $(\Lambda^T_t, 0\leq t{<}T)$ corresponds to a weak solution to (1.4) conditioned to become extinct by time T.
The SDE evolution in Theorem 2.2 mimics the skeletal decomposition in (2.3), albeit that the different components in the decomposition are time dependent. Putting the SDE representation aside, such time-varying skeletons have been observed in, e.g., [Reference Duquesne and Le Gall10, Reference Etheridge and Williams15]. We note that the underlying skeleton $Z^T$ can be thought of as a time-inhomogeneous GW process (a T-prolific skeleton) such that, at time $s< T$ , its branching rate is given by
and its offspring distribution is given by $\{p^{T-s}_k\,:\,k\geq 0\}$ . This has the feature that the branching rate explodes towards the time horizon T. To see why, we can appeal to (1.1), and note that
and hence $\lim_{t\to0}u_t(\infty) = \infty$ . Moreover, we can easily verify from (1.2) that
Together, these facts imply the explosion of (2.6) as $s\to T$ .
We also note from the integrals involving ${\texttt N}_T^1$ and ${\texttt N}_T^{2}$ that there is mass immigrating off the space-time trajectory of $Z^T$ . Moreover, once mass has immigrated, the first four terms of (2.5) show that it evolves as a time-inhomogeneous CSBP.
Note that in the supercritical setting $u_{T-t}(\infty)$ converges to $\lambda^*$ for all $t>0$ as $T\rightarrow\infty$ . This intuitively means that when T goes to $\infty$ , one can recover the prolific skeleton decomposition of Theorem 2.1 from the time-inhomogeneous one of Theorem 2.2.
Finally, with the finite-horizon SDE skeletal decomposition in Theorem 2.2 we may now turn our attention to understanding what happens when we observe the solution to (2.5) in the (sub)critical case on a finite time horizon $[0,t_0]$ , and we condition on there being at least one T-prolific genealogy while letting $T\rightarrow\infty$ .
Theorem 2.3. Suppose that $\psi$ is a critical or subcritical branching mechanism such that Grey’s condition (1.5) holds. Suppose, moreover, that $((\Lambda^T_t, Z^T_t), 0\leq t{<}T)$ is a weak solution to (2.5) and that $Z_0^T$ is an independent random variable which is Poisson distributed with intensity $u_T(\infty) \Lambda^T_0$ . Then, conditional on the event $Z_0^T>0$ , in the sense of weak convergence with respect to the Skorokhod topology on $\mathbb{D}([0,\infty),\mathbb{R}^2)$ , for all $t_0>0$ ,
as $T\to\infty$ , where $X^\uparrow$ is a weak solution to (1.7).
Theorem 2.3 puts the phenomena of spines and skeletons in the same framework. Roughly speaking, any subcritical branching population contains a naturally embedded skeleton which describes the ‘fittest’ genealogies. In our setting ‘fittest’ means surviving until time T, but other notions of fitness can be considered, especially when one introduces a spatial type to mass in the branching process. For example, in [Reference Harris, Hesse and Kyprianou22] a branching Brownian motion in a strip is considered, where ‘fittest genealogies’ pertains to those lines of descent which survive in the strip for all eternity. Having at least one line of descent in the skeleton corresponds to the event of survival. Thus, conditioning on survival as we make the survival event itself increasingly unlikely, e.g. by taking $T\to\infty$ in our model or taking the width of the strip down to a critical value in the branching Brownian motion model, the natural stochastic behaviour of the skeleton is to thin down to a single line of descent. This phenomenon was originally observed in [Reference Etheridge and Williams15], where the scaling limit of a GW process conditioned on survival is shown to converge to the immortal particle decomposition of the $(1+\beta)$ -superprocess conditioned on survival.
The remainder of the paper is structured as follows. In the next section we explain the heuristic behind how (1.4) can be decoupled into components that arise in (2.3). The heuristic is used in Section 4, where the proof of Theorem 2.1 is given. In this sense our proof of Theorem 2.1 has the feel of a ‘guess-and-verify’ approach. In Section 5, again in the spirit of a ‘guess-and-verify’ approach, we use ideas from the classical description of the exploration process of CSBPs in, e.g., [Reference Duquesne and Le Gall10] to provide the heuristic behind the mathematical structures that lie behind the proof of Theorem 2.2. Given the similarity of this proof to that of Theorem 2.1, it is sketched in Section 6. Finally, in Section 7 we provide the proof of Theorem 2.3.
3. Thinning of the CSBP SDE
In this section we will perform an initial manipulation of the SDE (1.4), which we will need in order to make comparative statements for Theorems 2.1 and 2.2. To this end, we will introduce some independent marks on the atoms of the Poisson process N driving (1.4) and use them to thin out various contributions to the SDE evolution.
Denote by $(t_i,r_i,\nu_i\,:\,i\in {\mathbb{N}})$ some enumeration of the atoms of N and recall that $\mathbb{N}_0 = \{0\}\cup\mathbb{N}$ . By enlarging the probability space, we can introduce an additional mark to atoms of N, say $({k}_i\,:\,i\in {\mathbb{N}})$ , resulting in an ‘extended’ Poisson random measure,
on $[0,\infty)^3\times\mathbb{N}_0$ with intensity
Now define three random measures by
Classical Poisson thinning now tells us that $ N^{0}$ , $ N^{1}$ , and $ N^{2}$ are independent Poisson point processes on $[0, \infty)^{3}$ with respective intensities $\text{d}s\otimes \text{e}^{-{\lambda} r} \Pi(\text{d}r) \otimes \text{d}\nu$ , $\text{d}s\otimes ({\lambda}r) \text{e}^{-{\lambda} r} \Pi(\text{d}r) \otimes \text{d}\nu$ , and $ \text{d}s\otimes\sum_{k=2}^{\infty} ({\lambda}r)^k \text{e}^{-{\lambda} r} \Pi(\text{d}r)/{k!}\otimes \text{d}\nu. $
With these thinned Poisson random measures in hand, we may start to separate out the different stochastic integrals in (1.4). We have that, for $t\geq 0$ ,
where in the last equality we have used the easily derived fact that $-\int_{(0,\infty)}(1-\text{e}^{-{\lambda} r})r\\ \Pi(\text{d}r)= -\alpha +2\beta {\lambda}- \psi'(\lambda) $ . Recalling (2.1), the first line in the last equality of (3.1) corresponds to the dynamics of a subcritical CSBP with branching mechanism $\psi_\lambda$ .
Inspecting the statement of Theorem 2.1, we see intuitively that in order to prove this result, our job is to show that the integrals on the right-hand side of (3.1) driven by $N^1$ and $N^2$ can be identified with the mass that immigrates off the skeleton.
4. $\lambda$ -skeleton: proof of Theorem 2.1
We start by addressing the claim that (2.3) possesses a unique strong solution. Thereafter we prove claims (i) (ii), and (iii) of the theorem in order.
We can identify the existence of any weak solution to (2.3) with initial value $(\Lambda_0, Z_0)= (x,n)$ , $x\geq0$ , $n\in\mathbb{N}_0$ , by introducing additionally marked versions of the Poisson random measures ${\texttt N}^{1}$ and ${\texttt N}^{2}$ , as well as an additional Poisson random measure ${\texttt N}^*$ . We will insist that ${\texttt N}^{1}({\text{d}} s,{\text{d}} r,{\text{d}}{j}, {\text{d}}\omega)$ has intensity ${\text{d}} s\otimes r\text{e}^{-\lambda r}\Pi({\text{d}} r)\otimes \sharp({\text{d}} j)\otimes\mathbb{P}^{(\lambda)}_r({\text{d}}\omega)$ on $[0,\infty)^2\times\mathbb{N}_0\times\mathbb{D}([0,\infty), \mathbb{R})$ , ${\texttt N}^{2}({\text{d}} s,{\text{d}} r,{\text{d}}{k},{\text{d}}{j}, {\text{d}}\omega)$ has intensity $ \psi'(\lambda){\text{d}} s\otimes \eta_k({\text{d}} r) \otimes p_k\sharp({\text{d}} k)\otimes \sharp({\text{d}}{j})\otimes \mathbb{P}^{(\lambda)}_r({\text{d}}\omega)$ on $[0,\infty)^2\times\mathbb{N}_0\times\mathbb{N}_0\times\mathbb{D}([0,\infty), \mathbb{R})$ , and ${\texttt N}^*({\text{d}} s, {\text{d}}{j}, {\text{d}} \omega)$ has intensity $2\beta{\text{d}} s\otimes \sharp({\text{d}} j)\otimes \mathbb{Q}^{(\lambda)}({\text{d}}\omega)$ on $[0,\infty)\times\mathbb{N}_0\times\mathbb{D}([0,\infty), \mathbb{R})$ , where $\mathbb{P}^{(\lambda)}_r$ is the law of a $\psi_\lambda$ -CSBP with initial value $r\geq 0$ (formally speaking, $\mathbb{P}^{(\lambda)}_0$ is the law of the null process) and $\mathbb{Q}^{(\lambda)}$ is the associated excursion measure.
Our proposed solution to to (2.3) will be to first define $(Z_t, t\geq 0)$ as the continuous-time GW process with branching rate $\psi'(\lambda)$ and offspring distribution given by $(p_k, k\geq 0)$ . It is then easy to otherwise represent Z in the more complicated form
Next we take
where $X^{(\lambda)}$ is an autonomously independent copy of a $\psi_\lambda$ -CSBP issued with initial mass x and, given ${\texttt N}^{1}$ and ${\texttt N}^{2}$ , $D_t$ , $t\geq 0$ , is the uniquely identified (up to almost sure modification) ‘dressed skeleton’ described by
To see why this provides a weak solution to (2.3), we may appeal to the martingale representation of the Markov pair $(\Lambda, Z)$ described above. In particular, the generator of $(\Lambda, Z)$ can be identified consistently with the generator of the process associated to (2.3); that is to say, their common generator is given by
for $x\geq 0$ , $n\in\mathbb{N}_0$ , and for all non-negative, smooth, and compactly supported functions f. (Here, the penultimate term is understood to be zero when $n=0$ .) With this sense of commonality for their generators, it is then easy to verify the conditions of Theorem 2.3 of [Reference Kurtz and Crisan23] and thus to conclude that $(\Lambda, Z)$ provides a weak solution to (2.3).
Pathwise uniqueness is also relatively easy to establish. Indeed, suppose that $\Lambda$ is the first component of any path solution to (2.3) with driving source of randomness ${\texttt N}^{0}$ , ${\texttt N}^{1}$ , ${\texttt N}^{2}$ , and W, and suppose that we write it in the form
where
and
Recalling that the almost sure path of Z is uniquely defined by $ {\texttt N}^{2}$ , it follows that, if $\Lambda^{(1)}$ and $\Lambda^{(2)}$ are two path solutions to (2.3) with the same initial value, then
The reader will now note that the above equation is precisely the SDE obtained when looking at the path difference between two solutions of an SDE of the type given in (1.4). Since there is pathwise uniqueness for (1.4), we easily conclude that $\Lambda^{(1)} = \Lambda^{(2)}$ almost surely.
Finally, taking account of the existence of a weak solution and pathwise uniqueness, we may appeal to an appropriate version of the Yamada–Watanabe theorem (see, for example, Theorem 1.2 of [Reference Barczy, Li and Pap2]) to deduce that (2.3) possesses a unique strong solution. And since this holds for every fixed initial configuration x and n, it also holds when the initial values are independently randomised.
(i) This claim requires an analytical verification and, in some sense, is similar in spirit to the proof that, for $t\geq 0$ , $Z_t \mid \Lambda_t$ is Poisson distributed with rate $\lambda^*\Lambda_t$ in the prolific skeletal decomposition found in [Reference Berestycki, Kyprianou and Murillo-Salas4]. A fundamental difference here is that we work with SDEs, and hence stochastic calculus, rather than integral equations for semigroups as in [Reference Berestycki, Kyprianou and Murillo-Salas4] and, moreover, the parameter $\lambda$ need not be the minimal value, $\lambda^*$ , in its range.
Standard arguments show that the solution to (2.3) is a strong Markov process, and accordingly we write $\mathbf{P}_{x,n}$ , $x>0$ , $n\in\mathbb{N}_0$ for its probabilities. Moreover, with an abuse of notation we write, for $x>0$ ,
Define $f_t(\eta, \theta)\,:\!=\mathbf{E}_x[\text{e}^{-\eta \Lambda_t-\theta Z_t}]$ , $x,\theta,\eta, t\geq 0$ , and let $F_t\,:\!= \text{e}^{-\eta \Lambda_t-\theta Z_t}$ , $t\geq 0$ . Using Itô’s formula for semi-martingales (see Theorem 32 of [Reference Protter40]), for $t\geq 0$ ,
(Here and throughout the remainder of this paper, for any stochastic process Y we use the notation that $\Delta Y_t = Y_t-Y_{t-}$ .) As Z is a pure jump process, we have that $[Z,Z]_t^c=[\Lambda,Z]_t^c=0$ . Taking advantage of the fact that
we may thus write, in integral form,
where the sum is taken over the countable set of discontinuities of $(\Lambda, Z)$ . We can split up the sum of discontinuities according to the Poisson random measure in (2.3) that is responsible for the discontinuity. Hence, writing $\Delta^{(j)}$ , $j=0,1,2$ , to mean an increment coming from each of the three Poisson random measures,
Now, note that we can rewrite the first element of the vectorial SDE (2.3) as
where $M_t$ is a zero-mean martingale corresponding to the integral in (2.3) with respect to $\tilde{\texttt N}^0$ . Therefore, performing the necessary calculus in (4.3) for the integral with respect to $\text{d}\Lambda_t$ , we get that
is equal to a zero-mean martingale which is the sum of the previously mentioned $M_t$ , $t\geq 0$ , and the white noise integral. Taking expectations, we thus have
Accumulating terms, we find that $f_t(\eta,\theta)$ satisfies the PDE
where
Standard theory for the linear partial differential Equation (4.4) (see, for example, Chapter 3 (Theorem 2, p. 107) of [Reference Evans17] and references therein) tells us that it has a unique local solution. Our aim now is to show that this solution is also represented by
where we recall that X is the $\psi$ -CSBP. To this end, let us define $\kappa = \eta + \lambda (1- \text{e}^{-\theta})$ and note that, for $x,t, \kappa \geq 0$ , $ g_t(\kappa): = \mathbb{E}_x[\exp\{-\kappa X_t\}] $ satisfies
(see, for example, Exercise 12.2 in [Reference Kyprianou24]). After a laborious amount of algebra we can verify that $-\psi({\kappa})=A_\lambda(\eta,\theta)+\lambda \text{e}^{-\theta}B_\lambda(\eta,\theta)$ , and hence we may develop the right-hand side of (4.6) and write, for $x, t, \eta, \theta\geq 0$ ,
Now we choose $x=1$ . Then local uniqueness of the solution to (4.4), or equivalently the local uniqueness of (4.6), thus tells us that there exists $t_0>0$ such that $g_t( \eta + \lambda (1- \text{e}^{-\theta})) = f_t (\eta, \theta)$ for all $\eta, \theta \geq 0$ and $t\in [0,t_0]$ .
In conclusion, now that we have proved that for $t\in[0,t_0]$
we can observe the following implications in turn. First, setting $\theta = 0$ and $\eta>0$ , we see that $\Lambda_t$ under $\mathbf{P}_1$ has the same distribution as $X_t$ under $\mathbb{P}_1$ for all $t\in[0,t_0]$ . Next, setting both $\eta,\theta>0$ , we observe that $(\Lambda_t, Z_t)$ under $\mathbf{P}_1$ has the same law as $(X_t, \textrm{Po}(\lambda x)|_{x=X_t})$ under $\mathbb{P}_1$ , where $\textrm{Po}(\lambda x)$ is an autonomously independent Poisson random variable with rate $\lambda x$ . In particular, it follows that, for all $t\in[0,t_0]$ , under $\mathbf{P}_1$ , the law of $Z_t$ given $\Lambda_t$ is $\textrm{Po}(\lambda \Lambda_t)$ .
To get a global result we first show that the previous conclusions hold for any initial mass $x>0$ on the time interval $[0,t_0]$ , then, using the Markov property, we extend the results for any $t>0$ . First, from (4.5) we can observe that
Thus, in order to extend the previous results to any $x>0$ we only need to prove that
Recalling the representation (4.1) and the notation (4.2), we can write, for $t\leq t_0$ ,
as required.
Now take $t_0{<}t\leq 2t_0$ , and use the tower property to get
and similarly
Thus, using local uniqueness and the previously deduced implications on $[0,t_0]$ , we see that
and by iterating the previous argument we get equality for any $t>0$ .
Finally, on account of the fact that $(\Lambda_t, Z_t)$ , $t\geq0$ , is a joint Markovian pair, this now global Poissonisation allows us to infer that $\Lambda_t$ , $t\geq 0$ , is itself Markovian. Indeed, for any bounded measurable and positive h and $s,t\geq 0$ ,
We may now conclude that for all $t\geq 0$ and $x>0$ , under $\mathbf{P}_x$ , $Z_t \mid \mathcal{F}^\Lambda_t$ is $\textrm{Po}(\lambda \Lambda_t)$ distributed, as required.
(ii) We have seen that the pair $((\Lambda_t,Z_t),t\geq 0)$ is a Markov process for any initial state (x, n), but, due to the dependence on Z, on its own $(\Lambda_t,t\geq 0)$ is not Markovian. However, considering (4.7) we see that after the Poissonisation of $Z_0$ , $(\Lambda_t, t\geq 0)$ becomes a Markov process with a semi-group that agrees with that of $(X_t, t\geq 0)$ . On account of the fact that X is the unique weak solution to (1.4), it automatically follows that $\Lambda$ also represents the unique weak solution to (1.4).
(iii) Since the event $\{Z_0=0\}$ implies the event $\{Z_t=0,t\geq 0\}$ , the system (2.3) reduces to the SDE
\begin{equation*} \Lambda_t= x + \psi'(\lambda) \int_{0}^{t} \Lambda_{s-}\text{d}s + \sqrt{2\beta} \int_{0}^{t}\int_{0}^{\Lambda_{s-}}\! W(\text{d}s,\text{d}u) + \int_{0}^{t}\int_{0}^{\infty}\int_{0}^{\Lambda_{s-}}\!r \tilde{N}^0(\text{d}s, \text{d}r, \text{d}\nu), \end{equation*}
which has the exact form of the SDE describing the evolution of a CSBP with branching mechanism $\psi_\lambda$ .
5. Exploration of subcritical CSBPs
The objective of this section is to give a heuristic description of how the notion of a prolific skeleton emerges in the subcritical case, and specifically why the structure of the SDE (2.5) is meaningful in this respect. We need to be careful about what one means by ‘prolific’, but nonetheless the inspiration for a decomposition can be gleaned by examining in more detail the description of subcritical CSBPs through the exploration process.
We assume throughout the conditions of Theorem 2.2. That is to say, X is a (sub)critical $\psi$ -CSBP where $\psi$ satisfies Grey’s condition (1.5). Let $(\xi_t,t\geq 0)$ be a spectrally positive Lévy process with Laplace exponent $\psi$ . Using the classical work of [Reference Le Gall and Le Jan31, Reference Le Gall and Le Jan32] (see also [Reference Duquesne and Le Gall10, Reference Le Gall30]) we can use generalised Ray—Knight-type theorems to construct X in terms of the so-called height process associated to $\xi$ . For convenience, and to introduce more notation, we give a brief overview here.
Denote by $(\hat{\xi}^{(t)}_r,0\leq r\leq t)$ the time-reversed process at time t, that is, $\hat{\xi}_r^{(t)}\,:\!=\xi_t-\xi_{(t-r)-}$ , and let $\hat{S}_r^{(t)}\,:\!=\sup_{s\leq r} \hat{\xi}_s^{(t)}$ . We define $H_t$ as the local time at level 0 at time t of the process $\hat{S}^{(t)}-\hat{\xi}^{(t)}$ . Because the reversed process has a different point from which it is reversed at each time, the process H does not behave in a monotone way. The process $(H_t,t\geq 0)$ is called the $\psi$ -height process, which, under assumption (1.5), is continuous. There exists a notion of local time up to time t of H at level $a\geq 0$ , henceforth denoted by $L_t^a$ . Specifically, the family $(L^a_t, a,t\geq 0)$ satisfies
where g is a non-negative measurable function.
For $x>0$ let $T_x\,:\!=\inf\{t\geq 0, \xi_t=-x\}$ . Then the generalised Ray–Knight theorem for the $\psi$ -CSBP process states that $(L^a_{T_x},a\geq 0)$ has a càdlàg modification for which
that is, the two processes are equal in law.
The height process also codes the genealogy of the $\psi$ -CSBP. It can be shown that the excursions of H from 0 form a time-homogeneous Poisson point process of excursions with respect to local time at 0. We shall use $\texttt{n}$ to denote its intensity measure.
If $X_0 = x$ , then the total amount of local time of H accumulated at zero is x. Each excursion codes a real tree (see [Reference Duquesne and Le Gall10] for a precise meaning) such that the excursion that occurs after $u\leq x$ units of local time can be thought of as the descendants of the uth individual in the initial population. Here we are interested in the genealogy of the conditioned process and what we will call the embedded ‘T-prolific’ tree, that is, the tree of the individuals that survive up to time T. Conditioning the process on survival up to time T corresponds to conditioning the height process to have at least one excursion above level T. (We have the slightly confusing, but nonetheless standard, notational anomaly that a spatial height for an excursion corresponds to the spatial height in the tree that it codes, but that this may also be seen as a time into the forward evolution of the tree.) Let $\texttt{n}_T$ denote the conditional probability $\texttt{n}(\cdot \mid \sup_{s\geq 0}\epsilon_s\geq T)$ , where $\epsilon$ is a canonical excursion of H under $\texttt{n}$ . Let $(Z_t^T,t\geq 0)$ be the process that counts the number of excursions above level t that hit level T within the excursion $\epsilon$ . Duquesne and Le Gall in [Reference Duquesne and Le Gall10] described the distribution of $Z^T$ under $\texttt{n}_T$ and proved the following.
Theorem 5.1. Under $\texttt{n}_T$ the process $(Z_t^T,0\leq t {\color{black}<} T)$ is a time-inhomogeneous Markov process whose law is characterised by the following identities. For every $\lambda>0$ ,
and if $0\leq t{<} t' {<}T$ ,
In essence, the second part of the above theorem shows that $Z^T$ has the branching property. However, temporal inhomogeneity means that it is a time-dependent continuous-time GW process. In [Reference Duquesne and Le Gall10] it is moreover shown that, conditionally on $L_\sigma^t$ under $\texttt{n}_T$ , where $\sigma$ is the length of the excursion $\epsilon$ , $Z_t^T$ is Poisson distributed with intensity $u_{T-t}(\infty)L_\sigma^t$ . Thinking of $L_\sigma^t$ as the mass at time t in the tree of descendants of the prolific individual in the initial population that the excursion codes, we thus have a Poisson embedding of the number of prolific descendants of that one individual within the excursion.
The time-dependent continuous-time GW process in the theorem can also be characterised as follows. At time 0 we start with one individual. Then the law of the first branching time, $\gamma_T$ , is given by
and, conditionally on $\gamma_T$ , the probability-generating function of the offspring distribution is
The offspring distribution when a split occurs at height t in the excursion (equivalently, time t in the underlying genealogical tree), say $(p^{T-t}_k, k\geq 0)$ , is explicitly given by the following. We have $p^{T-t}_0=p^{T-t}_1=0$ , and for $k\geq 2$ ,
Using (5.1) we can compute the rate of branching at any height t in the excursion. First, it is not hard to see that
Hence, the rate is
Again thinking of $L^t_\sigma$ , $t < T$ , as the mass of the tree coded by the excursion, and noting that not all of this mass is prolific, we would like to characterise the non-T-prolific mass that has ‘immigrated’ along the path of the prolific tree. We expect this to be a CSBP conditioned to die before time T. Using (1.1), we know that the probability that the process dies up to time T is given by
where we assume Grey’s condition (1.5) to ensure that the above conditioning makes sense. A simple application of the Markov property tells us that the law of X conditioned to die out by time T can be obtained by the following change of measure:
Indeed, using the semigroup property of $u_t$ , $t\geq 0$ , it is not hard to verify that the right-hand side above is a martingale. We would like to understand how to characterise the evolution of the process $(X, \mathbb{P}^T_x)$ , $x>0$ , a little better as the change of measure is time inhomogeneous.
To this end, we again appeal to Itô’s formula. Denote $v(t)\,:\!=u_{T-t}(\infty)$ ; then, for non-negative, twice-differentiable, and compactly supported functions f, after a routine, albeit lengthy, application of Itô’s formula we get, for $t\geq 0$ and $x>0$ ,
where $M_t$ , $t\geq 0$ , represents the martingale terms. Taking expectations, we get
Gathering terms, making use of the expression for $\psi$ in (1.2), and that
we have, for $t\geq 0$ and $x>0$ , the Dynkin formula
where the infinitesimal generator is given by
For comparison, consider the generator of a CSBP with Esscher-transformed branching mechanism $\psi_\lambda$ , which is given by
for suitably smooth and integrable functions f. Recall that the CSBP with generator (5.3) is subcritical providing $\psi'(\lambda)>0$ and, taking account of (1.3), the greater this value, the ‘more subcritical’ it becomes. It appears that $\hat{\mathcal{L}}_{T-t}$ has the form of an Esscher-transformed branching mechanism based on $\psi$ where the parameter shift is controlled by $u_{T-t}(\infty)$ , which explodes as $t\to T$ . Said another way, if we define $V^T_t(\theta)$ , $0\leq t{<}T$ , $x,\theta\geq 0$ , as the exponent satisfying
then
Recalling that we are assuming Grey’s condition (1.5) for a (sub)critical process, we note from (1.1) that
In that case, the density in (5.2) tends to unity as $T\to\infty$ .
We conclude this section by returning to Theorem 2.2. The discussion in this section shows that in the (sub)critical case the components of the SDE (2.5) mimic precisely the description of the T-skeleton in the previous section. In particular, the first three integrals in (2.5) indicate that once mass is created in the SDE, it evolves as a time-dependent CSBP with generator $\hat{\mathcal{L}}_{T-t}$ . Moreover, the evolution of the skeleton $Z^T$ as described in (2.5), matches precisely the dynamics of the T-prolific skeleton described in the previous section (for which we preemptively used the same notation), which is a time-dependent continuous-time GW process. Indeed, the branching rate and the time-dependent offspring distribution of both match.
Remark 5.1. It is important to note that even though the time-dependent T-prolific skeleton is inspired by the height process, the description does not require $\psi$ to be a (sub)critical branching mechanism. Indeed, only requiring Grey’s condition to be satisfied ensures that the branching rate and offspring distribution of this section are well defined. Similarly, we can apply the change of measure in (5.2) for any CSBP satisfying (1.5), and get a time-dependent CSBP with generator $\hat{\mathcal{L}}_{T-t}$ . Although the results of Theorem 2.2 were motivated by [Reference Duquesne and Le Gall10], we can see that the theorem can be stated in a more general setting, and thus extends the existing family of finite-horizon decompositions for CSBPs.
6. Finite-time horizon skeleton: proof of Theorem 2.2
Now that we understand that the mathematical structure of (2.5) is little more than a time-dependent version of (2.3), the reader will not be surprised by the claim that the proof of strong uniqueness to (2.5), as well as parts (i) and (ii) of Theorem 2.1, pass through almost verbatim, albeit needing some minor adjustments for additional time derivatives of $u_{T-t}(\infty)$ in, e.g. (4.3) and (4.6), which plays the role of $\lambda$ . To avoid repetition we simply leave the proof of these two parts as an exercise for the reader.
On the event $\{Z_0 =0\}$ , which is concurrent with the event $\{Z_t = 0, \, 0\leq t{<}T\}$ , close inspection of (2.5) allows us to note that $\Lambda$ is generated by an SDE with time-varying coefficients. Indeed, standard arguments show that, conditional on $\{Z_0 =0\}$ , $\Lambda$ is a time-inhomogeneous Markov process.
Suppose that we write $\mathbf{P}^T_{x,n}$ , $x\geq 0, n\in\mathbb{N}_0$ , for the law of the Markov probabilities corresponding to the solution of (2.5). Moreover, we will again abuse this notation in the spirit of (4.2) and write $\mathbf{P}^T_x$ , $x\geq 0$ , when $Z^T_0$ is randomised to be an independent Poisson random variable with rate $u_T(\infty)x$ . We can use parts (i) and (ii) of Theorem 2.2, together with (5.2), to deduce that
This tells us that the semigroups of $\Lambda$ conditional on $\{Z_0 =0\}$ and X conditional to become extinct by time T agree. Part (iii) of Theorem 2.2 is thus proved.
7. Thinning the skeleton to a spine: proof of Theorem 2.3
The aim of this section is to recover the unique solution to (1.7) as a weak limit of (2.5) in the sense of Skorokhod convergence. To this end, we assume throughout the conditions of Theorem 2.3, in particular that $\psi$ is a critical or subcritical branching mechanism and Grey’s condition (1.5) holds.
There are three main reasons why we should expect this result, and these three reasons pertain to the three structural features of the skeleton decomposition: the feature of Poisson embedding, the Galton–Watson skeleton, and the branching immigration from the skeleton with an Esscher-transformed branching mechanism. Let us dwell briefly on these heuristics.
First, let us consider the behaviour of the skeleton $(Z_t^T, t{<}T)$ as $T\to\infty$ . As we are assuming that $\psi$ is a (sub)critical branching mechanism, it holds that $\lim_{T\rightarrow\infty}u_T(\infty) = 0$ as $T\rightarrow\infty$ . Thus, recalling that $Z^T_0\sim \textrm{Po}(u_T(\infty)x)$ , i.e. independent and Poisson distributed with parameter $u_T(\infty)x$ , and hence conditioning on survival to time T in the skeletal decomposition is tantamount to conditioning on the event $\{Z^T_0\geq 1\}$ , we see that
We thus see that the probabilities (7.1) all tend to zero unless $k=1$ , in which case the limit is unity. Moreover, Theorem 2.2(ii) and (iii) imply that the law of $(\Lambda^T_t, 0\leq t{<}T)$ conditional on $(\mathcal{F}^{\Lambda_t^T}\cap\{Z^T_0\geq 1\}, 0\leq t{<}T)$ corresponds to the law of the $\psi$ -CSBP, X, conditioned to survive until time T. Intuitively, then, one is compelled to believe that, in law, there is asymptotically a single skeletal contribution to the law of X conditioned to survive.
Second, considering (5.1), it follows from l’Hospital’s rule that the rate at which the aforementioned most common recent ancestor branches begins to slow down, since
What we are thus observing is a thinning, in the weak sense, of the skeleton in terms of the number of branching events.
Third, we consider the mass that immigrates from the skeleton. For a fixed T, it evolves as a $\psi$ -CSBP conditioned to die before time T. We recall that conditioning to die before time T is tantamount to the change of measure given in (5.2). It is easy to see that, as $T\to\infty$ , the density in this change of measure converges to unity and hence immigrating mass, in the weak limit, should have the evolution of a $\psi$ -CSBP.
With all this evidence in hand, Theorem 2.3 should now take on a natural meaning. We give its proof below.
Proof of Theorem 2.3. According to Theorem 2.5 on p. 167 of [Reference Ethier and Kurtz16, Chapter 4], if E is a locally compact and separable metric space, $\mathcal{P}^T: = (\mathcal{P}_t^T, t\geq 0)$ , $T>0$ , is a sequence of Feller semi-groups on $C_0(E)$ (the space of continuous functions on E vanishing at $\infty$ , endowed with the supremum norm), $\mathcal{P}: = (\mathcal{P}_t, t\geq 0)$ is a Feller semi-group on $C_0(E)$ such that, for $f\in C_0(E)$ , with respect to the supremum norm on the space $C_0(E)$ ,
and moreover, $(\nu^T,T>0)$ is a sequence of Borel probability measures on E such that $\lim_{T\to\infty}\nu^T =\nu$ weakly for some probability measure $\nu$ ; then, with respect to the Skorokhod topology on $\mathbb{D}([0,\infty), E)$ , $\Xi^T$ converges weakly to $\Xi$ , where $(\Xi^T,T>0)$ are the strong Markov processes associated to $(\mathcal{P}^T, T>0)$ with initial law $(\nu^T,T>0)$ and $\Xi$ is the strong Markov processes associated to $\mathcal{P}$ with initial law $\nu$ , respectively.
Note that such weak convergence results would normally require a tightness criterion; however, having the luxury of (7.2), where $\mathcal{P}$ is a Feller semi-group, removes this condition and this will be the setting in which we are able to apply the conclusion of the previous paragraph.
Fix $t_0>0$ . We want to prove the weak convergence result in the finite time window $[0,t_0]$ . In order to introduce the role of $\mathcal{P}^T$ , $T>0$ , in our setting, we will abuse yet further the previous notation and define $\mathbf{P}^T_{x,n, s}$ , $x\geq 0$ , $n\geq 0$ , $s\in[0,t_0]$ to be the Markov probabilities associated to the three-dimensional process $(\Lambda_t, Z_t, \tau_t)$ , $0\leq t\leq t_0$ , whenever $t_0{<}T$ , where $(\Lambda_t, Z_t)$ , $0\leq t\leq t_0$ , is the weak solution to (2.5), and $\tau_t: = t$ , $0\leq t\leq t_0$ . Consistently with the previous notation, we have $\mathbf{P}^T_{x,n, 0} = \mathbf{P}^T_{x,n}$ , $x\geq 0$ , $n\geq 0$ . Now define the associated time-dependent semi-group for the three-dimensional process, for $t\geq 0$ and $f\in C_0([0,\infty)\times\mathbb{N}_0\times[0,\infty))$ , such that $\texttt{P}^T_t[\,f](x,n,s)=f(x,n,s)$ when $T\leq t_0$ , and when $T> t_0$ we have
for $(x,n,s)\in[0,\infty)\times\mathbb{N}_0\times[0,t_0]$ , and $\texttt{P}^T_t[\,f](x,n,s): = f(x,n,s)$ for $(x,n,s)\in[0,\infty)\times\mathbb{N}_0\times(t_0,\infty)$ . We take $\mathcal{P}^T=\texttt{P}^T$ . In order to verify the Feller property of $\texttt{P}^T_t[\,f](x,n,s)$ we need to check two things (cf. Proposition 2.4 in Chapter III of [Reference Revuz and Yor41]):
(i) For each $t\geq 0$ , the function $(x,n,s)\mapsto \texttt{P}^T_t[\,f](x,n,s)$ belongs to $C_0([0,\infty)\times\mathbb{N}_0\times[0,\infty))$ for any f in that space.
(ii) For all $f\in C_0([0,\infty)\times\mathbb{N}_0\times[0,\infty))$ and for each $(x,n,s)\in [0,\infty)\times\mathbb{N}_0\times[0,\infty)$ , we have $\lim_{t\downarrow s}\texttt{P}^T_t[\,f](x,n,s) = f(x,n,s)$ .
Note that when $T\leq t_0$ , or $s\geq t_0$ , or $t\leq s\leq t_0$ , we have $\texttt{P}^T_t[\,f](x,n,s)=f(x,n,s)$ . Since $f\in C_0([0,\infty)\times\mathbb{N}_0\times[0,\infty))$ , both (i) and (ii) are trivially satisfied. We can also notice that the case when $T>t_0$ , $s\leq t_0$ and $t\geq t_0$ reduces to the case of $t=t_0$ , hence in order to show the Feller property of $\texttt{P}^T_t[\,f](x,n,s)$ we can restrict ourselves to the case of $s\leq t\leq t_0{<}T$ .
By the denseness of the subalgebra generated by exponential functions (according to the uniform topology) in $C_0(E)$ , it suffices to check, for (i), that
belongs to $C_0([0,\infty)\times\mathbb{N}_0\times[0,\infty))$ and, for (ii), that
To this end, note that
In order to evaluate the expectation on the right-hand side above, we want to work with an appropriate representation of the unique weak solution to (2.5). We shall do so by following the example of how the weak solution to (2.3) was identified in the form (4.1).
As before, we need to introduce additionally marked versions of the Poisson random measures ${\texttt N}_T^{1}$ and ${\texttt N}_T^{2}$ , as well as an additional Poisson random measure ${\texttt N}_T^*$ . We will insist that Poisson random measure $ {\texttt N}^1_T({\text{d}} s, {\text{d}} r, {\text{d}}{j}, {\text{d}}\omega)$ on $[0,T)\times[0,\infty) \times\mathbb{N}_0\times\mathbb{D}([0,\infty), \mathbb{R})$ has intensity $\text{d} s\otimes r \text{e}^{-u_{T-s}(\infty) r} \Pi(\text{d}r) \otimes\sharp(\text{d}j)\otimes \mathbb{P}^{T-s}_r({\text{d}} \omega)$ , Poisson random measure $ {\texttt N}^{2}_T(\text{d}s,\text{d}r,\text{d}k,\text{d}j, {\text{d}}\omega)$ on $[0,T)\times[0, \infty)\times {\mathbb{N}}_0\times\mathbb{N}\times \mathbb{D}([0,\infty), \mathbb{R})$ has intensity
and Poisson random measure ${\texttt N}^*_T({\text{d}} s, {\text{d}}{j}, {\text{d}} \omega)$ has intensity $2\beta{\text{d}} s\otimes \sharp({\text{d}} j)\otimes\mathbb{Q}^{T-s}({\text{d}}\omega)$ on $[0,T)\times\mathbb{N}_0\times\mathbb{D}([0,\infty), \mathbb{R})$ , where $\mathbb{Q}^T$ is the excursion measure associated to $\mathbb{P}^T_r$ , $r\geq 0$ , satisfying
for $0\leq t{<}T$ , where $V^T_t$ was defined in (5.4). To recall some of the notation used in these rates, see (2.4) and (2.6).
If the pair $(\Lambda, Z)$ has law $\mathbf{P}^T_{x,n}$ , then we can write
where X is autonomously independent with law $\mathbb{P}^T_x$ and, given ${\texttt N}^{2}$ , D is the uniquely identified (up to almost sure modification) ‘dressed skeleton’ described by
where $Z_0 = n$ . The verification of this claim follows almost verbatim that for (2.3), albeit with the obvious change to take account of the time-varying rates. We therefore omit the proof and leave it as an exercise for the reader.
With the representation (7.6), as Z is piecewise constant we can condition on the sigma-algebra generated by ${\texttt N}_T^{2}$ and show, using Campbell’s formula in between the jumps of Z, that, for $0\leq t{<}T$ , $\gamma, \theta\geq 0$ , $x\geq 0$ , and $n\in\mathbb{N}_0$ ,
where
and, for $\lambda, z \geq 0$ ,
Given the identities (7.5) and (7.7), the two required verifications in (7.3) and (7.4) follow easily as a direct consequence of the continuity and bounded convergence in (7.7).
The target semigroup $\mathcal{P}$ on $f\in C_0([0,\infty)\times\mathbb{N}_0\times[0,\infty))$ is defined as follows. For fixed $n\in\mathbb{N}_0$ , $x\geq 0$ , let $\mathbb{P}^{(n)}_{x}$ be the law of the homogeneous Markov process described by the weak solution to
with W, N, and $N^{(*,n)}$ Poisson random measures on $[0,\infty)^2\times\mathbb{D}([0,\infty), \mathbb{R})$ with intensity measure $n{\text{d}} s\otimes r\Pi({\text{d}} r)\otimes\mathbb{P}_r({\text{d}}\omega)$ . Note, we have that, at no detriment to consistency, $\mathbb{P}^{(0)}_{x}$ can be replaced by $\mathbb{P}_{x}$ . Then, we take the role of $\mathcal{P}_t$ played by the semi-group $\texttt{P}_t^\uparrow$ given by
for $(x,n,s)\in[0,\infty)\times\mathbb{N}_0\times[0,t_0]$ , and $\texttt{P}^\uparrow_t[\,f](x,n,s): = f(x,n,s)$ otherwise. Here, $f\in C_0([0,\infty)\times\mathbb{N}_0\times[0,\infty))$ and $\tau_t=t$ , as above. Notice that $(X, \mathbf{P}^{(n)}_{x})$ is a branching process with immigration, whose Laplace transform is given by
From this, it is easily seen that $\texttt{P}^\uparrow_t$ is Feller as well.
Lastly, for each $T\geq 0$ we take $\nu^T$ as the measure on $[0,\infty)\times\mathbb{N}_0\times[0,\infty)$ given for each $x\geq 0$ by $\delta_x\otimes \pi^{T,x} \otimes \delta_0$ , with $ \pi^{T,x}(\!\cdot\!) = \sum_{n\geq 1}\varpi_n^{T,x} \delta_n(\!\cdot\!). $ Recall from (7.1) that $\pi^{T,x}$ converges weakly, as $T\to\infty$ , to the measure $\delta_1(\!\cdot\!)$ on $\mathbb{N}_0$ , hence $\nu^T$ converges weakly to $\nu\,:\!=\delta_x\otimes \delta_1 \otimes \delta_0$ . Thus, in order to invoke Theorem 2.5 of [Reference Ethier and Kurtz16, Chapter 4], we just need to check the analogue of (7.2) in our setting.
To this end, notice first that we can restrict ourselves to $0\leq s\leq t\leq t_0$ , since when $s>t_0$ , $\texttt{P}^\uparrow_t[\,f](x,n,s)=\texttt{P}^T_t[\,f](x,n,s)$ by definition, and the case when $0\leq s\leq t_0\leq t$ reduces to the case when $t=t_0$ . Then, note from (2.6) that $q^T\to0$ as $T\to\infty$ , and this yields that, under $\mathbf{P}^T_{0,n}$ , the process Z converges in probability uniformly on [0, t] as $T\to\infty$ (cf. Theorem 6.1, Chapter 1, p. 28 of [Reference Ethier and Kurtz16]) to the constant process $Z_s\equiv n$ , $s\leq t$ . Referring back to (7.7), the continuity in T of the deterministic quantities as they appear on the right-hand side and the previously mentioned uniform convergence of $(Z, \mathbf{P}^T_{0,n})$ together imply that, for $x\geq 0$ , $0\leq s\leq t\leq t_0$ , $n\in\mathbb{N}_0$ ,
where
To conclude, it is thus enough to prove that this convergence holds uniformly in $x\geq 0$ , $0\leq s\leq t$ , $n\in\mathbb{N}_0$ , where $t\leq t_0$ . Consider fixed $R>0$ and $N\in \mathbb{N}$ . Since $V_t^T(\gamma)$ defined above is non-negative and, for each $n\in \mathbb{N}_0$ , $Z_t\geq Z_0=n$ , $t\geq0$ , almost surely under $\mathbf{P}^T_{0,n}$ , for all $T>0$ , using the triangle inequality, we have
where we have set:
and $B^N =2 \text{e}^{-\theta N}$ . First, it is not hard to see that
The identity ${\partial u_s(\theta)}/{\partial \theta}= % \text{e}^ \exp\big\{-\int_0^s\psi'(u_r(\theta))\,{\text{d}} r\big\}$ (see (12.12) in [Reference Kyprianou24, Chapter 12]) and the fact that $\psi'(\theta)\geq 0$ allows us to estimate $|u_{t-s}(\gamma+ u_{T-t}(\infty))- u_{t-s}(\gamma)|$ by $u_{T-t}(\infty)$ . Recalling that $u_T(\infty)\rightarrow 0$ as $T\rightarrow\infty$ , it follows that $A_R(T)$ tends to 0 as $T\to \infty$ , for each $R>0$ . Next, since $(s,\gamma)\mapsto u_s(\gamma)$ is increasing in $\gamma$ and decreasing in s, we have
which, for T sufficiently large, is bounded from below by $u_t(\gamma)/2>0$ . Fix $\varepsilon>0$ . Choosing $R>0$ such that $\text{e}^{-R u_t(\gamma)/2}+ \text{e}^{-R u_{t}(\gamma) } \leq \varepsilon$ , we thus get
With regard to the term $B_N(T)$ , we have
The first term on the right-hand side above is bounded by
and hence goes to 0 for each N as $T\to \infty$ . On the other hand, as a function of $(Z_s,s\leq t)$ , the expression inside the expectation in the second term of (7.8) is bounded and continuous with respect to the Skorokhod topology (recall that Skorokhod continuity is preserved for Z under the operation of supremum over finite time horizons). Moreover, it vanishes when $ Z_s\equiv n$ , $0\leq s\leq t$ . This implies that this term goes to 0 as well. Finally, the expression whose absolute value we take in the third term of (7.8) is bounded by 1, and vanishes unless Z jumps at least once on [0,s]. This shows that the last term is bounded by $ \max_{n\leq N} \sup_{s\leq t} \mathbf{P}^{T-s}_{0,n} ( \sup_{w\leq t}\Delta Z_w>0)$ , which goes to 0 when $T\to \infty$ . Note that for all three terms in (7.8), we are using the fact that if $g(T)\geq 0$ is continuous in T and $\lim_{T\to\infty}g(T) = 0$ then, for each $\varepsilon> 0$ and for $0< t\leq t_0$ , by choosing T sufficiently large, we have $\sup_{s\leq t}g(T-s)<\varepsilon$ . That is to say, $\lim_{T\to\infty}\sup_{s\leq t}g(T-s) = 0$ .
Putting the pieces together and choosing $N\in \mathbb{N}_0$ large enough that $B^N\leq \varepsilon$ , we thus get
Since $\varepsilon$ was arbitrary this shows the convergence of the semi-groups (7.2) in our setting which, together with the weak convergence of the initial configurations, gives the weak convergence of the associated processes on $[0,t_0]$ . And since $t_0>0$ was chosen arbitrarily, this completes the proof of Theorem 2.3.
Acknowledgements
Part of this work was carried out whilst AEK was visiting the Centre for Mathematical Modelling, Universidad de Chile, and JF was visiting the Department of Mathematical Sciences at the University of Bath; each is grateful to the host institution of the other for their support. DF was supported by a scholarship from the EPSRC Centre for Doctoral Training, SAMBa. JF was supported by Basal-Conicyt Centre for Mathematical Modeling AFB170001 and Millenium Nucleus MESCD. AEK was supported by EPSRC grant EP/L002442/1. The authors would also like to thank the anonymous referees whose extensive reading of earlier versions of this paper led to many improvements.