Kingman’s model with random mutation probabilities: convergence and condensation I

Linglong Yuan

doi:10.1017/apr.2021.33

Kingman’s model with random mutation probabilities: convergence and condensation I

Part of: Stochastic processes Limit theorems

Published online by Cambridge University Press: 25 February 2022

Linglong Yuan

Show author details

Linglong Yuan*: Affiliation:
University of Liverpool and Xi’an Jiaotong-Liverpool University
*: *Postal address: University of Liverpool, Department of Mathematical Sciences, Peach Street, L69 7ZL, Liverpool, UK. Email address: yuanlinglongcn@gmail.com

Article contents

Abstract
Motivation and background
Models
Main results
Proofs
Analysis of $\ln\frac{h(1-b)}{\int x\mathcal{K}_Q(dx)}$ in Kingman’s model
References

Rights & Permissions

Abstract

For a one-locus haploid infinite population with discrete generations, the celebrated model of Kingman describes the evolution of fitness distributions under the competition of selection and mutation, with a constant mutation probability. This paper generalises Kingman’s model by using independent and identically distributed random mutation probabilities, to reflect the influence of a random environment. The weak convergence of fitness distributions to the globally stable equilibrium is proved. Condensation occurs when almost surely a positive proportion of the population travels to and condenses at the largest fitness value. Condensation may occur when selection is favoured over mutation. A criterion for the occurrence of condensation is given.

Keywords

Population dynamics mutation–selection balance house of cards fitness distribution size-biased distribution distributional equation

MSC classification

Primary: 60F05: Central limit and other weak theorems

Secondary: 60G10: Stationary processes 92D15: Problems related to evolution 92D25: Population dynamics (general)

Type: Original Article
Information: Advances in Applied Probability , Volume 54 , Issue 1 , March 2022 , pp. 311 - 335

DOI: https://doi.org/10.1017/apr.2021.33 [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Motivation and background

Various biological forces interact with each other and jointly drive the evolution of populations. One important competing pair consists of selection and mutation. As early as 1937, Haldane [Reference Haldane14] put forward the concept of mutation–selection balance. The mathematical foundation of this subject was established by Crow and Kimura [Reference Crow and Kimura7], Ewens [Reference Ewens11], and Kingman [Reference Kingman20]. For more details on this topic, we refer to Bürger [Reference Bürger5, Reference Bürger6].

A simple setting is to consider a one-locus haploid infinite population with discrete generations under selection and mutation. The locus is assumed to have infinitely many possible alleles, which have continuous effects on a quantitative type. The continuum-of-alleles models were introduced by Crow and Kimura [Reference Crow and Kimura7] and Kimura [Reference Kimura17] and are used frequently in quantitative genetics.

Kingman [Reference Kingman18] suggested that the tendency that most mutations are deleterious could be explained by the assumption of the independence of the gene before and after mutation. The paper [Reference Kingman19], which proposed Kingman's famous one-locus model, described this feature in terms of a ‘house of cards’; that is, a mutation destroys the biochemical house of cards built up by evolution. In the one-locus model, a population is characterised by its type distribution, which is a probability measure on [0,1], and any $x\in[0,1]$ is a type value. In Kingman’s setting, an individual with a larger type value is fitter, which means more productive. So the type value can also be referred to as a fitness value. Kingman’s model can be seen as the limit of a finite-population model; see [Reference Grange13].

Bürger [Reference Bürger4] generalised the selection mechanism by allowing the gene after mutation to depend on that before and proved convergence in total variation. The genetic variation of the equilibrium distribution was computed and discussed. In [Reference Yuan22] the author of the present paper proposed a more general selection mechanism which can model general macroscopic epistasis, with the other settings the same as in Kingman’s model. This model was applied to the modelling of the Lenski experiment (see [Reference Gonzalez-Casanova, Kurt, Wakolbinger and Yuan12] for a description of the experiment).

There also exist models on the balance of selection and mutation in the setting of continuous generations. Bürger [Reference Bürger3] provided an exact mathematical analysis of Kimura’s continuum-of-alleles model, focusing on the equilibrium genetic variation. Steinsaltz et al. [Reference Steinsaltz, Evans and Wachter21] proposed a multi-locus model using a differential equation to study the ageing effect. Later on, recombination was incorporated into the model [Reference Evans, Steinsaltz and Wachter10]. The model of Betz et al. [Reference Betz, Dereich and Mörters2] generalised a continuous-time version of Kingman’s model and other models arising from physics.

However, to the best of the author’s knowledge, Kingman’s model has never been generalised to a random version. In this paper we will assume that the mutation probabilities of all generations form an independent and identically distributed (i.i.d.) sequence. Biologically, we think of a stable random environment such that the mutation probabilities vary with time but are independently sampled from the same distribution.

In Kingman’s model, condensation occurs if a certain positive proportion of the population travels to and condenses at the largest fitness value. This is due to the dominance of selection over mutation. In the random model proposed in this paper, we also consider the convergence of (random) fitness distributions to the equilibrium and the condensation phenomenon. Moreover, Kingman’s model has been revisited recently in terms of the travelling wave of mass to the largest fitness value [Reference Dereich and Mörters8]. The random model provides another example for consideration in this direction.

2. Models

2.1. Kingman’s model with time-varying mutation probabilities

Consider a haploid population of infinite size and discrete generations under the competition of selection and mutation. We use a sequence of probability measures $(P_n)=(P_n)_{n\geq 0}$ on [0,1] to describe the distribution of fitness values in the nth generation. We can assume, more generally, that the probability measures are supported on a finite interval, not necessarily [0,1]. But since only fitness ratios will be relevant (see [Reference Kingman19] or [Reference Yuan22] for a more explicit explanation), we adopt the setting of [0,1], which was used by Kingman [Reference Kingman19], and which is equivalent to general finite supports.

Individuals in the nth generation are children of the $(n-1)$ th generation. First of all, the fitness distribution of children is initially $P_{n-1}$ (an exact copy from the parents). Then selection takes effect, updating the fitness distribution from $P_{n-1}$ to the size-biased distribution

\begin{equation*}\frac{x P_{n-1}(dx)}{\int y P_{n-1}(dy)}.\end{equation*}

Here and henceforth, we use $\int$ to denote $\int_0^1.$ Basically, the new population is re-sampled from the existing population by using their fitness as a selective criterion. Next, each individual mutates independently with the same mutation probability, which we denote by $b_{n}$ , taking values in $[0,1).$ Each mutant has fitness value sampled independently from a common mutant distribution, which we denote by Q, a probability measure on [0,1]. Then the resulting distribution is the distribution of the nth generation:

(1)

\begin{equation} P_{n}(dx)=(1-b_{n})\frac{x P_{n-1}(dx)}{\int y P_{n-1}(dy)}+b_{n}Q(dx).\end{equation}

The reason we exclude the case that $b_n$ equals 1 is that in this situation we have $P_{n}=Q$ ; that is, we have completely lost the accumulated evolutionary changes. This is not interesting either biologically or mathematically.

Expanding (1), we can also obtain

(2)

\begin{equation}P_{n}(dx)=\left(\prod_{l=0}^{n-1}\frac{1-b_{l+1}}{\int yP_l(dy)}\right)x^{n} P_0(dx)+\sum_{j=1}^{n}\left(\prod_{l=j}^{n-1}\frac{1-b_{l+1}}{\int yP_l(dy)}\right)b_jm_{n-j}Q^{n-j}(dx),\end{equation}

where

\begin{equation*}Q^k(dx)\,{:\!=}\,\frac{x^kQ(dx)}{\int y^kQ(dy)},\qquad m_k\,{:\!=}\,\int x^kQ(dx), \qquad \forall k\geq 0.\end{equation*}

In particular, if $Q=\delta_0,$ the Dirac measure on $\{0\}$ , then $Q^k=\delta_0$ for any $k\geq 0$ .

When all the $b_n$ are equal to the same number $b\in[0,1),$ this is the model introduced by Kingman [Reference Kingman19]. In the general setting we allow the mutation probabilities to be different. We call it Kingman’s model with time-varying mutation probabilities, or the general model for short. We introduce a few more pieces of notation. Let M be the space of (nonnegative) Borel measures on [0,1] and $M_1$ the subspace of M consisting of probability measures. Let $M, M_1$ be endowed with the topology of weak convergence. We use $\stackrel{d}{\longrightarrow}$ to denote weak convergence. We say a sequence of measures $(u_n)$ converges in total variation to a measure u, and write $u_n\stackrel{TV}{\longrightarrow}u,$ if the total variation, defined as $\sup_B|u_n(B)-u(B)|$ where the supremum is taken over all Borel sets, converges to 0.

For any $u\in M_1$ , define

(3)

\begin{equation}S_{u}\,{:\!=}\,\sup\{x:u[x,1]>0\}.\end{equation}

We interpret $S_{u}$ as the largest fitness value in a population of distribution u. Define $h\,{:\!=}\,S_{P_0}$ . It is not difficult to see that $S_{P_{n}}=\max\{S_{P_0}, S_Q\}$ for any $n\geq 1$ . Since we are interested in asymptotics, it is thus without loss of generality to assume that $h\geq S_Q$ . Therefore $S_Q\leq h\leq 1.$

Note that the general model has parameters $(b_n)_{n\geq 1}, Q, P_0, h$ . Kingman’s model shares the same parameters, but with the $b_n$ all equal to b. We call $(P_n)$ the forward sequence or just the sequence. Although h is determined by $P_0$ , we still consider h as a parameter, as it will become clear later that for Kingman’s model and the random model considered in this paper, the limit of $(P_n)$ depends on $P_0$ only through h. This is the property of so-called global stability.

2.2. Convergence and condensation in Kingman’s model

Kingman [Reference Kingman19] proved the convergence of $(P_n)$ when all mutation probabilities are equal, i.e. $b_n=b$ for all $n\geq 1$ .

Theorem 1. (Kingman’s theorem, [Reference Kingman19].) 1. If $\int \frac{Q(dx)}{1-x/h}\geq b^{-1},$ then $(P_n)$ converges in total variation to

\begin{equation*}\mathcal{K}(dx)=\frac{b \theta_bQ(dx)}{\theta_b-(1-b)x},\end{equation*}

with $\theta_b$ , as a function of b, being the unique solution of

(4)

\begin{equation}\int\frac{b \theta_bQ(dx)}{\theta_b-(1-b)x}=1.\end{equation}

2. If $\int \frac{Q(dx)}{1-x/h}< b^{-1}$ , then $(P_n)$ converges weakly to

\begin{equation*}\mathcal{K}(dx)=\frac{b Q(dx)}{1-x/h}+\Big(1-\int\frac{b Q(dy)}{1-y/h}\Big)\delta_{h}(dx).\end{equation*}

Note that $\mathcal{K}$ is uniquely determined by b, Q, h, but not the choice of $P_0$ . In this sense $\mathcal{K}$ is a globally stable equilibrium. For simplicity, for any measure, say $\mu,$ its mass on a point x is denoted by $\mu(x)$ instead of $\mu(\{x\}).$ Then we say there is condensation at h in Kingman’s model if $Q(h)=0$ but $\mathcal{K}(h)>0$ . We call $\mathcal{K}(h)$ the condensate size if $Q(h)=0$ . In Case 1 above, there is no condensation. The condition

\begin{equation*}\int \frac{Q(dx)}{1-x/h}\geq b^{-1}\end{equation*}

is satisfied only if b is big and/or Q is fit (i.e., having more mass on larger values). It means mutation is stronger than selection, so that the limit does not depend on $P_0$ at all.

In Case 2, the condition

\begin{equation*}\int \frac{Q(dx)}{1-x/h}< b^{-1}\end{equation*}

implies $Q(h)=0$ , but we have that $\mathcal{K}(h)>0$ . So there is condensation. In contrast to the first case, selection is favoured over mutation, so that the limit depends on $P_0$ through h. If $P_0(h)=0$ (implying $S_{P_n}=h$ and $P_n(h)=0$ for any n), a certain amount of mass

\begin{equation*}\mathcal{K}(h)=\Big(1-\int\frac{b Q(dy)}{1-y/h}\Big)\end{equation*}

travels to the largest fitness value h, by the force of selection.

Next we introduce the random model, which is the main object of study in this paper.

2.3. Kingman’s model with random mutation probabilities

Let $(\beta_n)_{n\geq 0}$ be an i.i.d. sequence of random variables in the common probability space $(\Omega, \mathcal{F}, \mathbb{P})$ , taking values in [0,1) with common distribution $\mathcal{L}\in M_1$ supported on [0,1). Kingman’s model with random mutation probabilities or simply the random model is defined by the following dynamical system:

(5)

\begin{equation} P_{n}(dx)=(1-\beta_{n})\frac{x P_{n-1}(dx)}{\int y P_{n-1}(dy)}+\beta_{n}Q(dx), \qquad n\geq1.\end{equation}

The random model has parameters $(\beta_n), Q, P_0, h$ . It is a randomisation of Kingman’s model, as we can set each $\beta_n$ to equal b with probability 1.

We are interested in the convergence of $(P_n)$ to the equilibrium and the phenomenon of condensation. Since we are dealing with random probability measures, i.e., random elements of $M_1,$ let us recall the definition of weak convergence in this context. Random (probability) measures $(\mu_n)$ supported on [0,1] converge weakly to a limit $\mu$ if and only if for any continuous function f on [0,1] we have

\begin{equation*}\int f(x)\mu_n(dx)\stackrel{d}{\longrightarrow}\int f(x)\mu(dx).\end{equation*}

We refer to [Reference Kallenberg16] for a reference on random measures. The definition of weak convergence for random measures stated in the follow-up paper [Reference Yuan23] is incorrect. But this does not affect anything there as the weak convergence results are all proved in this paper.

As the sequence $(P_n)$ is completely determined by $(\beta_n), Q, P_0$ , and h, the only randomness arises from $(\beta_n)$ . In the terminology of statistical physics, the weak limit of $(P_n)$ is an annealed limit, which is obtained given the law of $(\beta_n)$ . A quenched limit, which is obtained by conditioning on $(\beta_n)$ , does not exist unless $P_0=Q=\delta_0$ . A simple reason for nonexistence is that $P_n$ contains $\beta_nQ$ , which fluctuates persistently as $(\beta_n)$ is i.i.d. However in Section 4.3 we will see that it is possible to obtain a quenched limit if the evolution is seen backwards.

For the particular case that $Q=\delta_0$ , we have

\begin{equation*}P_n(dx)=(1-\beta_{n})\frac{x^nP_0(dx)}{\int y^nP_0(dy)}+\beta_{n}\delta_0(dx).\end{equation*}

From this, it is easily deduced that the sequence $(P_n)$ converges weakly to the random element $(1-\beta)\delta_h+\beta\delta_0$ , where $\beta$ is a random variable with law $\mathcal{L}$ , the common law of all the $\beta_n$ . So we assume from now on that $Q\neq\delta_0.$

3. Main results

3.1. Weak convergence

Recall that the sequence $(P_n)$ in the random model has parameters $(\beta_n)$ , Q, $P_0$ , and h, with $h=S_{P_0}.$ Then $(P_n)$ converges weakly to a globally stable equilibrium, in the sense that the limit depends on $P_0$ only through h. Recall that $\beta$ is a random variable with law $\mathcal{L}$ , the common law of the $\beta_n$ .

Theorem 2. For the random model (5), the sequence $(P_n)$ converges weakly to a random probability measure, denoted by $\mathcal{I}$ , whose distribution depends on $\mathcal{L}, Q, h$ but not on the choice of $P_0$ .

Remark 1. In [Reference Yuan23, p. 872], it is written that the distribution of $\mathcal{I}$ depends on $\beta, Q, h$ . This statement is true in the sense that the distribution of $\mathcal{I}$ depends on $\beta$ via its distribution. Here we make it clearer by replacing $\beta$ by $\mathcal{L}$ .

Remark 2. If we start with $P_0=\delta_h$ (recall that $h\in[S_Q,1]$ ), then all the $P_n$ are supported on $[0,S_Q]\cup\{h\}$ , which implies that the limit $\mathcal{I}$ is supported on the same set $[0,S_Q]\cup\{h\}$ . Moreover, we have either $\mathcal{I}(h)>0$ almost surely (a.s.) or $\mathcal{I}(h)=0$ a.s. (a justification is provided in Remark 12 in Section 4.4). In the latter case, $\mathcal{I}$ is supported only on $[0,S_Q]$ , and (the distribution of) $\mathcal{I}$ does not depend on h (see Theorem 3). Therefore, although we say h is a parameter of $\mathcal{I}$ , this should be understood in the sense that $\mathcal{I}$ is the weak limit of $(P_n)$ with $h=S_{P_0}.$ The limit $\mathcal{I}$ is introduced in Section 4.3, but the proof of weak convergence is deferred to a later stage, as it uses other main results, such as the condensation criterion for the random model.

3.2. Condensation criterion

The fact that either $\mathcal{I}(h)>0$ a.s. or $\mathcal{I}(h)=0$ a.s. allows us to give the precise definition of condensation, in line with that for Kingman’s model, as follows.

Definition 1. For the random model, we say there is condensation at the largest fitness value h if Q assigns zero mass at h (i.e., $Q(h)=0$ ) but the limiting measure $\mathcal{I}$ assigns positive mass at h (i.e., $\mathcal{I}(h)>0$ , a.s.).

Next we give the condensation criterion. If $h=S_Q$ , we write $\mathcal{I}_Q$ for $\mathcal{I}$ and $\mathcal{K}_Q$ for $\mathcal{K}$ .

Theorem 3. (Condensation criterion.) If there is no condensation at h, then $\mathcal{I}\stackrel{d}{=}\mathcal{I}_Q$ . The condensation criterion for $\mathcal{I}$ at h is as follows:

1. If $h=S_Q$ , then there is no condensation at h if
(6) \begin{equation}\mathbb{E}\left[\ln \frac{S_Q(1-\beta)}{\int y\mathcal{I}_Q(dy)}\right]<0.\end{equation}
2. If $h>S_Q$ , then there is no condensation at h if and only if
(7) \begin{equation}\mathbb{E}\left[\ln \frac{h(1-\beta)}{\int y\mathcal{I}_Q(dy)}\right]\leq 0.\end{equation}

Here

\begin{equation*}\mathbb{E}\left[\ln \frac{1-\beta}{\int y\mathcal{I}_Q(dy)}\right]\end{equation*}

is well defined, takes values in $[\!-\infty,-\ln\int yQ(dy)]$ , and depends only on the marginal distributions of $\beta$ and $\mathcal{I}_Q.$

Remark 3. In fact, if there is no condensation at h, then $\mathcal{I},\mathcal{I}_Q$ are the same random probability measure, based on the definition of $\mathcal{I}$ introduced at the end of Section 4.3. But since here we do not have the definition yet, we write the weaker statement $\mathcal{I}\stackrel{d}{=}\mathcal{I}_Q.$

Remark 4. As the distribution of $\mathcal{I}_Q$ is determined by Q and $\mathcal L$ (the distribution of $\beta$ ),

\begin{equation*}\mathbb{E}\left[\ln \frac{1-\beta}{\int y\mathcal{I}_Q(dy)}\right]\end{equation*}

depends only on Q and $\mathcal L$ . By Remark 13 in Section 4.4, we can only have

\begin{equation*}\mathbb{E}\left[\ln \frac{S_Q(1-\beta)}{\int y\mathcal{I}_Q(dy)}\right]\leq0.\end{equation*}

For the occurrence of condensation in the case where $h=S_Q$ , the fact that we cannot say anything when

\begin{equation*}\mathbb{E}\left[\ln \frac{S_Q(1-\beta)}{\int y\mathcal{I}_Q(dy)}\right]=0\end{equation*}

can be better understood in Kingman’s model, which is a special random model. In this model,

\begin{equation*}\mathbb{E}\left[\ln \frac{S_Q(1-\beta)}{\int y\mathcal{I}_Q(dy)}\right]=0\end{equation*}

becomes

\begin{equation*}\ln \frac{S_Q(1-b)}{\int y\mathcal{K}_Q(dy)}=0.\end{equation*}

By some simple computations using Theorem 1, the above display is equivalent to

\begin{equation*}\int\frac{Q(dx)}{1-x/S_Q}\leq b^{-1}.\end{equation*}

But it covers cases with and without condensation. For full details see Appendix A, where the case $h>S_Q$ is also analysed.

We give some intuition for why Theorem 3 holds. Consider the unnormalised variant of the dynamical system that is given by

(8)

\begin{equation}\overline{P}_n(dx)=(1-\beta_n)x\overline{P}_{n-1}(dx)+\beta_n\left(\int y\overline{P}_{n-1}(dy)\right)Q(dx)\end{equation}

with $\overline{P}_0=P_0.$ By induction, it can be shown that

(9)

\begin{equation}\overline{P}_n=P_n\prod_{i=0}^{n-1}\int xP_i(dx) \qquad \forall n\geq 0.\end{equation}

We can roughly think of the growth of $\overline{P}_n$ as contributed by two parts, the initial $P_0$ and the subsequently arriving distributions Q. If the initial distribution is supported on $[0,S_Q]$ , by Theorem 2, $P_i$ converges weakly to $\mathcal{I}_Q$ as $i\to\infty.$ Then the part of $\overline{P}_n$ contributed by the Qs grows at rate $\operatorname{gr}(Q)\,{:\!=}\,\mathbb{E}[\ln\int x\mathcal{I}_Q]$ (see (9)). In comparison, the largest fitness value h in $P_0$ can be assigned the growth rate $\operatorname{gr}(h)\,{:\!=}\,\mathbb{E}[\ln h(1-\beta)]$ (due to the term $(1-\beta_n)x\overline{P}_{n-1}(dx)$ in (8)). Then it is clear that the occurrence of condensation is determined by the comparison of $\operatorname{gr}(h)$ and $\operatorname{gr}(Q)$ . However, it is subtle when $\operatorname{gr}(h)=\operatorname{gr}(Q)$ : there is no condensation if $h>S_Q$ , and it is undetermined if $h=S_Q$ .

In the follow-up paper [Reference Yuan23], we provide a matrix representation for $\mathcal{I}_Q$ , so the condensation criterion can be written neatly (see [Reference Yuan23, Corollary 2, p. 877]). Moreover, using matrix analysis, we can compare the fitness of equilibria from different models (see [Reference Yuan23, Section 3.3-(3), pp. 878--879]). The challenging problem of finding a necessary and sufficient condition for the occurrence of condensation in the case $h=S_Q$ has not been dealt with anywhere and still remains open.

3.3. Invariant measure

We introduce the notion of invariant measure, which includes the limit $\mathcal{I}.$ We will use invariant measures heavily in the proofs.

Definition 2. (Invariant measure.) A random probability measure $\nu$ is invariant if it is supported on [0,1] and satisfies

(10)

\begin{equation}\nu(dx)\stackrel{d}{=}(1-\beta)\frac{x\nu(dx)}{\int y\nu(dy)}+\beta Q(dx),\end{equation}

where $\beta \text{ is independent of } \nu$ . Clearly $\mathcal{I}$ is an invariant measure, since it is the weak limit of $(P_n)$ defined by (5).

Theorem 4. (Compoundness of invariant measures.) For any invariant measure $\nu$ , there exists a regular conditional distribution of $\nu$ on $S_\nu$ . Moreover, conditional on $S_\nu$ ,

\begin{equation*}(\nu|S_{\nu})\stackrel{d}{=}\mathcal{I},\quad \text{ a.s.},\end{equation*}

where $\mathcal{I}$ is the random probability measure introduced in Theorem 2 with parameters $\mathcal{L}$ , Q, $h=S_\nu$ and satisfies $\mathbb{P}(S_\mathcal{I}=S_\nu|S_\nu)=1$ , a.s.

Remark 5. Remark 2 says that if there is no condensation at h, then $\mathcal{I}$ is supported on $[0,S_Q]$ . Since $\mathcal{I}$ is an invariant measure, the above theorem implies that $\mathcal{I}\stackrel{d}{=}\mathcal{I}_Q.$ This assertion has been stated in Theorem 3.

Using the notion of invariant measures, we can solve a distributional equation in the following example. For a survey on distributional equations, we refer to Aldous and Bandyopadhyay [Reference Aldous and Bandyopadhyay1].

Example 1. Consider a particular case: Q is supported only on $\{c\}$ for some $c\in (0,1)$ , and $h\in (c,1)$ . Let $\nu$ be an invariant measure supported on $\{c\}\cup\{h\}$ . Then $\nu$ can be written as $\nu=X\delta_c+(1-X)\delta_h$ , where X is a random variable taking values in [0,1] and satisfies

\begin{equation*}X\delta_c+(1-X)\delta_h\stackrel{d}{=}(1-\beta)\frac{cX\delta_c+h(1-X)\delta_h}{cX+h(1-X)}+\beta\delta_c,\end{equation*}

with $\beta$ independent of X. The above display is equivalent to

\begin{equation*}X\stackrel{d}{=}\frac{c+(h\beta-c)(1-X)}{c+(h-c)(1-X)}.\end{equation*}

We are interested in a necessary and sufficient condition for the above equation to have a solution X with $0\leq X<1$ a.s. (i.e., $\nu(h)>0$ a.s.). By Theorem 4, this is equivalent to saying that there is condensation at h. By Theorem 3, the necessary and sufficient condition is simply $\mathbb{E}[\ln (h(1-\beta)/c)]>0.$ Moreover, as such $\nu$ is unique (in distribution), the solution X is also unique (in distribution).

The rest of the paper is organised as follows. Sections 4.1 and 4.2 provide necessary preparations. Sections 4.3 and 4.4 analyse the finite backward sequence, which is the main tool used in this paper. Section 4.5 proves Theorem 3. Section 4.4 analyses the invariant measures, and the results obtained there will be used in Section 4.7 to prove the weak convergence in Theorem 2. Section 4.8 is dedicated to the proof of Theorem 4.

4. Proofs

4.1. Relations between measures

We introduce the following notation to describe relations between measures:

1. For measures $u,v\in M$ , we say u is a component of v on [0,a] (resp. [0,a)), and we write $u\leq_a v$ (resp. $\leq_{a-}$ ), if
\begin{equation*}u(A)\leq v(A) \quad\text{ for any measurable set } A\subset [0,a] \ (\text{resp.}\ [0,a)).\end{equation*}
For random measures $\mu,\nu\in M$ , we write $\mu\leq_a^d \nu$ if there exists a coupling $(\mu',\nu')$ with $\mu',\nu'\in M$ such that
(11) \begin{equation}\mu'\leq_a \nu' \text{ a.s. and }\,\, \mu'\stackrel{d}{=}\mu,\,\,\nu'\stackrel{d}{=}\nu.\end{equation}
The relation $\mu\leq_{a-}^d \nu$ is defined in a similar way.
2. For measures $(u_n)$ and u in M, we introduce the notation
\begin{equation*}u_n\leq_a\stackrel{TV}{\longrightarrow}u,\end{equation*}
which means that $u_n\leq_a u_{n+1}$ for any n, and $u_n$ converges in total variation to u. We define similarly $\leq_{a-}\stackrel{TV}{\longrightarrow}.$
3. For real-valued random variables $\xi, \eta,$ we write the well-known stochastic ordering as $\xi\preceq\eta$ , which holds if
\begin{equation*} \mathbb{P}(\xi\leq x)\geq \mathbb{P}(\eta\leq x) \qquad \forall\, x\in\mathbb{R}.\end{equation*}
4. For any $u\in M_1,$ let the distribution function of u be
\begin{equation*}D_{u}(x)\,{:\!=}\,u([0,x]) \qquad \forall x\in[0,1].\end{equation*}
For any $u, v\in M_1,$ we use the same notation $\preceq$ for the stochastic ordering and write $u\preceq v$ if $D_u(x)\geq D_v(x)$ for any $x\in[0,1]$ . This definition is natural, as $\xi\preceq \eta$ is equivalent to $u\preceq v$ , if u is the distribution of $\xi$ and v is the distribution of $\eta.$

Remark 6. We make a comment about the relationship between $\leq_{a-}$ and $\preceq.$ For two probability measures $u,v\in M_1$ , assume that $S_u=S_v=a$ ; then $u\leq_{a-}v$ implies that $v\preceq u.$ But the converse is not true.

Remark 7. If we use the notation $\leq_{a}$ , $\leq_{a-}$ , $\leq_{a}\stackrel{TV}{\longrightarrow}$ , $\leq_{a-}\stackrel{TV}{\longrightarrow}$ , $\preceq$ to describe the relations between random measures, it should be understood that they hold in the almost sure sense, or even in the pointwise sense (i.e., for every $\omega\in\Omega$ ), if possible.

Similarly, if we use $\leq$ , $<$ , $\geq$ , $>$ , $=$ , $\neq$ to compare random variables, they should be understood in the almost sure sense or in the pointwise sense.

4.2. Three sequences

To study the asymptotic behaviour of $(P_n)$ , we also introduce $(P_n^{\prime}), (P_n^{\prime\prime})$ so that the three forward sequences correspond respectively to

\begin{equation*}((\beta_n), Q, P_0, h),\quad ((\beta_n), Q', P_0^{\prime}, h'),\quad ((\beta_n), Q'', P_0^{\prime\prime}, h'').\end{equation*}

The parameters of $(P_n^{\prime})$ and $(P_n^{\prime\prime})$ will be specified when they are used. The two sequences will converge weakly when they are used, and $(P_n)$ will be compared to them (or to one of them) to show that $(P_n)$ also converges weakly. The first place where this technique is used is in Section 4.4.

Using (2), we write

(12)

\begin{equation}P_n(dx)=\mathcal{M}_n(dx)+\mathcal{W}_n(dx)\end{equation}

with

\begin{equation*}\mathcal{M}_n(dx)=\left(\prod_{l=0}^{n-1}\frac{1-\beta_{l+1}}{\int yP_l(dy)}\right)x^{n} P_0(dx) \end{equation*}

and

\begin{equation*}\mathcal{W}_n(dx)=\sum_{j=1}^{n}\left(\prod_{l=j}^{n-1}\frac{1-\beta_{l+1}}{\int yP_l(dy)}\right)b_jm_{n-j}Q^{n-j}(dx).\end{equation*}

Therefore $\mathcal{M}_n$ is the contribution to $P_n$ made by $P_0$ , while $\mathcal{W}_n$ is the contribution by the Qs.

Similarly we introduce

(13)

\begin{equation}P_n^{\prime}(dx)=\mathcal{M}_n^{\prime}(dx)+\mathcal{W}_n^{\prime}(dx),\end{equation}

(14)

\begin{equation}P_n^{\prime\prime}(dx)=\mathcal{M}_n^{\prime\prime}(dx)+\mathcal{W}_n^{\prime\prime}(dx)\end{equation}

with $\mathcal{M}_n^{\prime}, \mathcal{W}_n^{\prime}, \mathcal{M}_n^{\prime\prime}, \mathcal{W}_n^{\prime\prime}$ defined correspondingly.

4.3. Introducing the finite backward sequences

4.3.1. The general model

We introduce the finite backward sequence $( P_j^n)=( P_j^n)_{0\leq j\leq n}$ for the general model which has parameters $n, (b_j)_{1\leq j\leq n},Q, P_n^n, h$ with $S_{P_n^n}=h$ :

(15)

\begin{equation} P_j^n(dx)=(1-b_{j+1})\frac{x P_{j+1}^n(dx)}{\int y P_{j+1}^n(dy)}+b_{j+1}Q(dx) \qquad \forall \,0\leq j\leq n-1.\end{equation}

Here h,Q are from the general model and $P_n^n$ can be any measure in $M_1$ satisfying $S_{P_n^n}=h$ . The $(b_j)_{1\leq j\leq n}$ are the first n mutation probabilities in the general model. Here we use the index j to indicate that we are dealing with a finite backward sequence.

The sequence is backward in the sense that we use $b_n$ to generate $P_{n-1}^n$ from $P_{n}^n$ , use $b_{n-1}$ to generate $P_{n-2}^n$ from $P_{n-1}^n$ , etc. That is, the $(b_j)$ are used backwards and the $(P_j^n)$ are generated backwards. The advantage in taking the backward approach is that $( P_j^n)$ converges as n tends to infinity, in contrast to the forward sequence.

Lemma 1. In the general model, for the finite backward sequence with $P_n^n=\delta_h,$ $P_j^n$ converges in total variation to a limit, denoted by $\mathcal{G}_j=\mathcal{G}_{j,h}$ , as n goes to infinity with j fixed, such that

(16)

\begin{equation} \mathcal{G}_{j-1}(dx)=(1-b_{j})\frac{x \mathcal{G}_j(dx)}{\int y \mathcal{G}_j(dy)}+b_{j}Q(dx), \quad j\geq 1.\end{equation}

As a consequence, $\mathcal{G}_0:[0,1)^\infty\to M_1$ is a measurable function, with $\mathcal{G}_j=\mathcal{G}_0(b_{j+1},b_{j+2,\cdots})$ supported on $[0,S_Q]\cup \{h\}$ for any $j\geq0.$

Remark 8. We write $\mathcal{G}_{j,h}$ when h has to be specified for clarity. Otherwise we write $\mathcal{G}_j$ . This kind of abbreviation applies to other terms which will appear later.

Remark 9. Note that, by (16), the $\mathcal{G}_j(h)$ are either all zero or all strictly positive.

Proof. We prove a stronger version below:

(17)

\begin{equation}\text{ For any } j, \quad P_j^n\leq_{h-}\stackrel{TV}{\longrightarrow} \mathcal{G}_j, \text{ as } n\to\infty.\end{equation}

It suffices to show that

(18)

\begin{equation}P_j^n\leq_{h-}P_j^{n+1},\end{equation}

as the $P_j^n$ are all supported on $[0,S_Q]\cup\{h\}$ .

First of all, $P_n^n=\delta_h\leq_{h-} P_n^{n+1}$ . Assume that for some $1\leq j\leq n$ we have $P_j^n\leq_{h-} P_j^{n+1}$ . By definition

(19)

\begin{equation}P_{j-1}^n(dx)=(1-b_{j})\frac{xP_{j}^n(dx)}{\int yP_{j}^n(dy)}+b_{j}Q(dx), \,\, P_{j-1}^{n+1}(dx)=(1-b_{j})\frac{x P_{j}^{n+1}(dx)}{\int yP_{j}^{n+1}(dy)}+b_{j}Q(dx).\end{equation}

Since $P_j^n\leq_{h-} P_j^{n+1}$ (and hence $P_j^{n+1} \preceq P_j^n$ ; see Remark 6), we have

\begin{equation*}\int yP_{j}^{n+1}(dy)\leq \int yP_{j}^n(dy),\end{equation*}

and thus

\begin{equation*}\frac{x}{\int yP_{j}^n(dy)}\leq \frac{x}{\int yP_{j}^{n+1}(dy)} \qquad \forall x\in [0,1].\end{equation*}

From $P_j^n\leq_{h-} P_j^{n+1}$ and (19), we get $P_{j-1}^n\leq_{h-} P_{j-1}^{n+1}$ . Induction shows that

(20)

\begin{equation}P_j^n\leq_{h-} P_j^{n+1} \quad \text{ for any } 0\leq j\leq n, \ n\geq 0. \end{equation}

This completes the proof.

The monotonicity analysis in the above proof will be used many times in this paper, as it applies to both backward and forward sequences. An immediate application is the following: we can compare $(\mathcal{G}_j)$ and $(\mathcal{G}_j^{\prime})=(\mathcal{G}_{j,h'})$ for $S_Q\leq h< h'\leq 1$ with the same $(b_j), Q$ .

Corollary 1. Let $(\mathcal{G}_j)$ and $(\mathcal{G}_j^{\prime})$ be the above sequences. Then we have

(21)

\begin{equation}\mathcal{G}_j^{\prime}\leq_{h-}\mathcal{G}_j, \qquad \mathcal{G}_j(h)\leq \mathcal{G}_j^{\prime}(h'), \qquad \forall j\geq0.\end{equation}

Moreover, we have the exact equalities in the above display for any $h\in[S_Q, h']$ if and only if $\mathcal{G}_{0}^{\prime}(h')=0$ . In this case $(\mathcal{G}_j)$ and $(\mathcal{G}_j^{\prime})$ are all supported on $[0,S_Q]$ , and are both equal to $\big(\mathcal{G}_{j,S_Q}\big)$ .

Proof. Let $( P_j^n)$ be the sequence in Lemma 1. Let $\Big(P_{j,h'}^n\Big)$ be the variant of $( P_j^n)$ with $P_n^n=\delta_{h'}.$ By following the same monotonicity analysis as in the proof of (18), we obtain

\begin{equation*}P_{j,h'}^n\leq_{h-}P_j^n, \qquad P_j^n(h)\leq P_{j,h'}^n(h'), \qquad\forall 0\leq j\leq n.\end{equation*}

By Lemma 1, $P_{j,h'}^n\stackrel{TV}{\longrightarrow}\mathcal{G}_j^{\prime}$ and $P_{j}^n\stackrel{TV}{\longrightarrow}\mathcal{G}_j$ as $n\to\infty.$ Thus we obtain (21).

Now let us prove the if-and-only-if statement. If $\mathcal{G}_0^{\prime}(h')=0$ , then by (21), $\mathcal{G}_0(h)=0$ . Using Remark 9, $\mathcal{G}_j^{\prime}(h')=0$ , $\mathcal{G}_j(h)=0$ for any j, and so (21) holds with equalities. For the other direction, if $\mathcal{G}_0^{\prime}(h')>0$ , then again by Remark 9, $\mathcal{G}_j^{\prime}(h')>0$ for any j. Using (21), it holds that

(22)

\begin{equation}\int y\mathcal{G}_j^{\prime}(dy)>\int y\mathcal{G}_j(dy)\qquad \forall j.\end{equation}

Similarly to (19),

\begin{equation*}\mathcal{G}_{j-1}(dx)=(1-b_{j})\frac{x\mathcal{G}_{j}(dx)}{\int y\mathcal{G}_{j}(dy)}+b_{j}Q(dx), \,\, \mathcal{G}^{\prime}_{j-1}(dx)=(1-b_{j})\frac{x \mathcal{G}^{\prime}_{j}(dx)}{\int y\mathcal{G}^{\prime}_{j}(dy)}+b_{j}Q(dx).\end{equation*}

As (22) implies that

\begin{equation*}\frac{1-b_{j}}{\int y\mathcal{G}_{j}(dy)}>\frac{1-b_{j}}{\int y\mathcal{G}^{\prime}_{j}(dy)},\end{equation*}

and again using (21), we obtain $\mathcal{G}^{\prime}_{j-1}\leq_{h-}\mathcal{G}_{j-1}$ , but they are not equal on [0,h). Since they are probability measures, we have $\mathcal{G}_{j-1}(h)<\mathcal{G}_{j-1}^{\prime}(h')$ for any j. This completes the proof.

If (21) holds with equalities, $(\mathcal{G}_j)=(\mathcal{G}_j')$ are all supported on $[0,S_Q]$ . To show that they are equal to $(\mathcal{G}_{j,S_Q})$ , we only have to take $h=S_Q$ and apply the equalities in (21).

4.3.2. The random model

The goal of this paper is the random model, which is a randomised general model. Since $(\mathcal{G}_j)$ has parameters $(b_{j+1},b_{j+2},\cdots)$ and Q, h, we can define

\begin{equation*}\mathcal{I}_j=\mathcal{I}_{j,h}\,{:\!=}\,\mathcal{G}_0(\beta_{j+1},\beta_{j+2},\cdots).\end{equation*}

Therefore $\mathcal{I}_j$ is the quenched limit of the finite backward sequences in the random model with $P_n^n=\delta_h$ . Thanks to Lemma 1, we have the following result.

Corollary 2. The sequence $(\mathcal{I}_j)=(\mathcal{I}_j)_{j\geq 0}$ is stationary ergodic and satisfies

(23)

\begin{equation} \mathcal{I}_{j-1}(dx)=(1-\beta_{j})\frac{x \mathcal{I}_j(dx)}{\int y \mathcal{I}_j(dy)}+\beta_{j}Q(dx), \quad j\geq1.\end{equation}

Remark 10. The equality (23) holds in the pointwise sense. In other words, given any realisation of $(\beta_j)$ (or equivalently, conditioning on $(\beta_j)$ ), the equality holds for any j as in the general model. In the sequel, when we present results regarding $(\mathcal{I}_j)$ , then conditioning on $(\beta_j)$ should be understood as in the pointwise sense. Sometimes we omit to state either of the two phrases when the context is clear.

The proof of Corollary 2 requires the following result, which is proved in [Reference Kallenberg15, Lemma 9.5].

Lemma 2. Let $(S,\mathscr{S})$ and $(S',\mathscr{S}')$ be measurable spaces. Let $(\alpha_j)\in S^\infty$ be a stationary ergodic sequence of random variables. Let $f:S^\infty\to S'$ be a measurable function. Then $\left(f(\alpha_j,\alpha_{j+1},\cdots)\right)$ is also stationary ergodic.

Proof of Corollary 2. Since $(\beta_j)$ is i.i.d., it is stationary ergodic. As $\mathcal{G}_0$ is a measurable function from $[0,1)^\infty$ to $M_1$ , we apply Lemma 2 to obtain that $(\mathcal{I}_j)=(\mathcal{G}_0(\beta_{j+1},\beta_{j+2},\cdots))$ is also stationary ergodic. The recursive equation (23) is inherited from (16).

Since $(\mathcal{I}_j)$ is stationary ergodic, all of the $\mathcal{I}_j$ have the same distribution. We define $\mathcal{I}\,{:\!=}\,\mathcal{I}_0=\mathcal{I}_{0,h}$ , which is the weak limit appearing in Theorem 2. The reason we drop the index is to distinguish this from the backward context, when it is appropriate to do so. The term $\mathcal{I}_Q$ used in Theorem 3 is in fact $\mathcal{I}_{0,S_Q}.$

We comment further on the importance of finite backward sequences. Let $(P_n)$ be a forward sequence and $(P_j^n)$ the finite backward sequence with $P_n^n=P_0$ , both in the random model with the same $(\beta_j)$ and Q. Since $(\beta_j)$ is i.i.d., we have

(24)

\begin{equation}(P_0, P_1,\cdots, P_n)\stackrel{d}{=}(P_n^n, P_{n-1}^n,\cdots, P_0^n).\end{equation}

So showing the weak convergence of $(P_n)_{n\geq 0}$ is equivalent to showing that of $(P_0^n)_{n\geq 0}.$ But investigating the finite backward sequences, via the general model, appears to be more convenient. In general a dynamical system is easier to handle if we take a backward point of view; see Diaconis and Freedman [Reference Diaconis and Freedman9].

4.4. Finer analysis of the finite backward sequences

4.4.1. The general model

We consider $(P_j^n)$ with $P_n^n=\delta_h$ , as in Lemma 1. Developing (15), we obtain

(25)

\begin{align}P_0^n(dx)&=\left(\prod_{l=1}^n\frac{1-b_l}{\int yP_l^n(dy)}\right)x^nP_n^n(dx)+\sum_{j=0}^{n-1}\left(\prod_{l=1}^j\frac{1-b_l}{\int yP_l^n(dy)}\right)b_{j+1}m_jQ^j(dx)&\end{align}

(26)

\begin{align}&=\left(\prod_{l=1}^n\frac{h(1-b_l)}{\int yP_l^n(dy)}\right)\delta_h(dx)+\sum_{j=0}^{n-1}\left(\prod_{l=1}^j\frac{1-b_l}{\int yP_l^n(dy)}\right)b_{j+1}m_jQ^j(dx).&\end{align}

We refer to (2) for the expansion of the forward sequence $(P_n)$ .

Proposition 1. Let $(P_j^n)$ be the finite backward sequence in the general model with $P_n^n=\delta_h$ . Then for the sequence $(\mathcal{G}_j)$ , we have

(27)

\begin{equation}\mathcal{G}_0(dx){=}G_0\delta_h(dx)+\sum_{j=0}^{\infty}\prod_{l=1}^{j}\frac{(1-b_l)}{\int y\mathcal{G}_l(dy)}b_{j+1}m_jQ^j(dx), \end{equation}

where the second term on the right side of (26) converges to that of (27),

(28)

\begin{equation}\sum_{j=0}^{n-1}\left(\prod_{l=1}^j\frac{1-b_l}{\int yP_l^n(dy)}\right)b_{j+1}m_jQ^j(dx)\leq_{S_Q-}\stackrel{TV}{\longrightarrow} \sum_{j=0}^{\infty}\prod_{l=1}^{j}\frac{(1-b_l)}{\int y\mathcal{G}_l(dy)}b_{j+1}m_jQ^j(dx),\end{equation}

and the term $G_0=G_{0,h}$ satisfies the following assertions:

(29)

\begin{align}&\prod_{l=1}^n\frac{h(1-b_l)}{\int yP_l^n(dy)}\text{ decreases in }n \text{ and converges to }G_0;&\end{align}

(30)

\begin{align}&G_0=1-\sum_{j=0}^{\infty}\prod_{l=1}^{j}\frac{(1-b_l)}{\int y\mathcal{G}_l(dy)}b_{j+1}m_j\in[0,1];&\end{align}

(31)

\begin{align}&G_0=\mathcal{G}_0(h) \text{ if }Q(h)=0;&\end{align}

(32)

\begin{align}&\int \left(\frac{y}{h}\right)^n\mathcal{G}_n(dy)\prod_{l=1}^n\frac{h(1-b_l)}{\int y\mathcal{G}_l(dy)}\text{ decreases in }n \text{ and converges to } G_0, \text{ if }G_0>0.&\end{align}

Moreover, if we define $G_j$ for $\mathcal{G}_j$ similarly to $G_0$ for $\mathcal{G}_0$ , we have

(33)

\begin{equation} G_{j-1}=G_j\frac{h(1-b_j)}{\int y\mathcal{G}_j(dy)} \qquad \forall j\geq 1.\end{equation}

As a consequence, the $G_j$ are either all 0 or all strictly positive.

Proof. By (17), $\int yP_l^n(dy)$ increases in n and converges to $\int y\mathcal{G}_l(dy)$ as $n\to\infty$ . Then, using (26), we obtain (28). Integrating on both sides of (26), we use (28) to deduce that

\begin{equation*}\prod_{l=1}^n\frac{h(1-b_l)}{\int yP_l^n(dy)}=1-\int\sum_{j=0}^{n-1}\left(\prod_{l=1}^j\frac{1-b_l}{\int yP_l^n(dy)}\right)b_{j+1}m_jQ^j(dx)\end{equation*}

decreases in n and converges to the limit

\begin{equation*}1-\int\sum_{j=0}^{\infty}\left(\prod_{l=1}^j\frac{1-b_l}{\int y\mathcal{G}_l(dy)}\right)b_{j+1}m_jQ^j(dx)\,{=\!:}\,G_0.\end{equation*}

So (27), (29), and (30) are proved.

From (27) we observe (31). To show (32), we develop (16) as follows:

\begin{align*}\mathcal{G}_0(dx)&=\left(\prod_{l=1}^n\frac{1-b_l}{\int y\mathcal{G}_l(dy)}\right)x^n\mathcal{G}_n(dx)+\sum_{j=0}^{n-1}\left(\prod_{l=1}^j\frac{1-b_l}{\int y\mathcal{G}_l(dy)}\right)b_{j+1}m_jQ^j(dx)&\nonumber\\&=\left(\int \left(\frac{y}{h}\right)^n\mathcal{G}_n(dy)\prod_{l=1}^n\frac{h(1-b_l)}{\int y\mathcal{G}_l(dy)}\right)\frac{x^n\mathcal{G}_n(dx)}{\int y^n\mathcal{G}_n(dy)}+\sum_{j=0}^{n-1}\left(\prod_{l=1}^j\frac{1-b_l}{\int y\mathcal{G}_l(dy)}\right)b_{j+1}m_jQ^j(dx).&\end{align*}

Combining the above display with (27) and (28), we obtain (32), and also that

\begin{equation*}\frac{x^n\mathcal{G}_n(dx)}{\int y^n\mathcal{G}_n(dy)}\end{equation*}

converges weakly to $\delta_h$ . Finally, combining (16) and (27) leads to (33).

Remark 11. The proposition implies that $(P_0^n)$ with $P_n^n=\delta_h$ in the random model converges in total variation to $\mathcal{I}=\mathcal{I}_0$ , pointwise. Then by (24), $(P_n)$ in the random model with $P_0=\delta_h$ converges weakly to $\mathcal{I}.$ Therefore Theorem 2 is proved for the particular case with $P_0=\delta_h.$ As will become clear later (in Section 4.7), a complete proof has to deal with different kinds of $P_0$ . The proof here with $P_0=\delta_h$ is the simplest case.

4.4.2. The random model

When carrying over the results of Proposition 1 to the random model, we change the symbol G to I, analogously to the change from $\mathcal{G}$ to $\mathcal{I}.$ For instance, we set $I_j=G_0(\beta_{j+1}, \beta_{j+2},\cdots)$ for any $j\geq 0$ . Then we have the following corollary.

Corollary 3. The process $(I_j)=(I_j)_{j\geq 0}$ is stationary ergodic. Moreover, $\mathbb{P}(\{I_j=0, \forall j\})=\mathbb{P}(I_0=0)\in\{0,1\}$ .

Remark 12. If $Q(h)>0$ , then it must be that $h=S_Q$ and $\mathcal{I}(h)=\mathcal{I}_0(h)>0$ a.s. If $Q(h)=0,$ then $\mathcal{I}(h)=\mathcal{I}_0(h)=I_0.$ So applying Corollary (3), either $\mathcal{I}(h)>0$ a.s., or $\mathcal{I}(h)=0$ a.s.

Proof. By Proposition 1, $G_0=G_0(b_1, b_2,\cdots)$ is a measurable function from $[0,1)^\infty$ to [0,1]. As $(\beta_j)$ is i.i.d., we obtain that $(I_j)=(G_0(\beta_{j+1}, \beta_{j+2},\cdots))$ is stationary ergodic, thanks to Lemma 2.

By (33), for any k, $\{I_k=0\}=\{I_j=0,\forall j\}$ . Note that $\{I_j=0,\forall j\}$ is an invariant set in the sigma-algebra generated by $(I_j)$ . By ergodicity of $(I_j)$ , $\mathbb{P}(\{I_j=0,\forall j\})=\mathbb{P}(I_0=0)\in\{0,1\}$ .

The following result provides a tool for finding out more about $\mathcal{I}$ and Q. Let $I=I_{0,h}$ and $I_Q=I_{0,S_Q}$ . To summarise, $\mathcal{I}, I, \mathcal{I}_Q, I_Q$ are identical in value to $\mathcal{I}_{0,h}, I_{0,h}, \mathcal{I}_{0,S_Q}, I_{0,S_Q}$ , respectively.

Corollary 4. The following statements about $\mathbb{E}\left[\ln\frac{1-\beta}{\int y\mathcal{I}(dy)}\right]$ hold:

1. $\mathbb{E}\left[\ln\frac{1-\beta}{\int y\mathcal{I}(dy)}\right]$ is well defined, takes values in $[\!-\infty, -\ln \int yQ(dy)]$ , and depends only on the marginal distributions of $\beta$ and $\mathcal{I}.$
2. If $Q(h)=0$ , then
\begin{equation*} \mathbb{E}\left[\ln\frac{h(1-\beta)}{\int y\mathcal{I}(dy)}\right]\leq 0.\end{equation*}
3. If $\mathcal{I}(h)>0$ a.s. and $Q(h)=0,$ then
\begin{equation*}\mathbb{E}\left[\ln\frac{h(1-\beta)}{\int y\mathcal{I}(dy)}\right]=0.\end{equation*}
4. If $h=S_Q$ and $Q(S_Q)>0$ , then
\begin{equation*}\mathbb{E}\left[\ln \frac{S_Q(1-\beta)}{\int y\mathcal{I}(dy)}\right]<0\text{ and } I=0, \quad \text{a.s.}\end{equation*}

Remark 13. If $h=S_Q$ , we can only have

\begin{equation*}\mathbb{E}\left[\frac{S_Q(1-\beta)}{\int y\mathcal{I}(dy)}\right]=\mathbb{E}\left[\frac{S_Q(1-\beta)}{\int y\mathcal{I}_Q(dy)}\right]\leq 0.\end{equation*}

Proof. Part 1: By (27), $\mathcal{G}_0$ is a convex combination of probability measures $\{\delta_h, Q, Q^1,Q^2, \cdots\}$ . As $Q^j\preceq Q^{j+1}\preceq \delta_h$ for any $j\geq 0$ , we have, in the pointwise sense,

(34)

\begin{equation}Q\preceq \mathcal{I}=\mathcal{I}_0\preceq \delta_h.\end{equation}

Then

\begin{equation*}\ln \int yQ(dy)\le \mathbb{E}\left[\ln\int y\mathcal{I}(dy)\right]\leq \ln h.\end{equation*}

So $\mathbb{E}\left[\ln\int y\mathcal{I}(dy)\right]$ is a finite term. Consequently,

\begin{align*}\mathbb{E}\left[\ln \frac{1-\beta}{\int y\mathcal{I}(dy)}\right]&=\mathbb{E}\left[\ln (1-\beta)-\ln \int y\mathcal{I}(dy)\right]&\\&=\mathbb{E}\left[\ln (1-\beta)\right]-\mathbb{E}\left[\ln \int y\mathcal{I}(dy)\right] \in\left[\!-\infty, -\ln \int yQ(dy)\right].&\end{align*}

We observe that the above display depends only on the marginal distributions of $\beta$ and $\mathcal{I}.$

Part 2: Let $(P_j^n)$ be the finite backward sequence in the random model with $P_n^n=\delta_h$ . By assumption, $Q(h)=0$ . Adapting (26) to the random model and taking the expectation of the mass on h, we obtain

\begin{equation*}1\geq \mathbb{E}[P_0^{n}(h)]=\mathbb{E}\left[\left(\prod_{l=1}^{n}\frac{h(1-\beta_{l})}{\int yP_l^n(dy)}\right)\right]\geq \exp\left(\sum_{l=1}^{n}\mathbb{E}\left[\ln\frac{h(1-\beta_{l})}{\int yP_l^n(dy)}\right]\right),\end{equation*}

where the second inequality is due to Jensen’s inequality. By (17),

\begin{equation*}\mathbb{E}\left[\ln\frac{h(1-\beta_{l})}{\int yP_l^n(dy)}\right] \text{ increases in } n \text{ and converges to }\mathbb{E}\left[\ln\frac{h(1-\beta_{l})}{\int y\mathcal{I}_l(dy)}\right]=\mathbb{E}\left[\ln\frac{h(1-\beta)}{\int y\mathcal{I}(dy)}\right]. \end{equation*}

Combining the above two displays, it must be that

\begin{equation*}\mathbb{E}\left[\ln\frac{h(1-\beta)}{\int y\mathcal{I}(dy)}\right]\leq 0.\end{equation*}

Part 3: Lemma 1 implies that there exists a measurable function $T:[0,1)^\infty\mapsto (0,\infty)$ such that for any j,

\begin{equation*}\frac{h(1-b_j)}{\int y\mathcal{G}_{j}(dy)}=T(b_j, b_{j+1},\cdots).\end{equation*}

By Lemma 2,

\begin{equation*}\left(\frac{h(1-\beta_j)}{\int y\mathcal{I}_{j}(dy)}\right) \text{ is stationary ergodic.}\end{equation*}

By (32) and the fact that $\mathcal{I}(h)=\mathcal{I}_0(h)=I_0>0$ a.s. (because $Q(h)=0$ by assumption), we have

(35)

\begin{equation}\lim_{n\to\infty}(I_0)^{1/n}=\lim_{n\to\infty}\exp\left(\frac{1}{n}\ln \int\left(\frac{y}{h}\right)^n\mathcal{I}_n(dy)+\frac{1}{n}\sum_{l=1}^n\ln\frac{h(1-\beta_l)}{\int y\mathcal{I}_l(dy)}\right)\stackrel{a.s.}{=}1.\end{equation}

As $(\mathcal{I}_j)$ is stationary ergodic, $\int\left(\frac{y}{h}\right)^n\mathcal{I}_{n}(dy)\in [I_n,1]$ converges weakly to $I_0, $ which is strictly positive. Then

\begin{equation*}\left[\frac{1}{n}\ln I_{n}, 0\right]\ni \frac{1}{n}\ln \int \left(\frac{y}{h}\right)^n\mathcal{I}_n(dy)\stackrel{d}{\longrightarrow} 0, \qquad n\to\infty.\end{equation*}

Moreover, since $\left(\frac{h(1-\beta_j)}{\int y\mathcal{I}_{j}(dy)}\right)$ is stationary ergodic, we have

\begin{equation*}\frac{1}{n}\sum_{l=1}^n\ln\frac{h(1-\beta_l)}{\int y\mathcal{I}_l(dy)}\stackrel{a.s.}{\longrightarrow} \mathbb{E}\left[\ln \frac{h(1-\beta)}{\int y\mathcal{I}(dy)}\right], \quad n\to\infty.\end{equation*}

The above three displays yield

\begin{equation*}1=\exp\left(\mathbb{E}\left[\ln \frac{h(1-\beta)}{\int y\mathcal{I}(dy)}\right]\right)\,\,\text{ or equivalently }\,\,\mathbb{E}\left[\ln \frac{h(1-\beta)}{\int y\mathcal{I}(dy)}\right]=0.\end{equation*}

Part 4: We show by contradiction that $I=I_0=0$ a.s. Adapting (26) to the random model, we have

\begin{equation*}P_0^n(dx)=\left(\prod_{l=1}^n\frac{S_Q(1-\beta_l)}{\int yP_l^n(dy)}\right)\delta_{S_Q}(dx)+\sum_{j=0}^{n-1}\left(\prod_{l=1}^j\frac{1-\beta_l}{\int yP_l^n(dy)}\right)\beta_{j+1}m_jQ^j(dx).\end{equation*}

If $I_0>0$ a.s., we consider the mass on $S_Q$ in the above display. Note that $m_jQ^j(S_Q)=S_Q^jQ(S_Q)$ . By (29) we obtain

\begin{equation*}1\geq P_0^n(S_Q)\geq Q(S_Q)\sum_{j=0}^{n-1}\left(\prod_{l=1}^j\frac{S_Q(1-\beta_l)}{\int yP_l^n(dy)}\right)\beta_{j+1}\geq Q(S_Q)\sum_{j=0}^{n-1}I_0\beta_{j+1}\stackrel{d}{\longrightarrow}\infty.\end{equation*}

This is a contradiction. So $I_0=0$ a.s. Note that by (34), $\mathcal{I}(S_Q)=\mathcal{I}_0(S_Q)\geq Q(S_Q)>0.$ Then we get $\mathbb{E}\left[\ln \frac{h(1-\beta)}{\int y\mathcal{I}(dy)}\right]<0$ using (35) and the arguments thereafter.

4.5. Proof of Theorem 3

Proof of Theorem 3. The statement about $\mathbb{E}\left[\ln \frac{h(1-\beta)}{\int y\mathcal{I}_Q(dy)}\right]$ concerns just a subcase of Corollary 4–1. So this is proved.

If there is no condensation at h, then by Corollary 1, $\mathcal{I}= \mathcal{I}_{0,h}=\mathcal{I}_{0, S_Q}=\mathcal{I}_Q$ , so of course $\mathcal{I}\stackrel{d}{=}\mathcal{I}_Q.$

The first assertion in the condensation criterion holds by Corollary 4–3. We consider the second one. If there is condensation at h, then $ \mathcal{I}_{0, S_Q}\neq \mathcal{I}_{0,h}$ . By Corollary 1, $\mathcal{I}_{0,h}\leq_{S_Q-}\mathcal{I}_{0, S_Q}$ and $\mathcal{I}_{0,S_Q}(S_Q)\leq \mathcal{I}_{0,h}(h)$ , which together with Corollary 4–3 implies that

\begin{equation*}\mathbb{E}\left[\ln \frac{h(1-\beta)}{\int y\mathcal{I}_{0,S_Q}(dy)}\right]>\mathbb{E}\left[\ln \frac{h(1-\beta)}{\int y\mathcal{I}_{0,h}(dy)}\right]=0.\end{equation*}

The above inequality is strict because $ \mathcal{I}_{0, S_Q}\neq \mathcal{I}_{0,h}$ .

If there is no condensation at h, then by Corollary 1, $\mathcal{I}_{0,h}=\mathcal{I}_{0,S_Q}$ . Using Corollary 4-2, we have

\begin{equation*}\mathbb{E}\left[\ln \frac{h(1-\beta)}{\int y\mathcal{I}_{0,S_Q}(dy)}\right]=\mathbb{E}\left[\ln \frac{h(1-\beta)}{\int y\mathcal{I}_{0,h}(dy)}\right]\leq 0.\end{equation*}

4.6. Some properties of invariant measures

In this section, we prove some results concerning invariant measures. However, we leave the proof of Theorem 4 to the end. Invariant measures will play important roles in the proof of Theorem 2.

Lemma 3. For any invariant measure $\nu$ ,

\begin{equation*}\mathbb{E}\left[\ln\frac{1-\beta}{\int y\nu(dy)}\right]\end{equation*}

is well defined, takes values in $[\!-\infty, -\ln\int yQ(dy)]$ , and depends only on the marginal distributions of $\beta$ and $\nu.$

Proof. By the definition of invariant measure,

\begin{align*}\mathbb{E}\left[\int y\nu(dy)\right]&=(1-\mathbb{E}[\beta])\mathbb{E}\left[\frac{\int y^2\nu(dy)}{\int y\nu(dy)}\right]+\mathbb{E}[\beta]\int yQ(dy)&\\&\geq (1-\mathbb{E}[\beta])\mathbb{E}\left[\int y\nu(dy)\right]+\mathbb{E}[\beta]\int yQ(dy),&\end{align*}

where the inequality is due to the fact that $\int y^2\nu(dy)\geq (\int y\nu(dy))^2.$ Then we obtain

\begin{equation*}\int yQ(dy)\leq \mathbb{E}\left[\int y\nu(dy)\right]\leq 1.\end{equation*}

Proceeding similarly as in the proof of Corollary 4–1, we conclude that this lemma holds.

Corollary 5. $\mathcal{I}_Q$ is the unique (in distribution) invariant measure supported on $[0,S_Q]$ .

Proof. Let $\nu$ be any invariant measure on $[0,S_Q]$ . We show that $\nu\stackrel{d}{=}\mathcal{I}_Q$ . Note that $S_\nu=S_Q$ , a.s. Let $(P_n)$ and $(P_n^{\prime})$ be two forward sequences as in Section 4.2 with

\begin{equation*}Q=Q', \qquad h=h'=S_Q, \qquad P_0\stackrel{d}{=}\nu, \qquad P_0^{\prime}=\delta_{S_Q},\end{equation*}

and with $P_0$ independent of $(\beta_n)$ . The two sequences differ only in the starting measures (satisfying $P_0\leq_{S_Q-}P_0^{\prime}$ ), with other parameters identical. Since $\nu$ is invariant, $P_n\stackrel{d}{=}\nu$ for any $n\geq0$ . Using the notation $\mathcal{M}_n,\mathcal{M}_n^{\prime},\mathcal{W}_n,\mathcal{W}_n^{\prime}$ from Section 4.2, and by a monotonicity analysis as in the proof of Lemma 1, we obtain, in the pointwise sense,

(36)

\begin{equation}\int \mathcal{M}_n(dx)\leq \int \mathcal{M}_n^{\prime}(dx), \qquad \mathcal{W}_n^{\prime}\leq_{S_Q} \mathcal{W}_n.\end{equation}

If $I_Q=0$ a.s., by (29) in Proposition 1 and (24),

\begin{equation*}\int \mathcal{M}_n^{\prime}(dx)\stackrel{d}{\longrightarrow}I_Q\stackrel{a.s.}{=}0.\end{equation*}

Remark 11 says that $P_n^{\prime}(=\mathcal{W}_n^{\prime}+\mathcal{M}_n^{\prime})\stackrel{d}{\to}\mathcal{I}_Q.$ So

\begin{equation*}\mathcal{W}^{\prime}_n{\stackrel{d}{\longrightarrow} }\mathcal{I}_Q.\end{equation*}

Thus, applying (36) and the fact that $I_Q=0$ a.s., we obtain

\begin{equation*}\int \mathcal{M}_n(dx)\stackrel{d}{\longrightarrow}0, \qquad \mathcal{W}_n{\stackrel{d}{\longrightarrow} }\mathcal{I}_Q.\end{equation*}

Consequently,

\begin{equation*}P_n(=\mathcal{W}_n+\mathcal{M}_n)\stackrel{d}{\longrightarrow}\mathcal{I}_Q.\end{equation*}

Since $\nu\stackrel{d}{=}P_n$ for any n, we have $\nu\stackrel{d}{=}\mathcal{I}_Q.$

If $I_Q>0$ a.s., then by Corollary 4–4, $Q(S_Q)=0$ and $\mathcal{I}_Q(S_Q)=I_Q>0$ a.s. Then by Corollary 4–3, we have

(37)

\begin{equation}\mathbb{E}\left[\ln\frac{S_Q(1-\beta)}{\int y\mathcal{I}_Q(dy)}\right]=0.\end{equation}

Again using a monotonicity analysis, in a pointwise sense, we have

(38)

\begin{equation}P_n^{\prime}\leq_{S_Q-}P_n, \qquad P_n(S_Q)\leq P_n^{\prime}(S_Q).\end{equation}

As $P_n^{\prime}\stackrel{d}{\rightarrow}\mathcal{I}_Q$ , and $P_n\stackrel{d}{=}\nu$ for all n, the above display implies that

\begin{equation*}\mathcal{I}_Q\leq_{S_Q-}^d\nu, \qquad \nu(S_Q)\preceq I_Q=\mathcal{I}_{Q}(S_Q).\end{equation*}

Assume that $\nu$ is not equal to $\mathcal{I}_Q$ in distribution; then by the above display and (37),

\begin{equation*}\mathbb{E}\left[\ln\frac{S_Q(1-\beta)}{\int y\nu(dy)}\right]>0.\end{equation*}

The inequality implies that for $\varepsilon>0$ small enough, we have

\begin{equation*}\mathbb{E}\left[\ln\frac{(S_Q-\varepsilon)(1-\beta)}{\int y\nu(dy)}\right]>0.\end{equation*}

As $S_\nu=S_Q$ a.s. and $P_0\stackrel{d}{=}\nu$ ,

\begin{equation*}\int \left(\frac{x}{S_Q-\varepsilon} \right)^nP_0(dx)\stackrel{d}{\rightarrow}\infty\end{equation*}

as $n\to\infty$ . Again using the decomposition (12) in Section 4.2, we get

\begin{align*}1&=\mathbb{E}\left[\int P_0(dx) \right]\geq \mathbb{E}\left[\int \mathcal{M}_n(dx)\right]&\\&=\mathbb{E}\left[\exp\left(\ln\int \left(\frac{x}{S_Q-\varepsilon}\right)^nP_0(dx)+\sum_{l=0}^{n-1}\ln\frac{(S_Q-\varepsilon)(1-\beta_{l+1})}{\int yP_l^n(dy)}\right)\right]&\\&\geq \mathbb{E}\left[\exp\left(\sum_{l=0}^{n-1}\ln\frac{(S_Q-\varepsilon)(1-\beta_{l+1})}{\int yP_l^n(dy)}\right)\right]\geq \exp\left(n\mathbb{E}\left[\frac{(S_Q-\varepsilon)(1-\beta)}{\int y\nu(dy)}\right] \right)\stackrel{n\to\infty}{\longrightarrow}\infty,&\end{align*}

where the third inequality is due to Jensen’s inequality. So this is a contradiction, which means that $\nu$ is equal in distribution to $\mathcal{I}_Q$ .

4.7. Proof of Theorem 2

Case 1. ${P_0=\delta_h}$ .

Proof of Theorem 2, Case 1. This is shown in Remark 11.

Case 2. ${I_{0,h}=0}$ a.s.

Proof of Theorem 2, Case 2. Let $(P_n)_{n\geq 0}$ , $(P^{\prime}_n)_{n\geq 0}$ be two forward sequences as in Section 4.2 with

\begin{equation*}Q=Q', \qquad h=h',\qquad P_0^{\prime}=\delta_h.\end{equation*}

So the two sequences differ only in the starting measures (satisfying $P_0\leq_{h-}P_0^{\prime}$ ), with other parameters identical. Next it suffices to follow the same procedure as in the proof of Corollary 5 for the case $I_Q=0$ a.s. The proof is omitted.

Case 3. ${I_{0,h}>0 }$ a.s. and $P_0(h)>0.$

For Case 3, we first restate a result from [Reference Yuan22, p. 10]. The statement in [Reference Yuan22] considers only $h=1$ , but it is easily generalised to any h. Recall the distribution function $D_{u}$ for $u\in M_1$ , introduced in Section 4.1.

Lemma 4. Let $u_1,u_2\in M_1$ be any probability measures satisfying $S_{u_1}\,{=}\,S_{u_2}=h$ and $u_1\leq_{h-}u_2$ . If for some $\varepsilon>0$ there exists $a\in(0,h)$ such that $D_{u_1}(a)+\varepsilon\leq D_{u_2}(a)$ , then

\begin{equation*}\int yu_1(dy)\geq c(a,\varepsilon)\int yu_2(dy),\end{equation*}

where

\begin{equation*}c(a,\varepsilon)=\frac{1}{1-\varepsilon(h-a)}>1.\end{equation*}

Proof of Theorem 2, Case 3. Let $(P_n), (P_n^{\prime})$ be the two forward sequences in the proof of Case 2. Similarly to (38), conditionally on $(\beta_n)$ , we have

(39)

\begin{equation}P_n^{\prime}\leq_{h-}P_n,\quad P_n(h)\leq P_n^{\prime}(h),\end{equation}

implying

\begin{equation*}\int yP_j^{\prime}(dy)\geq \int yP_j(dy) \qquad \forall j\geq 0.\end{equation*}

For any $\varepsilon>0$ , $a\in(0,h)$ , let

\begin{equation*}\kappa_n\,{:\!=}\,\#\{n: D_{P^{\prime}_j}(a)+\varepsilon\leq D_{P_j}(a), 0\leq j\leq n\}.\end{equation*}

Note that by Proposition 1–4, $ Q(h)=0$ . So using (12) and (13) in Section 4.2, we have

\begin{equation*}P^{\prime}_{n}(h)=\prod_{l=0}^{n-1}\frac{h(1-\beta_{l+1})}{\int yP^{\prime}_l(dy)}, \qquad P_{n}(h)=\left(\prod_{l=0}^{n-1}\frac{h(1-\beta_{l+1})}{\int yP_l(dy)}\right)P_0(h).\end{equation*}

Then by Lemma 4, we have

\begin{equation*}\prod_{l=0}^n\int yP_l^{\prime}(dy)\geq c(a,\varepsilon)^{\kappa_n}\prod_{l=0}^n\int yP_l(dy).\end{equation*}

Therefore,

\begin{equation*}P^{\prime}_{n}(h)\leq \frac{1}{c(a,\varepsilon)^{\kappa_n}}\left(\prod_{l=0}^{n-1}\frac{h(1-\beta_{l+1})}{\int yP_l(dy)}\right) = \frac{P_n(h)}{c(a,\varepsilon)^{\kappa_n}P_0(h)}\leq \frac{1}{c(a,\varepsilon)^{\kappa_n}P_0(h)}.\end{equation*}

But (29) of Proposition 1 and (24) imply that $P^{\prime}_{n}(h)$ converges weakly to $I_{0,h}$ , which is by assumption non-zero a.s. Thus $\lim_{n\to\infty}\kappa_n<\infty$ a.s. As $a, \varepsilon$ are arbitrary numbers and by Case 1 of this theorem $P_n^{\prime}\stackrel{d}{\longrightarrow}\mathcal{I}_{0,h}$ , we use (39) to conclude that $P_n$ also converges weakly to $\mathcal{I}_{0,h}.$

Case 4. ${I_{0,h}>0 }$ a.s. and $P_0(h)=0.$

Proof of Theorem 2, Case 4. The idea is to use a tripling argument similarly as in the proof of Theorem 5 in [Reference Yuan22]. For any $u\in M_1$ and any $a\in [0,1]$ , define

\begin{equation*}u^a=u_{[0,a)}+u([a,1])\delta_a, \qquad a<h,\end{equation*}

where $u_{[0,a)}$ is the restriction of u on $[0,a).$

We distinguish between $h>S_Q$ and $h=S_Q.$ For the former, let $(P_n), (P_n^{\prime}), (P_n^{\prime\prime})$ be three forward sequences as in Section 4.2 with

\begin{equation*}S_Q<h''<h=h'; \qquad Q'=Q,\qquad Q''=Q^{h''}=Q;\qquad P_0^{\prime}=\delta_h,\qquad P_0^{\prime\prime}=P_0^{h''}.\end{equation*}

So the three sequences differ in the starting measures, including the largest fitness values, but have the same Q and $(\beta_n)$ . Since $P_0^{\prime}(h')=\delta_h(h)=1$ and $0<P_0^{\prime\prime}(h'')\leq 1$ , we use Case 1 for $(P_n^{\prime})$ and Cases 2–3 for $(P_n^{\prime\prime})$ to obtain that

(40)

\begin{equation}P_n^{\prime}\stackrel{d}{\longrightarrow}\mathcal{I}_{0,h},\qquad P_n^{\prime\prime}\stackrel{d}{\longrightarrow}\mathcal{I}_{0,h''}.\end{equation}

Applying the monotonicity analysis, we find that the following holds in the pointwise sense:

(41)

\begin{equation}P^{\prime}_n\leq_{h-} P_n\leq_{h''-} P^{\prime\prime}_n.\end{equation}

Letting $h''\to h$ and using Corollary 1, we have that conditionally on $(\beta_j)$ , $\mathcal{I}_{0,h''}$ converges weakly to a limit in $M_1$ , denoted by $\nu$ . So $\nu$ is a (pointwise) weak limit of $\mathcal{I}_{0,h''}$ as $h''\to h.$ We prove next that $\nu=\mathcal{I}_{0,h}.$

Since $I_{0,h}>0$ a.s. and $h>S_Q$ , by Theorem 3,

\begin{equation*}\mathbb{E}\left[\ln\frac{h(1-\beta)}{\int y\mathcal{I}_{0, S_Q}(dy)}\right]>0.\end{equation*}

Then for h $^{\prime\prime}$ close enough to h, we also have

\begin{equation*}\mathbb{E}\left[\ln\frac{h''(1-\beta)}{\int y\mathcal{I}_{0, S_Q}(dy)}\right]>0.\end{equation*}

The above display implies that there is condensation at h $^{\prime\prime}$ , thanks to Theorem 3. Together with Corollary 4–3, this gives us

(42)

\begin{equation}\mathbb{E}\left[\ln\frac{h''(1-\beta)}{\int y\mathcal{I}_{0, h''}(dy)}\right]=0, \qquad I_{0,h''}>0, \quad \text{a.s.}\end{equation}

Since $\mathcal{I}_{0,h''}$ is an invariant measure, the limit $\nu$ is still an invariant measure. By (42) and Corollary 1, the pointwise convergence of $\mathcal{I}_{0,h''}$ to $\nu$ as $h''\to h$ implies

(43)

\begin{equation}\mathbb{E}\left[\ln\frac{h(1-\beta)}{\int y\nu(dy)}\right]=0,\qquad \nu(h)>0, \quad \text{a.s.}\end{equation}

Using Corollary 1 again, in the pointwise sense we have

\begin{equation*}\mathcal{I}_{0,h}\leq_{h''-}\mathcal{I}_{0,h''},\qquad \mathcal{I}_{0,h''}(h'')\leq \mathcal{I}_{0,h}(h), \end{equation*}

implying that in the pointwise sense (since $\nu$ is a pointwise weak limit of $\mathcal{I}_{0,h''}$ )

(44)

\begin{equation}\mathcal{I}_{0,h}\leq_{h-}\nu, \qquad \nu(h)\leq I_{0,h}.\end{equation}

On the other hand, by assumption $I_{0,h}>0$ a.s., so using Corollary 4–3, we have

\begin{equation*}\mathbb{E}\left[\ln\frac{h(1-\beta)}{\int y\mathcal{I}_{0,h}(dy)}\right]=0.\end{equation*}

The above display together with (43) and (44) implies that

(45)

\begin{equation}\nu=\mathcal{I}_{0,h} \quad \text{pointwise}.\end{equation}

Therefore we have proved that $\mathcal{I}_{0,h''}$ converges pointwise to the weak limit $\mathcal{I}_{0,h}$ as $h''\to h.$

Now, taking into account (40), for any continuous function f we have

(46)

\begin{align}\int f(x)P_n^{\prime\prime}(dx)&\stackrel{d}{\longrightarrow}\int f(x)\mathcal{I}_{0,h''}(dx)&\nonumber\\&\xrightarrow[h''\to h ]{\text{pointwise}}\int f(x)\mathcal{I}_{0,h}(dx)\stackrel{d}{\longleftarrow}\int f(x)P_n^{\prime}(dx).&\end{align}

Note that using (41), for any bounded continuous increasing function g we have

\begin{equation*}\int g(x)P_n^{\prime\prime}(dx)\preceq \int g(x)P_n(dx)\preceq\int g(x)P_n^{\prime}(dx).\end{equation*}

Together with (46), this yields

\begin{equation*}\int g(x)P_n(dx)\stackrel{d}{\longrightarrow}\int g(x)\mathcal{I}_{0,h}(dx).\end{equation*}

Since, by (41), $P_n^{\prime}\leq_{h-}P_n$ pointwise for any n, the above display implies that $P_n$ converges weakly to $\mathcal{I}_{0,h}$ , which is the same as the weak limit of $(P_n^{\prime})$ .

If $h=S_Q$ , we follow the same procedure, except that to prove (45) we require Corollary 5.

4.8. Proof of Theorem 4

First we prove two lemmas. Recall the definition of $S_{u}$ for $u\in M_1$ .

Lemma 5. $S_{(\cdot)}$ is a continuous (hence measurable) function on $M_1$ with the topology of the weak convergence.

Proof. Assume that a sequence $(u_n)$ converges weakly to u. If $S_{u_n}$ does not converge to $S_{u}$ , then there exists a subsequence $(u_{n_k})$ such that $S_{u_{n_k}}$ converges to a limit a with $a<S_{u}$ or $a>S_{u}$ . Without loss of generality, assume $a<S_u.$ We take a positive and continuous function f supported on

\begin{equation*}\left(\frac{a+S_{u}}{2}, S_{u}\right],\end{equation*}

and then $\int f(x)u(dx)>0.$ But $\int f(x)u_{n_k}(dx)$ converges to 0. This contradicts the weak convergence, which completes the proof.

The next lemma generalises Corollary 5.

Lemma 6. For any invariant measure $\nu$ with $S_{\nu}=h$ a.s., we have $\nu\stackrel{d}{=}\mathcal{I}.$

Proof. Let $(P_n)$ be the forward sequence in the random model with $P_0\stackrel{d}{=}\nu$ and $P_0$ independent of $(\beta_n)$ . By Theorem 2, conditionally on $P_0$ , $P_n$ converges in distribution to the same random measure $\mathcal{I}.$ Then, unconditionally, $P_n\stackrel{d}{=}\nu$ converges in distribution to $\mathcal{I},$ implying $\nu\stackrel{d}{=}\mathcal{I}$ .

Proof of Theorem 4. Let $\nu$ be any invariant measure. By (10), $S_\nu \in [S_Q, 1]$ , a.s. By Lemma 5, $S_\nu$ is a random variable, and then by Theorem 5.3 in [Reference Kallenberg15], there exists a regular conditional distribution of $\nu$ on $S_\nu$ .

Conditioning on $S_\nu$ for both sides of (10), we see that $(\nu| S_\nu)$ must be an invariant measure a.s. By Lemma 6, conditionally on $S_\nu$ , we have $(\nu|S_\nu)\stackrel{d}{=}\mathcal{I}$ a.s., where $\mathcal{I}$ is the random probability measure with parameters $\mathcal L$ , Q, $h=S_\nu$ and satisfies $\mathbb{P}(S_{\mathcal{I}}=S_\nu|S_\nu)=1$ a.s. This finishes the proof.

Appendix A. Analysis of $\ln\frac{h(1-b)}{\int x\mathcal{K}_Q(dx)}$ in Kingman’s model

We discuss respectively Theorem 1-1, i.e.

\begin{equation*}\int \frac{Q(dx)}{1-x/h}\geq b^{-1},\end{equation*}

and Theorem 1-2, i.e.

\begin{equation*}\int \frac{Q(dx)}{1-x/h}< b^{-1}.\end{equation*}

For the former, let us first compute $\int x\mathcal{K}_Q(dx)$ :

(47)

\begin{align}\int x\mathcal{K}_Q(dx)&=\int\frac{b\theta_bxQ(dx)}{\theta_b-(1-b)x}&\nonumber\\&=\int\frac{b\theta_b(x-\theta_b/(1-b))Q(dx)+b\theta_b^2/(1-b)Q(dx)}{\theta_b-(1-b)x}&\nonumber\\&=\frac{b\theta_b}{1-b}+\frac{b\theta_b^2}{1-b}\int\frac{Q(dx)}{\theta_b-(1-b)x}&\nonumber\\&=\theta_b,&\end{align}

where the last equality is due to the fact that $\theta_b$ is the solution of the equation (4). The equation (4) also implies

\begin{equation*}\int\frac{\theta_bQ(dx)}{\theta_b-(1-b)x}=\int\frac{Q(dx)}{1-(1-b)x/\theta_b}=b^{-1}.\end{equation*}

Recall that $\int \frac{Q(dx)}{1-x/h}\geq b^{-1}.$ Then the above display implies that

\begin{equation*}\frac{1}{h}\geq \frac{1-b}{\theta_b}.\end{equation*}

Taking into account (47), we arrive at

\begin{equation*}\frac{h(1-b)}{\int x\mathcal{K}_Q(dx)}\leq 1, \quad \text{ or equivalently, } \ln \frac{h(1-b)}{\int x\mathcal{K}_Q(dx)}\leq 0.\end{equation*}

Equality holds if and only if

\begin{equation*}\int \frac{Q(dx)}{1-x/h}=b^{-1}.\end{equation*}

For Theorem 1-2, we have

\begin{align*}\int x\mathcal{K}_Q(dx)&=\int\frac{bxQ(dx)}{1-x/S_Q}+\left(1-\int\frac{bQ(dx)}{1-x/S_Q}\right)S_Q&\\&=S_Q+\int\frac{b(x-S_Q)Q(dx)}{1-x/S_Q}&\\&=(1-b)S_Q.&\end{align*}

Thus we obtain

\begin{equation*}\ln \frac{h(1-b)}{\int x\mathcal{K}_Q(dx)}=\ln\frac{h}{S_Q}\geq 0,\end{equation*}

where equality holds if and only if $h=S_Q$ .

In conclusion, if $h>S_Q,$ then

\begin{equation*}\ln \frac{h(1-b)}{\int x\mathcal{K}_Q(dx)}\leq 0\end{equation*}

is equivalent to

\begin{equation*}\int \frac{Q(dx)}{1-x/h}\geq b^{-1}\end{equation*}

(non-condensation case), and

\begin{equation*}\ln \frac{(1-b)S_Q}{\int x\mathcal{K}_Q(dx)}>0\end{equation*}

is equivalent to

\begin{equation*}\int \frac{Q(dx)}{1-x/h}<b^{-1}\end{equation*}

(condensation case).

If $h=S_Q$ , then

\begin{equation*}\ln \frac{S_Q(1-b)}{\int x\mathcal{K}_Q(dx)}= 0\end{equation*}

is equivalent to either

\begin{equation*}\int \frac{Q(dx)}{1-x/S_Q}=b^{-1}\end{equation*}

(non-condensation case) or

\begin{equation*}\int \frac{Q(dx)}{1-x/S_Q}<b^{-1}\end{equation*}

(condensation case), and

\begin{equation*}\ln \frac{S_Q(1-b)}{\int x\mathcal{K}_Q(dx)}<0\end{equation*}

is equivalent to

\begin{equation*}\int \frac{Q(dx)}{1-x/S_Q}>b^{-1}\end{equation*}

(non-condensation case). The case

\begin{equation*}\ln \frac{S_Q(1-b)}{\int x\mathcal{K}_Q(dx)}>0\end{equation*}

does not occur, which is in line with Remark 13.

Therefore, if $h=S_Q$ , knowing only

\begin{equation*}\ln \frac{S_Q(1-b)}{\int x\mathcal{K}_Q(dx)}= 0\end{equation*}

does not allow one to determine whether condensation occurs or not.

Acknowledgements

The author would like to thank Takis Konstantopoulos, Götz Kersting, and Pascal Grange for discussions. The author thanks the anonymous referees for their comments, which greatly improved the presentation of the paper, and for suggesting the intuition for Theorem 3.

Funding Information

The author acknowledges the support of the National Natural Science Foundation of China (Youth Programme, Grant 11801458).

Competing Interests

There were no competing interests to declare which arose during the preparation or publication process for this article.

References

Aldous, D. J. and Bandyopadhyay, A. (2005). A survey of max-type recursive distributional equations. Ann. Prob. 15, 1047–1110.Google Scholar

Betz, V., Dereich, S. and Mörters, P. (2018). The shape of the emerging condensate in effective models of condensation. Ann. Inst. H. Poincaré Prob. Statist. 19, 1869–1889.10.1007/s00023-018-0673-7CrossRef Google Scholar

Bürger, R. (1986). On the maintenance of genetic variation: global analysis of Kimurabs continuum-of-alleles model. J. Math. Biol. 24, 341–351.10.1007/BF00275642CrossRef Google Scholar PubMed

Bürger, R. (1989). Mutation–selection balance and continuum-of-alleles models. Math. Biosci. 12, 67–83.Google Scholar

Bürger, R. (1998). Mathematical properties of mutation–selection models. Genetica 102, 279–298.10.1023/A:1017043111100CrossRef Google Scholar

Bürger, R. (2000). The Mathematical Theory of Selection, Recombination, and Mutation. John Wiley, Chichester, New York.Google Scholar

Crow, J. F. and Kimura, M. (1970). An Introduction to Population Genetics Theory. Harper and Row, New York.Google Scholar

Dereich, S. and Mörters, P. (2013). Emergence of condensation in Kingmanbs model of selection and mutation. Acta Appl. Math. 127, 17–26.10.1007/s10440-012-9790-3CrossRef Google Scholar

Diaconis, P. and Freedman, D. (1999). Iterated random functions. SIAM Rev. 41, 45–76.10.1137/S0036144598338446CrossRef Google Scholar

Evans, S. N., Steinsaltz, D. and Wachter, K. W. (2013). A Mutation–Selection Model with Recombination for General Genotypes. American Mathematical Society, Providence, RI.Google Scholar

Ewens, W. J. (1979). Mathematical Population Genetics. Springer, New York.Google Scholar

Gonzalez-Casanova, A., Kurt, N., Wakolbinger, A. and Yuan, L. (2016). An individual-based model for the Lenski experiment, and the deceleration of the relative fitness. Stoch. Process. Appl. 126, 2211–2252.10.1016/j.spa.2016.01.009CrossRef Google Scholar

Grange, P. (2019). Steady states in a non-conserving zero-range process with extensive rates as a model for the balance of selection and mutation. J. Phys. A 52, 365601.10.1088/1751-8121/ab3370CrossRef Google Scholar

Haldane, J. B. S. (1937). The effect of variation on fitness. Amer. Naturalist 71, 337–349.10.1086/280722CrossRef Google Scholar

Kallenberg, O. (1997). Foundations of Modern Probability. Springer, Cham.Google Scholar

Kallenberg, O. (2017). Random Measures: Theory and Applications. Springer, Cham.10.1007/978-3-319-41598-7CrossRef Google Scholar

Kimura, M. (1965). A stochastic model concerning the maintenance of genetic variability in quantitative characters. Proc. Nat. Acad. Sci. USA 54, 731–736.10.1073/pnas.54.3.731CrossRef Google Scholar

Kingman, J. F. C. (1977). On the properties of bilinear models for the balance between mutation and selection. Proc. Camb. Phil. Soc. 80, 443–453.10.1017/S0305004100053512CrossRef Google Scholar

Kingman, J. F. C. (1978). A simple model for the balance between selection and mutation. J. Appl. Prob. 15, 1–12.10.2307/3213231CrossRef Google Scholar

Kingman, J. F. C. (1980). Mathematics of Genetic Diversity. Society for Industrial and Applied Mathematics, Philadelphia.10.1137/1.9781611970357CrossRef Google Scholar

Steinsaltz, D., Evans, S. N. and Wachter, K. W. (2005). A generalized model of mutation–selection balance with applications to aging. Adv. Appl. Math. 35, 16–33.10.1016/j.aam.2004.09.003CrossRef Google Scholar

Yuan, L. (2017). A generalization of Kingmanbs model of selection and mutation and the Lenski experiment. Math. Biosci. 285, 61–67.10.1016/j.mbs.2016.12.007CrossRef Google Scholar PubMed

Yuan, L. (2020). Kingman’s model with random mutation probabilities: convergence and condensation II. J. Statist. Phys. 181, 870–896.10.1007/s10955-020-02609-wCrossRef Google Scholar

Article contents

Kingman’s model with random mutation probabilities: convergence and condensation I

Abstract

Keywords

MSC classification

1. Motivation and background

2. Models

2.1. Kingman’s model with time-varying mutation probabilities

2.2. Convergence and condensation in Kingman’s model

2.3. Kingman’s model with random mutation probabilities

3. Main results

3.1. Weak convergence

3.2. Condensation criterion

3.3. Invariant measure

4. Proofs

4.1. Relations between measures

4.2. Three sequences

4.3. Introducing the finite backward sequences

4.3.1. The general model

4.3.2. The random model

4.4. Finer analysis of the finite backward sequences

4.4.1. The general model

4.4.2. The random model

4.5. Proof of Theorem 3

4.6. Some properties of invariant measures

4.7. Proof of Theorem 2

4.8. Proof of Theorem 4

Appendix A. Analysis of $\ln\frac{h(1-b)}{\int x\mathcal{K}_Q(dx)}$ in Kingman’s model

Acknowledgements

Funding Information

Competing Interests

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests