1. Introduction
Bifurcating Markov chains (BMCs) are a class of stochastic processes indexed by the regular binary tree and which satisfy the branching Markov property (see below for a precise definition). This model represents the evolution of a trait along a population where each individual has two children. We refer to [Reference Bitseki Penda and Delmas4] for references on this subject. The recent study of BMC models was motivated by the understanding of the cell division mechanism (where the trait of an individual is given by its growth rate). The first BMC model, known as the ‘symmetric’ bifurcating autoregressive process (BAR) (see Section 4.1 for more details in a Gaussian framework), was introduced by Cowan and Staudte [Reference Cowan and Staudte7] in order to analyze cell lineage data. In [Reference Guyon9], Guyon studied ‘asymmetric’ BAR in order to prove statistical evidence of aging in Escherichia coli.
In this paper, our objective is to establish a central limit theorem for additive functionals of BMCs. This will be done for the class of functions which belong to $L^{4}(\mu)$ , where $\mu$ is the invariant probability measure of the Markov chain given by the genealogical evolution of an individual taken at random in the population. This paper completes the pointwise approach developed in [Reference Bitseki Penda and Delmas4] in a very close framework. Let us emphasize that the $L^2$ approach is an important step toward the kernel approximation of the densities of the kernel transition of the BMCs and the invariant probability measure $\mu$ which will be developed in a companion paper. The main contribution of this paper, with respect to [Reference Bitseki Penda and Delmas4], is the derivation of a nontrivial hypothesis on the kernel transition given in Assumption 2.2(i). More precisely, let the random variable (X, Y, Z) model the trait of the mother, X, and the traits of its two children Y and Z. Notice that we do not assume that conditionally on X, the random variables Y and Z are independent or that they have the same distribution. In this setting, $\mu$ is the distribution of an individual picked at random in the stationary regime. From an ergodic point of view, it would be natural to assume some $L^2(\mu)$ continuity in the sense that for some finite constant M and all functions f and g,
where ${\mathbb E}_{W \sim\mu}$ means that the random variable W has distribution $\mu$ . However, this condition is not always true even in the simplest case of the symmetric BAR model; see the comments in Remark 2.2 and the detailed computation in Section 4. This motivates the introduction of Assumption 2.2(i), which allows us to recover the results from [Reference Bitseki Penda and Delmas4] in the context of the $L^2$ approach, and in particular the three regimes: the subcritical, critical, and supercritical regimes. Since the results are similar and the proofs follow the same steps, we provide a detailed proof only in the subcritical case. Finally, let us mention that the numerical study on the symmetric BAR in Section 4.2 illustrates the phase transitions for the fluctuations. We also provide an example where the asymptotic variance in the critical regime is 0; this happens when the function considered is orthogonal to the second eigenspace of the associated Markov chain.
The paper is organized as follows. In Section 2, we present the model and give the assumptions: we introduce the BMC model in Section 2.1, we give the assumptions under which our results will be stated in Section 2.2, and we give some useful notation in Section 2.3. In Section 3 we state our main results: the subcritical case in Section 3.1, the critical case in Section 3.2, and the supercritical case in Section 3.3. In Section 4, we study the special case of the symmetric BAR process. The proof of the results in the subcritical case given in Section 5, which are in the same spirit as [Reference Bitseki Penda and Delmas4], rely essentially on the explicit second moment computations and precise upper bounds on fourth moments for BMCs which are recalled in Section 6.
The proof of the results in the critical case is an adaptation of the proof in the subcritical case, in the same spirit as in [Reference Bitseki Penda and Delmas4]; the interested reader can find the details in [Reference Bitseki Penda and Delmas3]. The proof of the results in the supercritical case does not involve the original Assumption 2.2(i); it is not reproduced here as it is very close to its counterpart in [Reference Bitseki Penda and Delmas4].
2. Models and assumptions
2.1. Bifurcating Markov chain: the model
We denote by ${\mathbb N}$ the set of nonnegative integers and ${\mathbb N}^*= {\mathbb N} \setminus \{0\}$ . If $(E, {\mathcal E})$ is a measurable space, then ${\mathcal B}(E)$ (resp. ${\mathcal B}_b(E)$ , resp. ${\mathcal B}_+(E)$ ) denotes the set of (resp. bounded, resp. nonnegative) ${\mathbb R}$ -valued measurable functions defined on E. For $f\in {\mathcal B}(E)$ , we set $\mathop{\parallel\! {f} \! \parallel}\nolimits_\infty =\sup\{|f(x)|$ , $x\in E\}$ . For a finite measure $\lambda$ on $(E,{\mathcal E})$ and $f\in {\mathcal B}(E)$ we shall write $\langle \lambda,f \rangle$ for $\int f(x) \, \textrm{d}\lambda(x)$ whenever this integral is well defined. For $p\geq 1$ and $f\in {\mathcal B}(E)$ , we set $\|f\|_{L^{p}(\lambda)} =\langle \lambda,|f|^p \rangle^{1/p}$ and we define the space $L^{p}(\lambda)= \left\{f \in {\mathcal B}(E);\, \|f\|_{L^{p}(\lambda)} < +\infty\right\}$ of $p\text{-}$ integrable functions with respect to $\lambda$ . For $n\in {\mathbb N}^*$ , the product space $E^n$ is endowed with the product $\sigma$ -field ${\mathcal E}^{\otimes n}$ .
Let $(S, {\mathscr S}\,)$ be a measurable space. Let Q be a probability kernel on $S \times {\mathscr S}\,$ ; that is, $Q(\cdot , A)$ is measurable for all $A\in {\mathscr S}\,$ , and $Q(x,\cdot)$ is a probability measure on $(S,{\mathscr S}\,)$ for all $x \in S $ . For any $f\in {\mathcal B}_b(S)$ , we set, for $x\in S$ ,
We define (Qf), or simply Qf, for $f\in {\mathcal B}(S)$ as soon as the integral (1) is well defined, and we have $Q f\in {\mathcal B}(S)$ . For $n\in {\mathbb N}$ , we denote by $Q^n$ the nth iterate of Q, defined by $Q^0=I_d$ , the identity map on ${\mathcal B}(S)$ , and $Q^{n+1}f=Q^n(Qf)$ for $f\in {\mathcal B}_b(S)$ .
Let P be a probability kernel on $S \times {\mathscr S}\,^{\otimes 2}$ ; that is, $P(\cdot , A)$ is measurable for all $A\in {\mathscr S}\,^{\otimes 2}$ , and $P(x,\cdot)$ is a probability measure on $\big(S^2,{\mathscr S}\,^{\otimes 2}\big)$ for all $x \in S$ . For any $g\in {\mathcal B}_b\big(S^3\big)$ and $h\in {\mathcal B}_b\big(S^2\big)$ , we set, for $x\in S$ ,
We define (Pg) (resp. (Ph)), or simply Pg for $g\in {\mathcal B}\big(S^3\big)$ $\big($ resp. Ph for $h\in {\mathcal B}\big(S^2\big)\big)$ , as soon as the corresponding integral (2) is well defined, and we have that Pg and Ph belong to ${\mathcal B}(S)$ .
We now introduce some notation related to the regular binary tree. We set $\mathbb{T}_0=\mathbb{G}_0=\{\emptyset\}$ , $\mathbb{G}_k=\{0,1\}^k$ , $\mathbb{T}_k = \bigcup _{0 \leq r \leq k} \mathbb{G}_r$ for $k\in {\mathbb N}^*$ , and $\mathbb{T} = \bigcup _{r\in {\mathbb N}} \mathbb{G}_r$ . The set $\mathbb{G}_k$ corresponds to the kth generation, $\mathbb{T}_k$ to the tree up to the kth generation, and $\mathbb{T}$ the complete binary tree. For $i\in \mathbb{T}$ , we denote by $|i|$ the generation of i ( $|i|=k$ if and only if $i\in \mathbb{G}_k$ ), and $iA=\{ij;\, j\in A\}$ for $A\subset \mathbb{T}$ , where ij is the concatenation of the two sequences $i,j\in \mathbb{T}$ , with the convention that $\emptyset i=i\emptyset=i$ .
We recall the definition of a bifurcating Markov chain from [Reference Guyon9].
Definition 2.1. We say a stochastic process indexed by $\mathbb{T}$ , $X=(X_i, i\in \mathbb{T})$ , is a bifurcating Markov chain (BMC) on a measurable space $(S, {\mathscr S}\,)$ with initial probability distribution $\nu$ on $(S, {\mathscr S}\,)$ and probability kernel ${\mathcal P}$ on $S\times {\mathscr S}\,^{\otimes 2}$ if the following hold:
(Initial distribution.) The random variable $X_\emptyset$ is distributed as $\nu$ .
(Branching Markov property.) For a sequence $(g_i, i\in \mathbb{T})$ of functions belonging to ${\mathcal B}_b\big(S^3\big)$ , we have, for all $k\geq 0$ ,
\begin{equation*}{\mathbb E}\Bigg[\prod_{i\in \mathbb{G}_k} g_i\big(X_i,X_{i0},X_{i1}\big) |\sigma\big(X_j;\, j\in \mathbb{T}_k\big)\Bigg]=\prod_{i\in \mathbb{G}_k} {\mathcal P} g_i(X_{i}).\end{equation*}
Let $X=(X_i, i\in \mathbb{T})$ be a BMC on a measurable space $(S, {\mathscr S}\,)$ with initial probability distribution $\nu$ and probability kernel ${\mathcal P}$ . We define three probability kernels $P_0$ , $P_1$ , and ${\mathcal Q}$ on $S\times {\mathscr S}\,$ by
Notice that $P_0$ (resp. $P_1$ ) is the restriction of the first (resp. second) marginal of ${\mathcal P}$ to S. Following [Reference Guyon9], we introduce an auxiliary Markov chain $Y=(Y_n, n\in {\mathbb N}) $ on $(S,{\mathscr S}\,)$ with $Y_0$ distributed as $X_\emptyset$ and transition kernel ${\mathcal Q}$ . The distribution of $Y_n$ corresponds to the distribution of $X_I$ , where I is chosen independently from X and uniformly at random in generation $\mathbb{G}_n$ . We shall write ${\mathbb E}_x$ when $X_\emptyset=x$ , i.e. the initial distribution $\nu$ is the Dirac mass at $x\in S$ .
We end this section with a useful inequality and the Gaussian BAR model.
Remark 2.1. By convention, for $f,g\in {\mathcal B}(S)$ , we define the function $f\otimes g\in {\mathcal B}\big(S^2\big)$ by $(\,f\otimes g)(x,y)=f(x)g(y)$ for $x,y\in S$ and introduce the notation
Notice that ${\mathcal P}(g\otimes _{\textrm{sym}} {\bf 1})={\mathcal Q}(g)$ for $g\in {\mathcal B}_+(S)$ . For $f \in {\mathcal B}_+(S)$ , as $ f\otimes f \leq f^2 \otimes _{\textrm{sym}} {\bf 1} $ , we get
Example 2.1 (Gaussian bifurcating autoregressive process). We will consider the real-valued Gaussian bifurcating autoregressive process (BAR) $X=(X_{u},u\in\mathbb{T})$ where for all $ u \in \mathbb{T}$ ,
with $a_{0}, a_{1} \in ({-}1,1)$ , $b_{0}, b_{1} \in \mathbb{R}$ , and $((\varepsilon_{u0},\varepsilon_{u1}),\, u \in \mathbb{T})$ an independent sequence of bivariate Gaussian ${\mathcal N}(0,\Gamma)$ random vectors independent of $X_{\emptyset}$ with covariance matrix as follows, where $\sigma>0$ and $\rho\in {\mathbb R}$ satisfy $|\rho| < \sigma^2$ :
Then the process $X=(X_{u},u\in\mathbb{T})$ is a BMC with transition probability ${\mathcal P}$ given by
with
The transition kernel ${\mathcal Q}$ of the auxiliary Markov chain is defined by
2.2. Assumptions
We assume that $\mu$ is an invariant probability measure for ${\mathcal Q}$ .
We state first some regularity assumptions on the kernels ${\mathcal P}$ and ${\mathcal Q}$ and the invariant measure $\mu$ we will use later on. Notice first that by Cauchy–Schwarz we have, for $f,g\in L^4(\mu)$ ,
so that, as $\mu$ is an invariant measure of ${\mathcal Q}$ ,
where we use Jensen’s inequality for the second inequality. Similarly, for $f,g\in L^2(\mu)$ ,s
We shall in fact assume that ${\mathcal P}$ (in fact only its symmetrized version) is in a sense an $L^2(\mu)$ operator; see also Remark 2.2 below.
Assumption 2.2. There exists an invariant probability measure $\mu$ for the Markov transition kernel ${\mathcal Q}$ .
-
(i) There exists a finite constant M such that for all $f,g,h \in L^{2}(\mu)$ ,
(6) \begin{align}\mathop{\parallel\! {\mathcal P} ({\mathcal Q} f \otimes _{\textrm{sym}} {\mathcal Q} g) \! \parallel}\nolimits_{L^{2}(\mu)}&\leq M \mathop{\parallel\! f \! \parallel}\nolimits_{L^{2}(\mu)} \mathop{\parallel\! g \! \parallel}\nolimits_{L^{2}(\mu)} ,\end{align}(7) \begin{align}\mathop{\parallel\! {\mathcal P} \!\left({\mathcal P}({\mathcal Q} f \otimes _{\textrm{sym}} {\mathcal Q} g)\otimes _{\textrm{sym}} {\mathcal Q} h\right) \! \parallel}\nolimits_{L^{2}(\mu)}&\leq M \mathop{\parallel\! f \! \parallel}\nolimits_{L^{2}(\mu)} \mathop{\parallel\! g \! \parallel}\nolimits_{L^{2}(\mu)} \mathop{\parallel\! h \! \parallel}\nolimits_{L^{2}(\mu)} ,\end{align}(8) \begin{align}\mathop{\parallel\! {\mathcal P} (\,f \otimes _{\textrm{sym}} {\mathcal Q} g) \! \parallel}\nolimits_{L^{2}(\mu)}&\leq M \mathop{\parallel\! {f} \! \parallel}\nolimits_{L^{4}(\mu)} \mathop{\parallel\! g \! \parallel}\nolimits_{L^{2}(\mu)}. \end{align} -
(ii) There exists $k_0\in {\mathbb N}$ such that the probability measure $\nu {\mathcal Q}^{k_0}$ has a bounded density, say $\nu_0$ , with respect to $\mu$ . That is,
\begin{equation*}\nu {\mathcal Q}^{k_0}(dy) = \nu_0(y) \mu(dy)\quad \text{and} \quad \mathop{\parallel\! {\nu_0} \! \parallel}\nolimits_\infty <+\infty .\end{equation*}
Remark 2.2. Let $\mu$ be an invariant probability measure of ${\mathcal Q}$ . If there exists a finite constant M such that for all $f,g \in L^{2}(\mu)$ ,
then we deduce that (6), (7), and (8) hold. Condition (9) is much more natural and simpler than the latter ones, and it allows us to give shorter proofs. However, Condition (9) appears to be too strong even in the simplest case of the symmetric BAR model developed in Example 2.1 with $a_0=a_1$ and $b_0=b_1$ . Let a denote the common value of $a_0$ and $a_1$ . In fact, according to the value of $a\in ({-}1, 1)$ in the symmetric BAR model, there exists $k_1\in {\mathbb N}$ such that for all $f,g \in L^{2}(\mu)$ ,
with $k_1$ increasing with $|a|$ . Since Assumption 2.2(i) is only necessary for the asymptotic normality in the case $|a|\in \big[0, 1/\sqrt{2}\big]$ (corresponding to the subcritical and critical regimes), it will be enough to consider $k_1=1$ (but not sufficient to consider $k_1=0$ ). For this reason, we consider (6), that is, (10) with $k_1=1$ . A similar remark holds for (7) and (8). In a sense Condition (10) (as well as similar extensions of (7) and (8)) is in the same spirit as Assumption 2.2(ii): one uses iterates of ${\mathcal Q}$ to get smoothness on the kernel ${\mathcal P}$ and the initial distribution $\nu$ .
Remark 2.3. Let $\mu$ be an invariant probability measure of ${\mathcal Q}$ and assume that the transition kernel ${\mathcal P}$ has a density, denoted by p, with respect to the measure $\mu^{\otimes 2}$ ; that is, ${\mathcal P}(x, dy, dz) = p(x,y,z) \, \mu(dy)\mu(dz)$ for all $x\in S$ . Then the transition kernel ${\mathcal Q}$ has a density, denoted by q, with respect to $\mu$ ; that is, ${\mathcal Q}(x, dy)= q(x,y) \mu(dy)$ for all $x \in S$ with $q(x,y)=2^{-1} \int_S (p(x,y,z)+p(x,z,y))\, \mu(dz)$ . We set
Assume that
and that there exists a finite constant C such that for all $f\in L^4(\mu)$ ,
Since $|{\mathcal Q} f| \leq \mathop{\parallel\! f \! \parallel}\nolimits_{L^{2}(\mu)} {\mathfrak h}$ , we deduce that (12), (13), and (14) imply respectively (6), (7), and (8).
We consider the following ergodic property of ${\mathcal Q}$ , which in particular implies that $\mu$ is indeed the unique invariant probability measure for ${\mathcal Q}$ . We refer to [Reference Douc, Moulines, Priouret and Soulier8, Section 22] for a detailed account of $L^2(\mu)$ -ergodicity (see in particular Definition 22.2.2 on the exponentially convergent Markov kernel).
Assumption 2.3. The Markov kernel ${\mathcal Q}$ has a (unique) invariant probability measure $\mu$ , and ${\mathcal Q}$ is $L^2(\mu)$ exponentially convergent; that is, there exist $\alpha \in (0,1)$ and M finite such that for all $f \in L^{2}(\mu)$ ,
We consider the stronger ergodic property based on a second spectral gap. (Notice in particular that Assumption 2.2 implies Assumption 2.2.)
Assumption 2.4. The Markov kernel ${\mathcal Q}$ has a (unique) invariant probability measure $\mu$ , and there exist $\alpha \in (0,1)$ ; a finite nonempty set J of indices; distinct complex eigenvalues $\{\alpha_j, \, j\in J\}$ of the operator ${\mathcal Q}$ with $|\alpha_j|=\alpha$ ; nonzero complex projectors $\{{\mathcal R}_j, \, j\in J\}$ defined on ${\mathbb C} L^2(\mu)$ , the ${\mathbb C}$ -vector space spanned by $L^2(\mu)$ , such that ${\mathcal R}_j\circ {\mathcal R}_{j^{\prime}}={\mathcal R}_{j^{\prime}}\circ {\mathcal R}_{j}=0$ for all $j\neq j^{\prime}$ (so that $\sum_{j\in J} {\mathcal R}_j$ is also a projector defined on ${\mathbb C} L^2(\mu)$ ); and a positive sequence $(\beta_n, n\in {\mathbb N})$ converging to $0$ , such that for all $f\in L^2(\mu)$ , with $\theta_j=\alpha_j/\alpha$ ,
Assumptions 2.3 and 2.4 stated in an $L^2$ framework correspond to [Reference Bitseki Penda and Delmas4, Assumptions 2.4 and 2.6] stated in a pointwise framework. The structural Assumption 2.2 on the transition kernel ${\mathcal P}$ replaces the structural [Reference Bitseki Penda and Delmas4, Assumptions 2.2] on the set of functions considered.
Remark 2.4. Assume that ${\mathcal Q}$ has a density q with respect to an invariant probability measure $\mu$ such that ${\mathfrak h}\in L^2(\mu)$ , where ${\mathfrak h}$ is defined in (11); that is,
Then the operator ${\mathcal Q}$ is a nonnegative Hilbert–Schmidt operator (and thus a compact operator) on $L^2(\mu)$ . It is well known that in this case, except for the possible value 0, the spectrum of ${\mathcal Q}$ is equal to the set $\sigma_{p}({\mathcal Q})$ of eigenvalues of ${\mathcal Q}$ ; $\sigma_{p}({\mathcal Q})$ is a countable set with 0 as the only possible accumulation point, and for all $\lambda \in \sigma_{p}({\mathcal Q})\setminus \{0\}$ , the eigenspace associated to $\lambda$ is finite-dimensional (we refer for example to [Reference Beauzany2, Chapter 4] for more details). In particular, if 1 is the only eigenvalue of ${\mathcal Q}$ with modulus 1 and if it has multiplicity 1 (that is, the corresponding eigenspace is reduced to the constant functions), then Assumptions 2.3 and 2.4 also hold. Let us mention that $q(x,y)>0$ $\mu(dx)\otimes \mu(dy)$ -almost surely (a.s.) is a standard condition which implies that 1 is the only eigenvalue of ${\mathcal Q}$ with modulus 1 and that it has multiplicity 1; see for example [Reference Baxter and Rosenthal1].
2.3. Notation for averages of different functions over different generations
Let $X=(X_u, u\in \mathbb{T})$ be a BMC on $(S, \mathcal{S})$ with initial probability distribution $\nu$ and probability kernel ${\mathcal P}$ . Recall that ${\mathcal Q}$ is the induced Markov kernel. We shall assume that $\mu$ is an invariant probability measure of ${\mathcal Q}$ . For a finite set $A\subset \mathbb{T}$ and a function $f\in {\mathcal B}(S)$ , we set
We shall be interested in the cases $A=\mathbb{G}_n$ (the nth generation) and $A=\mathbb{T}_n$ (the tree up to the nth generation). We recall from [Reference Guyon9, Theorem 11 and Corollary 15] that under geometric ergodicity assumption, for f a continuous bounded real-valued function defined on S we have the following convergences in $L^2(\mu)$ (resp. a.s.):
Using Lemma 5.1 and the Borel–Cantelli theorem, one can prove that we also have (17) with the $L^2(\mu)$ and a.s. convergences under Assumptions 2.2(ii) and 2.3.
We shall now consider the corresponding fluctuations. We will use frequently the following notation:
In order to study the asymptotics of $ M_{\mathbb{G}_{n-\ell }}\big(\tilde f\big)$ , we shall consider the contribution of the descendants of the individual $i\in \mathbb{T}_{n-\ell}$ for $n\geq \ell\geq 0$ :
where $i\mathbb{G}_{n-|i|-\ell}=\{ij, \, j\in\mathbb{G}_{n-|i|-\ell}\}\subset \mathbb{G}_{n-\ell}$ . For all $k\in {\mathbb N}$ such that $n\geq k+\ell$ , we have
Let ${\mathfrak f}=(\,f_\ell, \ell\in {\mathbb N})$ be a sequence of elements of $L^1(\mu)$ . We set, for $n\in {\mathbb N}$ and $i\in \mathbb{T}_n$ ,
We deduce that $ \sum_{i\in \mathbb{G}_k} N_{n,i}({\mathfrak f})=|\mathbb{G}_n|^{-1/2 }\sum_{\ell=0}^{n-k}M_{\mathbb{G}_{n-\ell}}\big(\tilde f_\ell\big)$ , which gives, for $k=0$ ,
The notation $N_{n, \emptyset}$ means that we consider the average from the root $\emptyset$ to the nth generation.
Remark 2.5. We shall consider in particular the following two simple cases. Let $f\in L^1(\mu)$ and consider the sequence ${\mathfrak f}=(\,f_\ell, \ell\in {\mathbb N})$ . If $f_0=f$ and $f_\ell=0$ for $\ell\in {\mathbb N}^*$ , then we get
If $f_\ell=f$ for $\ell \in {\mathbb N}$ , then we shall write ${\bf f}=(\,f,f, \ldots)$ , and we get, as $|\mathbb{T}_n|=2^{n+1} - 1 $ and $|\mathbb{G}_n|= 2^n$ ,
Thus, we will deduce the fluctuations of $M_{\mathbb{T}_n}(\,f)$ and $M_{\mathbb{G}_n}(\,f)$ from the asymptotics of $N_{n, \emptyset}({\mathfrak f})$ .
Because of Assumption 2.2(ii) (which roughly states that after $k_0$ generations, the distribution of the induced Markov chain is absolutely continuous with respect to the invariant measure $\mu$ ), it is better to consider only generations $k\geq k_0$ for some $k_0\in {\mathbb N}$ and thus remove the first $k_0-1$ generations in the quantity $N_{n,\emptyset}({\mathfrak f}) $ defined in (20). To study the asymptotics of $N_{n, \emptyset}({\mathfrak f})$ , it is convenient to write, for $n\geq k\geq 1$ ,
If ${\bf f}=(\,f,f, \ldots)$ is the infinite sequence in which each term is the same function f, this becomes
3. Main results
3.1. The subcritical case: $2\alpha^2<1$
We shall consider, when well defined, for a sequence ${\mathfrak f}=(\,f_\ell,\ell\in {\mathbb N})$ of measurable real-valued functions defined on S, the quantities
where
The proof of the next result is detailed in Section 5.
Theorem 3.1. Let X be a BMC with kernel ${\mathcal P}$ and initial distribution $\nu$ such that Assumptions 2.2 and 2.3 are in force with $\alpha\in \big(0, 1/\sqrt{2}\big)$ . We have the following convergence in distribution for any sequence ${\mathfrak f}=(\,f_\ell, \ell\in {\mathbb N})$ that is bounded in $L^4(\mu)$ (that is, such that $\sup_{\ell\in {\mathbb N}} \mathop{\parallel\! {f_\ell} \! \parallel}\nolimits_{L^4(\mu)}<+\infty $ ):
where G is a centered Gaussian random variable with variance $\Sigma^{\textrm{sub}}({\mathfrak f})$ given by (22) which is well defined and finite.
Notice that the variance $\Sigma^{\textrm{sub}}({\mathfrak f})$ already appears in the subcritical pointwise-approach case; see [Reference Bitseki Penda and Delmas4, (15) and Theorem 3.1]. Then, arguing similarly as in [Reference Bitseki Penda and Delmas4, Section 3.1], we deduce that if Assumptions 2.2 and 2.3 are in force with $\alpha\in \big(0, 1/\sqrt{2}\big)$ , then for $f\in L^{4}(\mu)$ , we have the following convergence in distribution:
where $G_{1}$ and $G_2$ are centered Gaussian random variables with respective variances $\Sigma^{\textrm{sub}}_{\mathbb{G}}(\,f)=\Sigma^{\textrm{sub}}({\mathfrak f})$ , with ${\mathfrak f} = (\,f,0,0, \ldots)$ , and $\Sigma^{\textrm{sub}}_{\mathbb{T}}(\,f)=\Sigma^{\textrm{sub}}({\bf f})/2$ , with $ {\bf f}=(\,f,f, \ldots)$ , given in [Reference Bitseki Penda and Delmas4, Corollary 3.3], which are well defined and finite.
3.2. The critical case: $2\alpha^2=1$
In the critical case $\alpha=1/\sqrt{2}$ , we shall denote by ${\mathcal R}_j$ the projector on the eigenspace associated to the eigenvalue $\alpha_j$ with $\alpha_j=\theta_j \alpha$ , $|\theta_j|=1$ and for j in the finite set of indices J. Since ${\mathcal Q}$ is a real operator, we get that if $\alpha_j$ is a nonreal eigenvalue, then so is $\overline \alpha_j$ . We shall denote by $\overline {\mathcal R}_j$ the projector associated to $\overline \alpha_j$ . Recall that the sequence $(\beta_n, n\in {\mathbb N})$ in Assumption 2.2 is nonincreasing and bounded from above by 1. For any measurable real-valued function f defined on S, we set, when this is well defined,
We shall consider, when well defined, for a sequence ${\mathfrak f}=(\,f_\ell,\ell\in {\mathbb N})$ of measurable real-valued functions defined on S, the quantities
where
with, for $k, \ell\in {\mathbb N}$ ,
Notice that $f_{k, \ell}^*=f_{\ell, k}^*$ and that $f_{k, \ell}^*$ is real-valued as
for j ′ such that $\alpha_{j^{\prime}}=\overline\alpha _j$ and thus ${\mathcal R}_{j^{\prime}}=\overline {\mathcal R}_j$ .
The technical proof of the next result is omitted, as it is an adaptation of the proof of Theorem 3.1 in the subcritical case, in the same spirit as [Reference Bitseki Penda and Delmas4, Theorem 3.4] (critical case) is an adaptation of the proof of [Reference Bitseki Penda and Delmas4, Theorem 3.1] (subcritical case). The interested reader can find the details in [Reference Bitseki Penda and Delmas3].
Theorem 3.2. Let X be a BMC with kernel ${\mathcal P}$ and initial distribution $\nu$ such that Assumptions 2.2 (with $k_0\in {\mathbb N}$ ), 2.3, and 2.4 are in force with $\alpha = 1/\sqrt{2}$ . We have the following convergence in distribution for any sequence ${\mathfrak f}=(\,f_\ell,\ell\in {\mathbb N})$ that is bounded in $L^4(\mu)$ $\big($ that is, such that $\sup_{\ell\in {\mathbb N}}\mathop{\parallel\! {f_\ell} \! \parallel}\nolimits_{L^4(\mu)}<+\infty\big) $ :
where G is a centered Gaussian random variable with variance $\Sigma^{\textrm{crit}}({\mathfrak f})$ given by (27), which is well defined and finite.
Notice that the variance $\Sigma^{\textrm{crit}}({\mathfrak f})$ already appears in the critical pointwise-approach case; see [Reference Bitseki Penda and Delmas4, (20) and Theorem 3.4]. Then, arguing similarly as in [Reference Bitseki Penda and Delmas4, Section 3.2], we deduce that if Assumptions 2.2 (with $k_0\in {\mathbb N}$ ), 2.3, and 2.4 are in force with $\alpha = 1/\sqrt{2}$ , then for $f\in L^{4}(\mu)$ , we have the following convergence in distribution:
where $G_{1}$ and $G_2$ are centered Gaussian random variables with respective variances $\Sigma^{\textrm{crit}}_{\mathbb{G}}(\,f)=\Sigma^{\textrm{crit}}({\mathfrak f})$ , with ${\mathfrak f} = (\,f,0,0, \ldots)$ , and $\Sigma^{\textrm{crit}}_{\mathbb{T}}(\,f)=\Sigma^{\textrm{crit}}({\bf f})/2$ , with $ {\bf f}=(\,f,f, \ldots)$ , given in [Reference Bitseki Penda and Delmas4, Corollary 3.6], which are well defined and finite.
3.3. The supercritical case $2\alpha^2>1$
We consider the supercritical case $\alpha\in \big(1/\sqrt{2}, 1\big)$ . This case is very similar to the supercritical case in the pointwise approach; see [Reference Bitseki Penda and Delmas4, Section 3.3]. So we only mention the most interesting results without proof. The interested reader can find the details in [Reference Bitseki Penda and Delmas3].
We shall assume that Assumptions 2.2(ii) and 2.4 hold. In particular we do not assume Assumption 2.4(i). Recall (16) with the eigenvalues $\big\{\alpha_j=\theta_j \alpha, j\in J\big\} $ of ${\mathcal Q}$ , with modulus equal to $\alpha$ (i.e. $|\theta_j|=1$ ) and the projector ${\mathcal R}_j$ on the eigenspace associated to eigenvalue $\alpha_j$ . Recall that the sequence $(\beta_n, n\in {\mathbb N})$ in Assumption 2.4 can (and will) be chosen to be nonincreasing and bounded from above by 1. We shall consider the filtration ${\mathcal H}=({\mathcal H}_n, n\in )$ defined by ${\mathcal H}_{n} = \sigma(X_{i},i\in\mathbb{T}_{n})$ . The next lemma exhibits martingales related to the projector ${\mathcal R}_j$ .
Lemma 3.1. Let X be a BMC with kernel ${\mathcal P}$ and initial distribution $\nu$ such that Assumptions 2.2(ii) and 2.4 are in force with $\alpha\in \big(1/\sqrt{2}, 1\big)$ in (16). Then, for all $j\in J$ and $f\in L^{2}(\mu)$ , the sequence $M_{j}(\,f)=\big(M_{n,j}(\,f), n\in {\mathbb N}\big)$ , with
is an ${\mathcal H}$ -martingale which converges a.s. and in $L^{2}(\nu)$ to a random variable, say $M_{\infty,j}(\,f)$ .
The next result corresponds to [Reference Bitseki Penda and Delmas4, Corollary 3.13] in the pointwise approach.
Corollary 3.1. Let X be a BMC with kernel ${\mathcal P}$ and initial distribution $\nu$ such that Assumptions 2.2(ii) and 2.4 are in force with $\alpha \in \big(1/\sqrt{2},1\big) $ in (16). Assume $\alpha$ is the only eigenvalue of ${\mathcal Q}$ with modulus equal to $\alpha$ (and thus J is reduced to a singleton, say $\{j_0\}$ ). Then, for $f\in L^{2}(\mu)$ , we have
where $M_{\infty, j_0}(\,f)$ is the random variable defined in Lemma 3.1.
4. Application to the study of symmetric BAR
4.1. Symmetric BAR
We consider a particular case from [Reference Cowan and Staudte7] of the real-valued bifurcating autoregressive process (BAR) from Example 2.1. We keep the same notation. Let $a\in ({-}1, 1)$ and assume that $a=a_{0} = a_{1}$ , $b_{0} = b_{1} = 0$ , and $\rho = 0$ . In this particular case the BAR has symmetric kernel as
We have ${\mathcal Q} f(x)={\mathbb E}[f(ax +\sigma G)]$ and more generally
where G is a standard ${\mathcal N}(0, 1)$ Gaussian random variable and $\sigma_a=\sigma \big(1- a^2\big)^{-1/2}$ . The kernel ${\mathcal Q}$ admits a unique invariant probability measure $\mu$ , which is ${\mathcal N}\big(0, \sigma_a^2\big)$ and whose density, still denoted by $\mu$ , with respect to the Lebesgue measure is given by
The densities p (resp. q) of the kernel ${\mathcal P}$ (resp. ${\mathcal Q}$ ) with respect to $\mu^{\otimes 2}$ (resp. $\mu$ ) are given by
and
Notice that q is symmetric. The operator ${\mathcal Q}$ $\big($ in $L^2(\mu)\big)$ is a symmetric integral Hilbert–Schmidt operator whose eigenvalues are given by $\sigma_{p}({\mathcal Q}) = (a^{n}, n \in \mathbb{N})$ , their algebraic multiplicity is one, and the corresponding eigenfunctions $(\bar{g}_{n}(x), n \in \mathbb{N})$ are defined for $n\in {\mathbb N}$ by
where $g_{n}$ is the Hermite polynomial of degree n ( $g_0=1$ and $g_1(x)=x$ ). Let ${\mathcal R}$ be the orthogonal projection on the vector space generated by $\bar{g}_{1}$ ; that is, ${\mathcal R} f= \langle \mu, f\bar g_1 \rangle\, \bar g_1$ , or equivalently, for $x\in {\mathbb R}$ ,
Recall ${\mathfrak h}$ defined in (11). It is not difficult to check that
and ${\mathfrak h} \in L^2(\mu)$ $\big($ that is, $\int_{{\mathbb R}^2} q(x,y)^2 \, \mu(x) \mu(y)\, dx dy<+\infty\big) $ . Using elementary computations, it is possible to check that ${\mathcal Q} {\mathfrak h}\in L^4(\mu)$ if and only if $|a|< 3^{-1/4}$ $\big($ whereas ${\mathfrak h}\in L^4(\mu)$ if and only if $|a|<3^{-1/2}\big)$ . As ${\mathcal P}$ is symmetric, we get ${\mathcal P}\big({\mathfrak h}\otimes ^2\big)\leq ({\mathcal Q}{\mathfrak h})^2$ and thus (12) holds for $|a|< 3^{-1/4}$ . We also get, using the Cauchy–Schwarz inequality, that
and thus (14) holds for $|a|< 3^{-1/4}$ . Some elementary computations give that (13) also holds for $|a|\leq 0.724$ (but (13) fails for $|a|\geq 0.725$ ). $\big($ Notice that $2^{-1/2}< 0.724< 3^{-1/4}\big)$ . As a consequence of Remark 2.3, if $|a|\leq 0.724$ , then (6)–(8) are satisfied and thus Assumption 2.2(i) holds.
Notice that $\nu {\mathcal Q}^k$ is the probability distribution of $a^k X_\emptyset + \sigma_a \sqrt{1- a^{2k}}\, G$ , with G an ${\mathcal N}(0, 1)$ random variable independent of $X_\emptyset$ . So Assumption 2.2(ii) holds in particular if $\nu$ has compact support (with $k_0=1$ ) or if $\nu$ has a density with respect to the Lebesgue measure, which we still denote by $\nu$ , such that $\mathop{\parallel\! {\nu/\mu} \! \parallel}\nolimits_\infty $ is finite (with $k_0\in {\mathbb N}$ ). Notice that if $\nu$ is the probability distribution of ${\mathcal N}\big(0, \rho_0^2\big)$ , then $\rho_0> \sigma_a$ (resp. $\rho_0\leq \sigma_a$ ) implies that Assumption 2.2(ii) fails (resp. is satisfied).
Using the fact that $\big(\bar g_n/\sqrt{n!}, \, n\in {\mathbb N}\big)$ is an orthonormal basis of $L^2(\mu)$ and Parseval’s identity, it is easy to check that Assumption 2.4 holds with $J=\{j_0\}$ , $\alpha_{j_0} = \alpha = a$ , $\beta_n = a^{n}$ , and ${\mathcal R}_{j_0}={\mathcal R}$ .
4.2. Numerical studies: illustration of phase transitions for the fluctuations
We consider the symmetric BAR model from Section 4.1 with $a=\alpha\in (0, 1)$ . Recall that $\alpha$ is an eigenvalue with multiplicity one, and we denote by ${\mathcal R}$ the orthogonal projection on the one-dimensional eigenspace associated to $\alpha$ . The expression for ${\mathcal R}$ is given in (31).
In order to illustrate the effects of the geometric rate of convergence $\alpha$ on the fluctuations, we plot for ${\mathbb A}_n\in \{\mathbb{G}_n, \mathbb{T}_n\}$ the slope, say $b_{\alpha,n}$ , of the regression line ${\log}\big({\textrm{Var}}\big(|{\mathbb A}_{n}|^{-1}M_{{\mathbb A}_{n}}(\,f)\big)\big) $ versus $ {\log}(|{\mathbb A}_{n}|)$ as a function of the geometric rate of convergence $\alpha$ . In the classical cases (e.g. Markov chains), the points are expected to be distributed around the horizontal line $y = -1$ . For n large, we have ${\log}(|{\mathbb A}_{n}|)\simeq n\, {\log}(2)$ , and for the symmetric BAR model, the convergences in (25) for $\alpha<1/\sqrt{2}$ , (30) for $\alpha=1/\sqrt{2}$ , and Corollary 3.1 for $\alpha>1/\sqrt{2}$ yield that $b_{\alpha,n} \simeq h_{1}(\alpha) $ with $h_1(\alpha)=\,{\log}\big(\alpha^2\vee 2^{-1}\big)/{\log}(2)$ as soon as the limiting Gaussian random variable in (25) and (30) or $M_\infty (\,f)$ in Corollary 3.1 is nonzero.
For our illustrations, we consider the empirical moments of order $p\in \{1, \ldots, 4\}$ ; that is, we use the functions $f(x) = x^p$ . As we can see in Figures 1 and 2, these curves present two trends, with a phase transition around the rate $\alpha = 1/\sqrt{2}$ for $p \in \{1,3\}$ and around the rate $\alpha^{2} = 1/\sqrt{2}$ for $p \in \{2,4\}$ . For convergence rates $\alpha \in \big(0,1/\sqrt{2}\big)$ , the trend is similar to that of the classical cases. For convergence rates $\alpha \in \big(1/\sqrt{2},1\big)$ , the trend differs from the classical cases. One can observe that the slope $b_{\alpha,n}$ increases with the value of the geometric convergence rate $\alpha$ . We also observe that for $\alpha > 1/ \sqrt{2}$ , the empirical curve agrees with the graph of $h_1(\alpha)=\,{\log}\big(\alpha^2\vee 2^{-1}\big)/{\log}(2)$ for $f(x) = x^{p}$ when p is odd; see Figure 1. However, the empirical curve does not agree with the graph of $h_1$ for $f(x) = x^{p}$ when p is even (see Figure 2); instead, it agrees with the graph of the function $h_2(\alpha)=\,{\log}\big(\alpha^4\vee 2^{-1}\big)/{\log}(2)$ . This is due to the fact that for p even, the function $f(x)=x^p$ belongs to the kernel of the projector ${\mathcal R}$ (which is clear from the formula (31)), and thus $M_\infty (\,f)=0$ . In fact, in those two cases, one should take into account the projection on the eigenspace associated to the third eigenvalue, which in this particular case is equal to $\alpha^2$ . Intuitively, this indeed gives a rate of order $h_2$ . Therefore, the normalization given for $f(x)=x^p$ when p is even is not correct.
5. Proof of Theorem 3.1
In the following proofs, we will denote by C any unimportant finite constant which may vary from line to line (in particular, C does not depend on n or on ${\mathfrak f}$ ).
Let $(p_n, n\in {\mathbb N})$ be a nondecreasing sequence of elements of ${\mathbb N}^*$ such that
When there is no ambiguity, we write p for $p_n$ .
Remark 5.1. We stress that in the critical case (corresponding to Theorem 3.2, for which a detailed proof can be found in [Reference Bitseki Penda and Delmas3]), the condition (32) must be strengthened as follows: for all $\lambda>0$ ,
Let $i,j\in \mathbb{T}$ . We write $i\preccurlyeq j$ if $j\in i\mathbb{T}$ . We denote by $i\wedge j$ the most recent common ancestor of i and j, which is defined as the only $u\in \mathbb{T}$ such that if $v\in \mathbb{T}$ and $ v\preccurlyeq i$ , $v \preccurlyeq j$ , then $v \preccurlyeq u$ . We also define the lexicographic order $i\leq j$ if either $i \preccurlyeq j$ or $v0 \preccurlyeq i$ and $v1 \preccurlyeq j$ for $v=i\wedge j$ . Let $X=(X_i, i\in \mathbb{T})$ be a BMC with kernel ${\mathcal P}$ and initial measure $\nu$ . For $i\in \mathbb{T}$ , we define the $\sigma$ -field
By construction, the $\sigma$ -fields $({\mathcal F}_{i}; \, i\in \mathbb{T})$ are nested, as ${\mathcal F}_{i}\subset {\mathcal F}_{j} $ for $i\leq j$ .
For $n\in {\mathbb N}$ , $i\in \mathbb{G}_{n-p_n}$ , and ${\mathfrak f} = (\,f_{\ell}, \ell \in \mathbb{N})$ a bounded sequence in $L^{4}(\mu)$ , we define the martingale increments as follows:
Thanks to (19), we have
Using the branching Markov property and (19), we get for $i\in \mathbb{G}_{n-p_n}$
We deduce from (21) with $k=n-p_n$ that
with
A quick overview of our strategy
As a first step, we prove that $R_{0}(n)$ and $R_{1}(n)$ , which appear in (34), converge in probability to 0 (see Remark 5.2 and Lemmas 5.2–5.3). Then we shall prove a central limit theorem for the martingale $\Delta_n({\mathfrak f})$ defined in (33), by first proving the convergence of its bracket V(n), which is defined in (43) (see Lemma 5.7, which is a consequence of the technical Lemmas 5.4, 5.5, and 5.6), and checking that Lindeberg’s condition holds using a fourth moment condition (see Lemma 5.8). Notice that we use the condition (7) in this latter part only. Then we conclude, as $N_{n,\emptyset}({\mathfrak f})$ and $\Delta_{n}({\mathfrak f})$ have the same asymptotic behavior.
We first state a very useful lemma which holds in the subcritical, critical, and supercritical cases.
Lemma 5.1. Let X be a BMC with kernel ${\mathcal P}$ and initial distribution $\nu$ such that Assumption 2.2(ii) (with $k_0\in {\mathbb N}$ ) is in force. Then there exists a finite constant C such that for all $f\in {\mathcal B}_+(S)$ and all $n\geq k_0$ , we have
Proof. Using the first moment formula (74), Assumption 2.2(ii), and the fact that $\mu$ is invariant for ${\mathcal Q}$ , we get that
We also have
where we used the second moment formula (75) for the equality, (3) for the first inequality, Jensen’s inequality for the second, and Assumption 2.2(ii) and the fact that $\mu$ is invariant for ${\mathcal Q}$ for the last.
For $k\in {\mathbb N}^*$ , we set
As mentioned earlier, we will denote by C any unimportant finite constant, which may vary from line to line (in particular C does not depend on n or on ${\mathfrak f}$ , but may depend on $k_0$ and $\mathop{\parallel\! {\nu_0} \! \parallel}\nolimits_\infty $ ).
Remark 5.2. Recall $k_{0}$ given in Assumption 2.2(ii). Let ${\mathfrak f} = (\,f_{\ell}, \ell \in \mathbb{N})$ be a bounded sequence in $L^{4}(\mu)$ . We have
where we set
Using the Cauchy–Schwarz inequality, we get
Since the sequence ${\mathfrak f}$ is bounded in $L^{4}(\mu)$ and since $k_{0}$ is finite, we have, for all $\ell \in \{0, \ldots, $ $k_{0}-1\}$ , that $\lim_{n \rightarrow \infty} |\mathbb{G}_{n}|^{-1/2} M_{\mathbb{G}_{\ell}}(|f_{n-\ell}|) = 0$ a.s. and then that (using (40))
Therefore, from (38), the study of $N_{n,\emptyset}({\mathfrak f})$ is reduced to that of $N^{[k_{0}]}_{n,\emptyset}({\mathfrak f})$ .
Recall that $(p_n, n\in {\mathbb N})$ is such that (32) holds. Assume that n is large enough so that $n-p_n -1\geq k_0$ . We have
where $\Delta_n({\mathfrak f})$ and $R_1(n)$ are defined in (33) and (35), and
Lemma 5.2. Under the assumptions of Theorem 3.1, we have the following convergence:
Proof. Assume $n-p\geq k_0$ . We write
We have that
where
We deduce from Assumption 2.2(ii) (see (36)) that ${\mathbb E}\Big[M_{\mathbb{G}_{k_0}}(h_{k,n})\Big]\leq C \langle \mu, h_{k, n}\rangle$ . We have also that
where we used (36) for the first inequality (notice one can take $k_0=0$ in this case as we consider the expectation ${\mathbb E}_\mu$ ), (15) in the second, and $2\alpha^2<1$ in the last. We deduce that
using the triangle inequality for the $L^2({\mathbb P})$ norm. As
by Jensen’s inequality, we obtain, with $a_i= {\mathbb E}\Big[M_{i\mathbb{G}_{k-k_0}}\big(\tilde f_{n-k}\big)^2\Big]^{1/2}$ , that
where we used that the sequence ${\mathfrak f}$ is bounded in $L^2(\mu)$ for the last inequality. We use that $\lim_{n\rightarrow \infty } p=\infty $ to conclude.
We have the following lemma.
Lemma 5.3. Under the assumptions of Theorem 3.1, we have the following convergence:
Proof. For $p\geq \ell \geq 0$ , $n-p\geq k_0$ , and $j\in \mathbb{G}_{k_0}$ , we set
so that $R_1(n)=\sum_{\ell=0}^{p} \sum_{j\in \mathbb{G}_{k_0}} R_{1,j} (\ell,n )$ . For $i\in \mathbb{G}_{n-p}$ we have
where we used the definition (18) of $N_{n,i}^\ell$ for the first equality, the Markov property of X for the second, and (74) for the third. Using (42), we get for $j\in \mathbb{G}_{k_0}$
We deduce from the Markov property of X that ${\mathbb E}[R_{1,j}(\ell,n)^2|\, {\mathcal F}_j]=2^{-n+2(p-\ell)} \,h_{\ell, n} (X_j)$ with
Thanks to Assumption 2.2(ii) (see (36)), we have that
We have
where we used (36) for the first inequality (notice one can take $k_0=0$ in this case as we consider the expectation ${\mathbb E}_\mu$ ), (15) in the second, and $2\alpha^2<1$ in the last. We deduce that
We get that
with the sequence $(a_{1, n},n\in {\mathbb N})$ defined by
The sequence $(a_{1, n}, n\in {\mathbb N})$ does not depend on ${\mathfrak f}$ and converges to 0, since $\lim_{n \rightarrow \infty} p = \infty$ , $2\alpha^{2} < 1$ , and
Then we use that ${\mathfrak f}$ is bounded in $L^2(\mu)$ to conclude.
Remark 5.3. From the proofs of Lemmas 5.2 and 5.3, we have that $\mathbb{E}\Big[\Big(N_{n,\emptyset}^{[k_{0}]}({\mathfrak f}) - \Delta_{n}({\mathfrak f})\Big)^{2}\Big] \leq a_{0,n} c_{2}({\mathfrak f})$ , where the sequence $(a_{0,n}, n\in \mathbb{N})$ converges to 0 as n goes to infinity.
We now study the central limit theorem for $\Delta_{n}({\mathfrak f}).$ First, we study the bracket of $\Delta_n$ :
with
Lemma 5.4. Under the assumptions of Theorem 3.1, we have the following convergence:
Proof. We define the sequence $(a_{2, n}, n\in {\mathbb N})$ for $n\in {\mathbb N}$ by
Notice that the sequence $(a_{2, n}, n\in {\mathbb N})$ converges to 0, since $\lim_{n\rightarrow \infty }p=\infty $ , $2\alpha^2<1$ , and
We now compute $ {\mathbb E}_x\!\left[R_2(n)\right]$ :
where we used the definition of $N_{n,i}({\mathfrak f})$ for the first equality, the Markov property of X for the second, and (74) for the third. From the latter equality, we have using Assumption 2.2(ii) that
We deduce that
then use that ${\mathfrak f}$ is bounded in $L^2(\mu)$ to conclude.
Remark 5.4. In particular, we have obtained from the previous proof that $\mathbb{E}[|V(n) - V_{1}(n) - V_{2}(n)|] \leq C c_{2}^{2}({\mathfrak f}) a_{2,n}$ , with the sequence $(a_{2,n}, n \in \mathbb{N})$ going to 0 as n goes to infinity.
Lemma 5.5. Under the assumptions of Theorem 3.1, we have that in probability $ \lim_{n\rightarrow \infty } V_2(n) =\Sigma^{\textrm{sub}}_2({\mathfrak f})$ with $\Sigma^{\textrm{sub}}_2({\mathfrak f})$ finite and defined in (24).
Proof. Using (76), we get
with
We consider the term $V_6(n)$ . We have
with
Define $H_6({\mathfrak f})=\sum_{0\leq \ell< k;\, r\geq 0} h_{k,\ell,r}$ with
Thanks to (5) and (15), we get that
We deduce that $|h_{k,\ell,r}|\leq C \, 2^{r-\ell} \alpha^{k-\ell +2r} c_2^2({\mathfrak f})$ and, as the sum $\sum_{0\leq \ell<k, \, r\geq 0}2^{r-\ell} \alpha^{k-\ell +2r}$ is finite,
We write $H_6({\mathfrak f})=H_6^{[n]}({\mathfrak f})+ B_{6,n}({\mathfrak f})$ , with
As $\lim_{n\rightarrow \infty } {\bf 1}_{\{r+k\geq p\}}=0$ , we get from (47), (48), and dominated convergence that $\lim_{n\rightarrow \infty } B_{6,n}({\mathfrak f})=0$ and thus
We set
so that from the definition of $V_{6}(n)$ , we get that
We now study the second moment of $ |\mathbb{G}_{n-p}|^{-1}\, M_{\mathbb{G}_{n-p}}(A_{6,n}({\mathfrak f}))$ . Using (36), for $n-p\geq k_0$ we get
Recall $c_k({\mathfrak f})$ and $q_k({\mathfrak f})$ from (37). We deduce that
where we used the triangle inequality for the first inequality; (15) for the second; (6) for $r\geq1$ and (15) again for the third; (8) for $r=0$ to get the $c_4({\mathfrak f})$ term and $c_2({\mathfrak f})\leq c_4({\mathfrak f})$ for the fourth; and the fact that $ \sum_{{0\leq \ell< k,\, r\geq 0}} 2^{r-\ell} \alpha^{k -\ell+2r} $ is finite for the last. As $\sum_{j=0}^{\infty } \big(2\alpha^2\big)^j$ is finite, we deduce that
We now consider the term $V_5(n)$ defined just after (45):
with
Define $H_5({\mathfrak f})=\sum_{0\leq \ell< k} h_{k,\ell}$ with $h_{k,\ell}= 2^{-\ell} \left\langle \mu, \tilde f_k {\mathcal Q}^{k-\ell} \tilde f_\ell\right\rangle$ . We have using the Cauchy–Schwarz inequality and (15) that
As the sum $\sum_{0\leq \ell<k}2^{-\ell} \alpha^{k-\ell}$ is finite, we deduce that
We write $H_5({\mathfrak f})=H_5^{[n]}({\mathfrak f})+ B_{5,n}({\mathfrak f})$ , with
As $\lim_{n\rightarrow \infty } {\bf 1}_{\{k> p\}}=0$ , we deduce from (51) and (52) that $\lim_{n\rightarrow \infty } B_{5,n}({\mathfrak f})=0$ by dominated convergence, and thus
We set
so that from the definition of $V_{5}(n)$ , we get that
We now study the second moment of $ |\mathbb{G}_{n-p}|^{-1}\, M_{\mathbb{G}_{n-p}} \big(A_{5,n}({\mathfrak f})\big) $ . Using (36), for $n-p\geq k_0$ we get
We also have that
where we used the triangle inequality for the first inequality, (15) for the second, and the Cauchy–Schwarz inequality for the last. As $\sum_{j=0}^{\infty }\big(2\alpha^2\big)^j$ is finite, we deduce that
Since $c_2({\mathfrak f}) \leq c_4({\mathfrak f})$ , we deduce from (50) and (57), as $V_2(n)=V_5(n)+V_6(n)$ (see (45)), that
Since, according to (49) and (54), $\Sigma^{\textrm{sub}}_2({\mathfrak f})=H_6({\mathfrak f})+H_5({\mathfrak f})$ (see (24)), we get $\lim_{n\rightarrow \infty } H_2^{[n]}({\mathfrak f})=\Sigma^{\textrm{sub}}_2({\mathfrak f})$ . This implies that $\lim_{n\rightarrow \infty }V_2(n)=\Sigma^{\textrm{sub}}_2({\mathfrak f})$ in probability.
We now study the limit of $V_1(n)$ .
Lemma 5.6. Under the assumptions of Theorem 3.1, we have that in probability $\lim_{n\rightarrow \infty } V_1(n) =\Sigma^{\textrm{sub}}_1({\mathfrak f})<+\infty $ with $\Sigma^{\textrm{sub}}_1({\mathfrak f})$ finite and defined in (23).
Proof. Using (75), we get
with
We first consider the term $V_4(n)$ . We have
with
Define the constant $H_4({\mathfrak f})= \sum_{\ell\geq 0, \, k\geq 0} h_{\ell, k} $ with $h_{\ell, k} = 2^{k-\ell}\, \left\langle \mu,{\mathcal P}\!\left({\mathcal Q}^k \tilde f_\ell \otimes ^2 \right) \right\rangle$ . Thanks to (3) and (15), we have
and thus, as the sum $\sum_{\ell\geq 0, \, k\geq 0} 2^{k-\ell}\alpha^{2k} $ is finite,
We write $H_4({\mathfrak f})=H_4^{[n]}({\mathfrak f})+ B_{4,n}({\mathfrak f})$ , with
Using that $\lim_{n\rightarrow \infty } {\bf 1}_{\{\ell+k\geq p\}}=0$ , we deduce from (59), (60), and dominated convergence that $\lim_{n\rightarrow \infty } B_{4,n}({\mathfrak f})=0$ , and thus
We set
so that from the definition of $V_{4}(n)$ , we get that
We now study the second moment of $ |\mathbb{G}_{n-p}|^{-1}\, M_{\mathbb{G}_{n-p}}(A_{4,n}({\mathfrak f}))$ . Using (36), for $n-p\geq k_0$ we get
Using (3), we obtain that
We deduce that
where we used the triangle inequality for the first inequality; (15) for the second; (6) for $k\geq1$ and (15) again for the third; and (3) as well as $c_2({\mathfrak f})\leq c_4({\mathfrak f})$ for the last. As $\sum_{j=0}^{\infty } \big(2\alpha^2\big)^j$ is finite, we deduce that
We now consider the term $V_3(n)$ defined just after (58):
with
Define the constant $H_3({\mathfrak f})=\sum_{\ell\geq 0} h_\ell$ with $h_\ell= 2^{-\ell} \left\langle\mu, \tilde f_\ell^2 \right\rangle=\left\langle \mu, h^{(n)}_\ell \right\rangle$ . As $h_\ell \leq \mathop{\parallel\! f_\ell \! \parallel}\nolimits_{L^{2}(\mu)}^2\leq c_2^2({\mathfrak f})$ , we get that $H_3({\mathfrak f})\leq 2 c_2^2({\mathfrak f})$ . We write $H_3({\mathfrak f})=H_3^{[n]}({\mathfrak f})+ B_{3,n}({\mathfrak f})$ , with
As $\lim_{n\rightarrow \infty } {\bf 1}_{\{\ell> p\}}=0$ , we get from dominated convergence that $\lim_{n\rightarrow \infty } B_{3,n}({\mathfrak f})=0$ , and thus
We set
so that from the definition of $V_{3}(n)$ , we get that
We now study the second moment of $ |\mathbb{G}_{n-p}|^{-1}\,M_{\mathbb{G}_{n-p}}(A_{3,n}({\mathfrak f}))$ . Using (36), for $n-p\geq k_0$ we get
We have that
where we used the triangle inequality for the first inequality and (15) for the third. As $\sum_{j=0}^{\infty } \big(2\alpha^2\big)^j$ is finite, we deduce that
Since $c_2({\mathfrak f}) \leq c_4({\mathfrak f})$ , we deduce from (62) and (65) that
Since, according to (61) and (63), $\Sigma^{\textrm{sub}}_1({\mathfrak f})=H_4({\mathfrak f})+H_3({\mathfrak f})$ (see (23)), we get
This implies that $\lim_{n\rightarrow \infty }V_1(n)=\Sigma^{\textrm{sub}}_1({\mathfrak f})$ in probability.
The next lemma is a direct consequence of (44) and Lemmas 5.4, 5.5, and 5.6.
Lemma 5.7. Under the assumptions of Theorem 3.1, we have $\lim_{n\rightarrow\infty } V(n)=\Sigma^{\textrm{sub}}({\mathfrak f})$ in probability, where, with $\Sigma^{\textrm{sub}}_1({\mathfrak f})$ and $\Sigma^{\textrm{sub}}_2({\mathfrak f})$ defined by (23) and (24), we have
We now check Lindeberg’s condition using a fourth moment condition. We set
Lemma 5.8. Under the assumptions of Theorem 3.1, we have that $\lim_{n\rightarrow\infty } R_3(n)=0$ .
Proof. We have
where we used that $\big(\sum_{k=0}^r a_k\big)^4 \leq (r+1)^3 \sum_{k=0}^r a_k^4$ for the two inequalities (with $r=1$ and $r=p$ , respectively), as well as Jensen’s inequality and (33) for the first and (19) for the last. Using (18), we get
so that
Using (36) (with f and n replaced by $h_{n, \ell}$ and $n-p$ ), we get that
Now we give the main steps to get an upper bound on ${\mathbb E}_\mu\!\left[M_{\mathbb{G}_{p-\ell}}\big(\tilde f_\ell\big) ^4\right]$ . Recall that
We have
Now we consider the case $0\leq \ell \leq p-3$ . Let the functions $\psi_{j,p-\ell}$ , with $1\leq j\leq 9$ , be as in Lemma 6.2, with f replaced by $\tilde f_\ell$ , so that for $\ell\in \{0, \ldots, p-3\}$ ,
We now assume that $p-\ell-1\geq 2$ . We shall give bounds on $\langle \mu, \psi_{j,p-\ell}\rangle$ based on computations similar to those in the second step in the proof of Theorem 2.1 in [Reference Bitseki Penda, Djellout and Guillin6]. We set $h_k={\mathcal Q}^{k-1} \tilde f_\ell$ , so that for $k\in {\mathbb N}^{*}$ ,
We recall the notation $f\otimes f=f\otimes ^2$ . We deduce for $k\geq 2$ from (6) applied with $h_k={\mathcal Q} h_{k-1}$ and for $k=1$ from (4) and (70) that
Upper bound on $\langle \mu, |\psi_{1,p-\ell}| \rangle$ . We have
Upper bound on $|\langle \mu, \psi_{2,p-\ell} \rangle|$ . Using Lemma 6.3 for the second inequality and (70) for the third, we get
Upper bound on $\langle \mu, |\psi_{3,p-\ell} \rangle|$ . Using (5), we easily get
Upper bound on $\langle \mu, |\psi_{4,p-\ell} \rangle|$ . Using (5) and then (71) with $p-\ell -1\geq 2$ , we get
Upper bound on $\langle \mu, |\psi_{5,p-\ell} \rangle|$ . We have
with
Using (5) and then (71), we get
We deduce that $\langle \mu,| \psi_{5, p-\ell}| \rangle \leq C \, 2^{2(p-\ell)} \, c_4^4({\mathfrak f})$ .
Upper bound on $\langle \mu, |\psi_{6,p-\ell} |\rangle$ . We have
with
Using (5) and then (71), we get
We deduce that $\langle \mu,| \psi_{6, p-\ell}| \rangle\leq C\, 2^{2(p-\ell)} \, \, c_4^4({\mathfrak f}) $ .
Upper bound on $|\langle \mu, \psi_{7,p-\ell} \rangle|$ . We have
with
For $k\leq p-\ell -2$ , we have
where we used (5) for the first inequality, (6) for the second, and (70) for the third. We now consider the case $k= p-\ell -1$ . Let $g\in {\mathcal B}_+(S)$ . As $2ba^2\leq b^3+ a^3$ for a, b nonnegative, we get that $g\otimes g^2 \leq g^3\otimes _{\textrm{sym}} {\bf 1}$ , and thus
Writing $A_r=\Gamma_{p-\ell-1,r}^{[7]}$ , we get, using (72) for the first inequality and Lemma 6 for the second,
Since $c_2({\mathfrak f}) \leq c_4({\mathfrak f})$ , we deduce that $ |\langle\mu, \psi_{7, p-\ell} \rangle|\leq C\,2^{2(p-\ell)} \, c_4^4({\mathfrak f})$ .
Upper bound on $\langle \mu, |\psi_{8,p-\ell} |\rangle$ . We have
with
Using (5) and then (71) (twice, and noticing that $p-\ell-r\geq 2$ ), we get
We deduce that $ \langle \mu, |\psi_{8, p-\ell} |\rangle\leq C\,2^{2(p-\ell)} \,c_4^4({\mathfrak f})$ .
Upper bound on $\langle \mu, |\psi_{9,p-\ell} |\rangle$ . We have
with
For $r\leq k-2$ , we have
where we used (5) for the first inequality; (6) as $p-\ell -r \geq 2$ and $k-r-1\geq 1$ for the second; and (70) (twice) and (71) (once) for the last. For $r= k-1$ and $k\leq p-\ell -2$ , we have
where we used (5) for the first inequality; (7) as $p-\ell -k \geq 2$ for the second (notice this is the only place in the proof of Theorem 3.1 where we use (7)); and (70) (three times) for the last. For $r= k-1= p-\ell -2$ , we have
where we used (5) for the first inequality, (3) (with f replaced by $f_\ell$ ) for the second, and (6) as well as (71) (with $p-\ell -j\geq 2$ ) for the last. Taking all of this together, we deduce that $\langle \mu,| \psi_{9, p-\ell}| \rangle\leq C\, 2^{2(p-\ell)}\, c_4^2({\mathfrak f})\, c_2^2({\mathfrak f})$ .
Combining all the upper bounds with (69), we deduce that for $\ell\in \{0, \ldots, p-3\}$ ,
Thanks to (68), this inequality holds for $\ell\in \{0,\ldots, p\}$ . We deduce from (67) that
This proves that $\lim_{n \rightarrow \infty} R_{3}(n) = 0$ .
We can now use Theorem 3.2 and Corollary 3.1 from [Reference Hall and Heyde10, p. 58] and the remark from [Reference Hall and Heyde10, p. 59] to deduce from Lemmas 5.7 and 5.8 that $\Delta_n({\mathfrak f})$ converges in distribution towards a Gaussian real-valued random variable with deterministic variance $\Sigma^{\textrm{sub}}({\mathfrak f})$ given by (22). Using (34), Remark 5.2, and Lemmas 5.2 and 5.3, we then deduce Theorem 3.1.
6. Moments formula for BMCs
Let $X=(X_i, i\in \mathbb{T})$ be a BMC on $(S, {\mathscr S}\,)$ with probability kernel ${\mathcal P}$ . Recall that $|\mathbb{G}_n|=2^n$ and $M_{\mathbb{G}_n}(\,f)=\sum_{i\in \mathbb{G}_n} f(X_i)$ . We also recall that $2{\mathcal Q}(x,A)={\mathcal P}(x, A\times S) + {\mathcal P}(x, S\times A)$ for $A\in {\mathscr S}\,$ . We use the convention that $\sum_\emptyset=0$ .
We recall the following well-known and easy-to-establish many-to-one formulas for BMCs.
Lemma 6.1. Let $f,g\in {\mathcal B}(S)$ , $x\in S$ , and $n\geq m\geq 0$ . Assuming that all the quantities below are well defined, we have
We also give some bounds on ${\mathbb E}_x\!\left[M_{\mathbb{G}_{n}}(\,f) ^4\right]$ ; see the proof of Theorem 2.1 in [Reference Bitseki Penda, Djellout and Guillin6]. We will use the notation
Lemma 6.2. There exists a finite constant C such that for any $f\in {\mathcal B}(S)$ , $n\in {\mathbb N}$ , and $\nu$ a probability measure on S, assuming that all the quantities below are well defined, there exist functions $\psi_{j, n}$ for $1\leq j\leq 9$ such that
and, with $h_{k}= {\mathcal Q}^{k - 1} (\,f) $ (and notice that either $|\psi_j|$ or $|\langle \nu,\psi_j \rangle|$ is bounded), writing $\nu g=\langle\nu , g \rangle$ ,
We shall use the following lemma to bound the term $ | \nu \psi_{2, n}|$ .
Lemma 6.3. Let $\mu$ be an invariant probability measure on S for ${\mathcal Q}$ . Let $f,g\in L^4(\mu)$ . Then, for all $r \in {\mathbb N}$ , we have
Proof. We have
where we used Hölder’s inequality and the fact that $v\otimes w=(v\otimes 1) \,(1\otimes w)$ for the first inequality; the fact that ${\mathcal P}(v\otimes 1) \leq 2 {\mathcal Q} v$ and ${\mathcal P}(1\otimes v) \leq 2 {\mathcal Q} v$ if v is nonnegative for the second inequality; and Jensen’s inequality and the fact that $\mu$ is invariant for ${\mathcal Q}$ for the last.
Funding information
There are no funding bodies to thank in relation to the creation of this article.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process for this article.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/apr.2022.3.