1 Introduction
Let A be a countable set, called an alphabet. Consider a measurable function $\phi : A\times A^{\{-1,-2, \ldots \}}\to \mathbb {R}$ such that $\sum _{a\in A}e^{\phi (a,x)}=1$ for all $x\in A^{\{-1,-2, \ldots \}}$ . The function $\phi $ is called a normalized potential, and the probability kernel $g:= e^{\phi }$ (also known as g-function) is a natural generalization of Markov kernels. Let $\eta =(\eta _n)_{\mathbb {Z}}$ be the canonical projections on $A^{\mathbb {Z}}$ , that is, for all $x\in A^{\mathbb {Z}}$ and all $n\in \mathbb {Z}$ , $\eta _n(x)=x_n$ . For $y \in A^{\{0,-1, \ldots \}}$ , let $\mu ^y$ be the probability measure on $A^{\mathbb {Z}}$ such that $\mu ^y[(\eta _0,\eta _{-1},\ldots ) \in B]=\delta _{y} [B]$ for all $B\subset A^{\{0, -1, \ldots \}}$ measurable. For $n\geq 0$ , $\mu ^y[\eta _{n+1}=a\mid (\eta _n,\eta _{n-1},\ldots )=x]=e^{\phi (a,x)}$ for every $a\in A$ and $\mu ^y$ -a.e. (almost every) x in $A^{\{-1,-2, \ldots \}}$ . Let T denote the shift operator on $A^{\{0,-1, \ldots \}}$ . We indicate by $d_{\text {TV}}$ the total variation distance, that is, if P and Q are two probability measures on the same $\sigma $ -algebra $\mathcal {F}$ ,
In this paper, we obtain upper bounds, respectively, for the relaxation rate
the mixing rate
and the decay rate of correlations
when $\tilde {\mu }$ is the unique shift-invariant measure compatible with $\phi $ (see the next section for the definition of compatibility) and $f, \hat {f}$ are suitable functions (see Theorem 3.6). Bressaud, Fernández and Galves [Reference Bressaud, Fernández and GalvesBFG99] and Pollicott [Reference PollicottPol00, Proposition 1] obtained upper bounds for $L(n), M(n)$ , and $\rho _{f,\hat {f}}(n)$ for potentials of summable variations and finite alphabets. Gouëzel [Reference GouëzelGou04] obtained sharp lower bounds for the decay of correlations for dynamics with Hölder continuous (that is, exponentially decaying) potentials and countable alphabet. Our contribution is twofold. We obtain upper bounds for $L(n), M(n),$ and $\rho _{f,\hat {f}}(n)$ when the variation rate $ \operatorname {{\mathrm {var}}}_k(\phi )$ decays as $\mathcal {O}(k^{-(1/2+\delta ')})$ for any $ \delta '> 0$ . Moreover, our results also hold for a countably infinite alphabet A. Theorem 3.2 is our main result, showing a new upper bound for the coupling error between $\mu ^y$ and $\mu ^z$ . Corollary 3.3 answers a question posed in [Reference Johansson and ÖbergJÖ08], in which the authors ask for a bound for $L(n)$ when the variation of $\phi $ is not summable. Corollary 3.4 shows a bound for $M(n)$ , which cannot be achieved by simply using the union bound and Corollary 3.3. The result is new even for the case of summable variations. The interest in $M(n)$ stands from the fact that it is the natural generalization of mixing times for Markov chains. Gallesco, Gallo and Takahashi [Reference Gallesco, Gallo and TakahashiGGT18] showed that $M(n)$ converges to $0$ only when $\operatorname {{\mathrm {var}}}_k(\phi )$ is square summable $\tilde {\mu }$ -a.s. (almost surely) and hence Corollary 3.4 covers the main cases of interest. Theorem 3.6 gives an upper bound for the speed of decay of correlations, extending Theorem 3.2 in [Reference Bressaud, Fernández and GalvesBFG99]. Johansson, Öberg and Pollicott [Reference Johansson, Öberg and PollicottJOP12] showed, when the alphabet A is finite, that there is a unique shift-invariant measure $\tilde {\mu }$ compatible with $\phi $ when $\operatorname {{\mathrm {var}}}_k(\phi ) \in \scriptstyle \mathcal {O}(k^{-(1/2)})$ . Moreover, Berger, Hoffman and Sidoravicius [Reference Berger, Hoffman and SidoraviciusBHS18] proved that whenever $\operatorname {{\mathrm {var}}}_k(\phi ) \in \mathcal {O}(k^{-(1/2-\delta )})$ for any $\delta> 0$ , there exists a normalized potential $\phi $ that exhibits multiple compatible shift-invariant measures. Therefore, Theorem 3.6 also covers the main variation rates of interest under uniqueness of the compatible shift-invariant measure. In Corollary 3.9, we use Theorem 3.6 to obtain upper bounds on the rate of correlation decay for non-normalized potentials. We illustrate the application of our inequalities in three cases. The first application proves a novel weak invariance principle for additive functionals of dynamics with non-summable variations. The second application shows that we can obtain Hoeffding-type bounds for averages of random variables when the variation of $\phi $ is not summable. The third example illustrates how we can apply our results on a Poisson autoregression model, which is popular in applied works.
The proof technique is based on a renewal equation and coupling inequalities. These ideas were developed in [Reference Bressaud, Fernández and GalvesBFG99, Reference Comets, Fernández and FerrariCFF02, Reference Coelho and QuasCQ98]. We improve on the coupling bounds obtained in [Reference Bressaud, Fernández and GalvesBFG99] by using a coupling between blocks of coordinates, instead of one coordinate at a time. A block coupling idea was used in [Reference Johansson, Öberg and PollicottJOP12] to obtain sharp conditions for uniqueness of the equilibrium measure for $\phi $ on a finite alphabet A, but mixing rate was not obtained. A difference between [Reference Johansson, Öberg and PollicottJOP12] and our approach is that we upper bound the block coupling using different renewal processes leading to a distinct renewal equation. This new renewal equation allows us to upper bound the speed of decay of the coupling inequality even when the variation is not summable (see Theorem 3.2).
2 Definitions
Let the alphabet A be a countable set, $\mathcal {X}=A^{\mathbb {Z}}$ , and $\mathcal {X}_{-} = A^{\mathbb {Z}_-}$ , where $\mathbb {Z}_-=\{0,-1,-2,\ldots \}$ . We endow $\mathcal {X}$ and $\mathcal {X}_-$ with the product topology and its corresponding Borel $\sigma $ -algebra. The topologies and $\sigma $ -algebras considered on subsets of $\mathcal {X}$ and $\mathcal {X}_-$ will always be the trace topologies and $\sigma $ -algebras. We denote by $x_i$ the ith coordinate of $x \in \mathcal {X}$ and, for $-\infty < i \leq j<\infty $ , we write $x^{-i}_{-j}:=(x_{-i},\ldots , x_{-j})$ , $x^{-i}_{-\infty }:=(x_{-i},x_{-i-1},\ldots )$ , and $x_i^{\infty }:=(\ldots ,x_{i+1},x_{i})$ . If $i < j$ , $x^i_j = \phi $ . For $x\in \mathcal {X}$ and $y \in \mathcal {X}_{-}$ , a concatenation $x^{0}_{-i}y$ is a new sequence $z\in \mathcal {X}_{-}$ with $z^{0}_{-i} = x^{-1}_{-i}$ and $z^{-i-1}_{-\infty } = y$ . We take $\phi $ to be the neutral element of the concatenation operation, that is, $\phi x=x$ for all $x\in \mathcal {X}_{-}$ . Note that we are using the convention, consistent with the concatenation operation, that when we scan an element $x\in \mathcal {X}$ from the left to the right we go further into the past.
Consider a measurable function $\phi : \mathcal {X}_{-}\to \mathbb {R}$ , which we call a potential. We say that $\phi $ is normalized if it satisfies
for all $x\in \mathcal {X}_{-}$ . To a normalized potential $\phi $ we can associate a probability kernel g on the alphabet A by defining $g=e^{\phi }$ . The variation of order $k\geq 0$ of $\phi $ is defined by
When A is finite, the variation is usually defined by taking the supremum over $b \in A$ instead of the sum. Nevertheless, our definition is more convenient when the alphabet is infinite and has appeared in the literature before [Reference Chazottes, Gallo and TakahashiCGT20]. The constant $1/2$ in the definition relates $\operatorname {{\mathrm {var}}}_k(e^{\phi })$ to total variation distance when $\phi $ is normalized.
We also define, for $k\geq 0$ , the $\chi ^2$ -variation of order k of $\phi $ as
The use of $\chi ^2$ -variation to measure the regularity of potentials seems to be new; therefore, it is interesting to compare it to variation, which is more standard. When $\phi $ is normalized, by using the Cauchy–Schwarz inequality, we have that $ \operatorname {{\mathrm {var}}}^2_{k}(e^\phi )\leq \tfrac {1}{4}\chi ^2_{k}(\phi )$ for any $k\geq 1$ . When the alphabet A is finite and $\phi $ is normalized, $\operatorname {{\mathrm {var}}}^2_k(\phi )$ and $\chi ^2_k(\phi )$ are comparable, that is, there exist positive constants $K_1$ and $K_2$ such that $K_1\operatorname {{\mathrm {var}}}^2_k(\phi )\leq \chi ^2_k(\phi )\leq K_2\operatorname {{\mathrm {var}}}^2_k(\phi )$ . The $\chi ^2$ -variation introduced in this work will be particularly useful to study asymptotic properties of positive probability kernels on infinite A (cf. §8.3).
Let $\eta =(\eta _n)_{\mathbb {Z}}$ be the canonical projections on $\mathcal {X}$ , that is, for all $x\in \mathcal {X}$ , $\eta _n(x)=x_n$ for all $n\in \mathbb {Z}$ . We say that a probability measure $\mu $ on $\mathcal {X}$ is compatible with a normalized potential $\phi $ if there exists a probability measure P on $\mathcal {X}_-$ such that
for all $B\subset \mathcal {X}_-$ measurable and if, for $n\geq 0$ ,
for every $a\in A$ and $\mu $ -a.e. x in $\mathcal {X}_{-}$ . Johansson, Öberg and Pollicott [Reference Johansson, Öberg and PollicottJÖP07] showed that if
then there is at most one shift-invariant invariant compatible measure with $\phi $ . From [Reference ReissRei12, Lemma 3.3.9], for $k \geq 0$ , we have
and hence the summability of $\chi ^2_k(\phi )$ implies the existence of at most one shift-invariant invariant compatible measure.
When $\phi $ is not normalized, the definition of a compatible measure loses its meaning. Nevertheless, we can associate a set of shift-invariant measures called equilibrium states for not necessarily normalized $\phi $ [Reference WaltersWal75]. Equilibrium states are characterized via a variational principle and coincide with shift-invariant compatible measures when $\phi $ is normalized. An equilibrium state $\tilde {\mu }$ compatible with a normalized $\phi $ is also called a g-measure in ergodic theory [Reference KeaneKea72]. In probability literature, g-measures are known as chains of complete connections [Reference Doeblin and FortetDF37, Reference Iosifescu and GrigorescuIG90], chains of infinite order [Reference HarrisHar55, Reference KeaneKea72], random-step Markov processes [Reference KalikowKal90], and uniform martingales [Reference KalikowKal90]. Compatible measures that are not necessarily shift-invariant are called g-chains [Reference Johansson, Öberg and PollicottJOP12] or stochastic chains of unbounded memory [Reference Gallesco, Gallo and TakahashiGGT18]. When there is more than one shift-invariant measure compatible with $\phi $ , we say that there is a phase transition, otherwise we say that the shift-invariant compatible measure is unique.
3 Results
In this paper, we will work under the following assumption
Assumption ( $\mathcal {A}$ ).
$\phi $ is a potential on $\mathcal {X}_-$ such that for all $k\geq 1$ ,
for some $C>0$ and $\delta>0$ .
Remark 3.1. When the alphabet A is finite and $\phi $ is normalized, Assumption ( $\mathcal {A}$ ) is equivalent to
for some $C'> 0$ and the same $\delta $ as in (1). Observe that $ \operatorname {{\mathrm {var}}}_k(\phi )$ is not summable when $\delta \in (0, 1]$ .
Now, consider $\mathcal {X}\times \mathcal {X}$ with the projection maps $\hat {\eta }=(\hat {\eta }_n)_{n\in \mathbb {Z}}$ and $\hat {\omega }=(\hat {\omega }_n)_{n\in \mathbb {Z}}$ such that for $(x,y)\in \mathcal {X}\times \mathcal {X}$ , $\hat {\eta }_n(x,y)=x_n$ and $\hat {\omega }_n(x,y)=y_n$ for all $n\in \mathbb {Z}$ . Let us also denote by $\widehat {\mathcal {C}}(\phi )$ the set of probability measures P on $\mathcal {X}\times \mathcal {X}$ such that the pushforward measures $\hat {\eta }_*P$ and $\hat {\omega }_*P$ are compatible with $\phi $ . We also introduce the process $X=(X_n)_{n\geq 1}$ such that for all $n\geq 1$ ,
where $(K_n)_{n\geq 1}$ is a fixed strictly increasing sequence of natural numbers such that $K_1=1$ . Here is our main result followed by two corollaries.
Theorem 3.2. Let $\phi $ be a normalized potential that satisfies Assumption ( $\mathcal {A}$ ). Let $K_n=\lfloor n^{\beta } \rfloor $ for $\beta \geq 1$ and $\beta>1/\delta $ . For all measures $\mu $ and $\nu $ compatible with $\phi $ , there exists $\mathbb {P}\in \widehat {\mathcal {C}}(\phi )$ such that $\hat {\eta }_*\mathbb {P}=\mu $ , $\hat {\omega }_*\mathbb {P}=\nu $ , and, for $n\geq 1$ ,
where $C_1$ is a positive constant depending on $C, \delta $ , and $\beta $ .
Corollary 3.3. Let $\phi $ be a normalized potential that satisfies Assumption ( $\mathcal {A}$ ). If $\delta>1$ , we have for all $n\geq 1$ ,
where $C_2$ is a positive constant depending on C and $\delta $ .
If $\delta \in (0,1]$ , we have for all $n\geq 1$ and $\delta '<\delta $ ,
where $C_3$ is a positive constant that depends on C, $\delta $ , and $\delta '$ .
Corollary 3.4. Let $\phi $ be a normalized potential that satisfies Assumption ( $\mathcal {A}$ ). For all $\delta '<\delta $ , we have for all $n\geq 1$ ,
where $C_4$ is a positive constant that depends on C, $\delta $ , and $\delta '$ .
Remark 3.5. When $\delta>1$ and A is finite, we can use [Reference Bressaud, Fernández and GalvesBFG99, Theorem 1] and the union bound to obtain
where $C_5> 0$ is a constant that depends on C and $\delta $ . Hence, the result in Corollary 3.4 gives a sharper upper bound, even when the potential is summable and the alphabet A is finite.
We now look at the correlations decay for the shift-invariant measure compatible with a potential $\phi $ . For this, we need the following definitions. Consider the shift operator $T: \mathcal {X}_{-}\to \mathcal {X}_{-}$ such that for all $x\in \mathcal {X}_{-}$ , $Tx=Tx_{-\infty }^0=x_{-\infty }^{-1}$ . For non-constant $\phi $ , let us consider the seminorm
and the subspace of $\mathcal {C}(\mathcal {X}_-,\mathbb {R})$ defined by
Theorem 3.6. Let $\phi $ be a normalized potential that satisfies Assumption ( $\mathcal {A}$ ). Assume that a shift-invariant probability measure $\tilde {\mu }$ compatible with $\phi $ exists. Let $f\in L^1(\tilde {\mu })$ and $\hat {f}\in V_{\phi }$ .
If $\delta>1$ , we have for all $n\geq 1$ ,
where $C_6$ is a positive constant that depends on C and $\delta $ .
If $\delta \in (0,1]$ , we have for all $n\geq 1$ and $\delta '<\delta $ ,
where $C_7$ is a positive constant that depends on C, $\delta $ , and $\delta '$ .
Remark 3.7. When $\delta>1$ and A is finite, Theorem 3.6 recovers the rate obtained in [Reference Bressaud, Fernández and GalvesBFG99, Theorem 1].
Remark 3.8. When A is finite, continuity of $\phi $ guarantees the existence of a compatible shift-invariant measure; therefore, the assumption on the existence of a compatible measure in Theorem 3.6 is redundant. When A is infinite, the existence of a shift-invariant compatible measure is not immediate. Sufficient conditions for existence of shift-invariant compatible measures when A is infinite are given in [Reference Fernández and MaillardFM05, Reference Johansson, Öberg and PollicottJÖP07]. See §8.3 for a concrete example. Whenever a shift-invariant compatible measure exists, Assumption ( $\mathcal {A}$ ) implies uniqueness of $\tilde {\mu }$ in Theorem 3.6 [Reference Johansson, Öberg and PollicottJÖP07], although uniqueness is not a priori necessary for Theorem 3.6.
A natural question is whether we can obtain an upper bound for the rate of correlations decay for a potential $\phi $ that is not normalized. When A is finite, we can use the same strategy as in [Reference Bressaud, Fernández and GalvesBFG99, Reference PollicottPol00, Reference WaltersWal75]. The idea is to study normalized potentials $\psi $ that are cohomologous to $\phi $ , that is, $\psi = \phi + h - h\circ T + c$ for some $h \in \mathcal {C}(\mathcal {X}_-,\mathbb {R})$ and $c \in \mathbb {R}$ . If $\phi $ and $\psi $ are cohomologous, then both functions have the same associated equilibrium states [Reference WaltersWal75]. Hence, properties of equilibrium states for $\phi $ can be obtained by studying shift-invariant measures compatible with $\psi $ . Walters [Reference WaltersWal75] proved that when the rate of variation of $\phi $ is summable, there exist a unique h and a unique c such that $\psi $ is a normalized potential. Moreover, from the construction of h in [Reference WaltersWal75], we have that $\operatorname {{\mathrm {var}}}_{k}(h) \leq \sum _{j \geq k}\operatorname {{\mathrm {var}}}_{j}(\phi )$ . This implies that $\operatorname {{\mathrm {var}}}_{k}(\psi ) \leq 3\sum _{j \geq k}\operatorname {{\mathrm {var}}}_{j}(\phi )$ . Using these results, we obtain the following corollary, which improves the results in [Reference Bressaud, Fernández and GalvesBFG99, Reference PollicottPol00].
Corollary 3.9. Let the alphabet A be finite and $\phi $ be a potential not necessarily normalized. Assume that there exist a constant $C> 0$ and $\delta> 0$ such that
Let $\tilde {\mu }$ be an equilibrium state for $\phi $ , $f\in L^1(\tilde {\mu })$ , and $\hat {f}\in V_{\phi }$ .
If $\delta>1$ , we have for all $n\geq 1$ ,
where $C_8$ is a positive constant that depends on C and $\delta $ .
If $\delta \in (0,1]$ , we have for all $n\geq 1$ and $\delta '<\delta $ ,
where $C_9$ is a positive constant that depends on C, $\delta $ , and $\delta '$ .
Remark 3.10. When $\delta> 1$ , Corollary 3.9 recovers the rate obtained in [Reference PollicottPol00, Theorem 1(1)]. To generalize Corollary 3.9 to an infinite alphabet, we need a result equivalent to [Reference WaltersWal75, Theorem 3.3] for an infinite alphabet, which is currently unavailable.
4 Technical lemmas
Here we collect some results that we will use to prove Theorem 3.2. We first recall the definitions of the Kullback–Leibler and Pearson $\chi ^2$ divergences. Let P and Q be two probabilities on some discrete space $\mathcal {Y}$ . We define
and
It is well known that $D_{\text {KL}}(P||Q)\leq D_{\chi ^2}(P||Q)$ (cf. [Reference Sason and VerdúSV16, eq. 5]).
Lemma 4.1. Let $x, y\in \mathcal {X}_-$ and $\mu ,\nu \in \mathcal {P}(\phi )$ such that $\mu [\eta _{-\infty }^{0}\in \cdot \;]=\delta _x(\cdot )$ and $\nu [\eta _{-\infty }^{0}\in \cdot \;]=\delta _y(\cdot )$ . For all $n\geq 1$ , $0\leq k\leq n-1$ , and all $a, b, c\in \mathcal {X}$ , we have
Proof Let us simply denote by D the left-hand term of inequality (2). We have by the chain rule property of the Kullback–Leibler divergence [Reference Cover and ThomasCT06, Theorem 2.5.3] that
Then we use the well-known bound
to conclude the proof.
Lemma 4.2. For $\alpha> 1$ and $0<a<b$ , we have
Proof By algebraic computations, we obtain that (3) is equivalent to
This last inequality is obtained from the Bernoulli inequality $(1+x)^{r}\geq 1+rx$ , for $r>0$ and $x>-1$ , observing that
and
Define, for all $\delta>0$ , $\beta \geq 1$ , $k\geq 3$ , and $n\geq k+1$ ,
Lemma 4.3. For all $\delta>0$ , $\beta \geq 1$ , and $k\geq 3$ , $\Delta _k^n$ is a non-increasing function of $n\geq k+1$ .
Proof The statement of the lemma is trivial for $\beta =1$ . For $\beta>1$ , consider the function $f:[4,\infty )\to \mathbb {R}^+$ defined by
In order to prove the result, it is enough to show that the derivative of f is negative. Since
it is enough to show that
But this last inequality follows from Lemma 4.2.
Lemma 4.4. For all $\delta>0$ , $\beta \geq 1$ , and $k\geq 3$ , we have
Proof Observe that for $k\geq 3$ ,
where, to obtain the second inequality in (4), we used the inequality
for $\alpha \geq 1$ and $a, b\geq 0$ . This inequality can be obtained using the fundamental theorem of calculus applied to the function $f(x)=x^{\alpha }$ .
To obtain the last inequality in (4), we used that for $k\geq 3$ ,
and
Finally, we recall the following lemma in [Reference Bressaud, Fernández and GalvesBFG99] (see also Lemma A.4 in [Reference GiacominGia07]) that gives an estimate for the renewal sequence that will appear in the proof of Theorem 3.2. We state the lemma using a notation that is adapted to our purpose.
Lemma 4.5. (Proposition 2, item (iv) in [Reference Bressaud, Fernández and GalvesBFG99])
Let $(f_k)_{k \geq 1}$ be a sequence of positive real numbers such that $\sum _{k=1}^\infty f_k < 1$ . Suppose that $(u_k)_{k \geq 1}$ is a sequence with $u_0 = 1$ and satisfies the renewal equation
If $f_n \leq c_1/n^{1+\alpha }$ for some $\alpha>0$ and a positive constant $c_1$ , then $u_n \leq c_2/n^{1+\alpha }$ , where $c_2$ is a constant that depends on $(f_k)_{k \geq 1}$ .
5 Proof of Theorem 3.2
Let $x, y\in \mathcal {X}_-$ and $\mu , \nu $ compatible measures such that $\mu [\eta _{-\infty }^{0}\in \cdot \;]=\delta _x(\cdot )$ and $\nu [\eta _{-\infty }^{0}\in \cdot \;]=\delta _y(\cdot )$ . We now construct the coupling of $\mu $ and $\nu $ , that we call $\mathbb {P}^{x,y}$ , as follows. We start by defining
Then, for all $n\geq 1$ , given the pasts $\hat {\eta }_{-\infty }^{K_n-1}$ and $\hat {\omega }_{-\infty }^{K_n-1}$ , we maximally couple $\hat {\eta }_{K_n}^{K_{n+1}-1}$ and $\hat {\omega }_{K_n}^{K_{n+1}-1}$ to complete the construction of $\mathbb {P}^{x,y}$ .
Next, we show that $\mathbb {P}^{x,y}$ satisfies the inequality in Theorem 3.2. For all $n\geq 1$ and $0\leq k\leq n-1$ , define
with the convention that if $k>l$ , then elements of the form $a_k^l$ are dropped from the conditional part. The shorthand notation $X_{n-k}^{n-1}=0$ means that $X_{n-k} = 0, \ldots , X_{n-1} = 0$ . Observe that for all $n\geq 1$ and $0\leq k\leq n-2$ , we have $q^n_{k} \geq q^n_{k+1}$ .
We start by proving the following lemma.
Lemma 5.1. Suppose that $(\chi ^2_n(\phi ))_{j\geq 0}\in \ell ^1$ . Then there exists $\varepsilon>0$ such that for all ${k\geq 0}$ and all $n\geq k+1$ ,
For $k\geq 1$ and $n\geq k+1$ , we also have
Proof Inequality (5) is a direct consequence of the Bretagnolle–Huber inequality (cf. [Reference Sason and VerdúSV16, eq. 4]) and Lemma 4.1. Inequality (6) is a direct consequence of the Pinsker inequality (cf. [Reference Sason and VerdúSV16, eq. 1]) and again Lemma 4.1.
Now, on some probability space $(\Omega , \mathcal {F}, P)$ , consider the random process $Y=(Y_n)_{n\geq 0}$ with values in $\{0,1\}$ such that $Y_0=1$ and, for $n\geq 1$ and $0\leq k\leq n-1$ ,
For all $m\geq 1$ and $a,b \in \{0,1\}^{m}$ , we say that $a\geq b$ if $a_i \geq b_i$ for $i \in \{1,\ldots , m\}$ . By construction, for all $n\geq 2$ , $a,b \in \{0,1\}^{n-1}$ , and $a\geq b$ , we have $P[Y_n = 1 | Y^{n-1}_1 = a] \geq \mathbb {P}^{x,y}[X_n = 1 | X^{n-1}_1 = b]$ . Therefore, by applying Strassen’s theorem on stochastic domination [Reference LindvallLin99] inductively on n, we can construct a coupling measure Q such that, for $a,b \in \{0,1\}^{n-1}$ and $a\geq b$ , we have $Q[Y_n \geq X_n| Y^{n-1}_1 = a, X^{n-1}_1 = b] = 1$ . Therefore, for all $n\geq 1$ , we have $Q[Y_n \geq X_n] = 1$ , which implies that
for all $n \geq 1$ .
Now, consider the process $Z=(Z_n)_{n\geq 0}$ with values in $\{0,1\}$ such that $Z_0=1$ and, for $n\geq 1$ and $0\leq k\leq n-1$ ,
Observe that, for all $k \geq 0$ , we have $b_k \geq b_{k+1}$ . Using the same argument used to show (7), we have that $P[Z_n = 1] \geq P[Y_n = 1]$ for all $n\geq 1$ . Also, by (6) and Lemmas 4.3 and 4.4, we have that for $k\geq 3$ and $n\geq k+1$ ,
Using (5), we obtain that $b_k\leq (2C({2^{\beta } 4^{\delta }\beta \delta ^{-1}})/({k^{\delta \beta +1}}))^{1/2}\wedge (1-\varepsilon )$ for all $k\geq 3$ and $b_k\leq 1-\varepsilon $ for $0\leq k\leq 2$ .
Next, let $f_i:= b_{i-1}\prod _{k=0}^{i-2}(1-b_k)$ for $i\geq 1$ (with the convention that $\prod _{j=0}^{-1}=1$ ) and $u_i:= P[Z_i = 1]$ for $i \geq 0$ . We have that
and hence the following renewal equation holds:
By definition, we have that $\sum _{k= 1}^\infty f_k = 1-\prod _{k = 1}^\infty (1-b_k)$ . If $\beta>\delta ^{-1}$ , we have $\sum _{k = 0}^\infty b_k < \infty $ . Hence, $\sum _{k= 1}^\infty f_k < 1$ . Moreover, when $\beta>\delta ^{-1}$ , we have that $b_k \leq c_1k^{-(\delta \beta +1)/2}$ for some positive constant $c_1$ that depends on $C, \delta $ , and $\beta $ . From Lemma 4.5, we have that, for all $n\geq 1$ ,
where $C_1$ is a positive constant that depends on $C, \delta $ , and $\beta $ . Because $P[Z_n = 1] \geq \mathbb {P}^{x,y}[X_n = 1]$ , we obtain that for all $n\geq 1$ ,
for $\beta>\delta ^{-1}$ . Because the bound is uniform on $x,y \in \mathcal {X}_-$ , we obtain the desired result.
6 Proofs of Corollaries 3.3 and 3.4
6.1 Proof of Corollary 3.3
For $k\in [K_n, K_{n+1})$ and all $y,z \in \mathcal {X}_-$ , using the coupling inequality for total variation distance (cf. [Reference ThorissonTho00]), we have
Then, by Theorem 3.2, we obtain
for all $\beta \geq 1$ and $\beta>\delta ^{-1}$ . If $\delta>1$ , just take $\beta =1$ . In this case $k=n$ and thus we obtain Corollary 3.3 with a constant $C_2$ that depends on C and $\delta $ . If $\delta \in (0,1)$ , since $k\leq (n+1)^{\beta }$ , we have $n\geq k^{1/\beta }-1$ . This leads to
for all $k\geq 1$ , where $C_3$ is a positive constant that depends on $C, \delta $ , and $\beta $ . Now, observe that for any $0< \delta '<\delta $ , we can choose $\beta $ such that $\beta \geq 1$ , $\beta>\delta ^{-1}$ , and $({\beta \delta +1})/ {2\beta }\geq \delta '$ .
6.2 Proof of Corollary 3.4
Consider $k\in [K_n, K_{n+1})$ . Let
with the convention that $\inf \emptyset =\infty $ . We start by observing that
By Theorem 3.2, we obtain
for $\beta \geq 1$ , $\beta>\delta ^{-1}$ , and $C_1'$ a positive constant that depends on $C, \delta $ , and $\beta $ . Since $n\geq k^{1/\beta }-1$ , we obtain
for all $k\geq 1$ and $C_4$ a positive constant that depends on $C, \delta $ , and $\beta $ . Finally, notice that for all $\delta '<\delta $ , we can choose $\beta $ large enough such that $({\beta \delta -1})/{2\beta }\geq \delta '$ . Using the coupling inequality (cf. [Reference ThorissonTho00]), we conclude that
7 Proof of Theorem 3.6
Consider $K_m=\lfloor m^{\beta } \rfloor $ for $m\ge 1$ . For each $x, y\in \mathcal {X}_-$ , we consider a probability space $(\Omega , \mathcal {F}, \mathbb {P}^{x,y})$ that supports the random elements $\tilde {\eta }$ , $\tilde {\omega }$ , and $\tilde {Z}$ defined as follows. Let $\tilde {\eta }_*\mathbb {P}^{x,y}$ , $\tilde {\omega }_*\mathbb {P}^{x,y}$ be compatible with $\phi $ and $\tilde {\eta }_{-\infty }^{0}=x,\, \tilde {\omega }_{-\infty }^{0}=y$ . Also, for all $m\geq 1$ given the pasts $\tilde {\eta }_{-\infty }^{K_m-1}$ and $\tilde {\omega }_{-\infty }^{K_m-1}$ , the blocks $\tilde {\eta }_{K_m}^{K_{m+1}-1}$ and $\tilde {\omega }_{K_m}^{K_{m+1}-1}$ are maximally coupled. Under $\mathbb {P}^{x,y}$ , the process $\tilde {Z}$ has the same law as the process Z defined in §5 and verifies for all $m\geq 1$ (this is indeed possible since Z stochastically dominates X; see §5). We denote by $\mathbb {E}^{x,y}$ the expectation with respect to $\mathbb {P}^{x,y}$ .
Fix some $n\in \mathbb {N}$ and let k be such that $n\in [K_{k-1}, K_{k})$ . We will show that
for some positive constant $c_1$ that depends only on C, $\delta $ , and $\beta $ . From this point, Theorem 3.6 is easily obtained following the proof of Corollary 3.3. To obtain (9), we follow the argument developed in [Reference Bressaud, Fernández and GalvesBFG99, §5]. Using (3.7) in [Reference Bressaud, Fernández and GalvesBFG99], we first observe that
For $k\geq 1$ , let
We have
Observe that for all $0\leq j\leq k$ ,
where the $b_{k-j+l}$ are from (8). Now, observe that for all $i\geq 1$ , we have
Thus, we have that for all $i\geq 1$ ,
with $f_k:=b_{k-1}\prod _{l=0}^{k-2}(1-b_l)$ , $k\geq 1$ . From this, we obtain
We deduce that
with
Finally, since $({\operatorname {{\mathrm {var}}}_{n-K_{k-l}+1}(e^\phi )})/({b_{l-1}})\leq 2$ and $\prod _{j=0}^{\infty }(1-b_j)>0$ (using that $b_0 < 1$ and $\sum _{j=0}^\infty b_j<\infty $ ), we observe that
for some positive constant $c_2$ depending on C, $\delta $ , and $\beta $ .
7.1 Proof of Corollary 3.9
The potential $\phi $ is summable; therefore, there exists a normalized potential $\psi $ with the same unique equilibrium state as $\phi $ [Reference WaltersWal75, Theorem 3.2]. If $\operatorname {{\mathrm {var}}}_{k}(\phi )\leq {C}/{k^{({3+ \delta })/{2}}}$ , then $\operatorname {{\mathrm {var}}}_{k}(\psi )\leq {C'}/{k^{({1+ \delta })/{2}}}$ for some constant $C'> 0$ that depends only on C (see, for example, [Reference PollicottPol00, Proposition 1]). Closely following the proof of Theorem 3.6 using the measures compatible with $\psi $ , we obtain the desired results. The only difference is that in (10) we use the bound
instead of
8 Applications
8.1 Functional central limit theorem (FCLT) for potentials with non-summable variations
Let $\sigma> 0$ . A function $h:A\to \mathbb {R}$ satisfies the functional central limit theorem (FCLT), also called the weak invariance principle, when the process $\{\zeta _n(t), t \in [0,1], n\geq 1\}$ defined by
converges weakly to a standard Brownian motion on $D[0,1]$ . Tyran-Kamińska [Reference Tyran-KamińskaTK05, Section 4.3] showed that the FCLT holds when the potential $\phi $ has summable variations. A straightforward application of Theorem 3.2 in [Reference Tyran-KamińskaTK05] and our Corollary 3.3 is the following FCLT for potentials with non-summable variations.
Proposition 8.1. Assume that the alphabet A is finite and $\phi $ satisfies Assumption ( $\mathcal {A}$ ) with $\delta \in (1/2,1]$ . Let $\mu $ be shift-invariant and compatible with $\phi $ . Also, let $h:A \to \mathbb {R}$ be a function such that $\int h\circ \eta _0\; d\mu = 0$ . If $\sigma ^2 :=\int (h\circ \eta _0)^2\; d\mu> 0$ , then h satisfies the FCLT.
Proof Because the alphabet is finite, we have that $\sigma ^2 \leq \|h\|^2_\infty < \infty $ , as required by Theorem 3.2 in [Reference Tyran-KamińskaTK05]. It remains to verify the condition on the mixing rate. Using the processes $\{\tilde {\eta }_i, i \in \mathbb {Z}\}$ and $\{\tilde {\omega }_i, i \in \mathbb {Z}\}$ introduced in the proof of Theorem 3.6, it is sufficient to check that there exists $\gamma> 1/2$ such that
We have from Corollary 3.3 that, for all $x,y \in \mathcal {X}_-$ ,
where $\delta ' < \delta $ and $c_1 $ is a positive constant that depends on $C, \delta , \delta '$ , and h. Taking $\delta '$ and $\gamma $ such that $1/2 < \gamma \leq \delta ' <\delta $ , we obtain (11).
8.2 Hoeffding-type inequality for potentials with non-summable variations
A Hoeffding-type inequality gives finite sample bounds for deviations of additive functionals from their mean. When the variation rate of the potential is summable, we have an exponential inequality [Reference Gallesco, Gallo and TakahashiGGT14, Reference Chazottes, Gallo and TakahashiCGT20, Reference MartonMar98]. Nevertheless, the rate of concentration for potentials when the variation rate is not summable is an open question. Using our result, we can obtain the following stretched exponential inequality for sums of random variables.
Proposition 8.2. Assume that the alphabet A is finite and $\phi $ satisfies Assumption ( $\mathcal {A}$ ) with $\delta \in (1/2,1]$ . Let $\mu $ be compatible with $\phi $ . For all $\delta '<2\delta -1$ , $n\geq 1$ , $t\geq 0$ , and all functions $h:A \to \mathbb {R}$ , we have
where $R(h):=\max _{a\in A}h(a)-\min _{a\in A}h(a)$ and $C_{10}$ is a constant that depends on C, $\delta $ , and $\delta '$ .
Proof This is a consequence of Theorem 3.2 of [Reference Chazottes, Collet, Külske and RedigCCKR07] and Corollary 3.3. In order to apply Theorem 3.2 of [Reference Chazottes, Collet, Külske and RedigCCKR07], we need to estimate the terms $\|\overline {D}\|_{\ell ^2(\mathbb {N})}^2$ and $\|\delta f\|^2_{\ell ^2(\mathbb {N})}$ (for $f(x_1,\ldots ,x_n)=({1}/{n})\sum _{i=1}^n h(x_i)$ ) there. We have
Now, for $\delta '<2\delta -1$ , using Corollary 3.3, we obtain
for some positive constant $c_1$ depending on $C, \delta $ , and $\delta '$ .
For a given function $f: A^{n}\to \mathbb {R}$ , we define the oscillation of f at site $i\in \{1,\ldots ,n\}$ by
Now, taking $f(x_1,\ldots , x_{n})= ({1}/{n})\sum _{i=1}^{n}h(x_i)$ , we have $\delta _i f={R(h)}/{n}$ for $i\in \{1,\ldots ,n\}$ . Thus, we obtain
Finally, using (12) and (13) in Theorem 3.2 of [Reference Chazottes, Collet, Külske and RedigCCKR07], we obtain Proposition 8.2.
8.3 Poisson autoregression model
As a second application of our results, we consider a model with countable infinite alphabet called Poisson autoregression, which is popular in applications [Reference Kedem and FokianosKF05]. Only the Markovian case of these models were studied in the literature. We will show how we can choose the parameters of non-Markovian Poisson autoregression models to satisfy the Assumption ( $\mathcal {A}$ ) and thus apply the results of §3.
Consider an absolutely converging sequence $(\beta _i)_{i\geq 1}$ and a sequence of non-negative integers $(\gamma _i)_{i\geq 1}$ such that $S:=\sum _{i=1}^{\infty }|\beta _i|\gamma _i<\infty $ . Consider $A=\mathbb {Z}_+$ and the potential $\phi $ defined for all $x\in \mathcal {X}_-$ by
where
For this model, we obtain
Now, since $e^{-S} \leq \lambda (x_{-\infty }^{-1})\leq e^S$ and the exponential function is locally bi-lipschitz, using (14), we have that
where $c_1$ is a positive constant that depends only on S.
Finally, choosing the sequences $(\beta _i)_{i\geq 1}$ and $(\gamma _i)_{i\geq 1}$ such that
for some $\varepsilon>0$ and $c_2\geq 1$ , we obtain
where $c_3$ is a positive constant.
Finally, for this model, we mention that the existence of a shift-invariant probability measure compatible with $\phi $ is obtained by applying Theorem 5.1 of [Reference Johansson, Öberg and PollicottJÖP07] with $K=e^{2\sinh S}$ and $\pi $ equal to the Poisson law with parameter $e^S$ . Assumption ( $\mathcal {A}$ ) implies the square summability of the variation, which guarantees the uniqueness of the shift-invariant probability measure [Reference Johansson, Öberg and PollicottJÖP07, Corollary 4.2].
Acknowledgements
C. Gallesco and D. Y. Takahashi would like to thank Sandro Gallo for several fruitful discussions that motivated this work. We also thank Leandro Cioletti for comments on an early version of the manuscript. C. Gallesco was partially supported by FAPESP (grant 2017/19876-4) and CNPq (grant 312181/2017-5). D. Y. Takahashi thanks the support of FAPESP Research, Innovation and Dissemination Center for Neuromathematics (grant 2013/07699-0).