Mixing rates for potentials of non-summable variations

CHRISTOPHE GALLESCO; DANIEL Y. TAKAHASHI

doi:10.1017/etds.2021.65

Mixing rates for potentials of non-summable variations

Part of: Topological dynamics Stochastic processes Markov processes Ergodic theory

Published online by Cambridge University Press: 16 July 2021

CHRISTOPHE GALLESCO and

DANIEL Y. TAKAHASHI

Show author details

CHRISTOPHE GALLESCO*: Affiliation:
Departmento de Estatística, Instituto de Matemática, Estatística e Ciência de Computação, Universidade de Campinas, Campinas, Brasil
DANIEL Y. TAKAHASHI: Affiliation:
Instituto do Cérebro, Universidade Federal do Rio Grande do Norte, Natal, Brasil (e-mail: takahashiyd@gmail.com)
*: e-mail: gallesco@unicamp.br

Article contents

Abstract
Introduction
Definitions
Results
Technical lemmas
Proof of Theorem 3.2
Proofs of Corollaries and
Proof of Theorem 3.6
Applications
References

Rights & Permissions

Abstract

Mixing rates, relaxation rates, and decay of correlations for dynamics defined by potentials with summable variations are well understood, but little is known for non-summable variations. This paper exhibits upper bounds for these quantities for dynamics defined by potentials with square-summable variations. We obtain these bounds as corollaries of a new block coupling inequality between pairs of dynamics starting with different histories. As applications of our results, we prove a new weak invariance principle and a Hoeffding-type inequality.

Keywords

g-measure chains of infinite order coupling central limit theorem concentration inequality

MSC classification

Primary: 60J05: Discrete-time Markov processes on general state spaces 37A35: Entropy and other invariants, isomorphism, classification 37B15: Cellular automata

Secondary: 60G07: General theory of processes

Type: Original Article
Information: Ergodic Theory and Dynamical Systems , Volume 42 , Issue 9 , September 2022 , pp. 2823 - 2840

DOI: https://doi.org/10.1017/etds.2021.65 [Opens in a new window]
Copyright: © The Author(s), 2021. Published by Cambridge University Press

1 Introduction

Let A be a countable set, called an alphabet. Consider a measurable function $\phi : A\times A^{\{-1,-2, \ldots \}}\to \mathbb {R}$ such that $\sum _{a\in A}e^{\phi (a,x)}=1$ for all $x\in A^{\{-1,-2, \ldots \}}$ . The function $\phi $ is called a normalized potential, and the probability kernel $g:= e^{\phi }$ (also known as g-function) is a natural generalization of Markov kernels. Let $\eta =(\eta _n)_{\mathbb {Z}}$ be the canonical projections on $A^{\mathbb {Z}}$ , that is, for all $x\in A^{\mathbb {Z}}$ and all $n\in \mathbb {Z}$ , $\eta _n(x)=x_n$ . For $y \in A^{\{0,-1, \ldots \}}$ , let $\mu ^y$ be the probability measure on $A^{\mathbb {Z}}$ such that $\mu ^y[(\eta _0,\eta _{-1},\ldots ) \in B]=\delta _{y} [B]$ for all $B\subset A^{\{0, -1, \ldots \}}$ measurable. For $n\geq 0$ , $\mu ^y[\eta _{n+1}=a\mid (\eta _n,\eta _{n-1},\ldots )=x]=e^{\phi (a,x)}$ for every $a\in A$ and $\mu ^y$ -a.e. (almost every) x in $A^{\{-1,-2, \ldots \}}$ . Let T denote the shift operator on $A^{\{0,-1, \ldots \}}$ . We indicate by $d_{\text {TV}}$ the total variation distance, that is, if P and Q are two probability measures on the same $\sigma $ -algebra $\mathcal {F}$ ,

$$ \begin{align*} d_{\text{TV}}(P,Q)=\sup_{F\in \mathcal{F}}|P[F]-Q[F]|. \end{align*} $$

In this paper, we obtain upper bounds, respectively, for the relaxation rate

$$ \begin{align*} L(n) :=\sup_{y,z} d_{\text{TV}}(\mu^y[\eta_n \in \cdot\;],\mu^z[\eta_n \in \cdot\;]), \end{align*} $$

the mixing rate

$$ \begin{align*} M(n) := \sup_{y,z} d_{\text{TV}}(\mu^y[(\eta_j)_{j\geq n} \in \cdot\;],\mu^z[(\eta_j)_{j\geq n} \in \cdot\;]), \end{align*} $$

and the decay rate of correlations

$$ \begin{align*} \rho_{f,\hat{f}}(n) := \bigg| \int f\circ T^n \;{\hat f}\,d\tilde{\mu}-\int f\,d\tilde{\mu}\int {\hat f}\,d\tilde{\mu} \bigg| \end{align*} $$

when $\tilde {\mu }$ is the unique shift-invariant measure compatible with $\phi $ (see the next section for the definition of compatibility) and $f, \hat {f}$ are suitable functions (see Theorem 3.6). Bressaud, Fernández and Galves [Reference Bressaud, Fernández and GalvesBFG99] and Pollicott [Reference PollicottPol00, Proposition 1] obtained upper bounds for $L(n), M(n)$ , and $\rho _{f,\hat {f}}(n)$ for potentials of summable variations and finite alphabets. Gouëzel [Reference GouëzelGou04] obtained sharp lower bounds for the decay of correlations for dynamics with Hölder continuous (that is, exponentially decaying) potentials and countable alphabet. Our contribution is twofold. We obtain upper bounds for $L(n), M(n),$ and $\rho _{f,\hat {f}}(n)$ when the variation rate $ \operatorname {{\mathrm {var}}}_k(\phi )$ decays as $\mathcal {O}(k^{-(1/2+\delta ')})$ for any $ \delta '> 0$ . Moreover, our results also hold for a countably infinite alphabet A. Theorem 3.2 is our main result, showing a new upper bound for the coupling error between $\mu ^y$ and $\mu ^z$ . Corollary 3.3 answers a question posed in [Reference Johansson and ÖbergJÖ08], in which the authors ask for a bound for $L(n)$ when the variation of $\phi $ is not summable. Corollary 3.4 shows a bound for $M(n)$ , which cannot be achieved by simply using the union bound and Corollary 3.3. The result is new even for the case of summable variations. The interest in $M(n)$ stands from the fact that it is the natural generalization of mixing times for Markov chains. Gallesco, Gallo and Takahashi [Reference Gallesco, Gallo and TakahashiGGT18] showed that $M(n)$ converges to $0$ only when $\operatorname {{\mathrm {var}}}_k(\phi )$ is square summable $\tilde {\mu }$ -a.s. (almost surely) and hence Corollary 3.4 covers the main cases of interest. Theorem 3.6 gives an upper bound for the speed of decay of correlations, extending Theorem 3.2 in [Reference Bressaud, Fernández and GalvesBFG99]. Johansson, Öberg and Pollicott [Reference Johansson, Öberg and PollicottJOP12] showed, when the alphabet A is finite, that there is a unique shift-invariant measure $\tilde {\mu }$ compatible with $\phi $ when $\operatorname {{\mathrm {var}}}_k(\phi ) \in \scriptstyle \mathcal {O}(k^{-(1/2)})$ . Moreover, Berger, Hoffman and Sidoravicius [Reference Berger, Hoffman and SidoraviciusBHS18] proved that whenever $\operatorname {{\mathrm {var}}}_k(\phi ) \in \mathcal {O}(k^{-(1/2-\delta )})$ for any $\delta> 0$ , there exists a normalized potential $\phi $ that exhibits multiple compatible shift-invariant measures. Therefore, Theorem 3.6 also covers the main variation rates of interest under uniqueness of the compatible shift-invariant measure. In Corollary 3.9, we use Theorem 3.6 to obtain upper bounds on the rate of correlation decay for non-normalized potentials. We illustrate the application of our inequalities in three cases. The first application proves a novel weak invariance principle for additive functionals of dynamics with non-summable variations. The second application shows that we can obtain Hoeffding-type bounds for averages of random variables when the variation of $\phi $ is not summable. The third example illustrates how we can apply our results on a Poisson autoregression model, which is popular in applied works.

The proof technique is based on a renewal equation and coupling inequalities. These ideas were developed in [Reference Bressaud, Fernández and GalvesBFG99, Reference Comets, Fernández and FerrariCFF02, Reference Coelho and QuasCQ98]. We improve on the coupling bounds obtained in [Reference Bressaud, Fernández and GalvesBFG99] by using a coupling between blocks of coordinates, instead of one coordinate at a time. A block coupling idea was used in [Reference Johansson, Öberg and PollicottJOP12] to obtain sharp conditions for uniqueness of the equilibrium measure for $\phi $ on a finite alphabet A, but mixing rate was not obtained. A difference between [Reference Johansson, Öberg and PollicottJOP12] and our approach is that we upper bound the block coupling using different renewal processes leading to a distinct renewal equation. This new renewal equation allows us to upper bound the speed of decay of the coupling inequality even when the variation is not summable (see Theorem 3.2).

2 Definitions

Let the alphabet A be a countable set, $\mathcal {X}=A^{\mathbb {Z}}$ , and $\mathcal {X}_{-} = A^{\mathbb {Z}_-}$ , where $\mathbb {Z}_-=\{0,-1,-2,\ldots \}$ . We endow $\mathcal {X}$ and $\mathcal {X}_-$ with the product topology and its corresponding Borel $\sigma $ -algebra. The topologies and $\sigma $ -algebras considered on subsets of $\mathcal {X}$ and $\mathcal {X}_-$ will always be the trace topologies and $\sigma $ -algebras. We denote by $x_i$ the ith coordinate of $x \in \mathcal {X}$ and, for $-\infty < i \leq j<\infty $ , we write $x^{-i}_{-j}:=(x_{-i},\ldots , x_{-j})$ , $x^{-i}_{-\infty }:=(x_{-i},x_{-i-1},\ldots )$ , and $x_i^{\infty }:=(\ldots ,x_{i+1},x_{i})$ . If $i < j$ , $x^i_j = \phi $ . For $x\in \mathcal {X}$ and $y \in \mathcal {X}_{-}$ , a concatenation $x^{0}_{-i}y$ is a new sequence $z\in \mathcal {X}_{-}$ with $z^{0}_{-i} = x^{-1}_{-i}$ and $z^{-i-1}_{-\infty } = y$ . We take $\phi $ to be the neutral element of the concatenation operation, that is, $\phi x=x$ for all $x\in \mathcal {X}_{-}$ . Note that we are using the convention, consistent with the concatenation operation, that when we scan an element $x\in \mathcal {X}$ from the left to the right we go further into the past.

Consider a measurable function $\phi : \mathcal {X}_{-}\to \mathbb {R}$ , which we call a potential. We say that $\phi $ is normalized if it satisfies

$$ \begin{align*} \sum_{a\in A}e^{\phi(ax)}=1 \end{align*} $$

for all $x\in \mathcal {X}_{-}$ . To a normalized potential $\phi $ we can associate a probability kernel g on the alphabet A by defining $g=e^{\phi }$ . The variation of order $k\geq 0$ of $\phi $ is defined by

$$ \begin{align*} \operatorname{{\mathrm{var}}}_{k}(\phi):=\frac{1}{2}\sup_{z \in \mathcal{X}}\sup_{x,y\in \mathcal{X}_-}\sum_{b\in A}|\phi(bz_{-k}^{-1}x)-\phi(bz_{-k}^{-1}y)|. \end{align*} $$

When A is finite, the variation is usually defined by taking the supremum over $b \in A$ instead of the sum. Nevertheless, our definition is more convenient when the alphabet is infinite and has appeared in the literature before [Reference Chazottes, Gallo and TakahashiCGT20]. The constant $1/2$ in the definition relates $\operatorname {{\mathrm {var}}}_k(e^{\phi })$ to total variation distance when $\phi $ is normalized.

We also define, for $k\geq 0$ , the $\chi ^2$ -variation of order k of $\phi $ as

$$ \begin{align*} \chi^2_{k}(\phi)=\sup_{z\in \mathcal{X}}\sup_{x,y\in \mathcal{X}_-}\sum_{b\in A}\frac{(e^{\phi(bz_{-k}^{-1}x)}-e^{\phi(bz_{-k}^{-1}y)})^2}{e^{\phi(bz_{-k}^{-1}y)}}. \end{align*} $$

The use of $\chi ^2$ -variation to measure the regularity of potentials seems to be new; therefore, it is interesting to compare it to variation, which is more standard. When $\phi $ is normalized, by using the Cauchy–Schwarz inequality, we have that $ \operatorname {{\mathrm {var}}}^2_{k}(e^\phi )\leq \tfrac {1}{4}\chi ^2_{k}(\phi )$ for any $k\geq 1$ . When the alphabet A is finite and $\phi $ is normalized, $\operatorname {{\mathrm {var}}}^2_k(\phi )$ and $\chi ^2_k(\phi )$ are comparable, that is, there exist positive constants $K_1$ and $K_2$ such that $K_1\operatorname {{\mathrm {var}}}^2_k(\phi )\leq \chi ^2_k(\phi )\leq K_2\operatorname {{\mathrm {var}}}^2_k(\phi )$ . The $\chi ^2$ -variation introduced in this work will be particularly useful to study asymptotic properties of positive probability kernels on infinite A (cf. §8.3).

Let $\eta =(\eta _n)_{\mathbb {Z}}$ be the canonical projections on $\mathcal {X}$ , that is, for all $x\in \mathcal {X}$ , $\eta _n(x)=x_n$ for all $n\in \mathbb {Z}$ . We say that a probability measure $\mu $ on $\mathcal {X}$ is compatible with a normalized potential $\phi $ if there exists a probability measure P on $\mathcal {X}_-$ such that

$$ \begin{align*} \mu[\eta_{-\infty}^{0}\in B]=P[B] \end{align*} $$

for all $B\subset \mathcal {X}_-$ measurable and if, for $n\geq 0$ ,

$$ \begin{align*} \mu[\eta_{n+1}=a\mid \eta_{-\infty}^{n}=x]=e^{\phi(ax)} \end{align*} $$

for every $a\in A$ and $\mu $ -a.e. x in $\mathcal {X}_{-}$ . Johansson, Öberg and Pollicott [Reference Johansson, Öberg and PollicottJÖP07] showed that if

$$ \begin{align*} \sum_{k = 0}^\infty \sup_{z \in \mathcal{X}}\sup_{x,y\in \mathcal{X}_-}\sum_{b\in A}(e^{\phi(bz_{-k}^{-1}x)/2}-e^{\phi(bz_{-k}^{-1}y)/2})^2 < \infty, \end{align*} $$

then there is at most one shift-invariant invariant compatible measure with $\phi $ . From [Reference ReissRei12, Lemma 3.3.9], for $k \geq 0$ , we have

$$ \begin{align*} \sum_{b\in A}(e^{\phi(bz_{-k}^{-1}x)/2}-e^{\phi(bz_{-k}^{-1}y)/2})^2 \leq \sum_{b\in A}\frac{(e^{\phi(bz_{-k}^{-1}x)}-e^{\phi(bz_{-k}^{-1}y)})^2}{e^{\phi(bz_{-k}^{-1}y)}} \end{align*} $$

and hence the summability of $\chi ^2_k(\phi )$ implies the existence of at most one shift-invariant invariant compatible measure.

When $\phi $ is not normalized, the definition of a compatible measure loses its meaning. Nevertheless, we can associate a set of shift-invariant measures called equilibrium states for not necessarily normalized $\phi $ [Reference WaltersWal75]. Equilibrium states are characterized via a variational principle and coincide with shift-invariant compatible measures when $\phi $ is normalized. An equilibrium state $\tilde {\mu }$ compatible with a normalized $\phi $ is also called a g-measure in ergodic theory [Reference KeaneKea72]. In probability literature, g-measures are known as chains of complete connections [Reference Doeblin and FortetDF37, Reference Iosifescu and GrigorescuIG90], chains of infinite order [Reference HarrisHar55, Reference KeaneKea72], random-step Markov processes [Reference KalikowKal90], and uniform martingales [Reference KalikowKal90]. Compatible measures that are not necessarily shift-invariant are called g-chains [Reference Johansson, Öberg and PollicottJOP12] or stochastic chains of unbounded memory [Reference Gallesco, Gallo and TakahashiGGT18]. When there is more than one shift-invariant measure compatible with $\phi $ , we say that there is a phase transition, otherwise we say that the shift-invariant compatible measure is unique.

3 Results

In this paper, we will work under the following assumption

Assumption ( $\mathcal {A}$ ).

$\phi $ is a potential on $\mathcal {X}_-$ such that for all $k\geq 1$ ,

(1)

$$ \begin{align} \chi^2_{k}(\phi)\leq \frac{C}{k^{1+\delta}} \end{align} $$

for some $C>0$ and $\delta>0$ .

Remark 3.1. When the alphabet A is finite and $\phi $ is normalized, Assumption ( $\mathcal {A}$ ) is equivalent to

$$ \begin{align*} \operatorname{{\mathrm{var}}}_{k}(\phi)\leq \frac{C'}{k^{({1+ \delta})/{2}}} \end{align*} $$

for some $C'> 0$ and the same $\delta $ as in (1). Observe that $ \operatorname {{\mathrm {var}}}_k(\phi )$ is not summable when $\delta \in (0, 1]$ .

Now, consider $\mathcal {X}\times \mathcal {X}$ with the projection maps $\hat {\eta }=(\hat {\eta }_n)_{n\in \mathbb {Z}}$ and $\hat {\omega }=(\hat {\omega }_n)_{n\in \mathbb {Z}}$ such that for $(x,y)\in \mathcal {X}\times \mathcal {X}$ , $\hat {\eta }_n(x,y)=x_n$ and $\hat {\omega }_n(x,y)=y_n$ for all $n\in \mathbb {Z}$ . Let us also denote by $\widehat {\mathcal {C}}(\phi )$ the set of probability measures P on $\mathcal {X}\times \mathcal {X}$ such that the pushforward measures $\hat {\eta }_*P$ and $\hat {\omega }_*P$ are compatible with $\phi $ . We also introduce the process $X=(X_n)_{n\geq 1}$ such that for all $n\geq 1$ ,

where $(K_n)_{n\geq 1}$ is a fixed strictly increasing sequence of natural numbers such that $K_1=1$ . Here is our main result followed by two corollaries.

Theorem 3.2. Let $\phi $ be a normalized potential that satisfies Assumption ( $\mathcal {A}$ ). Let $K_n=\lfloor n^{\beta } \rfloor $ for $\beta \geq 1$ and $\beta>1/\delta $ . For all measures $\mu $ and $\nu $ compatible with $\phi $ , there exists $\mathbb {P}\in \widehat {\mathcal {C}}(\phi )$ such that $\hat {\eta }_*\mathbb {P}=\mu $ , $\hat {\omega }_*\mathbb {P}=\nu $ , and, for $n\geq 1$ ,

$$ \begin{align*} \mathbb{P}[X_n=1]\leq \frac{C_1}{n^{({\beta \delta+1})/{2}}}, \end{align*} $$

where $C_1$ is a positive constant depending on $C, \delta $ , and $\beta $ .

Corollary 3.3. Let $\phi $ be a normalized potential that satisfies Assumption ( $\mathcal {A}$ ). If $\delta>1$ , we have for all $n\geq 1$ ,

$$ \begin{align*} L(n)\leq \frac{C_2}{n^{({1+\delta})/{2}}}, \end{align*} $$

where $C_2$ is a positive constant depending on C and $\delta $ .

If $\delta \in (0,1]$ , we have for all $n\geq 1$ and $\delta '<\delta $ ,

$$ \begin{align*} L(n)\leq \frac{C_3}{n^{\delta'}}, \end{align*} $$

where $C_3$ is a positive constant that depends on C, $\delta $ , and $\delta '$ .

Corollary 3.4. Let $\phi $ be a normalized potential that satisfies Assumption ( $\mathcal {A}$ ). For all $\delta '<\delta $ , we have for all $n\geq 1$ ,

$$ \begin{align*} M(n) \leq \frac{C_4}{n^{{\delta'}/{2}}}, \end{align*} $$

where $C_4$ is a positive constant that depends on C, $\delta $ , and $\delta '$ .

Remark 3.5. When $\delta>1$ and A is finite, we can use [Reference Bressaud, Fernández and GalvesBFG99, Theorem 1] and the union bound to obtain

$$ \begin{align*} M(n) \leq \frac{C_5}{n^{({\delta-1})/{2}}}, \end{align*} $$

where $C_5> 0$ is a constant that depends on C and $\delta $ . Hence, the result in Corollary 3.4 gives a sharper upper bound, even when the potential is summable and the alphabet A is finite.

We now look at the correlations decay for the shift-invariant measure compatible with a potential $\phi $ . For this, we need the following definitions. Consider the shift operator $T: \mathcal {X}_{-}\to \mathcal {X}_{-}$ such that for all $x\in \mathcal {X}_{-}$ , $Tx=Tx_{-\infty }^0=x_{-\infty }^{-1}$ . For non-constant $\phi $ , let us consider the seminorm

$$ \begin{align*} \|f\|_{\phi}=\sup_{k\geq 1}\frac{\operatorname{{\mathrm{var}}}_k(f)}{\operatorname{{\mathrm{var}}}_k(e^\phi)} \end{align*} $$

and the subspace of $\mathcal {C}(\mathcal {X}_-,\mathbb {R})$ defined by

$$ \begin{align*}V_{\phi}=\{f \in \mathcal{C}(\mathcal{X}_-,\mathbb{R}):\|f\|_{\phi}<\infty \}.\end{align*} $$

Theorem 3.6. Let $\phi $ be a normalized potential that satisfies Assumption ( $\mathcal {A}$ ). Assume that a shift-invariant probability measure $\tilde {\mu }$ compatible with $\phi $ exists. Let $f\in L^1(\tilde {\mu })$ and $\hat {f}\in V_{\phi }$ .

If $\delta>1$ , we have for all $n\geq 1$ ,

$$ \begin{align*} \rho_{f,\hat{f}}(n) \leq \frac{C_6}{n^{({1+\delta})/{2}}}\|f\|_1 \|\hat{f}\|_{\phi}, \end{align*} $$

where $C_6$ is a positive constant that depends on C and $\delta $ .

If $\delta \in (0,1]$ , we have for all $n\geq 1$ and $\delta '<\delta $ ,

$$ \begin{align*} \rho_{f,\hat{f}}(n) \leq \frac{C_7}{n^{\delta'}}\|f\|_1 \|\hat{f}\|_{\phi}, \end{align*} $$

where $C_7$ is a positive constant that depends on C, $\delta $ , and $\delta '$ .

Remark 3.7. When $\delta>1$ and A is finite, Theorem 3.6 recovers the rate obtained in [Reference Bressaud, Fernández and GalvesBFG99, Theorem 1].

Remark 3.8. When A is finite, continuity of $\phi $ guarantees the existence of a compatible shift-invariant measure; therefore, the assumption on the existence of a compatible measure in Theorem 3.6 is redundant. When A is infinite, the existence of a shift-invariant compatible measure is not immediate. Sufficient conditions for existence of shift-invariant compatible measures when A is infinite are given in [Reference Fernández and MaillardFM05, Reference Johansson, Öberg and PollicottJÖP07]. See §8.3 for a concrete example. Whenever a shift-invariant compatible measure exists, Assumption ( $\mathcal {A}$ ) implies uniqueness of $\tilde {\mu }$ in Theorem 3.6 [Reference Johansson, Öberg and PollicottJÖP07], although uniqueness is not a priori necessary for Theorem 3.6.

A natural question is whether we can obtain an upper bound for the rate of correlations decay for a potential $\phi $ that is not normalized. When A is finite, we can use the same strategy as in [Reference Bressaud, Fernández and GalvesBFG99, Reference PollicottPol00, Reference WaltersWal75]. The idea is to study normalized potentials $\psi $ that are cohomologous to $\phi $ , that is, $\psi = \phi + h - h\circ T + c$ for some $h \in \mathcal {C}(\mathcal {X}_-,\mathbb {R})$ and $c \in \mathbb {R}$ . If $\phi $ and $\psi $ are cohomologous, then both functions have the same associated equilibrium states [Reference WaltersWal75]. Hence, properties of equilibrium states for $\phi $ can be obtained by studying shift-invariant measures compatible with $\psi $ . Walters [Reference WaltersWal75] proved that when the rate of variation of $\phi $ is summable, there exist a unique h and a unique c such that $\psi $ is a normalized potential. Moreover, from the construction of h in [Reference WaltersWal75], we have that $\operatorname {{\mathrm {var}}}_{k}(h) \leq \sum _{j \geq k}\operatorname {{\mathrm {var}}}_{j}(\phi )$ . This implies that $\operatorname {{\mathrm {var}}}_{k}(\psi ) \leq 3\sum _{j \geq k}\operatorname {{\mathrm {var}}}_{j}(\phi )$ . Using these results, we obtain the following corollary, which improves the results in [Reference Bressaud, Fernández and GalvesBFG99, Reference PollicottPol00].

Corollary 3.9. Let the alphabet A be finite and $\phi $ be a potential not necessarily normalized. Assume that there exist a constant $C> 0$ and $\delta> 0$ such that

$$ \begin{align*} \operatorname{{\mathrm{var}}}_{k}(\phi)\leq \frac{C}{k^{({3+ \delta})/{2}}}. \end{align*} $$

Let $\tilde {\mu }$ be an equilibrium state for $\phi $ , $f\in L^1(\tilde {\mu })$ , and $\hat {f}\in V_{\phi }$ .

If $\delta>1$ , we have for all $n\geq 1$ ,

$$ \begin{align*} \rho_{f,\hat{f}}(n) \leq \frac{C_8}{n^{({1+\delta})/{2}}}\|f\|_1 \|\hat{f}\|_{\phi}, \end{align*} $$

where $C_8$ is a positive constant that depends on C and $\delta $ .

If $\delta \in (0,1]$ , we have for all $n\geq 1$ and $\delta '<\delta $ ,

$$ \begin{align*} \rho_{f,\hat{f}}(n) \leq \frac{C_9}{n^{\delta'}}\|f\|_1 \|\hat{f}\|_{\phi}, \end{align*} $$

where $C_9$ is a positive constant that depends on C, $\delta $ , and $\delta '$ .

Remark 3.10. When $\delta> 1$ , Corollary 3.9 recovers the rate obtained in [Reference PollicottPol00, Theorem 1(1)]. To generalize Corollary 3.9 to an infinite alphabet, we need a result equivalent to [Reference WaltersWal75, Theorem 3.3] for an infinite alphabet, which is currently unavailable.

4 Technical lemmas

Here we collect some results that we will use to prove Theorem 3.2. We first recall the definitions of the Kullback–Leibler and Pearson $\chi ^2$ divergences. Let P and Q be two probabilities on some discrete space $\mathcal {Y}$ . We define

$$ \begin{align*} D_{\text{KL}}(P||Q)=\sum_{y\in \mathcal{Y}}P(y)\ln\bigg(\frac{P(y)}{Q(y)}\bigg) \end{align*} $$

and

$$ \begin{align*} D_{\chi^2}(P||Q)=\sum_{y\in \mathcal{Y}}\frac{(P(y)-Q(y))^2}{Q(y)}. \end{align*} $$

It is well known that $D_{\text {KL}}(P||Q)\leq D_{\chi ^2}(P||Q)$ (cf. [Reference Sason and VerdúSV16, eq. 5]).

Lemma 4.1. Let $x, y\in \mathcal {X}_-$ and $\mu ,\nu \in \mathcal {P}(\phi )$ such that $\mu [\eta _{-\infty }^{0}\in \cdot \;]=\delta _x(\cdot )$ and $\nu [\eta _{-\infty }^{0}\in \cdot \;]=\delta _y(\cdot )$ . For all $n\geq 1$ , $0\leq k\leq n-1$ , and all $a, b, c\in \mathcal {X}$ , we have

(2)

$$ \begin{align} D_{\text{KL}}&(\mu[\eta_{K_{n}}^{K_{n+1}-1}\in\; \cdot\; | \eta_1^{K_n-1}= a_{K_{n-k}}^{K_n-1} b_{1}^{K_{n-k}-1}]\nonumber\\ &\times || \nu [\eta_{K_{n}}^{K_{n+1}-1}\in \;\cdot \; | \eta_1^{K_n-1}=a_{K_{n-k}}^{K_n-1}c_{1}^{K_{n-k}-1}])\nonumber\\ &\leq \sum_{j=K_n}^{K_{n+1}-1}\chi^2_{j-K_{n-k}}(\phi). \end{align} $$

Proof Let us simply denote by D the left-hand term of inequality (2). We have by the chain rule property of the Kullback–Leibler divergence [Reference Cover and ThomasCT06, Theorem 2.5.3] that

$$ \begin{align*} D &=\sum_{i=K_n}^{K_{n+1}-1}D_{\text{KL}}( \mu[\eta_i\in \cdot \; | \eta_1^{i-1}=z_{K_n}^{i-1}a_{K_{n-k}}^{K_n-1} b_{1}^{K_{n-k}-1}] \nonumber\\ &\,\,\quad\phantom{\sum_{i=K_n}^{K_{n+1}-1}D_{\text{KL}}}\times|| \nu[\eta_i\in \cdot \; | \eta_1^{i-1}=z_{K_n}^{i-1}a_{KK_{n-k}}^{K_n-1}c_{1}^{K_{n-k}-1}] )\nonumber\\ &=:\sum_{i=K_n}^{K_{n+1}-1}D_i. \end{align*} $$

Then we use the well-known bound

$$ \begin{align*} D_i&\leq D_{\chi^2}(\mu[\eta_i\in \cdot \; | \eta_1^{i-1}=z_{K_n}^{i-1}a_{K_{n-k}}^{K_n-1} b_{1}^{K_{n-k}-1}]\nonumber\\ &\phantom{\leq D_{\chi^2}}\times || \nu[\eta_i\in \cdot \; | \eta_1^{i-1}=z_{K_n}^{i-1}a_{K_{n-k}}^{K_n-1}c_{1}^{K_{n-k}-1}])\nonumber\\ &\leq \chi^2_{i-K_n}(\phi) \end{align*} $$

to conclude the proof.

Lemma 4.2. For $\alpha> 1$ and $0<a<b$ , we have

(3)

$$ \begin{align} \frac{(b+1)^{\alpha}-a^{\alpha}}{b^{\alpha}-a^{\alpha}}\geq \frac{(b+1)^{\alpha-1}-a^{\alpha-1}}{b^{\alpha-1}-a^{\alpha-1}}. \end{align} $$

Proof By algebraic computations, we obtain that (3) is equivalent to

$$ \begin{align*} \bigg(\frac{b}{a}\bigg)^{\alpha-1}\geq 1+(b-a)\bigg(1-\bigg(\frac{b}{b+1}\bigg)^{\alpha-1}\bigg). \end{align*} $$

This last inequality is obtained from the Bernoulli inequality $(1+x)^{r}\geq 1+rx$ , for $r>0$ and $x>-1$ , observing that

$$ \begin{align*} \bigg(\frac{b}{a}\bigg)^{\alpha-1}=\bigg(1+\frac{b-a}{a}\bigg)^{\alpha-1}\geq 1+(\alpha-1)\frac{b-a}{a} \end{align*} $$

and

$$ \begin{align*} \bigg(\frac{b}{b+1}\bigg)^{\alpha-1}=\bigg(1-\frac{1}{b+1}\bigg)^{\alpha-1}\geq 1-(\alpha-1)\frac{1}{b+1}.\\[-40pt] \end{align*} $$

Define, for all $\delta>0$ , $\beta \geq 1$ , $k\geq 3$ , and $n\geq k+1$ ,

$$ \begin{align*} \Delta^n_k:=(n^{\beta}-(n-k)^{\beta}-2)^{-\delta}-((n+1)^{\beta}-(n-k)^{\beta})^{-\delta}. \end{align*} $$

Lemma 4.3. For all $\delta>0$ , $\beta \geq 1$ , and $k\geq 3$ , $\Delta _k^n$ is a non-increasing function of $n\geq k+1$ .

Proof The statement of the lemma is trivial for $\beta =1$ . For $\beta>1$ , consider the function $f:[4,\infty )\to \mathbb {R}^+$ defined by

$$ \begin{align*} f(x)=(x^{\beta}-(x-k)^{\beta}-2)^{-\delta}-((x+1)^{\beta}-(x-k)^{\beta})^{-\delta}. \end{align*} $$

In order to prove the result, it is enough to show that the derivative of f is negative. Since

$$ \begin{align*} f'(x)&=-\delta \beta[(x^{\beta}-(x-k)^{\beta}-2)^{-\delta-1}(x^{\beta-1}-(x-k)^{\beta-1})\nonumber\\ &\phantom{**}-((x+1)^{\beta}-(x-k)^{\beta})^{-\delta-1}((x+1)^{\beta-1}-(x-k)^{\beta-1})], \end{align*} $$

it is enough to show that

$$ \begin{align*} \frac{(x+1)^{\beta}-(x-k)^{\beta}}{x^{\beta}-(x-k)^{\beta}-2}\geq \frac{(x+1)^{\beta-1}-(x-k)^{\beta-1}}{x^{\beta-1}-(x-k)^{\beta-1}}. \end{align*} $$

But this last inequality follows from Lemma 4.2.

Lemma 4.4. For all $\delta>0$ , $\beta \geq 1$ , and $k\geq 3$ , we have

$$ \begin{align*} \Delta_k^{k+1}\leq 4\frac{2^{\beta} 4^{\delta}\beta}{k^{\delta\beta+1}}. \end{align*} $$

Proof Observe that for $k\geq 3$ ,

(4)

$$ \begin{align} \Delta_k^{k+1}&=\int_{(k+1)^{\beta}-3}^{(k+2)^{\beta}-1}\frac{1}{x^{1+\delta}}dx \nonumber\\[5pt] & \leq \frac{(k+2)^{\beta}-(k+1)^{\beta}+2}{((k+1)^{\beta}-3)^{1+\delta}} \leq \frac{\beta (k+2)^{\beta-1}+2}{((k+1)^{\beta}-3)^{1+\delta}} \leq 4\frac{2^{\beta} 4^{\delta}\beta}{k^{\delta\beta+1}}, \end{align} $$

where, to obtain the second inequality in (4), we used the inequality

$$ \begin{align*} (a+b)^{\alpha}\leq a^{\alpha}+\alpha b (a+b)^{\alpha-1} \end{align*} $$

for $\alpha \geq 1$ and $a, b\geq 0$ . This inequality can be obtained using the fundamental theorem of calculus applied to the function $f(x)=x^{\alpha }$ .

To obtain the last inequality in (4), we used that for $k\geq 3$ ,

$$ \begin{align*} \beta (k+2)^{\beta-1}+2\leq 2\beta(k+2)^{\beta-1}\leq 2^{\beta}\beta k^{\beta-1} \end{align*} $$

and

$$ \begin{align*} (k+1)^{\beta}-3=(k+1)^{\beta}\bigg(1-\frac{3}{(k+1)^{\beta}}\bigg)\geq \frac{(k+1)^{\beta}}{4}\geq \frac{k^{\beta}}{4}. \\[-40pt]\end{align*} $$

Finally, we recall the following lemma in [Reference Bressaud, Fernández and GalvesBFG99] (see also Lemma A.4 in [Reference GiacominGia07]) that gives an estimate for the renewal sequence that will appear in the proof of Theorem 3.2. We state the lemma using a notation that is adapted to our purpose.

Lemma 4.5. (Proposition 2, item (iv) in [Reference Bressaud, Fernández and GalvesBFG99])

Let $(f_k)_{k \geq 1}$ be a sequence of positive real numbers such that $\sum _{k=1}^\infty f_k < 1$ . Suppose that $(u_k)_{k \geq 1}$ is a sequence with $u_0 = 1$ and satisfies the renewal equation

$$ \begin{align*} u_n = \sum_{k=1}^n f_k u_{n-k}. \end{align*} $$

If $f_n \leq c_1/n^{1+\alpha }$ for some $\alpha>0$ and a positive constant $c_1$ , then $u_n \leq c_2/n^{1+\alpha }$ , where $c_2$ is a constant that depends on $(f_k)_{k \geq 1}$ .

5 Proof of Theorem 3.2

Let $x, y\in \mathcal {X}_-$ and $\mu , \nu $ compatible measures such that $\mu [\eta _{-\infty }^{0}\in \cdot \;]=\delta _x(\cdot )$ and $\nu [\eta _{-\infty }^{0}\in \cdot \;]=\delta _y(\cdot )$ . We now construct the coupling of $\mu $ and $\nu $ , that we call $\mathbb {P}^{x,y}$ , as follows. We start by defining

$$ \begin{align*}\mathbb{P}^{x,y}[\hat{\eta}_{-\infty}^{0}\in\cdot\;, \hat{\omega}_{-\infty}^{0}\in\cdot\;]=\delta_x \otimes\delta_y.\end{align*} $$

Then, for all $n\geq 1$ , given the pasts $\hat {\eta }_{-\infty }^{K_n-1}$ and $\hat {\omega }_{-\infty }^{K_n-1}$ , we maximally couple $\hat {\eta }_{K_n}^{K_{n+1}-1}$ and $\hat {\omega }_{K_n}^{K_{n+1}-1}$ to complete the construction of $\mathbb {P}^{x,y}$ .

Next, we show that $\mathbb {P}^{x,y}$ satisfies the inequality in Theorem 3.2. For all $n\geq 1$ and $0\leq k\leq n-1$ , define

$$ \begin{align*} q^n_k=\sup_{x,y,a,b \in\mathcal{X}}\mathbb{P}^{x,y}[X_{n}=1\mid X_{n-k}^{n-1}=0, \hat{\eta}^{K_{n-k}-1}_{1} = a_{1}^{K_{n-k}-1}, \hat{\omega}^{K_{n-k}-1}_{1} = b_{1}^{K_{n-k}-1}] \end{align*} $$

with the convention that if $k>l$ , then elements of the form $a_k^l$ are dropped from the conditional part. The shorthand notation $X_{n-k}^{n-1}=0$ means that $X_{n-k} = 0, \ldots , X_{n-1} = 0$ . Observe that for all $n\geq 1$ and $0\leq k\leq n-2$ , we have $q^n_{k} \geq q^n_{k+1}$ .

We start by proving the following lemma.

Lemma 5.1. Suppose that $(\chi ^2_n(\phi ))_{j\geq 0}\in \ell ^1$ . Then there exists $\varepsilon>0$ such that for all ${k\geq 0}$ and all $n\geq k+1$ ,

(5)

$$ \begin{align} q_k^n\leq\sqrt{1-\exp\bigg(-\sum_{j=0}^{\infty}\chi_j^2(\phi)\bigg)}\leq 1-\varepsilon. \end{align} $$

For $k\geq 1$ and $n\geq k+1$ , we also have

(6)

$$ \begin{align} q_k^n\leq \sqrt{\frac{1}{2}\sum_{j=K_n}^{K_{n+1}-1}\chi^2_{j-K_{n-k}}(\phi)}. \end{align} $$

Proof Inequality (5) is a direct consequence of the Bretagnolle–Huber inequality (cf. [Reference Sason and VerdúSV16, eq. 4]) and Lemma 4.1. Inequality (6) is a direct consequence of the Pinsker inequality (cf. [Reference Sason and VerdúSV16, eq. 1]) and again Lemma 4.1.

Now, on some probability space $(\Omega , \mathcal {F}, P)$ , consider the random process $Y=(Y_n)_{n\geq 0}$ with values in $\{0,1\}$ such that $Y_0=1$ and, for $n\geq 1$ and $0\leq k\leq n-1$ ,

$$ \begin{align*} P[Y_n=1\mid Y^{n-1}_{n-k}=0, Y_{n-k-1}=1, Y_1^{n-k-2}]=q^n_k. \end{align*} $$

For all $m\geq 1$ and $a,b \in \{0,1\}^{m}$ , we say that $a\geq b$ if $a_i \geq b_i$ for $i \in \{1,\ldots , m\}$ . By construction, for all $n\geq 2$ , $a,b \in \{0,1\}^{n-1}$ , and $a\geq b$ , we have $P[Y_n = 1 | Y^{n-1}_1 = a] \geq \mathbb {P}^{x,y}[X_n = 1 | X^{n-1}_1 = b]$ . Therefore, by applying Strassen’s theorem on stochastic domination [Reference LindvallLin99] inductively on n, we can construct a coupling measure Q such that, for $a,b \in \{0,1\}^{n-1}$ and $a\geq b$ , we have $Q[Y_n \geq X_n| Y^{n-1}_1 = a, X^{n-1}_1 = b] = 1$ . Therefore, for all $n\geq 1$ , we have $Q[Y_n \geq X_n] = 1$ , which implies that

(7)

$$ \begin{align} P[Y_n = 1] \geq \mathbb{P}^{x,y}[X_n = 1] \end{align} $$

for all $n \geq 1$ .

Now, consider the process $Z=(Z_n)_{n\geq 0}$ with values in $\{0,1\}$ such that $Z_0=1$ and, for $n\geq 1$ and $0\leq k\leq n-1$ ,

(8)

$$ \begin{align} P[Z_n=1\mid Z^{n-1}_{n-k}=0, Z_{n-k-1}=1, Z_1^{n-k-2}]=b_k:=\sup_{n\geq k+1} q_k^n. \end{align} $$

Observe that, for all $k \geq 0$ , we have $b_k \geq b_{k+1}$ . Using the same argument used to show (7), we have that $P[Z_n = 1] \geq P[Y_n = 1]$ for all $n\geq 1$ . Also, by (6) and Lemmas 4.3 and 4.4, we have that for $k\geq 3$ and $n\geq k+1$ ,

$$ \begin{align*} 2(q_k^n)^2&\leq \sum_{j=K_n}^{K_{n+1}-1}\chi^2_{j-K_{n-k}}(\phi)\\ &\leq C\sum_{j=\lfloor n^{\beta}\rfloor}^{\lfloor(n+1)^{\beta}\rfloor-1}\frac{1}{(j-\lfloor (n-k)^{\beta}\rfloor)^{1+\delta}}=C\sum_{j=\lfloor n^{\beta}\rfloor-\lfloor (n-k)^{\beta}\rfloor}^{\lfloor(n+1)^{\beta}\rfloor-\lfloor (n-k)^{\beta}\rfloor-1}\frac{1}{j^{1+\delta}}\\ &\leq C\int_{n^{\beta}-(n-k)^{\beta}-2}^{(n+1)^{\beta}-(n-k)^{\beta}}\frac{1}{x^{1+\delta}}dx=C\frac{\Delta_k^n}{\delta}\leq C\delta^{-1}\Delta_k^{k+1}\leq 4C\frac{2^{\beta} 4^{\delta}\beta\delta^{-1}}{k^{\delta\beta+1}}. \end{align*} $$

Using (5), we obtain that $b_k\leq (2C({2^{\beta } 4^{\delta }\beta \delta ^{-1}})/({k^{\delta \beta +1}}))^{1/2}\wedge (1-\varepsilon )$ for all $k\geq 3$ and $b_k\leq 1-\varepsilon $ for $0\leq k\leq 2$ .

Next, let $f_i:= b_{i-1}\prod _{k=0}^{i-2}(1-b_k)$ for $i\geq 1$ (with the convention that $\prod _{j=0}^{-1}=1$ ) and $u_i:= P[Z_i = 1]$ for $i \geq 0$ . We have that

$$ \begin{align*}P[Z_n = 1] = \sum_{k= 1}^nP[Z_n = 1, Z^{n-1}_{n-k+1} = 0| Z_{n-k} = 1]P[Z_{n-k} = 1]\end{align*} $$

and hence the following renewal equation holds:

$$ \begin{align*}u_n = \sum_{k = 1}^nf_{k}u_{n-k}.\end{align*} $$

By definition, we have that $\sum _{k= 1}^\infty f_k = 1-\prod _{k = 1}^\infty (1-b_k)$ . If $\beta>\delta ^{-1}$ , we have $\sum _{k = 0}^\infty b_k < \infty $ . Hence, $\sum _{k= 1}^\infty f_k < 1$ . Moreover, when $\beta>\delta ^{-1}$ , we have that $b_k \leq c_1k^{-(\delta \beta +1)/2}$ for some positive constant $c_1$ that depends on $C, \delta $ , and $\beta $ . From Lemma 4.5, we have that, for all $n\geq 1$ ,

$$ \begin{align*} u_n\leq \frac{C_1}{n^{({\delta\beta+1})/{2}}}, \end{align*} $$

where $C_1$ is a positive constant that depends on $C, \delta $ , and $\beta $ . Because $P[Z_n = 1] \geq \mathbb {P}^{x,y}[X_n = 1]$ , we obtain that for all $n\geq 1$ ,

$$ \begin{align*} \mathbb{P}^{x,y}[X_n=1]\leq \frac{C_1}{n^{({\delta\beta+1})/{2}}} \end{align*} $$

for $\beta>\delta ^{-1}$ . Because the bound is uniform on $x,y \in \mathcal {X}_-$ , we obtain the desired result.

6 Proofs of Corollaries 3.3 and 3.4

6.1 Proof of Corollary 3.3

For $k\in [K_n, K_{n+1})$ and all $y,z \in \mathcal {X}_-$ , using the coupling inequality for total variation distance (cf. [Reference ThorissonTho00]), we have

$$ \begin{align*} d_{\text{TV}}(\mu^y[\eta_k \in \cdot\;],\mu^z[\eta_k \in \cdot\;])\leq \mathbb{P}[\hat{\eta}_k\neq \hat{\omega}_k]\leq \mathbb{P}[X_n=1]. \end{align*} $$

Then, by Theorem 3.2, we obtain

$$ \begin{align*} \mathbb{P}[\hat{\eta}_k\neq \hat{\omega}_k]\leq \frac{C_1}{n^{({\beta \delta+1})/{2}}} \end{align*} $$

for all $\beta \geq 1$ and $\beta>\delta ^{-1}$ . If $\delta>1$ , just take $\beta =1$ . In this case $k=n$ and thus we obtain Corollary 3.3 with a constant $C_2$ that depends on C and $\delta $ . If $\delta \in (0,1)$ , since $k\leq (n+1)^{\beta }$ , we have $n\geq k^{1/\beta }-1$ . This leads to

$$ \begin{align*} \mathbb{P}[\hat{\eta}_k\neq \hat{\omega}_k]\leq \frac{C_3}{k^{({\beta \delta+1})/{2\beta}}} \end{align*} $$

for all $k\geq 1$ , where $C_3$ is a positive constant that depends on $C, \delta $ , and $\beta $ . Now, observe that for any $0< \delta '<\delta $ , we can choose $\beta $ such that $\beta \geq 1$ , $\beta>\delta ^{-1}$ , and $({\beta \delta +1})/ {2\beta }\geq \delta '$ .

6.2 Proof of Corollary 3.4

Consider $k\in [K_n, K_{n+1})$ . Let

$$ \begin{align*}\theta=\inf\{n\geq 1: \hat{\eta}_k=\hat{\omega}_k\; \text{for all}\; k\geq n\}\end{align*} $$

with the convention that $\inf \emptyset =\infty $ . We start by observing that

$$ \begin{align*} \mathbb{P}[\theta>k]\leq \mathbb{P}\bigg[\bigcup_{j\geq n} \{ X_j=1\}\bigg]\leq \sum_{j\geq n} \mathbb{P}[X_j=1]. \end{align*} $$

By Theorem 3.2, we obtain

$$ \begin{align*} \mathbb{P}[\theta>k]\leq C_1\sum_{j\geq n} \frac{1}{n^{({\beta \delta+1})/{2}}}\leq \frac{C_1'}{n^{({\beta\delta-1})/{2}}} \end{align*} $$

for $\beta \geq 1$ , $\beta>\delta ^{-1}$ , and $C_1'$ a positive constant that depends on $C, \delta $ , and $\beta $ . Since $n\geq k^{1/\beta }-1$ , we obtain

$$ \begin{align*} \mathbb{P}[\theta>k]\leq \frac{C_4}{k^{({\beta\delta-1})/{2\beta}}} \end{align*} $$

for all $k\geq 1$ and $C_4$ a positive constant that depends on $C, \delta $ , and $\beta $ . Finally, notice that for all $\delta '<\delta $ , we can choose $\beta $ large enough such that $({\beta \delta -1})/{2\beta }\geq \delta '$ . Using the coupling inequality (cf. [Reference ThorissonTho00]), we conclude that

$$ \begin{align*} \qquad\ \ \quad M(n) := \sup_{y,z} d_{\text{TV}}(\mu^y[(\eta_j)_{j\geq n} \in \cdot\;],\mu^z[(\eta_j)_{j\geq n} \in \cdot\;]) \leq \mathbb{P}[\theta>k].\end{align*} $$

7 Proof of Theorem 3.6

Consider $K_m=\lfloor m^{\beta } \rfloor $ for $m\ge 1$ . For each $x, y\in \mathcal {X}_-$ , we consider a probability space $(\Omega , \mathcal {F}, \mathbb {P}^{x,y})$ that supports the random elements $\tilde {\eta }$ , $\tilde {\omega }$ , and $\tilde {Z}$ defined as follows. Let $\tilde {\eta }_*\mathbb {P}^{x,y}$ , $\tilde {\omega }_*\mathbb {P}^{x,y}$ be compatible with $\phi $ and $\tilde {\eta }_{-\infty }^{0}=x,\, \tilde {\omega }_{-\infty }^{0}=y$ . Also, for all $m\geq 1$ given the pasts $\tilde {\eta }_{-\infty }^{K_m-1}$ and $\tilde {\omega }_{-\infty }^{K_m-1}$ , the blocks $\tilde {\eta }_{K_m}^{K_{m+1}-1}$ and $\tilde {\omega }_{K_m}^{K_{m+1}-1}$ are maximally coupled. Under $\mathbb {P}^{x,y}$ , the process $\tilde {Z}$ has the same law as the process Z defined in §5 and verifies for all $m\geq 1$ (this is indeed possible since Z stochastically dominates X; see §5). We denote by $\mathbb {E}^{x,y}$ the expectation with respect to $\mathbb {P}^{x,y}$ .

Fix some $n\in \mathbb {N}$ and let k be such that $n\in [K_{k-1}, K_{k})$ . We will show that

(9)

$$ \begin{align} \bigg| \int f\circ T^n \;{\hat f}\,d\tilde{\mu}-\int f\,d\tilde{\mu}\int {\hat f}\,d\tilde{\mu} \bigg|\leq c_1 P[Z_k=1] \end{align} $$

for some positive constant $c_1$ that depends only on C, $\delta $ , and $\beta $ . From this point, Theorem 3.6 is easily obtained following the proof of Corollary 3.3. To obtain (9), we follow the argument developed in [Reference Bressaud, Fernández and GalvesBFG99, §5]. Using (3.7) in [Reference Bressaud, Fernández and GalvesBFG99], we first observe that

$$ \begin{align*} \bigg| \int f\circ T^n \;{\hat f}\,d\tilde{\mu}-\int f\,d\tilde{\mu}\int {\hat f}\,d\tilde{\mu}\bigg | \leq \|f\|_1\sup_{x,y} \mathbb{E}^{x,y}[|\hat{f}(\tilde{\eta}_{-\infty}^n)-\hat{f}(\tilde{\omega}_{-\infty}^{n}) |]. \end{align*} $$

For $k\geq 1$ , let

$$ \begin{align*} \theta_k=\inf\{0\leq m\leq k:\tilde{Z}_{k-m}=1\}. \end{align*} $$

We have

(10)

Observe that for all $0\leq j\leq k$ ,

$$ \begin{align*} \mathbb{P}^{x,y}[\theta_k=j]&=P[Z_k=0,\ldots,Z_{k-j+1}=0,Z_{k-j}=1]\nonumber\\ &=\prod_{l=1}^j(1-b_{k-j+l})P[Z_{k-j}=1], \end{align*} $$

where the $b_{k-j+l}$ are from (8). Now, observe that for all $i\geq 1$ , we have

$$ \begin{align*} P[Z_i=1]&=\sum_{k=1}^ib_{i-k}P[Z_{i-1}=0\mid Z_k^{i-1}=0, Z_{k-1}=1]\nonumber\\ &=\sum_{k=1}^i b_{i-k}\prod_{l=1}^{i-k}(1-b_{i-k-l})P[Z_{k-1}=1]. \end{align*} $$

Thus, we have that for all $i\geq 1$ ,

$$ \begin{align*} P[Z_i=1]&=\sum_{k=1}^if_kP[Z_{i-k}=1] \end{align*} $$

with $f_k:=b_{k-1}\prod _{l=0}^{k-2}(1-b_l)$ , $k\geq 1$ . From this, we obtain

$$ \begin{align*} \mathbb{E}^{x,y}[|\hat{f}(\tilde{\eta}_{-\infty}^n)-\hat{f}(\tilde{\omega}_{-\infty}^{n}) |]\,{\leq}\, &\|\hat{f}\|_{\phi}\bigg( \operatorname{{\mathrm{var}}}_{n-K_{k}+1}(e^\phi)\sum_{l=1}^k f_lP[Z_{k-l}=1]\nonumber\\ &\kern1.5pt{+}\, \sum_{l=1}^k \operatorname{{\mathrm{var}}}_{n-K_{k-l}+1}(e^\phi)\prod_{m=1}^{l}(1-b_{k-l+m})P[Z_{k-l}=1] \bigg). \end{align*} $$

We deduce that

$$ \begin{align*} \sup_{x,y}\mathbb{E}^{x,y}[|\hat{f}(\tilde{\eta}_{-\infty}^n)-\hat{f}(\tilde{\omega}_{-\infty}^{n}) |]&\leq \kappa \sum_{l=1}^k f_lP[Z_{k-l}=1]=\kappa P[Z_{k}=1] \end{align*} $$

with

$$ \begin{align*}\kappa:=\operatorname{{\mathrm{var}}}_{n-K_{k}+1}(e^\phi)+\sup_{1\leq l\leq k}\frac{\operatorname{{\mathrm{var}}}_{n-K_{k-l}+1}(e^\phi)}{f_l}. \end{align*} $$

Finally, since $({\operatorname {{\mathrm {var}}}_{n-K_{k-l}+1}(e^\phi )})/({b_{l-1}})\leq 2$ and $\prod _{j=0}^{\infty }(1-b_j)>0$ (using that $b_0 < 1$ and $\sum _{j=0}^\infty b_j<\infty $ ), we observe that

$$ \begin{align*} \kappa\leq 1+\frac{\operatorname{{\mathrm{var}}}_{n-K_{k-l}+1}(e^\phi)}{b_{l-1}\prod_{j=0}^{\infty}(1-b_j)}\leq c_2 \end{align*} $$

for some positive constant $c_2$ depending on C, $\delta $ , and $\beta $ .

7.1 Proof of Corollary 3.9

The potential $\phi $ is summable; therefore, there exists a normalized potential $\psi $ with the same unique equilibrium state as $\phi $ [Reference WaltersWal75, Theorem 3.2]. If $\operatorname {{\mathrm {var}}}_{k}(\phi )\leq {C}/{k^{({3+ \delta })/{2}}}$ , then $\operatorname {{\mathrm {var}}}_{k}(\psi )\leq {C'}/{k^{({1+ \delta })/{2}}}$ for some constant $C'> 0$ that depends only on C (see, for example, [Reference PollicottPol00, Proposition 1]). Closely following the proof of Theorem 3.6 using the measures compatible with $\psi $ , we obtain the desired results. The only difference is that in (10) we use the bound

$$ \begin{align*} \mathbb{E}^{x,y}[|\hat{f}(\tilde{\eta}_{-\infty}^n)-\hat{f}(\tilde{\omega}_{-\infty}^{n}) |] \leq \|\hat{f}\|_{\phi} \sum_{j=0}^k \operatorname{{\mathrm{var}}}_{n-K_{k-j}+1}(e^{\phi})\mathbb{P}^{x,y}[\theta_k=j] \end{align*} $$

instead of

$$ \begin{align*} \mathbb{E}^{x,y}[|\hat{f}(\tilde{\eta}_{-\infty}^n)-\hat{f}(\tilde{\omega}_{-\infty}^{n}) |] \leq \|\hat{f}\|_{\psi} \sum_{j=0}^k \operatorname{{\mathrm{var}}}_{n-K_{k-j}+1}(e^{\psi})\mathbb{P}^{x,y}[\theta_k=j]. \\[-24pt] \end{align*} $$

8 Applications

8.1 Functional central limit theorem (FCLT) for potentials with non-summable variations

Let $\sigma> 0$ . A function $h:A\to \mathbb {R}$ satisfies the functional central limit theorem (FCLT), also called the weak invariance principle, when the process $\{\zeta _n(t), t \in [0,1], n\geq 1\}$ defined by

$$ \begin{align*} \zeta_n(t) = \frac{1}{\sigma \sqrt{n}} \sum_{i = 0}^{\lfloor nt \rfloor} h\circ \eta_i \end{align*} $$

converges weakly to a standard Brownian motion on $D[0,1]$ . Tyran-Kamińska [Reference Tyran-KamińskaTK05, Section 4.3] showed that the FCLT holds when the potential $\phi $ has summable variations. A straightforward application of Theorem 3.2 in [Reference Tyran-KamińskaTK05] and our Corollary 3.3 is the following FCLT for potentials with non-summable variations.

Proposition 8.1. Assume that the alphabet A is finite and $\phi $ satisfies Assumption ( $\mathcal {A}$ ) with $\delta \in (1/2,1]$ . Let $\mu $ be shift-invariant and compatible with $\phi $ . Also, let $h:A \to \mathbb {R}$ be a function such that $\int h\circ \eta _0\; d\mu = 0$ . If $\sigma ^2 :=\int (h\circ \eta _0)^2\; d\mu> 0$ , then h satisfies the FCLT.

Proof Because the alphabet is finite, we have that $\sigma ^2 \leq \|h\|^2_\infty < \infty $ , as required by Theorem 3.2 in [Reference Tyran-KamińskaTK05]. It remains to verify the condition on the mixing rate. Using the processes $\{\tilde {\eta }_i, i \in \mathbb {Z}\}$ and $\{\tilde {\omega }_i, i \in \mathbb {Z}\}$ introduced in the proof of Theorem 3.6, it is sufficient to check that there exists $\gamma> 1/2$ such that

(11)

$$ \begin{align} \limsup_{n \rightarrow \infty} n^\gamma \sup_{x,y}\mathbb{E}^{x,y}[|h(\tilde{\eta}_n)-h(\tilde{\omega}_n)|] < \infty. \end{align} $$

We have from Corollary 3.3 that, for all $x,y \in \mathcal {X}_-$ ,

$$ \begin{align*} \mathbb{E}^{x,y}[|h(\tilde{\eta}_n)-h(\tilde{\omega}_n)|] &\leq 2\|h\|_{\infty} \mathbb{P}^{x,y}[\tilde{\eta}_n\neq \tilde{\omega}_n]\\ &\leq \frac{c_1}{n^{\delta'}}, \end{align*} $$

where $\delta ' < \delta $ and $c_1 $ is a positive constant that depends on $C, \delta , \delta '$ , and h. Taking $\delta '$ and $\gamma $ such that $1/2 < \gamma \leq \delta ' <\delta $ , we obtain (11).

8.2 Hoeffding-type inequality for potentials with non-summable variations

A Hoeffding-type inequality gives finite sample bounds for deviations of additive functionals from their mean. When the variation rate of the potential is summable, we have an exponential inequality [Reference Gallesco, Gallo and TakahashiGGT14, Reference Chazottes, Gallo and TakahashiCGT20, Reference MartonMar98]. Nevertheless, the rate of concentration for potentials when the variation rate is not summable is an open question. Using our result, we can obtain the following stretched exponential inequality for sums of random variables.

Proposition 8.2. Assume that the alphabet A is finite and $\phi $ satisfies Assumption ( $\mathcal {A}$ ) with $\delta \in (1/2,1]$ . Let $\mu $ be compatible with $\phi $ . For all $\delta '<2\delta -1$ , $n\geq 1$ , $t\geq 0$ , and all functions $h:A \to \mathbb {R}$ , we have

$$ \begin{align*} \mu\bigg[\bigg|\frac{1}{n}\sum_{i=1}^n(h(\eta_i)-\mathbb{E}[h(\eta_i)])\bigg|\geq t\bigg]\leq 2\exp\bigg\{\!-\frac{C_{10}n^{\delta'}t^2}{R(h)^2}\bigg\}, \end{align*} $$

where $R(h):=\max _{a\in A}h(a)-\min _{a\in A}h(a)$ and $C_{10}$ is a constant that depends on C, $\delta $ , and $\delta '$ .

Proof This is a consequence of Theorem 3.2 of [Reference Chazottes, Collet, Külske and RedigCCKR07] and Corollary 3.3. In order to apply Theorem 3.2 of [Reference Chazottes, Collet, Külske and RedigCCKR07], we need to estimate the terms $\|\overline {D}\|_{\ell ^2(\mathbb {N})}^2$ and $\|\delta f\|^2_{\ell ^2(\mathbb {N})}$ (for $f(x_1,\ldots ,x_n)=({1}/{n})\sum _{i=1}^n h(x_i)$ ) there. We have

$$ \begin{align*} \|\overline{D}\|_{\ell^2(\mathbb{N})}^2\leq \bigg(1+\sum_{i=1}^n \sup_{x,y\in \mathcal{X}_-}\mathbb{P}^{x,y}[\hat{\eta}_i\neq \hat{\omega}_i]\bigg)^2. \end{align*} $$

Now, for $\delta '<2\delta -1$ , using Corollary 3.3, we obtain

(12)

$$ \begin{align} \|\overline{D}\|_{\ell^2(\mathbb{N})}^2\leq c_1 n^{1-\delta'} \end{align} $$

for some positive constant $c_1$ depending on $C, \delta $ , and $\delta '$ .

For a given function $f: A^{n}\to \mathbb {R}$ , we define the oscillation of f at site $i\in \{1,\ldots ,n\}$ by

$$ \begin{align*} \delta_if:= \sup_{x_j=x'_j, j \neq i}|f(x_1,\ldots,x_n)-f(x'_1,\ldots,x'_n)|. \end{align*} $$

Now, taking $f(x_1,\ldots , x_{n})= ({1}/{n})\sum _{i=1}^{n}h(x_i)$ , we have $\delta _i f={R(h)}/{n}$ for $i\in \{1,\ldots ,n\}$ . Thus, we obtain

(13)

$$ \begin{align} \|\delta f\|^2_{\ell^2(\mathbb{N})}=\sum_{i=1}^{n}\bigg(\frac{R(h)}{n}\bigg)^2= \frac{R(h)^2}{n}. \end{align} $$

Finally, using (12) and (13) in Theorem 3.2 of [Reference Chazottes, Collet, Külske and RedigCCKR07], we obtain Proposition 8.2.

8.3 Poisson autoregression model

As a second application of our results, we consider a model with countable infinite alphabet called Poisson autoregression, which is popular in applications [Reference Kedem and FokianosKF05]. Only the Markovian case of these models were studied in the literature. We will show how we can choose the parameters of non-Markovian Poisson autoregression models to satisfy the Assumption ( $\mathcal {A}$ ) and thus apply the results of §3.

Consider an absolutely converging sequence $(\beta _i)_{i\geq 1}$ and a sequence of non-negative integers $(\gamma _i)_{i\geq 1}$ such that $S:=\sum _{i=1}^{\infty }|\beta _i|\gamma _i<\infty $ . Consider $A=\mathbb {Z}_+$ and the potential $\phi $ defined for all $x\in \mathcal {X}_-$ by

$$ \begin{align*}\phi(x)=-\lambda(x_{-\infty}^{-1}) + x_0\log \lambda(x_{-\infty}^{-1}) - \sum_{k = 0}^{x_0}\log(k), \end{align*} $$

where

$$ \begin{align*}\lambda(x_{-\infty}^{-1})=\exp\bigg\{\sum_{i=1}^{\infty}\beta_i (x_{-i}\wedge \gamma_i)\bigg\}. \end{align*} $$

For this model, we obtain

(14)

$$ \begin{align} \chi^2_k(\phi)=\sup_{a\in \mathcal{X}} \sup_{x,y\in \mathcal{X}_-}(e^{\lambda(a_{-k}^{-1}y)({\lambda(a_{-k}^{-1}x)}/({\lambda(a_{-k}^{-1}y)})-1)^2}-1). \end{align} $$

Now, since $e^{-S} \leq \lambda (x_{-\infty }^{-1})\leq e^S$ and the exponential function is locally bi-lipschitz, using (14), we have that

$$ \begin{align*} c_1^{-1} \bigg(\sum_{i=k+1}^{\infty}|\beta_i|\gamma_i\bigg)^2 \leq \chi^2_k(\phi)\leq c_1 \bigg(\sum_{i=k+1}^{\infty}|\beta_i|\gamma_i\bigg)^2, \end{align*} $$

where $c_1$ is a positive constant that depends only on S.

Finally, choosing the sequences $(\beta _i)_{i\geq 1}$ and $(\gamma _i)_{i\geq 1}$ such that

$$ \begin{align*}\frac{c_2^{-1}}{i^{({3+\varepsilon})/{2}}}\leq |\beta_i|\gamma_i \leq \frac{c_2}{i^{({3+\varepsilon})/{2}}}\end{align*} $$

for some $\varepsilon>0$ and $c_2\geq 1$ , we obtain

$$ \begin{align*} \frac{c_3^{-1}}{k^{1+\varepsilon}}\leq \chi^2_k(\phi)\leq \frac{c_3}{k^{1+\varepsilon}}, \end{align*} $$

where $c_3$ is a positive constant.

Finally, for this model, we mention that the existence of a shift-invariant probability measure compatible with $\phi $ is obtained by applying Theorem 5.1 of [Reference Johansson, Öberg and PollicottJÖP07] with $K=e^{2\sinh S}$ and $\pi $ equal to the Poisson law with parameter $e^S$ . Assumption ( $\mathcal {A}$ ) implies the square summability of the variation, which guarantees the uniqueness of the shift-invariant probability measure [Reference Johansson, Öberg and PollicottJÖP07, Corollary 4.2].

Acknowledgements

C. Gallesco and D. Y. Takahashi would like to thank Sandro Gallo for several fruitful discussions that motivated this work. We also thank Leandro Cioletti for comments on an early version of the manuscript. C. Gallesco was partially supported by FAPESP (grant 2017/19876-4) and CNPq (grant 312181/2017-5). D. Y. Takahashi thanks the support of FAPESP Research, Innovation and Dissemination Center for Neuromathematics (grant 2013/07699-0).

References

Bressaud, X., Fernández, R. and Galves, A.. Decay of correlations for non-Hölderian dynamics. A coupling approach. Electron. J. Probab. 4(3) (1999), 19 pages (electronic).CrossRef Google Scholar

Berger, N., Hoffman, C. and Sidoravicius, V.. Non-uniqueness for specifications in

${\ell}^{2+\epsilon }$ . Ergod. Th. & Dynam. Sys. 38(4) (2018), 1342–1352.CrossRef Google Scholar

Chazottes, J.-R., Collet, P., Külske, C. and Redig, F.. Concentration inequalities for random fields via coupling. Probab. Theory Related Fields 137(1–2) (2007), 201–225.CrossRef Google Scholar

Comets, F., Fernández, R. and Ferrari, P. A.. Processes with long memory: regenerative construction and perfect simulation. Ann. Appl. Probab. 12(3) (2002), 921–943.CrossRef Google Scholar

Chazottes, J.-R., Gallo, S. and Takahashi, D. Y.. Optimal Gaussian concentration bounds for stochastic chains of unbounded memory. Preprint, 2020, arXiv:2001.06633.Google Scholar

Coelho, Z. and Quas, A.. Criteria for

$\bar{d}$ -continuity. Trans. Amer. Math. Soc. 350(8) (1998), 3257–3268.CrossRef Google Scholar

Cover, T. M. and Thomas, J. A.. Elements of Information Theory, 2nd edn. Wiley, Hoboken, NJ, 2006.Google Scholar

Doeblin, W. and Fortet, R.. Sur des chaînes à liaisons complètes. Bull. Soc. Math. France 65 (1937), 132–148.CrossRef Google Scholar

Fernández, R. and Maillard, G.. Chains with complete connections: general theory, uniqueness, loss of memory and mixing properties. J. Stat. Phys. 118(3–4) (2005), 555–588.CrossRef Google Scholar

Gallesco, C., Gallo, S. and Takahashi, D. Y.. Explicit estimates in the Bramson–Kalikow model. Nonlinearity 27(9) (2014), 2281–2296.Google Scholar

Gallesco, C., Gallo, S. and Takahashi, D. Y.. Dynamic uniqueness for stochastic chains with unbounded memory. Stochastic Process. Appl. 128(2) (2018), 689–706.CrossRef Google Scholar

Giacomin, G.. Random Polymer Models. Imperial College Press, Singapore, 2007.CrossRef Google Scholar

Gouëzel, S.. Sharp polynomial estimates for the decay of correlations. Israel J. Math. 139(1) (2004), 29–65.CrossRef Google Scholar

Harris, T. E.. On chains of infinite order. Pacific J. Math. 5 (1955), 707–724.CrossRef Google Scholar

Iosifescu, M. and Grigorescu, Ş.. Dependence with Complete Connections and Its Applications (Cambridge Tracts in Mathematics, 96) . Cambridge University Press, Cambridge, 1990.Google Scholar

Johansson, A. and Öberg, A.. Square summability of variations and convergence of the transfer operator. Ergod. Th. & Dynam. Sys. 28(4) (2008), 1145–1151.CrossRef Google Scholar

Johansson, A., Öberg, A. and Pollicott, M.. Countable state shifts and uniqueness of g-measures. Amer. J. Math. 129(6) (2007), 1501–1511.CrossRef Google Scholar

Johansson, A., Öberg, A. and Pollicott, M.. Unique Bernoulli

$g$ -measures. J. Eur. Math. Soc. 14(5) (2012), 1599–1615.CrossRef Google Scholar

Kalikow, S.. Random Markov processes and uniform martingales. Israel J. Math. 71(1) (1990), 33–54.CrossRef Google Scholar

Keane, M.. Strongly mixing

$g$ -measures. Invent. Math. 16(4) (1972), 309–324.CrossRef Google Scholar

Kedem, B. and Fokianos, K.. Regression Models for Time Series Analysis (Wiley Series in Probability and Statistics, 488). John Wiley & Sons, New Jersey, 2005.Google Scholar

Lindvall, T.. On Strassen’s theorem on stochastic domination. Electron. Commun. Probab. 4 (1999), 51–59.CrossRef Google Scholar

Marton, K.. Measure concentration for a class of random processes. Probab. Theory Related Fields 110(3) (1998), 427–439.CrossRef Google Scholar

Pollicott, M.. Rates of mixing for potentials of summable variation. Trans. Amer. Math. Soc. 352(2) (2000), 843–853.CrossRef Google Scholar

Reiss, R.-D.. Approximate Distributions of Order Statistics: With Applications to Nonparametric Statistics. Springer Science & Business Media, Springer-Verlag, New York, 2012.Google Scholar

Sason, I. and Verdú, S.. f-divergence inequalities. IEEE Trans. Inform. Theory 62(11) (2016), 5973–6006.CrossRef Google Scholar

Thorisson, H.. Coupling, Stationarity, and Regeneration (Probability and Its Applications) . Springer, New York, 2000.CrossRef Google Scholar

Tyran-Kamińska, M.. An invariance principle for maps with polynomial decay of correlations. Comm. Math. Phys. 260(1) (2005), 1–15.CrossRef Google Scholar

Walters, P.. Ruelle’s operator theorem and

$g$ -measures. Trans. Amer. Math. Soc. 214 (1975), 375–387.Google Scholar

Article contents

Mixing rates for potentials of non-summable variations

Abstract

Keywords

MSC classification

1 Introduction

2 Definitions

3 Results

Assumption ( $\mathcal {A}$ ).

4 Technical lemmas

Lemma 4.5. (Proposition 2, item (iv) in [Reference Bressaud, Fernández and GalvesBFG99])

5 Proof of Theorem 3.2

6 Proofs of Corollaries 3.3 and 3.4

6.1 Proof of Corollary 3.3

6.2 Proof of Corollary 3.4

7 Proof of Theorem 3.6

7.1 Proof of Corollary 3.9

8 Applications

8.1 Functional central limit theorem (FCLT) for potentials with non-summable variations

8.2 Hoeffding-type inequality for potentials with non-summable variations

8.3 Poisson autoregression model

Acknowledgements

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests