Hostname: page-component-745bb68f8f-hvd4g Total loading time: 0 Render date: 2025-02-06T10:28:45.746Z Has data issue: false hasContentIssue false

Absolute regularity of semi-contractive GARCH-type processes

Published online by Cambridge University Press:  12 July 2019

Paul Doukhan*
Affiliation:
Université Cergy-Pontoise
Michael H. Neumann*
Affiliation:
Friedrich-Schiller-Universität Jena
*
*Postal address: UMR 8088 Analyse, Géométrie et Modélisation, 2 avenue Adolphe Chauvin, 95302 Cergy-Pontoise Cedex, France.
**Postal address: Friedrich-Schiller-Universität Jena, Institut für Mathematik, Ernst-Abbe-Platz 2, 07743 Jena, Germany. Email address: michael.neumann@uni-jena.de
Rights & Permissions [Opens in a new window]

Abstract

We prove existence and uniqueness of a stationary distribution and absolute regularity for nonlinear GARCH and INGARCH models of order (p, q). In contrast to previous work we impose, besides a geometric drift condition, only a semi-contractive condition which allows us to include models which would be ruled out by a fully contractive condition. This results in a subgeometric rather than the more usual geometric decay rate of the mixing coefficients. The proofs are heavily based on a coupling of two versions of the processes.

Type
Research Papers
Copyright
© Applied Probability Trust 2019 

1 Introduction

Conditionally heteroscedastic processes are frequently used to model the evolution of stock prices, exchange rates and interest rates. Starting with the seminal papers by Engle [Reference Engle15] on autoregressive conditional heteroscedastic models (ARCH) and Bollerslev [Reference Bollerslev3] on generalized ARCH, numerous variants of these models have been proposed for modelling financial time series; see, for example, Francq and Zakoïan [Reference Francq and Zakoïan20] for a detailed overview. More recently, integer-valued GARCH models (INGARCH) which mirror the structure of GARCH models have been proposed for modelling time series of counts; see, for example, Fokianos [Reference Fokianos, Subba Rao, Subba Rao and Rao16].

In this paper, we prove existence and uniqueness of a stationary distribution under a time-homogeneous dynamic. As our main result, we show absolute regularity of the observable process under the semi-contractive condition (1.5) rather than a more common fully contractive condition on the volatility function. In conjunction with standard conditions (A1) and (A3) given in Section 2, this results in an atypical decay rate for the coefficients of absolute regularity,

(1.1)\begin{equation} \label{RePr} \beta_n = O\big(\rho^{\sqrt{n}}\big), \quad\mbox{for some }\rho<1.\end{equation}

Our technique allows us to obtain this strong result even for nonstationary models with a nonhomogeneous dynamic, under uniform (in t) versions of our regularity conditions. This opens a wide range of applications for modelling real data sets.

The results hold for general GARCH processes obeying the model equations

(1.2a)\begin{gather} Y_t = \sigma_t \varepsilon_t, \end{gather}
(1.2b)\begin{gather} \sigma_t^2 =\,f(Y_{t-1},\ldots,Y_{t-p};\;\sigma_{t-1},\ldots,\sigma_{t-q}). \end{gather}

Here (εt)t is a sequence of independent and identically distributed (i.i.d.) random variables, where εt is independent of all lagged random variables and \[\mathbb{E}\varepsilon _t^2 = 1\]. A general INGARCH process is characterized by the model equations

(1.3a)\begin{equation} \label{1.2a} Y_t \mid {\mathcal F}_{t-1} = Q(\lambda_t), \end{equation}

where $${{\cal F}_s} = \sigma (({Y_s},{\lambda _s}),({Y_{s - 1}},{\lambda _{s - 1}}), \ldots )$$ and, analogously to the GARCH case,

(1.3b)\begin{equation} \lambda_t =\,f(Y_{t-1},\ldots,Y_{t-p};\,\lambda_{t-1},\ldots,\lambda_{t-q}). \end{equation}

Here {Q(λ) : λ ≥ 0} is a family of distributions on the nonnegative integers. An important aspect is that such models allow for a feedback mechanism in the hidden process which often makes a parsimonious parametrization possible. Absolute regularity (β-mixing) with a geometric decay rate of the coefficients of standard (linear) GARCH(p, q) processes was shown in the doctoral thesis of Boussama [Reference Boussama4]. Geometric β-mixing for nonlinear GARCH(1,1) specifications can be found in [Reference Carrasco and Chen6, Proposition 5] and [Reference Francq and Zakoïan19, Theorem 3]. Properties of INGARCH processes have already been studied under a fully contractive condition,

(1.4)$$\eqalign{ & |{\kern 1pt} f({y_1}, \ldots ,{y_p};{\kern 1pt} {\lambda _1}, \ldots ,{\lambda _q}) - f({y_{1'}}, \ldots ,{y_{p'}};{\kern 1pt} {\lambda _{1'}}, \ldots ,{\lambda _{q'}})| \cr & \quad \quad \le \sum\limits_{i = 1}^p {a_i}|{y_i} - {y_{i'}}| + \sum\limits_{j = 1}^q {b_j}|{\lambda _j} - {\lambda _{j'}}|, \cr} $$

where

y_1,\ldots,y_p,y_1',\ldots,y_p'\in\N_0=\{0,1,\ldots\},\qquad \lambda_1,\ldots,\lambda_q,\lambda_1',\ldots,\lambda_q'\geq 0,

and a 1, …, ap and b 1, …, bq are nonnegative constants such that $$\sum\nolimits_{i = 1}^p {a_i} + \sum\nolimits_{j = 1}^q {b_j} < 1$$. Neumann [Reference Neumann27] showed, in the case of p = q =1, that condition (1.4) implies that the bivariate process ((λt, Yt))t has a unique stationary distribution and that a stationary version of the count process (Yt)t is absolutely regular with mixing coefficients βn = O (ρn) for some ρ < 1. It was also shown that the intensity process (λt)t is not strongly mixing in general (see Remark 3 in that paper for a simple counterexample), but ergodic. Franke [Reference Franke21] showed in the case of p, q ≥ 1 that there exists a stationary distribution. Moreover, he proved τ-weak dependence as defined in [Reference Dedecker, Doukhan, Lang, León, Louhichi and Prieur8], again with an exponential decay of the coefficients of weak dependence. Also, under a fully contractive condition, Fokianos et al. [Reference fokianos, Rahbek and Tjøstheim18] analysed linear and nonlinear versions of INGARCH(1,1) processes. Since the verification of geometric ergodicity turned out to be unclear with conventional Markov chain theory, these authors proved ergodicity for a perturbed version of the original process. As the perturbations can be chosen arbitrarily small, this result could be used to derive the asymptotic distribution of parameter estimates.

We will cover both GARCH and INGARCH models, and we want to stress that we impose a contractive condition considerably weaker than (1.4),

(1.5)\begin{equation} |\,f(\kern1pt y_1,\ldots,y_p;\,z_1,\ldots,z_q) - f(\kern1pt y_1,\ldots,y_p;\,z_1',\ldots,z_q') | \leq \sum_{i=1}^q c_i | z_i-z_i' | , \end{equation}

where c 1, …, cq are nonnegative constants with c 1 + · · · + cq < 1. This allows us to consider, for example, threshold models where the function f is specified as

(1.6)\begin{equation} f(\kern1pt y;\,\lambda) = \bigg\{\!\begin{array}{ll} a + b y + c \lambda & \mbox{if } y\in [L,U], \\ a' + b' y + c' \lambda & \mbox{if } y\not\in [L,U]. \end{array} \end{equation}

Such a specification was proposed in the framework of integer-valued time series by Woodard et al. [Reference Woodard, Matteson and Henderson32]. Furthermore, our semi-contractive condition also allows us to consider functions f with

$$f(y;{\kern 1pt} \lambda ) = g(y) + h(\lambda )$$

and with only Lip(h) < 1. Note that well-established threshold models in financial mathematics, such as those proposed for example by Glosten, Jagannathan and Runkle [Reference Glosten, Jagannathan and Runkle22],

$$\sigma _t^2 = \omega + \alpha {\kern 1pt} Y_{t - 1}^2 + \beta {\kern 1pt} Y_{t - 1}^2{1_{\{ {Y_{t - 1}} < 0\} }} + \gamma {\kern 1pt} \sigma _{t - 1}^2,$$

or by Francq and Zakoïan [Reference Francq and Zakoïan20, p. 250],

$${\sigma _t} = \omega + \sum\limits_{i = 1}^p (\alpha _i^ + {Y_{t - i}}{1_{\{ {Y_{t - i}} > 0\} }} - \alpha _i^ - {Y_{t - i}}{1_{\{ {Y_{t - i}} < 0\} }}) + \sum\limits_{j = 1}^q {\beta _j}{\sigma _{t - j}},$$

even fulfil the fully contractive condition (1.4).

To unify our notation, we use the expression (λt)t for the hidden process in what follows, that is, $$\sigma _t^2$$ will be replaced by λt in the case of a GARCH process. It is worth noting at this point that, although the bivariate process ((Yt, λt))t is a Markov chain of order pq, the process (Yt)t does not share this property, except for the case when q = 0, which is not of primary interest here.

We show as our main result that the coefficients of absolute regularity of the observable process (Yt)t satisfy (1.1). Recall that $$\beta _n = \mathop {\sup }\nolimits_k \beta ({\cal F}_{ - \infty }^k, {\cal F}_{k + n}^\infty )$$ with $${\cal F}_k^l = \sigma (Y_s :k \le s \le l)$$, where, for any couple of σ-fields $${\cal A}$$ and $${\cal B}$$,

\begin{align*} \beta(\mathcal{A},\mathcal{B}) = \frac{1}{2} \sup\biggl\{\sum_{i=1}^\ell \sum_{j=1}^m |\mathbb P(A_i\cap B_j) -\mathbb P(A_i)\mathbb P( B_j)|\biggr\}, \end{align*}

where the supremum is taken over partitions of Ω, (Ai)1≤i≤ℓ, and (Bj)1≤im subject to $$A_i \in {\cal A}$$ for 1 ≤ i, and $$B_j \in {\cal B}$$ for 1 ≤ jm. This subexponential rate is quite unusual and it is a consequence of the fact that we only impose a semi-contractive rather than a fully contractive condition.

To prove this result, we construct a coupling of two versions of the bivariate process ((Yt, λt))t, both started independently at time 0 with the stationary distribution. These two versions, $${(({\widetilde Y_t},{\widetilde \lambda _t}))_t}$$ and $${(({\widetilde Y_{t'}},\widetilde \lambda _t^'))_t}$$, are defined on a sufficiently rich probability space \[({\widetilde\Omega}, {\widetilde{\cal F}},{\widetilde {\mathbb P}})\]. In the context of Markov chains, such a coupling typically leads to a coalescence of the two versions at some random time τ and \[{\widetilde {\mathbb P}}(\tau &#x003E; n)\] then serves as an estimate of βn. In our case, since (Yt)t is not a Markov chain, it can well happen that $$\widetildeY_\tau = \widetildeY_{\tau &#x2018;}$$ at some time τ, but that afterwards these two processes diverge again. This follows from the fact that the accompanying hidden processes $${({\widetilde \lambda _t})_t}$$ and $$(\widetilde\lambda _{t'} )_t$$ can still attain different values at time τ, which means that the observable processes may diverge again with positive probability. In view of this, we have to use \[\mathbb{P}(\widetildeY_m \ne \widetildeY_{m'}\] for any mn) as an upper estimate for βn. When the two processes reach a state with

(1.7)\begin{equation} \wty_t=\wty_t',\ldots,\wty_{t-p+1}=\wty_{t-p+1}' \quad \mbox{and} \quad |\wtl_t-\wtl_t'| + \cdots + |\wtl_{t-q+1}-\wtl_{t-q+1}'| \leq \rho^{\sqrt{n}}, \end{equation}

then we have p subsequent hits and the contractive condition begins to take effect which eventually leads to the result that both processes coalesce with a (conditional) probability exceeding $$1 - O(\rho ^{\sqrt n } )$$. To reach such a state with the crucial property (1.7), the two processes need several trials, beginning at certain stopping times τ1, τ2, … . Because of the condition of

$$|{\widetilde \lambda _t} - \widetilde \lambda _t^{'}| + \cdots + |{\widetilde \lambda _{t - q + 1}} - \widetilde \lambda _{t - q + 1}^{'}| \le {\rho ^{\sqrt n }},$$

in (1.7), each of these trials covers in order $$\sqrt n$$ time points. This means that, up to time n, there can be in order at most $$\sqrt n$$ such trials. Such a number of successive trials ensures that a state with (1.7) is reached before time n with a probability exceeding $$1 - O(\rho ^{\sqrt n } )$$. This might give some insight as to why we obtain the unusual rate of $$\rho ^{\sqrt n }$$ for the coefficients of absolute regularity. The desired uniqueness of the stationary law follows as a by-product of the successful coupling. The result on absolute regularity can be extended to nonstationary GARCH-type processes; a uniform (in t) version of our semi-contractive condition will ensure this.

The paper is organized as follows. In the next section we fix and discuss our assumptions. Our main results are based on a coupling technique which is introduced in Subsection 2.1. To make the main ideas of our proofs easily accessible, we present the consequences of this coupling for a simple special case in Subsection 2.2. The main results are formulated in Subsection 2.3. A few applications in statistics are mentioned in Subsection 2.4. All proofs are deferred to a final Section 3.

2 Assumptions and main results

We assume that the process (Yt)t, which is defined on some probability space $$(\Omega, {\cal F},P)$$, obeys the model equations

(2.1a)\begin{gather} Y_t\mid {\mathcal F}_{t-1} \sim Q(\lambda_t),\end{gather}
(2.1b)\begin{gather} \lambda_t =\,f(Y_{t-1},\ldots,Y_{t-p};\,\lambda_{t-1}\ldots,\lambda_{t-q}), \end{gather}

where $${\cal F}_s = \sigma ((Y_s, \lambda _s ),(Y_{s - 1}, \lambda _{s - 1} ), \ldots )$$ and {Q(λ) : λ ∈ [0,∞)} is some family of univariate distributions. Note that assumption (2.1a) is correctly formulated since it follows from (2.1b) that λt is $${\cal F}_{t - 1}$$-measurable.

The canonical domain of the function f is different in the two cases of GARCH and INGARCH models. To unify notation, we define f in both cases on ℝp × [0,∞)q, e.g. by a linear interpolation in the INGARCH case. Recall that (λt)t denotes the volatility process in the case of GARCH(p, q) models ((1.2a)–(1.2b))and the intensity process in the INGARCH(p, q) case ((1.3a)–(1.3b)). Here, the distribution of an observable random variable Yt conditioned on the past is Q(λt), where the parameter λt itself is random, depending on lagged variables Yt −1, …, Yt p and previous values λt −1, …, λt q of the (typically hidden) accompanying process (λt)t.

Possible examples we have in mind are linear or nonlinear GARCH(p,q) processes, with λt being the conditional variance of the observable variable Yt, or integer-valued GARCH processes, where Q(λ) is often chosen to be a Poisson distribution with intensity parameter λ. Existence of a one-sided version of these processes, i.e. t ∈ ℕ, is guaranteed since we can construct such processes iteratively. We will show that there exists a stationary distribution which implies by Kolmogorov’s extension theorem (see, e.g. [Reference Durrett14]) that a stationary two-sided version, i.e. t ∈ ℤ, also exists. In the proof of our main result, we also use some Markov chain techniques. The process (Zt)t with $$Z_t = (Y_t^2, \ldots, Y_{t - p + 1}^2, \sigma _t^2, \ldots, \sigma _{t - q + 1}^2 )$$ for a GARCH(p,q) model obeying (1.2a) and (1.2b) as well as Zt = (Yt, …, Ytp + 1, λt, …, λt q+1) in the INGARCH(p,q) case according to (1.3a) and (1.3b) has this property. In the following it turns out to be convenient to drop the first component of the random vector Zt and we also define $$X_t = (Y_{t - 1}^2, \ldots, Y_{t - p + 1}^2, \sigma _t^2, \ldots, \sigma _{t - q + 1}^2 )$$ as well as Xt =(Yt −1, …, Yt p+1, λt, …, λt q+1), respectively.

We impose the following conditions.

  1. (A1) (Geometric drift condition.) There exist positive constants a 1, …, ap −1, b 0, …, bq −1, κ < 1, and a 0 < ∞ such that, for

    V((\kern1pt y_1,\ldots,y_{p-1};\,\lambda_0,\ldots,\lambda_{q-1})) = \sum_{i=1}^{p-1} a_i y_i + \sum_{j=0}^{q-1} b_j\lambda_j,
    the condition
    \begin{equation} \mathbb E( V(X_t) \mid X_{t-1} ) \leq \kappa V(X_{t-1}) + a_0 \end{equation}
    is fulfilled with probability 1.
  2. (A2) (Semi-contractive condition.) The function f is measurable and there exist nonnegative constants c 1, …, cq with c 1 + · · · + cq < 1 such that

    \begin{equation} |\,f(\kern1pt y_1,\ldots,y_p;\, \lambda_1,\ldots,\lambda_q) - f(\kern1pt y_1,\ldots,y_p;\, \lambda_1',\ldots,\lambda_q')| \leq \sum_{i=1}^q c_i |\lambda_i-\lambda_i'| \end{equation}
    for all y 1, …, yp ∈ ℝ, $$\lambda _1, \ldots, \lambda _q, \lambda _{1'}, \ldots, \lambda _{q'} \ge 0$$.
  3. (A3) (Similarity condition.) There exists some constant δ ∈ (0,∞) such that

    $${\kern 1pt} TV{\kern 1pt} (Q(\lambda ),\,Q(\lambda ')) \le 1{ - ^{ - \delta |\lambda - \lambda '|}}\quad {\rm{for}}\,{\rm{all}}\,\lambda ,\lambda ' \ge 0,$$
    where $${\kern 1pt} TV{\kern 1pt} (Q_1, Q_2 ) = \mathop {\sup }\nolimits_{A \in {\cal B}} |Q_1 (A) - Q_2 (A)|$$ denotes the total variation distance between probability measures Q 1 and Q 2.

Remark 2.1. In the case in which p = q = 1, Xt reduces to λt. Condition (A1) follows from the following drift condition which is frequently used in the context of linear and nonlinear GARCH-type models; see, e.g. [Reference Lindner and Mikosch26] and [Reference Franke21].

  1. (A1′) There exist constants ā 0 ∈ [0,∞), and ā 1, …, āp, ā1, $${\bar b_1}, \ldots ,{\bar b_q} \in [0,1)$$, with $$\sum\nolimits_{i = 1}^p {\bar a_i} + \sum\nolimits_{j = 1}^q {\bar b_j} < 1$$ such that

    • in the GARCH(p,q) case,

      \begin{equation} \sigma_t^2 \leq \bar{a}_0 + \bar{a}_1 Y_{t-1}^2 + \cdots + \bar{a}_p Y_{t-p}^2 + \bar{b}_1 \sigma_{t-1}^2 + \cdots + \bar{b}_q \sigma_{t-q}^2, \end{equation}
    • in the INGARCH(p,q) case,

      \begin{equation} \lambda_t \leq \bar{a}_0 + \bar{a}_1 Y_{t-1} + \cdots + \bar{a}_p Y_{t-p} + \bar{b}_1 \lambda_{t-1} + \cdots + \bar{b}_q \lambda_{t-q}. \end{equation}

Remark 2.2. Condition (A2) is the essential difference to the fully contractive condition imposed in, e.g. [Reference Neumann27] and [Reference Truquet29]. Here, we only assume Lipschitz continuity of f with respect to lagged values λt −1, …, λt q. This includes the case of threshold models where the thresholds are set on the lagged variables of the observable process, $$Y_{t - 1}^2, \ldots, Y_{t - p}^2$$ or $$Y_{t - 1}, \ldots, Y_{t - p}$$, respectively.

Remark 2.3. With the standard specification for GARCH models, we have

\begin{equation} Y_t \mid {\mathcal F}_{t-1} = {\mathcal N}(0,\lambda_t), \end{equation}

that is, λt takes the role of the conditional volatility $$\sigma _t^2$$. Let pλ be the density of an $${\cal N}(0,\lambda )$$ distribution. If the volatilities satisfy λtω then we obtain, for 0 < ωλλ ',

\begin{equation} 1 - \mbox{TV}({\mathcal N}(0,\lambda),{\mathcal N}(0,\lambda')) = \int \,\,{p_\lambda \wedge p_{\lambda'}}\,\, \geq \sqrt{\frac{\lambda}{\lambda'}} \geq \frac{\lambda}{\lambda'} \geq \rme^{-|\lambda-\lambda'|/\lambda} \geq \rme^{-|\lambda-\lambda'|/\omega}, \end{equation}

that is, the similarity condition (A3) is fulfilled with δ =1. (In order to prove the third inequality in the above display, note that 1+u ≤ eu for all u ≥ 0, which implies that λ' = 1 + (λ' − λ) ≤ e|λ'−λ|.)

While a normal distribution seems to be the dominating choice for the distribution of the innovations in GARCH models, there exist quite a few proposals for their integer-valued counterparts, the INGARCH models. For the sake of an easy description, let \[(\mathcal{P}_t (\lambda ))_{\lambda \geqslant 0}, {\kern 1pt} t \in \mathbb{Z}\], be a sequence of independent standard Poisson processes.

  1. 1. Poisson seed. If Q(λ)=Poisson(λ) then Yt can be expressed as $$Y_t = {\cal P}_t (\lambda _t )$$.

  2. 2. Mixed Poisson seed. Here we have the specification $$Y_t = {\cal P}_t (\lambda _t Z_t )$$, where Zt is a nonnegative random variable. The special case of a Bernoulli-distributed random variable Zt leads to the so-called zero-inflated Poisson model in [Reference Lambert23]; it takes into account additional unobserved data.

  3. 3. Compound Poisson seed. Let (Zt,i)t,i≥0 be a double sequence of i.i.d. nonnegative random variables. In this case, Yt is given by $$Y_t = \sum\nolimits_{i = 1}^{{\cal P}_t (\lambda _t )} Z_{t,i}$$. This process is integer valued if \[\mathbb{P}(Z_{t,i} \in \mathbb{N}_0 ) = 1\].

In cases 1 and 3, the similarity assumption (A3) is fulfilled with δ = 1; see [Reference Adell and Jodrá1]. Regarding case 2, let QMP(λ) denote the mixed Poisson distribution with intensity parameter λ. Then,

\begin{equation} \mbox{TV}(Q_{\rm MP}(\lambda),Q_{\rm MP}(\lambda')) \leq \E( 1 - \rme^{-Z_t|\lambda-\lambda'|} ) \leq 1 - \rme^{-\delta|\lambda-\lambda'|}, \end{equation}

where \[\delta = \mathbb{E}Z_t\].

Remark 2.4. For two probability measures Q 1 and Q 2 on $${\cal B}$$, let d 1 = dQ 1/d(Q 1 +Q 2) and d 2 = dQ 2/d(Q 1 +Q 2) be the respective densities with respect to the dominating measure Q 1 +Q 2. Then

\begin{equation} \Delta \,:\!= \int d_1\wedge d_2 \sd(Q_1+Q_2) = 1 - \mbox{TV}(Q_1, Q_2).\nonumber \end{equation}

Furthermore, using the method of maximal coupling as described, for example, in [Reference Den Hollander9, p. 15], we can construct, with the aid of an additional randomization, random variables X 1 and X 2 such that

  • X 1 ~ Q 1, X 2 ~ Q 2,

  • P(X 1 = X 2) = Δ.

Indeed, let U be a random variable with a uniform distribution on [0,1]. If U ≤ Δ then we choose

\begin{equation} X_1 = X_2 = F^{-1}(U), \end{equation}

where $$F(x) = \int_{( - \infty, x]} d_1 \wedge d_2 (Q_1 + Q_2 )$$. Here and below, H −1 denotes the generalized inverse of a generic distribution function H, that is, H −1(t) = inf{x : H(x) ≥ t}. (This function is sometimes denoted by H.) This definition makes sense no matter if the distribution H is a continuous or discrete distribution. If U > Δ then we set

\begin{equation} X_1 = G_1^{-1}(U-\Delta), \qquad X_2 = G_2^{-1}(U-\Delta), \end{equation}

where $$G_i (x) = \int_{( - \infty, x]} (d_i - d_1 \wedge d_2 )(Q_1 + Q_2 )$$ for I = 1, 2.

2.1 Definition of the coupling

We use a coupling approach to prove stationarity and absolute regularity of the GARCH-type process. In the case of a stationary Markov chain \[(Z_t )_{t \in \mathbb{N}_0 }\] defined on some probability space \[(\Omega, \mathcal{F},\mathbb{P})\], one usually constructs, on an appropriate probability space \[(\widetilde\Omega, \widetilde\mathcal{F},\widetilde\mathbb{P})\], two versions \[(\widetildeZ_t )_{t \in \mathbb{N}_0 }\] and \[(\widetildeZ'_t )_{t \in \mathbb{N}_0 }\] of this chain which are started at t = 0 independently, both with their stationary distribution. If one succeeds to construct a coupling such that \[\widetilde\mathbb{P}(\widetildeZ_m \ne \widetildeZ'_m\] for any mn) tends to 0 as n → ∞, then the inequality

(2.2)\begin{equation} \beta_n \leq \widetilde{\P}(\widetilde{Z}_m \neq \widetilde{Z}'_m \text{ for any } m\geq n ) \end{equation}

provides an upper bound for the mixing coefficient. However, since a Markov process in discrete time is always strongly Markovian, it actually suffices to derive an upper estimate for \[\widetilde\mathbb{P}(\widetildeZ_n \ne \widetildeZ'_n )\] and we can conclude that the original process \[(Z_t )_{t \in \mathbb{N}_0 }\] on \[(\Omega, \mathcal{F},\mathbb{P})\] is absolutely regular with coefficients satisfying \[\beta _n \leqslant \widetilde\mathbb{P}(\widetildeZ_n \ne \widetildeZ'_n )\]. In our case, the process (Yt)t is not a Markov chain. Once we have constructed a coupling of $${(({\widetilde Y_t},{\widetilde \lambda _t}))_t}$$ and $${((\widetilde Y{'_t},\widetilde \lambda {'_t}))_t}$$, we have to stick to the estimate (2.2). (Even if $$\widetildeY_n = \widetildeY_{n'}$$$$\widetildeY_{n + 1} = \widetildeY_{n + 1} &#x2018;$$, it could well happen that $$\widetilde\lambda _n \ne \widetilde\lambda _{n'}$$, which means that we cannot achieve $$\widetildeY_{n + 1} = \widetildeY_{n + 1} '$$ with a conditional probability of 1.) This means that we are required to find a construction where the two versions hit at some time and stay together afterwards (they coalesce).

Suppose that pre-sample values $${\widetilde Y_0}, \ldots ,{\widetilde Y_{1 - p}},\,{\widetilde \lambda _0}, \ldots ,{\widetilde \lambda _{1 - q}}$$ and $$\widetilde Y_0^', \ldots ,\widetilde Y_{1 - p}^',\widetilde \lambda _0^', \ldots ,\widetilde \lambda _{1 - q}^'$$ are given. The values of $${\widetilde \lambda _1}$$ and $$\widetilde\lambda _{1'}$$ arise as a result of the model equation (2.1b),

\begin{equation} \wtl_1 =\,f(\wty_0,\ldots,\wty_{1-p};\,\wtl_0,\ldots,\wtl_{1-q}), \qquad \wtl_1' =\,f(\wty_0',\ldots,\wty_{1-p}';\,\wtl_0',\ldots,\wtl_{1-q}'). \end{equation}

Note that the conditional distribution of $${\widetilde Y_1}$$ given the past has to be $$Q({\widetilde \lambda _1})$$ and that of $$Q(\widetilde\lambda _{1'} )$$. We couple the two Markov chains in such a way that $$\widetildeY_t = \widetildeY_{t'}$$ with a maximum conditional probability. According to Remark 2.4, we utilize a sequence \[(U_t )_{t \in \mathbb{N}}\] of i.i.d. random variables with a uniform distribution on the interval [0,1], also independent of $$(\widetildeY_0, \widetildeY_{0'}, \widetilde\lambda _0, \widetilde\lambda _{0'} ),(\widetildeY_{ - 1}, \widetildeY_{ - 1} &#x2018;,\widetilde\lambda _{ - 1}, \widetilde\lambda _{ - 1} &#x2018;)$$, etc.

Let

\begin{gather*} q_1 =\frac{\rd Q(\wtl_1)}{\rd(Q(\wtl_1)+Q(\wtl_1'))}, \qquad q_1' =\frac{\rd Q(\wtl_1')}{\rd(Q(\wtl_1)+Q(\wtl_1'))}, \\ \bar{q}_1 =\int q_1\wedge q_1' \sd(Q(\wtl_1)+Q(\wtl_1')). \end{gather*}

If $${U_1} \le {\bar q_1}$$ then we define

\begin{equation} \wty_1 = \wty_1' = F_1^{-1}(U_1), \end{equation}

where

F_1(x)= \int_{(-\infty, x]} q_1\wedge q_1' \sd(Q(\wtl_1)+Q(\wtl_1')).

If $${U_1} \le {\bar q_1}$$ then we set

\begin{equation} \wty_1 = G_1^{-1}(U_1-\bar{q}_1) \quad \mbox{and} \quad \wty_1' = {G_1'}^{-1}(U_1-\bar{q}_1), \end{equation}

where

\begin{align*} G_1(x) & = \int_{-\infty}^x (q_1-q_1\wedge q_1') \sd(Q(\wtl_1)+Q(\wtl_1')),\\ G_1'(x) & = \int_{-\infty}^x (q_1'-q_1\wedge q_1') \sd(Q(\wtl_1)+Q(\wtl_1')). \end{align*}

We iterate this process in the same way.

Let

\begin{gather*} q_t =\frac{\rd Q(\wtl_t)}{\rd(Q(\wtl_t)+Q(\wtl_t'))}, \qquad q_t' =\frac{\rd Q(\wtl_t')}{\rd(Q(\wtl_t)+Q(\wtl_t'))},\\ \bar{q}_t =\int q_t\wedge q_t' \sd(Q(\wtl_t)+Q(\wtl_t')). \end{gather*}

Furthermore, denote by Ft, Gt, and $$G_{t'}$$ the distribution functions of the densities $$(q_t \wedge q_{t'} )$$, $$(q_t - (q_t \wedge q_{t'} ))$$, and $$(q_{t'} - (q_t \wedge q_{t'} ))$$, respectively. On the basis of given values

\wty_{t-1},\ldots,\wty_{t-p},\wtl_{t-1},\ldots,\wtl_{t-q}\quad\text{and}\quad \wty_{t-1}',\ldots,\wty_{t-p}',\wtl_{t-1}',\ldots,\wtl_{t-q}',

we set

\begin{equation} \wtl_t =\,f(\wty_{t-1},\ldots,\wty_{t-p};\,\wtl_{t-1},\ldots,\wtl_{t-q}), \qquad \wtl_t' =\,f(\wty_{t-1}',\ldots,\wty_{t-p}';\,\wtl_{t-1}',\ldots,\wtl_{t-q}'), \end{equation}

as well as

(2.3a)\begin{equation} \wty_t = \wty_t' = F_t^{-1}(U_t) \quad \mbox{if } U_t\leq \bar{q}_t \end{equation}

and

(2.3b)\begin{equation} \label{eq21.1b} \wty_t = G_t^{-1}(U_t-\bar{q}_t), \qquad \wty_t' = {G_t'}^{-1}(U_t-\bar{q}_t), \quad \mbox{if } U_t> \bar{q}_t. \end{equation}

2.2 A first glimpse at the consequences of the coupling

To communicate the main ideas involved in the proofs in a transparent way, we first consider the special case of an INGARCH(1,1) process and present a sketch of the major steps in the proofs of the results. For definiteness, we assume that $$Y_t |{\cal F}_{t - 1} \sim {\kern 1pt} Poisson{\kern 1pt} (\lambda _t )$$.

Note that TV(Poisson(λ), Poisson(λ′)) ≤ 1−e−|λλ′|. . To see this, assume without loss of generality that λλ . If Y ∼Poisson(λ) and W ∼Poisson(λ − λ) are independent, then Y = Y +W ∼Poisson(λ). It follows that P(Y ≠ Y ) = P(W = 0) = 1 – e −|λ−λ′|, which implies that the similarity condition (A3) is satisfied with δ = 1.

Let $${{\cal G}_t} = \sigma (({\widetilde Y_t},{\widetilde Y_{t'}},{\widetilde \lambda _t},{\widetilde \lambda _{t'}}),({\widetilde Y_{t - 1}},{\widetilde Y_{t - 1}}',{\widetilde \lambda _{t - 1}},{\widetilde \lambda _{t - 1}}'), \ldots )$$ denote the σ-field of the t-past of both versions of the processes. Suppose that τ is some stopping time and that, for some reason, $$|\widetilde\lambda _{\tau + 1} - \widetilde\lambda _{\tau + 1} &#x2018;| \le K$$. Note that $${\widetilde \lambda _{\tau + 1}}$$ and $$\widetilde\lambda _{\tau + 1} &#x2018;$$ are both $${\cal G}_\tau$$-measurable, where

{\mathcal G}_\tau=\biggl\{ G\in \bigcup_{n\in\N_0} {\mathcal G}_n\colon G\cap\{\tau\leq n\}\in {\mathcal G}_n \text{ for all } n\in\N_0\biggr\}.

Then, according to the maximal coupling explained above,

\begin{equation} \widetilde{\P}( \wty_{\tau+1}=\wty_{\tau+1}' \mid {\mathcal G}_\tau ) \geq \rme^{-|\wtl_{\tau+1}-\wtl_{\tau+1}'|} \geq \rme^{-K}. \end{equation}

If, in addition, $$\widetildeY_{\tau + 1} = \widetildeY_{\tau + 1} &#x2018;$$, then the contractive condition (A2) implies that

\begin{equation} | \wtl_{\tau+2} - \wtl_{\tau+2}' | \leq c_1 K. \end{equation}

Therefore, for the next step, we obtain

\begin{equation} \widetilde{\P}( \wty_{\tau+2}=\wty_{\tau+2}' \mid \wty_{\tau+1}=\wty_{\tau+1}', {\mathcal G}_\tau ) \geq \rme^{-c_1 K}, \end{equation}

and, if additionally $$\widetildeY_{\tau + 2} = \widetildeY_{\tau + 2} &#x2018;$$,

\begin{equation} | \wtl_{\tau+3} - \wtl_{\tau+3}' | \leq c_1^2 K. \end{equation}

Proceeding in the same way we obtain

(2.4)\begin{align}\label{eq.1} & \widetilde{\P}( \wty_{\tau+1}=\wty_{\tau+1}',\ldots,\wty_{\tau+M}=\wty_{\tau+M}' \mid {\mathcal G}_\tau ) \nonumber \\ & \qquad = \widetilde{\P}( \wty_{\tau+1}=\wty_{\tau+1}' \mid {\mathcal G}_\tau ) \widetilde{\P}( \wty_{\tau+2}=\wty_{\tau+2}' \mid \wty_{\tau+1}=\wty_{\tau+1}', {\mathcal G}_\tau ) \times \cdots \nonumber \\ & \qquad\jump \times \widetilde{\P}( \wty_{\tau+M}=\wty_{\tau+M}' \mid \wty_{\tau+1}=\wty_{\tau+1}',\ldots, \wty_{\tau+M-1}=\wty_{\tau+M-1}', {\mathcal G}_\tau ) \nonumber \\ & \qquad \geq \rme^{-K(1+c_1+\cdots +c_1^{M-1})}, \end{align}

which leads to

(2.5)\begin{align} \P( \wty_{\tau+m}=\wty_{\tau+m}' , |\wtl_{\tau+m}-\wtl_{\tau+m}'| \leq c_1^{m-1} K \text{ for all } m\in\N \mid {\mathcal G}_\tau ) & \geq \rme^{-K/(1-c_1)} \nonumber \\ & \geq 1 - \frac{K}{1-c_1}. \end{align}

In what follows we sketch how (2.5) can be used to prove absolute regularity. Let \[\widetilde\mathbb{P}_\pi\] denote the probability where $$({\widetilde Y_0},{\widetilde \lambda _0})$$ and $$(\widetildeY_{0'}, \widetilde\lambda _{0'} )$$ are independent and distributed with their common stationary law π. (Its existence and uniqueness is proved in Corollary 2.1 below.) We define the stopping time

\begin{equation} \tau^{(n)} = \inf \{t\geq 0\colon |\wtl_{t+1}-\wtl_{t+1}'| \leq C \rho^{n^\alpha} \} \end{equation}

for some C < ∞ and some α > 0 whose optimal choice is explained below. We obtain, from (2.5),

(2.6)\begin{align} \beta_n & \leq \widetilde{\P}_\pi ( \wty_m\neq \wty_m' \ \mbox{for any } m\geq n ) \nonumber \\ & \leq \widetilde{\P}_\pi( \wty_m\neq \wty_m' \ \mbox{for any } m>\tau^{(n)} \mid {\mathcal G}_{\tau^{(n)}} ) + \widetilde{\P}_\pi( \tau^{(n)} \geq n ) \nonumber \\ & \leq \frac{C \rho^{n^\alpha}}{1-c_1} + \widetilde{\P}_\pi( \tau^{(n)} \geq n ). \end{align}

It remains to derive an upper estimate for the second term on the right-hand side of (2.6). To this end, we consider subsequent trials to achieve a state with $$|\widetilde\lambda _t - \widetilde\lambda _t &#x2018;| \le C_1$$ for some C 1 ∈ (0,∞), followed by subsequent hits $$\widetildeY_t = \widetildeY_t &#x2018;, \ldots, \widetildeY_{t + d_n - 1} = \widetildeY_{t + d_n - 1} &#x2018;$$, where dn =[nα]. We define a first stopping time as

\begin{equation} \tau_1 = \inf\{t\geq 0\colon \wtl_{t}+\wtl_{t}' \leq C_1\}. \end{equation}

(If $$((\widetilde\lambda _t, \widetilde\lambda _{t'} ))_t$$\widetilde\lambda _0 + \widetilde\lambda _{0'} \le C_1$$ then τ 1 = 0. Otherwise, τ 1 is the first arrival time of the process $$((\widetilde\lambda _t ,\widetilde\lambda _{t'} ))_t $$ at A := {(u 1, u 2) : u 1 + u 2C 1}.) At time τ 1 we have $$|\widetilde\lambda _{\tau _1 } - \widetilde\lambda _{\tau _1 } &#x2018;| \le C_1$$. According to (2.4), there exists some constant C2 > 0 such that

\begin{equation} \widetilde{\P}_\pi( \wty_{\tau_1}=\wty_{\tau_1}',\ldots, \wty_{\tau_1+d_n-1}=\wty_{\tau_1+d_n-1}' \mid {\mathcal G}_{\tau_1-1} ) \geq C_2. \end{equation}

After such a successful trial with dn hits, we obtain, from the contractive property (A2),

(2.7)\begin{equation} \label{eq.5} | \wtl_{\tau_1+d_n} - \wtl_{\tau_1+d_n}' | \leq C_1 c_1^{d_n}. \end{equation}

This yields

\widetilde{\P}_\pi( \wty_{\tau_1+m}\neq \wty_{\tau_1+m}' \text{ for any } m\geq d_n \mid \wty_{\tau_1}=\wty_{\tau_1}',\ldots,\wty_{\tau_1+d_n-1}=\wty_{\tau_1+d_n-1}', {\mathcal G}_{\tau_1-1} ) \leq \frac{C_1 c_1^{d_n}}{1-c_1},

which brings us closer to the desired result. This means that a trial which actually leads to a favourable state with (2.7) covers dn time points. Accordingly, for i > 1, we consider the following retarded return times as starting points for the next trials:

\begin{equation} \tau_i = \inf\{ t\geq \tau_{i-1}+d_n\colon \wtl_t+\wtl_t' \leq C_1\}. \end{equation}

Now we are in a position to derive an upper bound for \[\widetilde\mathbb{P}_\pi (\tau ^{(n)} \geqslant n)\]. We define events

\begin{equation} A_i = \{ \wty_{\tau_i}=\wty_{\tau_i}', \ldots, \wty_{\tau_i+d_n-1}=\wty_{\tau_i+d_n-1}' \}. \end{equation}

Since each trial covers dn time points we cannot get more than O(n1−α) different stopping times τ1 before time n. Let Kn = C 3n 1−α for some C 3 > 0. It follows from Lemma 3.1 that

\begin{align*} & \widetilde{\P}\,\biggl( \wty_{\tau^{(n)}+m} = \wty_{\tau^{(n)}+m}' \text{ for all } m\in\N \ \mbox{and} \ \sum_{m=1}^\infty | \wtl_{\tau^{(n)}+m} - \wtl_{\tau^{(n)}+m}' | \leq \frac{\rho^{\sqrt{n}}}{1-c} \biggm| {\mathcal G}_{\tau^{(n)}} \biggr) \nonumber \\ & \qquad \geq \rme^{-(\delta/(1-c)) \rho^{\sqrt{n}}} \\ &\qquad\geq 1 - \frac{\delta}{1-c} \rho^{\sqrt{n}}. \end{align*}

for some η > 1 and ρ < 1, if C 3 is small enough. Therefore, and since

\widetilde{\P}_\pi(A_1^c \cap \cdots \cap A_{K_n}^c) \leq (1-C_2)^{K_n},

we obtain

(2.8)\begin{equation} \widetilde{\P}_\pi( \tau^{(n)} \geq n ) \leq \widetilde{\P}_\pi( \tau_{K_n}+d_n \geq n ) + \widetilde{\P}_\pi( A_1^c \cap \cdots \cap A_{K_n}^c) = o( \rho^{n^\alpha} ) + O(\rho^{n^{1-\alpha}}) \end{equation}

for some ρ < 1. The first term on the right-hand side of (2.6) and the second term on the right-hand side of (2.8) are of the same order for the choice of $$\alpha = {\textstyle{1 \over 2}}$$, which gives the estimate

\begin{equation} \beta_n = O( \rho^{\sqrt{n}} ). \end{equation}

2.3 Main results

To prove our main results, we use the coupling method described in Subsection 2.1. Recall that $${(({\widetilde Y_t},{\widetilde \lambda _t}))_t}$$ and $$((\widetildeY_{t'}, \widetilde\lambda _{t'} ))_t$$ denote the two versions of the process which are coupled on a suitable probability space \[(\widetilde\Omega, \widetilde\mathcal{F},\widetilde\mathbb{P})\] according to (2.3a) and (2.3b). Moreover, we remind the reader that $${\cal G}_t = \sigma ((\widetildeY_t, \widetildeY_{t'}, \widetilde\lambda _t, \widetilde\lambda _{t'} ),(\widetildeY_{t - 1}, \widetildeY_{t - 1} &#x2018;,\widetilde\lambda _{t - 1}, \widetilde\lambda _{t - 1} &#x2018;), \ldots )$$. The following lemma describes the core of our coupling method.

Lemma 2.1. Suppose that (A1)–(A3) are fulfilled, and let τ be any stopping time such that $$\widetildeY_\tau = \widetildeY_{\tau &#x2018;}, \ldots, \widetildeY_{\tau - p + 2} = \widetildeY_{\tau - p + 2} &#x2018;$$. Then

\begin{align*} & \widetilde{\P}\,\Big( \wty_{\tau+m} = \wty_{\tau+m}' \text{ for all } m\in\N \text{ and } \\ &\hskip12pt \sum_{m=1}^\infty | \wtl_{\tau+m} - \wtl_{\tau+m}' | \leq \frac{1}{1-c} \sum_{i=1}^q |\wtl_{\tau-i+2}-\wtl_{\tau-i+2}'| \Bigm| {\mathcal G}_\tau \Big) \\ & \qquad \geq \exp\biggl\{ -\frac{\delta}{1-c} \sum_{i=1}^q |\wtl_{\tau-i+2}-\wtl_{\tau-i+2}' | \biggr\}, \end{align*}

where c = c 1 + · · · + cq.

This lemma tells us that the two processes $${({\widetilde Y_t})_t}$$ and $$(\widetildeY_{t'} )_t$$ coalesce with a conditional probability greater than or equal to exp{−δK/(1 − c)}, where

K=\sum_{i=1}^q |\wtl_{\tau-i+2}-\wtl_{\tau-i+2}'|.

Therefore, in order to prove the desired decay rate for the coefficients of absolute regularity, we show that there exists a stopping time τ(n) such that

\begin{align*} &\qquad\qquad\wty_{\tau^{(n)}} =\wty_{\tau^{(n)}}',\ldots,\wty_{\tau^{(n)}-p+2}=\wty_{\tau^{(n)}-p+2}', \\ &|\wtl_{\tau^{(n)}+1}-\wtl_{\tau^{(n)}+1}'|+\cdots +|\wtl_{\tau^{(n)}-q+2}-\wtl_{\tau^{(n)}-q+2}'|\leq \rho^{\sqrt{n}}, \end{align*}

and that \[\widetilde\mathbb{P}(\tau ^{(n)} &#x003C; n) = 1 - O(\rho ^{\sqrt n } )\] for some ρ < 1. The following main result summarizes the result of our coupling method.

Proposition 2.1. Suppose that (A1)–(A3) are fulfilled. If

\begin{equation} \widetilde{\E}[ V(\wtX_0) + V(\wtX_0') ] < \infty, \end{equation}

then

\begin{equation} \widetilde{\P}\,\biggl( \wty_{m}=\wty_{m}' \text{ for all } m\geq n \text{ and } \sum_{m=n}^\infty | \wtl_{m} - \wtl_{m}' | \leq \frac{\rho^{\sqrt{n}}}{1-c} \biggr) = 1 - O( \rho^{\sqrt{n}} ). \end{equation}

The following two results are immediate consequences of the main Proposition 2.1.

Corollary 2.1. Suppose that (A1)–(A3) are fulfilled. Then the Markov process (Zt)t has a unique stationary distribution π.

Remark 2.5. Woodard et al. [Reference Woodard, Matteson and Henderson32] and Douc et al. [Reference Douc, Doukhan and Moulines10] also derived properties of nonlinear INGARCH(1,1) processes which are, as in our case here, Markov chains that are not necessarily irreducible. Woodard et al. [Reference Woodard, Matteson and Henderson32] used the fact that a drift condition in conjunction with the weak Feller property of the Markov kernel ensures the existence of a stationary distribution while its uniqueness follows from a so-called asymptotic strong Feller property. These properties were, for example, verified for a Poisson threshold model with an intensity function as in (1.6). Douc et al. [Reference Douc, Doukhan and Moulines10] extended these results to more general intensity functions, including among other examples the log-linear Poisson autoregression model introduced by Fokianos and Tjøstheim [Reference Fokianos and Tjøstheim17]. They also focused on the intensity process and imposed the weak Feller condition directly on it. Under an additional high-level condition on two appropriately coupled versions of the Markov chain (see their condition (A3)), they showed that the intensity process (λt)t and, as a consequence, the bivariate process ((Yt, λt))t as well possess unique stationary distributions, and that stationary versions of the processes are ergodic. In the case of a Poisson threshold model (1.6) they also imposed the condition that max{c, c } < 1 in order to ensure semi-contractivity.

Under the semi-contractivity condition imposed here, we cannot derive the abovementioned Feller properties in general. On the other hand, the coupling result stated in Proposition 2.1 compensates for this failure. A metric d which resembles the coupling result is given by

\begin{align*} & d( (\kern1pt y_1,\ldots,y_p;\,\lambda_1,\ldots,\lambda_q), (\kern1pt y_1',\ldots,y_p';\,\lambda_1',\ldots,\lambda_q') ) \\ & \qquad = \1( (\kern1pt y_1,\ldots,y_p)\neq (\kern1pt y_1',\ldots,y_p') ) + \sum_{i=1}^q |\lambda_i-\lambda_i'|. \end{align*}

It follows for arbitrary z ∈ [0,∞)p + q that $$P^{Z_1 |Z_0 = z'} \Rightarrow P^{Z_1 |Z_0 = z}$$ as d(z , z) → 0, where ‘⇒’ denotes weak dependence. In other words, the weak Feller property holds with respect to the metric d rather than the more usual Euclidean norm. As can be seen in the proof of Corollary 2.1, we also obtain

\begin{equation} \inf\big\{ d(\zeta_n, \zeta_n')\colon \zeta_n\sim P^{Z_n\mid Z_0=z}, \zeta_n'\sim P^{Z_n\mid Z_0=z'} \big\} \to 0 \quad\text{as } n\to\infty, \end{equation}

which means that the asymptotic Feller property is also fulfilled.

The following theorem is our main result.

Theorem 2.1. Suppose that (A1)–(A3) are fulfilled. A stationary version of the process (Yt)t is absolutely regular (β-mixing) with coefficients satisfying

\begin{equation} \beta_n \leq C \rho^{\sqrt{n}} \quad \text{ for all } n\in\N \end{equation}

for some C < ∞ and ρ < 1.

At this point we would like to recall that the accompanying process (λt)t is not mixing in general. The following counterexample was already given in [Reference Neumann27, Remark 3]. In the case of an INGARCH(1,1) process, consider the specification f (y;λ) = y/2+g(λ), where g is strictly monotone and satisfies 0 < κ 1g(λ)< 0.5 as well as |g(λ)−g(λ )| ≤ κ 2|λλ | for all λ, λ ≥ and some κ 2 < 0.5. Then our regularity conditions (A1)–(A3) are fulfilled. Using the fact that g(λ) ∈ [κ 1, 0.5), it follows from 2λt = Yt −1 + 2g(λt −1) that Yt −1 = [2λt] and, therefore, 2g(λt −1) = 2λt −[2λt]. This means that we can perfectly recover λt −1 once we know the value of λt. Iterating this argument we see that we can recover from λt the complete past of the hidden process (λt)t. Taking into account that the above choice of f excludes the case that this process is purely nonrandom, we conclude that a stationary version of (λt)t cannot be strongly mixing, and therefore also not be absolutely regular. However, exploiting once more our coupling idea we can show that λt can be expressed as

\begin{equation} \lambda_t = g(Y_{t-1},Y_{t-2},\ldots\kern-1.5pt) \end{equation}

for some measurable function g. This yields ergodicity of the process t)t ∈ ℤ and also of the bivariate process ((Yt, λt))t ∈ ℤ as stated in the following lemma.

Theorem 2.2. Suppose that (A1)–(A3) are fulfilled. Then a stationary version of the process ((Yt, λt))t ∈ ℤ is ergodic.

Compared to absolute regularity of the process (Yt)t, the ergodicity result for the accompanying process (λt)t seems to be a bit poor. However, combined with additional structural assumptions even the property of ergodicity might prove to be sufficient for deriving asymptotic properties of statistical procedures; see, e.g. [Reference Neumann27, Section 4], [Reference Leucht and Neumann24], and [Reference Leucht, Kreiss and Neumann25].

Remark 2.6. It is possible to extend our result on absolute regularity to the case of a time-varying transition mechanism, where the function f additionally depends on time. In this case, (2.1b) has to be replaced by

\begin{align*} \lambda_t =\,f_t(Y_{t-1},\ldots,Y_{t-p};\lambda_{t-1},\ldots,\lambda_{t-q}), \end{align*}

and assumption (A2) by 2pt

(A2′) (Uniform semi-contractive condition.) There exist nonnegative constants c 1, …, cq with c 1 + · · · + cq < 1 such that

\begin{equation} |\,f_t(\kern1pt y_1,\ldots,y_p;\,\lambda_1,\ldots,\lambda_q) - f_t(\kern1pt y_1,\ldots,y_p;\,\lambda_1',\ldots,\lambda_q')| \leq \sum_{i=1}^q c_i |\lambda_i-\lambda_i'| \end{equation}

for all t ≥ 0, y 1, …, yp ∈ ℝ, $${\lambda _1}, \ldots ,{\lambda _q},{\lambda _{1'}}, \ldots ,{\lambda _{q'}} \ge 0$$.

We are convinced that results similar to those in our paper can be proved under these conditions and we hope to be able to report on this elsewhere.

2.4 Some applications in statistics

In what follows we discuss a couple of instances where absolute regularity yields powerful uniform limit theorems, which also indicates the relevance of the present results. Assume that a real-valued process (Yt)t ∈ ℤ is strictly stationary and strongly mixing with coefficients satisfying $$\alpha _n \le C\rho ^{\sqrt n }$$ for some C < ∞. If, in addition, \[\mathbb{E}g(Y_0 ) = 0\] and \[\mathbb{E}g^2 (Y_0 )\mathop {\ln }\nolimits^2 (|g(Y_0 )| \vee 1) &#x003C; \infty\], then Doukhan et al. [Reference Doukhan, Massart and Rio12] proved the following central limit theorem in the Skorokhod space D [0,1]:

\begin{equation} \frac{1}{\sqrt{n}\,} \sum_{j=1}^{[nu]} g(Y_i) \stackrel{D[0,1]}{\longrightarrow} \sigma(g)W(u). \end{equation}

Here W is a Brownian motion and the series \[\sigma ^2 (g) = \sum\nolimits_{j = - \infty }^\infty \mathbb{E}g(Y_0 )g(Y_i )\] is assumed to converge. For the detection of changes in the mean, we refer the reader to Theorems 4.1.2 and 4.1.5 of [Reference Csorgö and Horvath7]. The same volume deals in Section 4.4 with the detection of change points for other parameters involving functional central limit theorems; Doukhan et al. [Reference Doukhan, Massart and Rio13] proved a corresponding result under β-mixing.

In the nonparametric estimation frame, the specific structure of β-mixing is also fruitful. Viennet’s [Reference Viennet31] covariance inequality gives relevant bounds for the centred moments of kernel-type estimators (and more general nonparametric estimators) without imposing the existence of uniformly bounded joint densities as this is usually done under weaker strong mixing assumptions. This inequality is written as

\begin{align*} n\int_{\R^d}\var\kern1pt \skew3\widehat f_n(x) w(x)\sd x\le \biggl(1+4\sum_{i=1}^{n-1}\beta_i\biggr)\sup_{x\in \R^d} \biggl\{w^2(x)\sum_{j=1}^me_i^2(x) \biggr\} \end{align*}

for projection-type estimators on the vector space spanned by {e 1, …, en}, which is an orthonormal system of \[\mathbb{L}^2 (\mathbb{R}^d, w(x)x)\]. The standard bound of such quadratic loss has order m/n under weak β-mixing assumptions. This fact was also decisive in using model selection procedures under dependence. Baraud et al. [Reference Baraud, Comte and Viennet2] proposed adaptive estimation and a selection procedure for regression models (including autoregression) under this β-mixing condition. Beyond the abovementioned covariance inequality from [Reference Viennet31], they used the Berbee coupling for β-mixing sequences.

3 Proofs

Proof of Remark 2.1. For nonnegative y 1, …, yp −1, λ 0, …, λq −1 and positive a 1, …, ap −1, b 0, …, bq −1, let

\begin{align*} V( (\kern1pt y_1,\ldots,y_{p-1},\lambda_0,\ldots,\lambda_{q-1}) ) = \sum_{i=1}^{p-1} a_i y_i + \sum_{j=0}^{q-1} b_j \lambda_j. \end{align*}

We consider, without loss of generality, only the case of an INGARCH(p,q) process since the proof in the GARCH(p,q) case is analogous. Recall that Xt = (Yt −1, …, Yt p+1, λt, …, λt q+1). Then

(3.1)\begin{align} & \E( V(X_t) \mid X_{t-1} )\nonumber \\ & \qquad = \E\biggl( a_1 Y_{t-1} + \sum_{i=2}^{p-1} a_i Y_{t-i} + b_0 \lambda_t + \sum_{j=1}^{q-1} b_j \lambda_{t-1} \biggm| Y_{t-2},\ldots,Y_{t-p},\lambda_{t-1},\ldots,\lambda_{t-q} \biggr) \nonumber \\ & \qquad \leq a_1 \lambda_{t-1} + \sum_{i=2}^{p-1} a_i Y_{t-i} + b_0 \biggl( \bar{a}_0 + \bar{a}_1 \lambda_{t-1} + \sum_{i=2}^p \bar{a}_i Y_{t-i} + \sum_{j=1}^q \bar{b}_j \lambda_{t-j} \biggr) + \sum_{j=1}^{q-1} b_j \lambda_{t-j}. \end{align}

We are going to find positive constants a 1, …, ap −1, b 0, …, bq −1, κ < 1, and a 0 < ∞ such that the right-hand side of (3.1) is smaller than or equal to

\begin{equation} a_0 + \kappa V(X_{t-1}) = a_0 + \kappa \biggl( \sum_{i=2}^p a_{i-1} Y_{t-i} + \sum_{j=1}^q b_{j-1} \lambda_{t-j} \biggr). \end{equation}

We set, without loss of generality, b 0 = 1 and, accordingly, a 0 = ā 0. Condition (A1) will be fulfilled for all possible values of the involved random variables if

(3.2a)\begin{gather}a_1 + b_1 + \bar{a}_1 + \bar{b}_1 < 1, \\\end{gather}
(3.2b)\begin{gather}\bar{b}_j + b_j < b_{j-1} \quad \mbox{for } j=2,\ldots,q-1, \\\end{gather}
(3.2c)\begin{gather}\bar{b}_q < b_{q-1} \\\end{gather}
(3.2d)\begin{gather}\bar{a}_i + a_i < a_{i-1} \quad \mbox{for } i=2,\ldots,p-1,\end{gather}
(3.2e)\begin{gather}\bar{a}_p < a_{p-1},\end{gather}

where the possible choice of κ becomes apparent at the end of the proof.

Let $$\bar a = \sum\nolimits_{i = 1}^p \bar a_i$$ and $$\bar b = \sum\nolimits_{j = 1}^q \bar b_j$$. We choose ε > 0 such that $$\bar a + \bar b + 2\varepsilon < 1$$ and we define

\begin{align*} a_1 = \bar{a} - \bar{a}_1 + \varepsilon, \qquad b_1 = \bar{b} - \bar{b}_1 + \varepsilon. \end{align*}

Then (3.2a) is fulfilled. Furthermore, we define recursively, for any δ ∈ (0, ε/( q − 2)),

\begin{equation} b_j = b_{j-1} - \bar{b}_j - \delta\quad \mbox{for } j=2,\ldots,q-1, \end{equation}

which implies that (3.2b) holds. Then

\begin{equation} b_{q-1} = \bar{b} - \bar{b}_1 - \cdots - \bar{b}_{q-1} + \varepsilon - (q-2)\delta > \bar{b}_q, \end{equation}

which means that (3.2c) is satisfied. Moreover, we set, for γ ∈ (0, ε/(p − 2)),

\begin{equation} a_i = a_{i-1} - \bar{a}_i - \gamma \quad \mbox{for } i=2,\ldots,p-1. \end{equation}

Then (3.2d) is fulfilled. Finally,

\begin{equation} a_{p-1} = \bar{a} - \bar{a}_1 - \cdots - \bar{a}_{p-1} + \varepsilon - (\kern1.5pt p-2) \delta > \bar{a}_p, \end{equation}

which shows that (3.2e) is also satisfied.

Since all inequalities (3.2a)–(3.2e) are fulfilled in the strict sense, we can include a factor κ < 1 which is sufficiently close to 1 on the right-hand sides, which leaves the strict inequalities intact. This completes the proof.

Proof of Lemma 2.1. Recall that $${\widetilde \lambda _{\tau + 1}}$$ and $$\widetilde\lambda _{\tau + 1} &#x2018;$$ are $${\cal G}_\tau$$-measurable. Therefore, it follows from the similarity condition (A3) and the maximal coupling scheme that

\begin{equation} \widetilde{\P}( \wty_{\tau+1} = \wty_{\tau+1}' \mid {\mathcal G}_\tau ) \geq \rme^{-\delta |\wtl_{\tau+1}-\wtl_{\tau+1}'|}. \end{equation}

Now if, in addition, $$\widetildeY_{\tau + 1} = \widetildeY_{\tau + 1} &#x2018;$$ then we obtain p consecutive hits ($$\widetildeY_\tau = \widetildeY_{\tau &#x2018;}, \ldots, \widetildeY_{\tau - p + 2} = \widetildeY_{\tau - p + 2} &#x2018;$$ was assumed) and the contractive property begins to take effect, which implies that

\begin{equation} |\wtl_{\tau+2}-\wtl_{\tau+2}'| \leq c_1 |\wtl_{\tau+1}-\wtl_{\tau+1}'| + \cdots + c_q |\wtl_{\tau-q+2}-\wtl_{\tau-q+2}'|. \end{equation}

Again, by (A3),

\begin{equation} \widetilde{\P}( \wty_{\tau+2} = \wty_{\tau+2}' \mid {\mathcal G}_\tau, \wty_{\tau+1} = \wty_{\tau+1}' ) \geq \rme^{-\delta |\wtl_{\tau+2}-\wtl_{\tau+2}'|}, \end{equation}

and if, additionally, $$\widetildeY_{\tau + 2} = \widetildeY_{\tau + 2} &#x2018;$$ then

\begin{align*} | \wtl_{\tau+3} - \wtl_{\tau+3}' | & \leq c_1 | \wtl_{\tau+2} - \wtl_{\tau+2}' | + \sum_{i=2}^q c_i | \wtl_{\tau+3-i} - \wtl_{\tau+3-i}' | \\ & \leq c_1 ( c_1 |\wtl_{\tau+1}-\wtl_{\tau+1}'| + \cdots + c_q |\wtl_{\tau-q+2}-\wtl_{\tau-q+2}'| ) \\ &\jump+ \sum_{i=2}^q c_i | \wtl_{\tau+3-i} - \wtl_{\tau+3-i}' |. \end{align*}

Iterating these calculations we obtain, for all k ∈ ℕ, the following general formulae. If $$\widetildeY_{\tau - p + 2} = \widetildeY_{\tau - p + 2} &#x2018;, \ldots, \widetildeY_{\tau + k - 1} = \widetildeY_{\tau + k - 1} &#x2018;$$ then

(3.3)\begin{equation} \label{pl21.1} | \wtl_{\tau+k} - \wtl_{\tau+k}'| \leq \sum_{i=1}^q d_{k,i} | \wtl_{\tau-i+2} - \wtl_{\tau-i+2}' |, \end{equation}

where d 1,1 = 1, d 1,i = 0 if i ≥ 2, and, for k ≥ 2,

(3.4)\begin{equation} d_{k,i} = \sum_{\{l\colon (k+i-2)/q\leq l\leq k+i-2\}} \sum_{\{(i_1,\ldots,i_l)\colon i_1+\cdots+i_l=k+i-2\}} c_{i_1}\times\cdots\times c_{i_l}. \end{equation}

Therefore,

\begin{equation} \widetilde{\P}( \wty_{\tau+k}=\wty_{\tau+k}' \mid {\mathcal G}_\tau, \wty_{\tau+1}=\wty_{\tau+1}', \ldots, \wty_{\tau+k-1}=\wty_{\tau+k-1}' ) \geq \rme^{-\delta \sum_{i=1}^q d_{k,i} |\wtl_{\tau-i+2}-\wtl_{\tau-i+2}'|}. \end{equation}

This leads to

(3.5)\begin{align} \label{pl21.3} & \widetilde{\P}( \wty_{\tau+1}=\wty_{\tau+1}',\ldots,\wty_{\tau+m}=\wty_{\tau+m}' \mid {\mathcal G}_\tau ) \nonumber \\ & \qquad= \P( \wty_{\tau+1}=\wty_{\tau+1}' \mid {\mathcal G}_\tau )\times \P( \wty_{\tau+2}=\wty_{\tau+2}' \mid \wty_{\tau+1}=\wty_{\tau+1}', {\mathcal G}_\tau ) \times \cdots \nonumber \\ & \qquad \jump \times \P( \wty_{\tau+m}=\wty_{\tau+m}' \mid \wty_{\tau+1}=\wty_{\tau+1}', \ldots, \wty_{\tau+m-1}=\wty_{\tau+m-1}', {\mathcal G}_\tau ) \nonumber \\ & \qquad\geq \rme^{-\delta \sum_{i=1}^q d_{1,i} |\wtl_{\tau-i+2}-\wtl_{\tau-i+2}'|}\times \cdots \times \rme^{-\delta \sum_{i=1}^q d_{m,i} |\wtl_{\tau-i+2}-\wtl_{\tau-i+2}'|} \nonumber \\ & \qquad= \exp\biggl\{-\delta \sum_{i=1}^q D_{m,i} |\wtl_{\tau-i+2}-\wtl_{\tau-i+2}'|\biggr\}, \end{align}

where

\begin{equation} D_{m,i} \,:\!= \sum_{k=1}^m d_{k,i} \leq \sum_{l=0}^{m+i-2} (c_1+\cdots +c_q)^l \leq \frac{1}{1-c}. \end{equation}

Since $$\widetildeY_{\tau - p + 2} = \widetildeY_{\tau - p + 2} &#x2018;, \ldots, \widetildeY_{\tau + m} = \widetildeY_{\tau + m} &#x2018;$$ means that the contractive property takes effect at all time points from τ +1 to τ + m, in this case we obtain

\begin{align*} | \wtl_{\tau+1} - \wtl_{\tau+1}' | + \cdots + | \wtl_{\tau+m+1} - \wtl_{\tau+m+1}' | & \leq \sum_{k=1}^{m+1} \sum_{i=1}^q d_{k,i} | \wtl_{\tau-i+2} - \wtl_{\tau-i+2}' | \\ & \leq \sum_{i=1}^q D_{m+1,i} | \wtl_{\tau-i+2} - \wtl_{\tau-i+2}' |. \end{align*}

With m → ∞ we conclude that

\begin{align*} & \widetilde{\P}\,\biggl( \wty_{\tau+m} = \wty_{\tau+m}' \text{ for all } m\in\N \ \mbox{and} \nonumber \\ &\hskip12pt \sum_{m=1}^\infty | \wtl_{\tau+m} - \wtl_{\tau+m}' | \leq \frac{1}{1-c} \sum_{i=1}^q |\wtl_{\tau-i+2}-\wtl_{\tau-i+2}'| \biggm| {\mathcal G}_\tau \biggr) \nonumber \\ & \qquad \geq \exp\,\biggl\{ -\frac{\delta}{1-c} \sum_{i=1}^q |\wtl_{\tau-i+2}-\wtl_{\tau-i+2}'| \biggr\}, \end{align*}

which proves the assertion.

Proof of Proposition 2.1. In view of the result of Lemma 2.1, we define a stopping time as

\begin{align*} \tau^{(n)} &= \inf\{t\geq 0\colon \wty_t=\wty_t',\ldots,\wty_{t-p+2}=\wty_{t-p+2}' \ \mbox{and} \ |\wtl_{t+1}-\wtl_{t+1}'|+\cdots \\ &\hskip26pt +|\wtl_{t-q+2}-\wtl_{t-q+2}'|\leq \rho^{\sqrt{n}} \} \end{align*}

for some ρ ∈ (0, 1). Recall that

\begin{align*} {\mathcal G}_t &= \sigma((\wty_t,\wty_t',\wtl_t,\wtl_t'),(\wty_{t-1},\wty_{t-1}',\wtl_{t-1},\wtl_{t-1}'),\ldots\kern-1.5pt)\\ &= \sigma(\wtl_{t+1},\wtl_{t+1}',(\wty_t,\wty_t',\wtl_t,\wtl_t'),(\wty_{t-1},\wty_{t-1}',\wtl_{t-1},\wtl_{t-1}'),\ldots\kern-1.5pt). \end{align*}

It follows from Lemma 2.1 that

\begin{align*} & \widetilde{\P}\,\biggl( \wty_{\tau^{(n)}+m} = \wty_{\tau^{(n)}+m}' \text{ for all } m\in\N \ \mbox{and} \ \sum_{m=1}^\infty | \wtl_{\tau^{(n)}+m} - \wtl_{\tau^{(n)}+m}' | \leq \frac{\rho^{\sqrt{n}}}{1-c} \biggm| {\mathcal G}_{\tau^{(n)}} \biggr) \nonumber \\ & \qquad \geq \rme^{-(\delta/(1-c)) \rho^{\sqrt{n}}} \\ &\qquad\geq 1 - \frac{\delta}{1-c} \rho^{\sqrt{n}}. \end{align*}

Hence, it remains to estimate \[\widetilde\mathbb{P}(\tau ^{(n)} \geqslant n)\]. To this end, we define stopping times τ 1, τ 2, …, which serve as starting points of subsequent trials to reach a state with

(3.6)\begin{equation} \wty_t=\wty_t',\ldots,\wty_{t-p+2}=\wty_{t-p+2}' \quad \mbox{and} \quad |\wtl_{t+1}-\wtl_{t+1}'|+\cdots +|\wtl_{t-q+2}-\wtl_{t-q+2}'|\leq \rho^{\sqrt{n}}. \end{equation}

Recall that

\begin{equation} \wty_t=\wty_t',\ldots,\wty_{t-p+2}=\wty_{t-p+2}' \quad \mbox{and} \quad |\wtl_{t+1}-\wtl_{t+1}'|+\cdots +|\wtl_{t-q+2}-\wtl_{t-q+2}'|\leq \rho^{\sqrt{n}}. \end{equation}

in the case of a GARCH(p,q) model. Furthermore, in the INGARCH(p,q) case we define these quantities as

\wtX_t=(\wty_{t-1},\ldots,\wty_{t-p+1},\wtl_t,\ldots,\wtl_{t-q+1}), \qquad \wtX_t'=(\wty_{t-1}',\ldots,\wty_{t-p+1}',\wtl_t',\ldots,\wtl_{t-q+1}').

Let $$W_t = (V(\widetildeX_t ) + V(\widetildeX_{t'} ))/2$$ and

\begin{equation} \tau_1 = \inf\{t\geq 0\colon W_t \leq C_1^{(0)} \}, \end{equation}

where $$C_1^{(0)} \in (0,\infty )$$ is defined in the course of the proof of Lemma 3.1 below. Then there exists some $$C_2^{(0)} &#x003E; 0$$ such that

\begin{equation}%\label{pt1.11} \widetilde{\P}( \wty_{\tau_1}=\wty_{\tau_1}' \mid {\mathcal G}_{\tau_1-1} ) \geq C_2^{(0)}.\nonumber \end{equation}

Furthermore, it follows from (A1) that there exists some $$C_1^{(1)} &#x003C; \infty$$ and $$C_3^{(1)} &#x003E; 0$$ such that

\begin{equation} \widetilde{\P}\big(W_{\tau_1+1} \leq C_1^{(1)} \mid {\mathcal G}_{\tau_1-1}, \wty_{\tau_1}=\wty_{\tau_1}' \big) \geq C_3^{(1)}.\nonumber \end{equation}

It follows, in turn, that there exist constants $$C_2^{(1)}, C_3^{(2)} &#x003E; 0$$ and $$C_1^{(2)} &#x003C; \infty$$ such that

\begin{equation} \widetilde{\P}\,\big(\wty_{\tau_1+1}=\wty_{\tau_1+1}' \mid {\mathcal G}_{\tau_1-1}, \wty_{\tau_1}=\wty_{\tau_1}', W_{\tau_1+1} \leq C_1^{(1)} \big) \geq C_2^{(1)}\nonumber \end{equation}

and

\begin{equation} \widetilde{\P}\,\big(W_{\tau_1+2} \leq C_1^{(2)} \mid {\mathcal G}_{\tau_1-1}, \wty_{\tau_1}=\wty_{\tau_1}', \wty_{\tau_1+1}=\wty_{\tau_1+1}', W_{\tau_1+1} \leq C_1^{(1)} \big) \geq C_3^{(2)}.\nonumber \end{equation}

Proceeding in the same way we obtain

\begin{align*} &\widetilde{\P}\,\big(\wty_{\tau_1+p-1} = \wty_{\tau_1+p-1}' \Bigm| {\mathcal G}_{\tau_1-1}, \wty_{\tau_1}=\wty_{\tau_1}',\ldots, \wty_{\tau_1+p-2}=\wty_{\tau_1+p-2}', \\ &\hskip102pt W_{\tau_1+1} \leq C_1^{(1)},\ldots, W_{\tau_1+p-1} \leq C_1^{(\kern1.5pt p-1)} \big) \\ &\qquad\geq C_2^{(\kern1.5pt p-1)}. \end{align*}

This leads to

\begin{equation} \widetilde{\P} ( \wty_{\tau_1}=\wty_{\tau_1}',\ldots,\wty_{\tau_1+p-1}=\wty_{\tau_1+p-1}' \mid {\mathcal G}_{\tau_1-1} ) \geq C_2^{(0)}\cdots C_2^{(\kern1.5pt p-1)} C_3^{(1)}\cdots C_3^{(\kern1.5pt p-1)} =\!:\,C_4,\nonumber \end{equation}

that is, with a probability not smaller than C 4 > 0 we we reach after p steps a state with

\begin{align*} &\qquad\qquad\wty_{\tau_1}=\wty_{\tau_1}',\ldots,\wty_{\tau_1+p-1}=\wty_{\tau_1+p-1}' \\ &\text{and}\quad \sum_{i=1}^q b_i |\wtl_{\tau_1+p-i}-\wtl_{\tau_1+p-i}'| \leq W_{\tau_1+p-1} \leq C_1^{(\kern1.5pt p-1)}. \end{align*}

Now the contractive condition begins to take effect and it follows from Lemma 2.1 that after $$D_n - p + 1{\kern 1pt} : = [C_5 \sqrt n ]$$ additional hits we arrive at a state with (3.6), if C 5 is large enough. This actually happens with a probability bounded away from 0. Hence, we obtain

\begin{align*} &\P\,\biggl( \wty_{\tau_1+D_n-1}=\wty_{\tau_1+D_n-1}',\ldots,\wty_{\tau_1+D_n-p+1} =\wty_{\tau_1+D_n-p+1}' \\ &\hskip15pt \text{and } \sum_{i=1}^q |\wtl_{\tau_1+D_n-i+1}-\wtl_{\tau_1+D_n-i+1}'| \leq\rho^{\sqrt{n}} \biggm| {\mathcal G}_{\tau_1-1} \biggr) \\ &\qquad\geq C_6 \end{align*}

for some C 6 > 0. This means that a trial to reach a favourable state with (3.6) covers Dn time points. Accordingly, for i > 1, we consider the following retarded return times:

\begin{equation} \tau_i = \inf\{t> \tau_{i-1}+D_n\colon W_t \leq C_1^{(0)} \}. \end{equation}

Now we are in a position to derive an upper bound for \[\widetilde\mathbb{P}(\tau ^{(n)} \geqslant n)\].

We define events

\begin{equation} A_i = \biggl\{ \wty_{\tau_i+D_n-\ell}=\wty_{\tau_i+D_n-\ell}' \ \mbox{for } 1\le \ell<p \ \mbox{and}\ \sum_{j=1}^q |\wtl_{\tau_i+D_n-j+1}-\wtl_{\tau_i+D_n-j+1}'|\leq \rho^{\sqrt{n}} \biggr\}. \end{equation}

Let Kn = C 7Dn. It follows from Lemma 3.1 that

\begin{gather*} \widetilde{\E} \eta^{\tau_1}\leq 1+ \widetilde{\E}(\eta^{\tau_1}\mid W_0>C_1^{(0)})\leq 1+\widetilde{\E} W_0 \\ \text{and}\quad \widetilde{\E}(\eta^{\tau_m-\tau_{m-1}}\mid {\mathcal G}_{\tau_{m-1}-1}) \leq \eta^{D_n}C \,:\!= \rho^{D_n} \biggl(1+ \frac{a_0+\kappa C_1^{(0)}}{1-\kappa}\biggr), \end{gather*}

which yields

\begin{align*} \widetilde{\E} \eta^{\tau_1+(\tau_2-\tau_1)+\cdots+(\tau_{K_n}-\tau_{K_n-1})} & = \widetilde{\E} [ \eta^{\tau_1+(\tau_2-\tau_1)+\cdots+(\tau_{K_n-1}-\tau_{K_n-2})} \widetilde{\E}( \eta^{\tau_{K_n}-\tau_{K_n-1}} \mid {\mathcal G}_{\tau_{K_n-1}-1} ) ] \\ & \leq \eta^{D_n}C \widetilde{\E} \eta^{\tau_1+(\tau_2-\tau_1)+\cdots+(\tau_{K_n-1}-\tau_{K_n-2})} \\ & \leq \cdots \\ &\leq \eta^{D_n(K_n-1)} C^{K_n-1} (1+\widetilde{\E} W_0). \end{align*}

This implies that

\begin{align*} \widetilde{\P}_\pi( \tau_{K_n}+D_n-1 \geq n ) & \leq \frac{ \eta^{D_n(K_n-1)} C^{K_n-1} (1+\widetilde{\E}_\pi W_0) }{ \eta^{n-D_n+1} } \\ & = O\Big( \eta^{ C_7D_n^2-n-1 C^{C_7D_n-1} } \Big)\\ & = o( \rho^{\sqrt{n}} ) \end{align*}

if C 7 < 1 is sufficiently small. Therefore, and since \[\widetilde\mathbb{P}(A_1^c \cap \cdots \cap A_{K_n }^c ) \leqslant (1 - C_6 )^{K_n }\], we obtain

\begin{align*} \widetilde{\P}( \tau^{(n)} \geq n ) \leq \widetilde{\P}( \tau_{K_n}+D_n-1\geq n ) + \widetilde{\P}(A_1^c \cap\cdots\cap A_{K_n}^c) = o( \rho^{\sqrt{n}} ) + (1-C_6)^{K_n}. \end{align*}

Proof of Corollary 2.1. In order to prove existence of a stationary version of (Zt)t, it would suffice to derive this property for (Xt)t, where Xt = (Yt −1, …, Yt p+1, λt, …, λt q+1). It follows from the drift condition (A1) that conditions (F1) and (F3), and therefore (F2) as well, in [Reference Tweedie30] are fulfilled. If the Markov chain were weak Feller, i.e. for any bounded and continuous function φ : ℝ p+q−1 → ℝ, the map x ↦ ∫ φ(y)PX 1 | X 0 = x(dy) were continuous, then we could conclude from Theorem 2 of [Reference Tweedie30] that (Xt)t has a stationary distribution. This fact has been used in, e.g. [Reference Douc, Doukhan and Moulines10], where the weak Feller property was explicitly imposed. The Feller property can be easily shown in the case of a continuous volatility/intensity function f; however, this might fail with a discontinuous function as they appear with certain threshold models. We show below that the missing Feller property will be compensated by the coupling result in Proposition 2.1.

First we convert the coupling result in a convergence result for the conditional distributions P Z n | X 0 = x, where x is an arbitrarily chosen point in the range of X 0. Using maximal coupling as in the proof of Proposition 2.1, we construct two versions of the process, $${({\widetilde Z_t})_{t \in {_0}}}$$ and \[(\widetildeZ_{t'} )_{t \in \mathbb{N}_0 }\], where $${\widetilde X_0} = x$$ and $$\widetildeX_{0'} \sim P^{X_1 |X_0 = x}$$. We obtain

\begin{align*} &\widetilde{\kP}\biggl( (\widetilde{Y}_n,\ldots,\widetilde{Y}_{n-p+1})\neq (\widetilde{Y}_n',\ldots,\widetilde{Y}_{n-p+1}') \ \mbox{or} \ \sum_{j=1}^q |\widetilde{\lambda}_{n-j+1}-\widetilde{\lambda}_{n-j+1}'| > \frac{\rho^{\sqrt{n-q+1}}}{1-c} \biggr) \\ &\qquad= O( \rho^{\sqrt{n}} ). \end{align*}

Now we can construct, on a suitable probability space $$(\widetilde{\widetilde\Omega },\widetilde{\widetilde{\cal F}},\widetilde{\widetilde{}})$$, a sequence of random vectors (ζn)n ∈ ℕ such that

\zeta_n=(\zeta_{n,1},\ldots,\zeta_{n,p},\zeta_{n,p+1},\ldots,\zeta_{p+q})^T =\Big(\zeta_{n,Y}^T,\zeta_{n,\lambda}^T\Big)^T\sim P^{Z_n\mid X_0=x}

and

(3.7)\begin{equation} \widetilde{\widetilde{\kP}}\biggl( \zeta_{n,Y}\neq \zeta_{n+1,Y} \ \mbox{or} \ \| \zeta_{n,\lambda}-\zeta_{n+1,\lambda} \|_{l_1} > \frac{\rho^{\sqrt{n-q+1}}}{1-c} \biggr) = O( \rho^{\sqrt{n}} ). \end{equation}

(Given ζ1, …, ζn, the vector ζn + 1 has to be defined according to the conditional distribution of $$\widetildeZ_{n'}$$ given $${\widetilde Z_n}$$.) Since

\sum_{m=n}^\infty \rho^{\sqrt{m}}=O(\sqrt{n} \rho^{\sqrt{n}}),

we obtain, from (3.7),

(3.8)\begin{align} &\widetilde{\widetilde{\kP}}\biggl( \zeta_{m,Y}=\zeta_{m+1,Y} \text{ for all } m\geq n \ \mbox{and} \ \sum_{m=n}^\infty \|\zeta_{m,\lambda}-\zeta_{m+1,\lambda}\|_{l_1} \leq K\sqrt{n} \rho^{\sqrt{n}} \biggr)\nonumber \\ &\qquad = 1 - O(\sqrt{n} \rho^{\sqrt{n}}) \end{align}

for some K < ∞. It follows that

\begin{equation} \widetilde{\widetilde{\kP}}\biggl( \bigcup_{n=1}^\infty \{\omega\colon \zeta_{m,Y}=\zeta_{m+1,Y} \text{ for all } m\geq n\} \biggr) = 1, \end{equation}

which means that all ζm,Y are equal for mn(ω), and therefore they are eventually equal to some random vector ζY. Furthermore, since

\zeta_{N,\lambda}=(\zeta_{N,\lambda}-\zeta_{N-1,\lambda})+\cdots +(\zeta_{n+1,\lambda}-\zeta_{n,\lambda})+\zeta_{n,\lambda},

we obtain

\begin{equation} \limsup_{N\to\infty} \zeta_{N,i} - \liminf_{N\to\infty} \zeta_{N,i} \leq \sum_{m=n}^\infty |\zeta_{m+1,i}-\zeta_{m,i}|\quad \text{for all } i=p+1,\ldots,p+q. \end{equation}

Hence, it follows from (3.8) that

\begin{equation} \widetilde{\widetilde{\kP}}\,\Big( \liminf_{N\to\infty} \zeta_{N,i} = \limsup_{N\to\infty} \zeta_{N,i} \text{ for all } i=p+1,\ldots,p+q \Big) = 1, \end{equation}

which implies that ζN,λ converges to some random vector ζλ with probability 1. Let $$\zeta = (\zeta _Y^T, \zeta _\lambda ^T )^T$$ and denote by $$\pi = \widetilde{\widetilde{}}^\zeta$$ the distribution of ζ. Let φ : ℝp+q → ℝ be a bounded and uniformly continuous function. Next we show that π is a stationary distribution of the Markov chain (Zt)t. Since the map y ↦ ∫ φ(z)P Z 1 | Z 0 = y(dz) is continuous in the last q arguments yp+1, …, yp+q, we obtain

\begin{equation} \int \biggl[ \int \varphi(z) P^{Z_1\mid Z_0=y}(\rd z) \biggr]\, \widetilde{\widetilde{\kP}}^{\zeta_n}(\rd y) \to \int\, \biggl[ \int \varphi(z) P^{Z_1\mid Z_0=y}(\rd z) \biggr]\, \pi(\rd y) \quad\text{as }{n\to\infty}, \end{equation}

which yields

\begin{align*} & \biggl| \int \varphi(\kern1pt y) \pi(\rd y) - \int \biggl[ \int \varphi(z) P^{Z_1\mid Z_0=y}(\rd z) \biggr] \pi(\rd y) \biggr| \\ & \qquad = \lim_{n\to\infty} \biggl| \int \varphi(\kern1pt y) \widetilde{\widetilde{\kP}}^{\zeta_n}(\rd y) - \int \biggl[ \int \varphi(z) P^{Z_1\mid Z_0=y}(\rd z) \biggr] \widetilde{\widetilde{\kP}}^{\zeta_n}(\rd y) \biggr| \\ & \qquad = \lim_{n\to\infty} \biggl| \int \varphi(\kern1pt y) \widetilde{\widetilde{\kP}}^{\zeta_n}(\rd y) - \int \varphi(\kern1pt y) \widetilde{\widetilde{\kP}}^{\zeta_{n+1}}(\rd y) \biggr| \\ &\qquad= 0. \end{align*}

Hence, π is a stationary distribution of (Zt)t.

To show uniqueness, suppose that π 1 and π 2 are two arbitrary stationary distributions. We start the processes to be coupled such that $${\widetilde Z_0} \sim {\pi _1}$$ and $$\widetildeZ_{0'} \sim \pi _2$$. (Here, it does not matter whether or not $${\widetilde Z_0}$$ and $$\widetildeZ_{0'}$$ are independent.) Since both π 1 and π 2 are stationary laws, we have

(3.9)\begin{equation} \widetilde{Z}_n \sim \pi_1 \quad \mbox{and} \quad \widetilde{Z}_n' \sim \pi_2 \quad \text{for all } n\in\N. \end{equation}

Furthermore, it follows from the geometric drift condition (A1) that \[\widetilde\mathbb{E}(V(\widetildeX_1 ) + V(\widetildeX_{1'} )) &#x003C; \infty\], which implies by Proposition 2.1 that

\begin{equation} \| \widetilde{Z}_n - \widetilde{Z}_n' \| \,\stackrel{\widetilde{\P}}{\longrightarrow}\, 0 \end{equation}

as n → ∞. This and (3.9) imply that π 1 = π 2.

Proof of Theorem 2.1. Let π denote the stationary distribution of (Zt)t and let, for st ≤ ∞, $${\cal F}_{s,t}^Y = \sigma (Y_s, \ldots, Y_t )$$. We start both versions of the process at time 0 independently, with $${\widetilde Z_0} \sim \pi $$ and $$\widetildeZ_{0'} \sim \pi$$. We denote by \[\widetilde\mathbb{P}_\pi\] and \[\widetilde\mathbb{E}_\pi\] the corresponding distribution and expectation, respectively. Since, by (3.10) below, λt = g(Yt −1, Yt −2, …),, we have in particular $${\cal F}_{ - \infty, 0}^Y = \sigma (Z_0, Z_{ - 1}, \ldots )$$. We obtain

\begin{align*} \beta_n & = \E( \esssup \{ | \rP (V\mid {\mathcal F}_{-\infty,0}^Y) - \rP(V) |\colon V\in {\mathcal F}_{n,\infty}^Y \} ) \\ & = \E( \esssup \{ | \rP(V\mid Y_0,Z_0,Z_{-1},\ldots ) - \rP(V) |\colon V\in {\mathcal F}_{n,\infty}^Y \} ) \\ & \leq \widetilde{\E}_\pi ( \esssup \{ | \widetilde{\P}_\pi( (\wty_n,\wty_{n+1},\ldots\kern-1.5pt)\in A\mid {\mathcal G}_0) - \widetilde{\P}_\pi( (\wty_n',\wty_{n+1}',\ldots\kern-1.5pt)\in A\mid {\mathcal G}_0) |\colon A\in {\mathcal C} \} ) \\ & \leq \widetilde{\E}_\pi ( \widetilde{\P}_\pi( \text{there exists } m\geq n\colon \wty_m\neq \wty_m\mid {\mathcal G}_0) ) \\ &= \widetilde{\P}_\pi( \text{there exists } m\geq n\colon \wty_m\neq \wty_m). \end{align*}

Here $${\cal C}$$ denotes the σ-field generated by the cylinder sets. Proposition 2.1 yields $$\beta _n = O(\rho ^{\sqrt n } )$$, as required.

Proof of Theorem 2.2. Let ((Yt, λt))t ∈ ℤ be a stationary version of the process. We will show that there exists a measurable function \[g:\mathbb{N}_0^\infty \to [0,\infty )\] such that λt = g(Yt −1, Yt −2, …). To this end, we consider the same ‘forward iterations’ as in the proof of Lemma 2.1. We use the true values Y 0, …, Y 1−p, λ0, …, λ1−q as well as Y 0, …, Y 1−p, $$\lambda _{0'}, \ldots, \lambda _{1 - q} &#x2018;$$ with $$\lambda _{0'} = \cdots = \lambda _{1 - q} &#x2018; = 0$$ as starting values. Then we define, according to the model equation (2.1b),

\begin{align*} \lambda_1 & =\,f(Y_0,\ldots,Y_{1-p};\,\lambda_0,\ldots,\lambda_{1-q}), \\ \lambda_1' & =\,f(Y_0,\ldots,Y_{1-p};\,\lambda_0',\ldots,\lambda_{1-q}') =\!:\, g^{[1]}(Y_0,\ldots,Y_{1-p}). \end{align*}

Iterating this scheme we obtain

\begin{align*} \lambda_k & =\,f(Y_{k-1},\ldots,Y_{k-p};\,\lambda_{k-1},\ldots,\lambda_{k-q}), \\ \lambda_k' & =\,f(Y_{k-1},\ldots,Y_{k-p};\,\lambda_{k-1}',\ldots,\lambda_{k-q}') =\!:\, g^{[k]}(Y_{k-1},\ldots,Y_{1-p}). \end{align*}

Note that in all steps matching values of the process (Yt)t are used for computing λk and $$\lambda _{k'}$$, which means that the contractive property takes effect at each step. Therefore we obtain, analogously to (3.3) in the proof of Lemma 2.1,

\begin{equation} | \lambda_k - \lambda_k' | \leq \sum_{i=1}^q d_{k+1,i} \lambda_{1-i}, \end{equation}

where it follows from (3.4) that dk +1 → 0 as k → ∞. By stationarity we conclude, for fixed t ∈ ℤ, that

\begin{equation} \E| \lambda_t - g^{[k]}(Y_{t-1},\ldots,Y_{t-p-k+1}) | \leq \sum_{i=1}^q d_{k+1,i} \E \lambda_{t-k-i+1} \to 0 \quad\text{as } k\to\infty, \end{equation}

that is, as k → ∞, g [k](Yt −1, …, Yt pk+1) converges in L 1 to λt. By taking an appropriate subsequence we also get almost-sure convergence. This means that there exists some measurable function \[g:\mathbb{N}_0^\infty \to [0,\infty )\] such that

(3.10)\begin{equation} \label{pc23.1} \lambda_t = g(Y_{t-1},Y_{t-2},\ldots\kern-1.5pt) \quad \mbox{almost surely}. \end{equation}

Since absolute regularity of the process (Yt)t ∈ ℤ implies strong mixing (see, e.g. [Reference Doukhan11, p. 20]), we conclude from Remark 2.6 on page 50 in combination with Proposition 2.8 of [Reference Bradley5, p. 51] that any stationary version of this process is also ergodic.

Finally, we conclude from representation (3.10) by Proposition 2.10(ii) of [Reference Bradley5, p. 54] that the bivariate process ((Yt, λt))t ∈ ℤ is also ergodic.

Lemma 3.1. Suppose that (A1) is fulfilled. Then

  1. (i) \[\widetilde\mathbb{E}(\eta ^{\tau _1 } |\mathcal{G}_{ - 1} ) \leqslant (V(\widetildeX_0 ) + V(\widetildeX_{0'} ))/2\] if $$(V(\widetildeX_0 ) + V(\widetildeX_{0'} ))/2 &#x003E; C_1^{(0)}$$, where η =2/(1 + κ) and $$C_1^{(0)} = (2a_0 + 2)/(1 - \kappa )$$.

  2. (ii) \[\widetilde\mathbb{E}(\eta ^{\tau _{m + 1} - \tau _m } |\mathcal{G}_{\tau _m - 1} ) \leqslant \rho ^{D_n } (1 + (a_0 + \kappa C_1^{(0)} )/(1 - \kappa ))\].

Proof of Lemma 3.1. We have already defined

\begin{gather*} \wtX_t=(\wty_{t-1}^2,\ldots,\wty_{t-p+1}^2,\widetilde{\sigma}_t^2,\ldots,\widetilde{\sigma}_{t-q+1}^2) \\ \text{and}\quad \wtX_t'=(\wty_{t-1}^{'2},\ldots,\wty_{t-p+1}^{'2},\widetilde{\sigma}_t^{'2},\ldots,\widetilde{\sigma}_{t-q+1}^{'2}) \end{gather*}

in the case of a GARCH(p,q) model. Furthermore, in the INGARCH(p, q) case we set analogously

\wtX_t=(\wty_{t-1},\ldots,\wty_{t-p+1},\wtl_t,\ldots,\wtl_{t-q+1}) \quad\text{and}\quad \wtX_t'=(\wty_{t-1}',\ldots,\wty_{t-p+1}',\wtl_t',\ldots,\wtl_{t-q+1}').

Let $$W_t = (V(\widetildeX_t ) + (\widetildeX_{t'} ))/2$$.

Since $$\widetildeY_{t - 1} |{\cal G}_{t - 1} = Q(\widetilde\lambda _{t - 1} )$$, we see that

\begin{gather*} \widetilde{\E}( \wty_{t-1}\mid {\mathcal G}_{t-1} ) =\widetilde{\E}( \wty_{t-1}\mid \wtX_{t-1}) \\ \text{and}\quad \widetilde{\E}( \wtl_t\mid {\mathcal G}_{t-1} ) =\widetilde{\E}(\,f(\wty_{t-1},\ldots,\wty_{t-p};\,\wtl_{t-1},\ldots,\wtl_{t-q})\mid {\mathcal G}_{t-1} ) =\widetilde{\E}( \wtl_t\mid \wtX_{t-1} ). \end{gather*}

Therefore, we obtain \[\widetilde\mathbb{E}(V(\widetildeX_t )|\mathcal{G}_{t - 1} ) = \widetilde\mathbb{E}(V(\widetildeX_t )|\widetildeX_{t - 1} )\] and, analogously, \[\widetilde\mathbb{E}(V(\widetildeX_{t'} )|\mathcal{G}_{t - 1} ) = \widetilde\mathbb{E}(V(\widetildeX_t )|\widetildeX_{t - 1} &#x2018;)\]. Hence, from the geometric drift condition (A1), we obtain

(3.11)\begin{equation} \widetilde{\E}( W_t \mid {\mathcal G}_{t-1} ) \leq \kappa W_{t-1} + a_0. \end{equation}

This implies that

(3.12)\begin{equation} \widetilde{\E}( W_t\mid {\mathcal G}_{t-1} ) \leq \eta^{-1} W_{t-1} - 1, \quad \mbox{if } W_{t-1}>C_1^{(0)} , \end{equation}

and

\begin{equation} \widetilde{\E}( W_t\mid {\mathcal G}_{t-1} ) \leq \kappa C_1^{(0)} + a_0, \quad \mbox{if } W_{t-1}\leq C_1^{(0)}.\nonumber \end{equation}

In what follows we adapt the line of arguments from Nummelin and Tuominen [Reference Nummelin and Tuominen28], who derived similar bounds for stopping times in the context of a Markov chain.

Proof of (i). Let $$W_0 = x &#x003E; C_1^{(0)}$$. We denote by \[\widetilde\mathbb{P}_x\] and \[\widetilde\mathbb{E}_x\] the conditional distribution and expectation, respectively, given W 0 = x. It follows from (3.12) that

\begin{equation} \widetilde{\E}_x( W_1 ) \leq \eta^{-1} x - 1, \end{equation}

which implies that

(3.13)\begin{equation}\label{pls.3} x - \eta \E_x ( W_1 ) \geq \eta. \end{equation}

Analogously, we conclude from (3.12) that

\begin{equation} \1( W_1>C_1^{(0)} ) \widetilde{\E}_x( W_2 \mid W_1 ) \leq \1( W_1>C_1^{(0)} ) (\eta^{-1} W_1 - 1), \end{equation}

which yields

\begin{equation} \1( W_1>C_1^{(0)} ) ( W_1 - \eta \widetilde{\E}_x( W_2\mid W_1 ) ) \geq \eta \1( W_1>C_1^{(0)} ). \end{equation}

Multiplying both sides by η and taking the expectation over W 1 under the condition that W0 = x, we obtain

\begin{equation} \widetilde{\E}_x ( \1( W_1>C_1^{(0)} ) ( \eta W_1 - \eta^2 W_2 ) ) \geq \eta^2 \widetilde{\P}_x( W_1>C_1^{(0)} ).\nonumber \end{equation}

Proceeding in the same way we conclude that

(3.14)\begin{align} & \widetilde{\E}_x ( \1( W_1>C_1^{(0)},\ldots,W_k>C_1^{(0)} ) ( \eta^k W_k - \eta^{k+1} W_{k+1} ) ) \nonumber \\ & \qquad \geq \eta^{k+1} \widetilde{\P}_x( W_1>C_1^{(0)},\ldots,W_k>C_1^{(0)} ). \end{align}

Adding both sides of (3.13) to (3.14) we obtain

\begin{align*} x\geq \sum_{k=0}^\infty \eta^{k+1} \widetilde{\P}_x( W_1>C_1^{(0)},\ldots,W_k>C_1^{(0)} ) = \sum_{k=0}^\infty \eta^{k+1} \widetilde{\P}_x ( \tau_1\geq k+1 ) \geq \widetilde{\E}_x ( \eta^{\tau_1} ), \end{align*}

as required.

Proof of (ii). Here we have to take into account that τm+1 is not a usual but a retarded return time. Recall that Xτ m is $${\cal G}_{\tau _m - 1}$$-measurable. Since $$X_{\tau _m } \le C_1^{(0)}$$, from (i) we obtain

(3.15)\begin{align} & \widetilde{\E}( \eta^{\tau_{m+1}-\tau_m} \mid {\mathcal G}_{\tau_m-1} ) \nonumber \\ & \qquad = \eta^{D_n} \widetilde{\P}( W_{\tau_m+D_n} \leq C_1^{(0)} \mid {\mathcal G}_{\tau_m-1} ) \nonumber \\ & \qquad\jump + \eta^{D_n} \int_{(C_1^{(0)},\infty)} \widetilde{\E}( \eta^{\tau_{m+1}-(\tau_m+D_n)} \mid {\mathcal G}_{\tau_m-1}, W_{\tau_m+D_n}=x ) \widetilde{\P}^{W_{\tau_m+D_n}\mid {\mathcal G}_{\tau_m-1}}(\rd x) \nonumber \\ & \qquad \leq \eta^{D_n} ( 1 + \widetilde{\E}( W_{\tau_m+D_n} \mid {\mathcal G}_{\tau_m-1} ) ). \end{align}

Furthermore, since $$W_{\tau _m } \le C_1^{(0)}$$, from (3.11) we obtain

\begin{align*} \widetilde{\E}( W_{\tau_m+1} \mid {\mathcal G}_{\tau_m-1} ) &\leq \kappa W_{\tau_m} + a_0 \leq \kappa C_1^{(0)} + a_0, \\ \widetilde{\E}( W_{\tau_m+2} \mid {\mathcal G}_{\tau_m-1} ) & = \widetilde{\E}( \widetilde{\E}( W_{\tau_m+2} \mid {\mathcal G}_{\tau_m-1}, W_{\tau_m+1} ) \mid {\mathcal G}_{\tau_m-1} ) \\ & \leq \widetilde{\E}( \kappa W_{t_m+1} + a_0 \mid {\mathcal G}_{\tau_m-1}) \\ & \leq 2a_0 + \kappa (\kappa C_1^{(0)} + a_0 ), \end{align*}

and, eventually,

(3.16)\begin{equation} \widetilde{\E}( W_{\tau_m+k} \mid {\mathcal G}_{\tau_m-1} ) \leq \frac{a_0 + \kappa C_1^{(0)}}{1-\kappa} \quad \text{ for all } k\in\N. \end{equation}

Part (ii) now follows from (3.15) and (3.16).

Acknowledgements

This work has been developed within the MME-DII centre of excellence (ANR-11-LABEX-0023-01) and with the help of PAI-CONICYT MEC no. 80170072. The first author wishes to thank the University of Jena and Universidad de Valparaiso for their hospitality. The research of the second author was supported by a guest professorship of IEA at the University of Cergy-Pontoise. We thank two anonymous referees for their comments which helped improve the presentation of our results.

References

Adell, J. A. and Jodrá, P. (2006). Exact Kolmogorov and total variation distances between some familiar discrete distributions. J. Inequal. Appl. 2006, 64307, 8pp.CrossRefGoogle Scholar
Baraud, Y., Comte, F. and Viennet, G. (2001). Adaptive estimation in autoregression or β-mixing regression via model selection. Ann. Statist. 29 (3), 839875.Google Scholar
Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. J. Econometrics 31, 307327.CrossRefGoogle Scholar
Boussama, F. (1998). Ergodicité, mélangeances et estimation des modèles GARCH. Doctoral Thesis, University Paris 7.Google Scholar
Bradley, R. C. (2007). Introduction to Strong Mixing Conditions, Vol. I. Kendrick Press, Heber City.Google Scholar
Carrasco, M. and Chen, X. (2002). Mixing and moment properties of various GARCH and stochastic volatility models. Econometric Theory 18, 1739.CrossRefGoogle Scholar
Csorgö, M. and Horvath, L. (1997). Limit Theorems in Change-Point Analysis. Wiley, Chichester.Google Scholar
Dedecker, J., Doukhan, P., Lang, G., León, J. R., Louhichi, S. and Prieur, C. (2007). Weak Dependence: With Examples and Applications (Lecture Notes Statist. 190). Springer, New York.CrossRefGoogle Scholar
Den Hollander, F. (2012). Probability theory: the coupling method. Lecture Notes, University of Leiden.Google Scholar
Douc, R., Doukhan, P. and Moulines, E. (2013). Ergodicity of observation-driven time series models and consistency of the maximum likelihood estimator. Stoch. Process. Appl. 123, 26202647.CrossRefGoogle Scholar
Doukhan, P. (1994). Mixing: Properties and Examples (Lecture Notes Statist. 84). Springer, Berlin.CrossRefGoogle Scholar
Doukhan, P., Massart, P. and Rio, E. (1994). The functional central limit theorem for strongly mixing processes. Ann. l’IHP Probabilités et Statistiques 30 (2), 6282.Google Scholar
Doukhan, P., Massart, P. and Rio, E. (1995). Invariance principles for absolutely regular empirical processes. Ann. Inst. H. Poincaré Prob. Statist. 31, 393427.Google Scholar
Durrett, R. (1991). Probability: Theory and Examples. Wadsworth, Pacific Grove.Google Scholar
Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50, 9871007.CrossRefGoogle Scholar
Fokianos, K. (2012). Count time series models. In Time Series: Methods and Applications (Handbook Statist. 30), eds Subba Rao, T., Subba Rao, S. and Rao, C. R., Elsevier, Amsterdam, pp. 315347.CrossRefGoogle Scholar
Fokianos, K. and Tjøstheim, D. (2011). Log-linear Poisson autoregression. J. Multivariate Anal. 102, 563578.CrossRefGoogle Scholar
fokianos, K., Rahbek, A. and Tjøstheim, D. (2009). Poisson autoregression. J. Amer. Statist. Soc. 104, 14301439.CrossRefGoogle Scholar
Francq, C. and Zakoïan, J.-M. (2006). Mixing properties of a general class of GARCH(1,1) models without moment assumptions on the observed process. Econometric Theory 22, 815834.CrossRefGoogle Scholar
Francq, C. and Zakoïan, J.-M. (2010). GARCH Models: Structure, Statistical Inference and Financial Applications. John Wiley, Chichester.CrossRefGoogle Scholar
Franke, J. (2010). Weak dependence of functional INGARCH processes. Unpublished manuscript.Google Scholar
Glosten, L. R., Jagannathan, R. and Runkle, D. E. (1993). On the relation between the expected value and the volatility of the nominal excess return on stocks. J. Finance 48, 17791801.CrossRefGoogle Scholar
Lambert, D. (1992). Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34, 114.CrossRefGoogle Scholar
Leucht, A. and Neumann, M. H. (2013). Degenerate U- and V-statistics under ergodicity: asymptotics, bootstrap and applications in statistics. Ann. Inst. Statist. Math. 65, 349386.CrossRefGoogle Scholar
Leucht, A., Kreiss, J.-P. and Neumann, M. H. (2015). A model specification test for GARCH(1,1) processes. Scand. J. Stat. 42, 11671193.CrossRefGoogle Scholar
Lindner, A. M. (2009). Stationarity, mixing, distributional properties and moments of GARCH( p,q)-processes. In Handbook of Financial Time Series, eds Mikosch, T. et al., Springer, Berlin, pp. 4369.CrossRefGoogle Scholar
Neumann, M. H. (2011). Absolute regularity and ergodicity of Poisson count processes. Bernoulli 17, 12681284.CrossRefGoogle Scholar
Nummelin, E. and Tuominen, P. (1982). Geometric ergodicity of Harris recurrent Markov chain with applications to renewal theory. Z. Wahrscheinlichkeitsth. 12, 187202.Google Scholar
Truquet, L. (2019). Local stationarity and time-inhomogeneous Markov chains. To appear in Ann. Statist..CrossRefGoogle Scholar
Tweedie, R. L. (1988). Invariant measures for Markov chains with no irreducibility assumptions. J. Appl. Prob. 25, 275285.CrossRefGoogle Scholar
Viennet, G. (1997). Inequalities for absolutely regular sequences: application to density estimation. Prob. Theory Relat. Fields 107, 467492.CrossRefGoogle Scholar
Woodard, D. B., Matteson, D. S. and Henderson, S. G. (2011). Stationarity of generalized autoregressive moving average models. Electron. J. Statist. 5, 800828.CrossRefGoogle Scholar