MIXING PROPERTIES OF A GENERAL CLASS OF GARCH(1,1) MODELS WITHOUT MOMENT ASSUMPTIONS ON THE OBSERVED PROCESS

Christian Francq; Jean-Michel Zakoïan

doi:10.1017/S0266466606060373

MIXING PROPERTIES OF A GENERAL CLASS OF GARCH(1,1) MODELS WITHOUT MOMENT ASSUMPTIONS ON THE OBSERVED PROCESS

Published online by Cambridge University Press: 30 August 2006

Christian Francq and

Jean-Michel Zakoïan

Show author details

Christian Francq: Affiliation:
Université Lille 3, GREMARS
Jean-Michel Zakoïan: Affiliation:
Université Lille 3, GREMARS and CREST

Article contents

Abstract
1. INTRODUCTION
2. BASIC MARKOV CHAIN THEORY
3. STRICT STATIONARITY
4. GEOMETRIC ERGODICITY
5. STATISTICAL APPLICATIONS
References

Rights & Permissions

Abstract

We consider general, and possibly nonparametric, GARCH(1,1) processes. First we give conditions for the existence and the uniqueness of stationary ergodic solutions. Then we identify additional conditions for geometric ergodicity. These conditions consist of mild restrictions on the distribution of the latent independent process. No moment assumption is made on the generalized autoregressive conditionally heteroskedastic (GARCH) process. Applications to the asymptotic behavior of sample autocorrelations and to unit-root tests are proposed.This work was supported by INTAS (research project 03-51-3714). The authors gratefully acknowledge the quick and careful reading of the manuscript by Bruce Hansen and three referees. Their detailed comments led to a greatly improved presentation.

Type: Research Article
Information: Econometric Theory , Volume 22 , Issue 5 , October 2006 , pp. 815 - 834

DOI: https://doi.org/10.1017/S0266466606060373 [Opens in a new window]
Copyright: © 2006 Cambridge University Press

1. INTRODUCTION

With the increasing popularity of generalized autoregressive conditional heteroskedasticity (GARCH) modeling, there is also increased interest in general, even nonparametric, models and in moving away from the particular specification of the classical GARCH models, as introduced by Engle (1982) and Bollerslev (1986). In this paper, we assume that (ε_t) belongs to the general class of GARCH(1,1) processes, defined by

where the sequence (η_t) is independent and identically distributed (i.i.d.). For statistical purposes the assumption that η_t has zero mean and unit variance is often required, but we do not need this assumption in this paper. The following assumptions are made on the functions ω, a, and h.

(i)

is such that its restrictions to

are either constant and strictly positive, or continuous and, respectively, strictly increasing and strictly decreasing;

(ii)

is such that its restrictions to

are continuous and, respectively, strictly increasing and strictly decreasing;

(iii)

is 1-1, onto and increasing.

The standard GARCH(1,1) model is obtained for ω(x) = ω, h(x) = x², and a(x) = αx² + β with α ≥ 0,β ≥ 0. This specification has been found adequate for a number of financial series and is arguably the most popular volatility model. When h has a form inspired by the Box–Cox transformation, and for some particular specifications of the functions ω and a, we get the augmented GARCH introduced by Duan (1997). For h(x) = x^δ we get the class of GARCH(1,1) models defined by He and Teräsvirta (1999), which includes a variety of other first-order specifications.¹

Namely, the absolute value GARCH model (Taylor, 1986; Schwert, 1989) for ω(x) = ω, h(x) = x, and a(x) = α|x| + β; the threshold GARCH model of Zakoïan (1994) for ω(x) = ω, h(x) = x, and a(x) = α₋ max(0,−x) + α₊ max(0,x) + β; the Glosten, Jagannathan, and Runkle (1993) model for ω(x) = ω, h(x) = x², and a(x) = α₋{max(0,−x)}² + α₊{max(0,x)}² + β; the asymmetric power GARCH model of Ding et al. (1993) for ω(x) = ω, h(x) = x^δ, and a(x) = α(|x| − γx)^δ + β; a moving average GARCH process, inspired by the moving average conditionally heteroskedastic (MACH) model of Yang and Bewley (1995), for h(x) = x², a(x) = a, and, for instance, ω(x) = ω₁ + ω₂ x²; the sign-switching autoregressive conditional heteroskedasticity (ARCH) model of Fornari and Mele (1997) for h(x) = x², ω(x) = ω₁ + ω₂ sign(x), and a(x) = αx² + β. See also Bühlmann and McNeil (2000) and Yang and Tschernig (2006) for recent references on nonparametric GARCH(1,1) modeling.

See Ling and McAleer (2002) for strict stationarity and moment conditions for such models. Note that with the preceding assumptions, the volatility σ_t increases with the magnitude of positive “shocks” η_t−1 as well as it increases with the magnitude of negative ones. Yet positive and negative shocks may have different impacts on the volatility.

Nelson (1990) and Bougerol and Picard (1992) showed in the standard GARCH case that if Eη_t = 0 and Var(η_t) = 1,

is a necessary and sufficient condition for the existence of a unique strictly stationary and nonanticipative solution to model (1). A nonanticipative solution is a process (ε_t) such that ε_t is a measurable function of the variables η_t−s, s ≥ 0. The extension to model (1) will be given subsequently. For statistical inference, however, strict stationarity is not a sufficient assumption, and it can be crucial to know when the stationary solution possesses mixing properties. Knowing that these properties hold may make it possible or easier to establish other properties such as central limit theorems (CLTs).

Mixing properties of classes of models including GARCH-type processes have been investigated by Ango Nze (1992, 1998), Lu (1996), Carrasco and Chen (2002), Rahbek, Hansen, and Dennis (2002), Lee and Shin (2004), Hwang and Kim (2004), and Meitz and Saikkonen (2004), among others. Unfortunately, when applied to standard GARCH processes, their results require moment assumptions that are much stronger than the strict stationarity assumption. Typically the condition α + β < 1 is imposed for the standard GARCH(1,1), which amounts to restricting the class of strictly stationary solutions to those admitting a second-order moment. To our knowledge the most significant contribution, specifically devoted to the standard GARCH(p,q), is the dissertation by Boussama (1998), which establishes strong mixing under conditions we will further discuss. However, the proof relies on heavy geometric algebra based upon the Mokkadem (1990) result for polynomial autoregressive processes. See also Kristensen (2006).

Our main contribution is to show that under (2), the β-mixing of the strictly stationary solution holds without any additional restriction on the function a(·). In particular we do not make any moment assumptions on the process (ε_t). We provide simple sufficient conditions on the process η_t under which the strictly stationary solution to model (1) is β-mixing with exponential decay. We do not impose a continuous distribution for η_t, contrary to the preceding references dealing with mixing. This may have interest for financial applications because prices, and hence returns, are not observed continuously but are multiples of a monetary unit called the tick. A continuous distribution for the i.i.d. process would typically imply a continuous distribution for ε_t. On the other hand, dealing with the mixing properties of discrete-valued time-series models is in general a difficult task. For these reasons, and for the sake of generality, we will allow for both a discrete and a continuous part in the distribution of η_t. We rely on the results displayed in the book by Meyn and Tweedie (1996) so that the proof can be easily followed.

The fact that requirements for the existence of second-order moments can be ignored is particularly important for the statistical inference of GARCH(1,1) models. Indeed, recent references establish asymptotic normality of the maximum likelihood estimator essentially under the assumption (2) using the martingale theory.²

See, e.g., Lee and Hansen (1994), Lumsdaine (1996), Berkes, Horváth, and Kokoszka (2003), and Francq and Zakoïan (2004). See also Jensen and Rahbek (2004) for an extension to the nonstationary case.

In this framework, the asymptotic distribution of other statistics of interest (such as the autocorrelations of functions of ε_t) may be difficult, if not impossible, to derive using limit theorems for martingale differences. In such situations, the β-mixing property will be of invaluable help.

In the next section we give, for the reader's convenience, the Markov chain results we need. Section 3 is devoted to strict stationarity. In Section 4 we establish geometric ergodicity of the strictly stationary solution. Two statistical applications are proposed in Section 5.

2. BASIC MARKOV CHAIN THEORY

This section is drawn from the papers by Tjøstheim (1990) and Basrak, Davis, and Mikosch (2002) and the book by Meyn and Tweedie (1996). All the random variables considered in this paper are defined on some probability space

. Let {X_t,t ≥ 0} be a homogeneous Markov chain on

where

is the Borel σ-field on E. We denote the probability of moving from x to the set B in t steps by

The Markov chain (X_t) is φ-irreducible if, for some nontrivial measure φ on

If (X_t) is φ-irreducible, there exists a maximal irreducibility measure M (see Meyn and Tweedie, 1996, Prop. 4.2.2), and we set

. We call the chain positive recurrent if

For a φ-irreducible Markov chain, positive recurrence is equivalent (see Meyn and Tweedie, 1996, Thm. 18.2.2) to the existence of a (unique) invariant probability measure, that is, a probability π such that

Let ∥·∥ denote the total variation norm. The Markov chain (X_t) is said to be geometrically ergodic if there exists a ρ, ρ ∈ (0,1), such that

Recall that for a stationary process, the strong (α-) mixing coefficients are defined by

where the first supremum is taken over the set of measurable functions f and g such that | f | ≤ 1, |g| ≤ 1, and the second supremum is taken over the sets A ∈ σ(X_s,s ≤ 0) and B ∈ σ(X_s,s ≥ k), whereas the β-mixing coefficients are defined by

where in the last equality the sup is taken over all pairs of partitions {A₁,…, A_I} and {B₁,…, B_J} of Ω such that A_i ∈ σ(X_s,s ≤ 0) for each i and B_j ∈ σ(X_s,s ≥ k) for each j. The process is called α-mixing (resp. β-mixing) if lim_k→∞ α_X(k) = 0 (resp. lim_k→∞ β_X(k) = 0). We have α_X(k) ≤ β_X(k), so that β-mixing implies α-mixing. If Y = (Y_t) is a process such that Y_t = f (X_t,…, X_t−r) for some measurable function f and some integer r ≥ 0, then σ(Y_t, t ≤ s) ⊂ σ(X_t, t ≤ s) and σ(Y_t, t ≥ s) ⊂ σ(X_t−r, t ≥ s). Thus

Note that for a stationary Markov process we have α_X(k) = sup_f,g|Cov(f (X₀),g(X_k))|, where f and g are as in the previous definition (see Bradley, 1986). One consequence of the geometric ergodicity is that the Markov chain (X_t) is β-mixing, and hence strongly mixing, with geometric rate. Indeed, Davydov (1973) showed that for an ergodic Markov chain (X_t) with invariant probability measure π,

Thus β_X(k) = O(ρ^k) if (3) holds.

To state the following criterion for the geometric ergodicity of a Markov chain, we need the idea of a Feller chain. We call (X_t) a Feller Markov chain (or weak Feller chain) if the function

is continuous for every bounded and continuous function g on E.

THEOREM 1 (Feigin and Tweedie, 1985, Thm. 1). Assume that

(i) (X_t) is a Feller Markov chain;

(ii) (X_t) is φ-irreducible for some measure φ on

;

(iii) there exists a compact set C ⊂ E such that φ(C) > 0 and a nonnegative continuous function (test function)

such that

and for some c > 0

Then (X_t) is geometrically ergodic.

3. STRICT STATIONARITY

The assumption that the sequence (η_t) is i.i.d. with E log⁺{a(η_t)} < ∞, where log⁺x = max(log x,0) for x ≥ 0, is maintained throughout. Our first result can be deduced from results established by Bougerol and Picard (1992) for generalized autoregressive vector equations. But we prefer to give a simple self-contained proof of this theorem.

THEOREM 2. If (2) holds, then the series

converges almost surely (a.s.), and the process (ε_t), defined by ε_t = h⁻¹(h_t)η_t, is a strictly stationary solution of (1). This solution is unique, nonanticipative, and ergodic.

If (2) does not hold and P[η_t = 0] ≠ 1, there exists no strictly stationary solution to model (1).

Proof. First note that γ =: E log a(η_t) exists in [−∞,+∞). Now let

the limit being well defined in

in view of the positivity of the summands. Because h_t(N) = ω(η_t−1) + a(η_t−1)h_t−1(N − 1) for all N, we have, letting N go to infinity, h_t = ω(η_t−1) + a(η_t−1)h_t−1. It remains to show that h_t is a.s. finite. Let

for some constant τ > 0 and

. Let h_t*(N) be obtained by replacing ω by ω* in h_t(N) and denote by h_t* its a.s. limit. We have

as n → ∞, by the strong law of large numbers applied to the i.i.d. sequence (log{a(η_t)}). It follows from the Cauchy rule that for any t, the sequence {h_t*(N),N ≥ 1} converges a.s. in

. Because h_t ≤ h_t* we thus have h_t < ∞ a.s. As a function of an i.i.d. sequence, the limit h_t is thus strictly stationary and ergodic, in which case so is ε_t.

To prove uniqueness, let

be another strictly stationary solution process of (1). Suppose

for some t. Iterating the second equation in (1) we have

. From the strong law of large numbers and (2), we have a(η_t−1) … a(η_t−n) → 0 with probability 1 as n → ∞. Thus

, which entails

with nonzero probability. This is not possible because the sequences

are stationary. Therefore

for any t, a.s.

To prove the necessary part, suppose there exists a strictly stationary solution (h_t) of (1). We have for n > 0,

from which we deduce that a(η₋₁) … a(η_−n)ω(η_−n−1) converges to zero, a.s., when n → ∞, or, equivalently, that

First suppose E log{a(η_t)} > 0. Then by the strong law of large numbers,

, and it is necessary for (14) to hold that log ω(η_−n−1) → −∞ a.s. This convergence implies ω(η₀) = 0 a.s., which is precluded because η_t is not identically equal to zero. Now suppose E log{a(η_t)} = 0. By the Chung–Fuchs theorem (see, e.g., Chow and Teicher, 1997) we have

with probability 1 and, using the elementary Lemma 1, which follows, the convergence (14) entails log ω(η_−n−1) → −∞ in probability. Thus, we are led to a contradiction as in the previous case. Thus, the assumption that a strictly stationary solution exists when E log{a(η_t)} ≥ 0 entails a contradiction. █

LEMMA 1. If (X_n) and (Y_n) are two independent sequences of random variables such that X_n + Y_n → −∞ and X_n [nrarr ] −∞ in probability, then Y_n → −∞ in probability.

Remark 1. It can be seen from the proof that a solution (h_t), as given by (12), always exists in

but that when (2) does not hold and

, this solution satisfies

. See Klüppelberg, Lindner, and Maller (2004) for more detailed results in the standard GARCH(1,1) case.

4. GEOMETRIC ERGODICITY

To prove geometric ergodicity we require additional assumptions on the i.i.d. process (η_t), essentially to ensure that the transition kernel has a Lebesgue component.

Assumption A. The distribution

of the variable η_t is a mixture of an absolutely continuous component with respect to the Lebesgue measure λ on

and Dirac masses at some points

. With standard notation we then have

where f is a density of the continuous component. Let η₊⁰ = inf{η|η > 0,f (η) > 0} and η₋⁰ = sup{η|η < 0,f (η) > 0}, when these sets are nonempty, and assume that

for some τ > 0. By convention (η₋⁰ − τ,η₋⁰) = Ø (resp. (η₊⁰, η₊⁰ + τ) = Ø) when η₋⁰ (resp. η₊⁰) is not defined. Finally, E {ω(η_t)^r} < ∞ and E {a(η_t)^r} < ∞ for some r > 0.

Remark 2. The standard case where

is absolutely continuous with respect to the Lebesgue measure is obtained by taking p = 0. Note however that the case p = 1, that is, when the law of η_t has no continuous component, is excluded. In such a case, criteria based on topological properties of the chain fail to prove ergodic properties (see a similar example in Meyn and Tweedie, 1996, p. 127). This does not mean that the process is not geometrically ergodic in those situations: for example, in the standard GARCH(1,1) case, if η_t² = 1 with probability 1, then the strictly stationary solution process is an independent white noise, which is obviously geometrically ergodic.

The main result of this paper is as follows.

THEOREM 3. Under Assumption A and if the strict stationarity condition (2) holds, then the strictly stationary and nonanticipative solution (ε_t) of the GARCH(1,1) model (1) is β-mixing with exponential decay.

Remark 3. The proof of this theorem relies on showing that (2) entails the geometric ergodicity of (h_t). Moreover, geometric ergodicity implies strict stationarity. Under Assumption A, which entails

, condition (2) is therefore necessary and sufficient for the existence of a geometrically ergodic solution (h_t) and also for the existence of a strictly stationary and geometrically β-mixing solution (ε_t).

Remark 4. To our knowledge, existing results on mixing conditions for nonstandard GARCH(1,1) processes (see references in the introduction) are demanding in terms of moment assumptions. For instance, in Carrasco and Chen (2002), the mixing properties are obtained for various GARCH(1,1) models under moment conditions on the process (ε_t) (see their Table 1). By contrast, we find that the strictly stationary solution is β-mixing without any moment restriction.

Remark 5. When applied to standard GARCH(1,1) models, this theorem is also more general than those already established. In Boussama (1998), the geometric ergodicity of standard GARCH models is proved under the assumption that η_t has an absolutely continuous distribution with respect to the Lebesgue measure (i.e., p = 0 in our framework), with a positive density in a neighborhood of zero. In this case Assumption A holds with η₋⁰ = η₊⁰ = 0. Note however that our assumption allows us to handle more general cases, where the density is null on a neighborhood of zero or where the distribution of η_t does not admit a density with respect to the Lebesgue measure.

Before proving the theorem, we start by establishing geometric ergodicity of (h_t).

LEMMA 2. Under the assumptions of Theorem 3, the strictly stationary and nonanticipative solution (h_t) of model (1) is geometrically ergodic.

Proof. By the second equation in model (1), (h_t) is obviously a homogeneous Markov chain on

. The proof consists in checking the three conditions of Theorem 1.

Step (i): Feller property. For any bounded and continuous function g on

, the function

is continuous in x over

, by the Lebesgue dominated convergence theorem, which shows that the Markov chain (h_t) is Feller.

Step (ii): Irreducibility. Let τ′ ∈ (0,τ) be small enough so that the set D_τ′ =: (η₋⁰ − τ′,η₋⁰) ∪ (η₊⁰, η₊⁰ + τ′) does not contain any mass μ_i.

(a) First assume that

. Note that {a(x) = 0} ⊂ {x = 0} in view of the assumptions made on the function a. Set H(x,y) = ω(y) + a(y)ω(x) and remark that H is strictly increasing in |y| for fixed x. Let

and denote by λ_I the restriction of the Lebesgue measure to the (nonempty) set I. For

, we have

We have, for any Borel set B,

Note that, conditional on the event {η₁ ∈ D_τ′}, the variable H(0,η₁) admits a density, f_H, that is positive over I. Thus

which proves that the Markov chain (h_t) is λ_I-irreducible.

(b) Now suppose

. For ease of presentation we shall assume that

The case where the distribution of η is absolutely continuous with respect to λ, that is, p = 0 in (15), can be handled by a straightforward adaptation of what follows.

We have, in view of (15),

where

By convention take a(η₋⁰) = 1 if {η|η < 0,f (η) > 0} = Ø and a(η₊⁰) = 1 if {η|η > 0,f (η) > 0} = Ø. Inequality (17) follows from the fact that a is monotonous over the positive and negative real semilines. It follows that, under (2),

Hence there exist some integers n₀, m₀, and n_i, for i = 1,…, N, such that

By continuity of a, it is not restrictive to assume that τ′ is small enough so that

Now let

. We have, for all t > 0,

The inequalities (18) and (19) will allow us to control the products of this sum, provided that we constrain the i.i.d. process to visit some states with appropriate frequencies. To this aim we introduce, for K = 1,2,…, the event

where

. When η₊⁰ or η₋⁰ is not defined, the corresponding terms can be withdrawn from the definition of A_K. Thus we have

Denote

, for k = 0,…, K − 1. The conditional distribution given A_K of the vector

has a density with respect to the Lebesgue measure on

Now we wish to show that

First suppose that the function ω is constant over

. For any integer [ell ] and any vector u = (u₁,…,u_{[ell ]}) denote u^← = (u₁,…,u_{[ell ]−1}) and u⁺ = u_{[ell ]}. By convention let

for n < k. Let

. We have, given A_K,

where

is a constant and S(Y_K^←) is an a.s. positive random variable. In view of (16) it follows that given A_K the mapping

is a C¹ diffeomorphism between open sets of

. Indeed, the determinant of the Jacobian matrix of this mapping is given by

S(Y_K^←) which is a.s. positive. Therefore the distribution of Z_K conditional on A_K has a density with respect to the Lebesgue measure on

. Consequently (22) holds.

Now suppose that ω is nonconstant over

. Let

and let

. We have

where

is a constant and

is a random variable. The conclusion follows from the same argument as before, noting that the mapping

is a C¹ diffeomorphism between open sets of

. The case where ω is nonconstant over

can be handled similarly.

To determine the support I_K of the conditional distribution of h_Kn, first note that, for t = Kn, the last term on the right-hand side of (20) writes, conditional on A_K,

and therefore belongs to the set [ρ^Kx,ρ₁^Kx], in view of the assumptions made on the function a. Products of the form a(η_Kn−1) … a(η_Kn−kn), for k = 1,…,K − 1, can be handled similarly. To deal with the other products in (20) we introduce the notation

If A_K holds true, using the assumptions (i) and (ii) on the functions a and ω, we have

and thus, for K sufficiently large, in view of (18) and (19),

Indeed, I is the closure of the limit of the sets I_K when K tends to infinity. Because the lower and upper bounds of I_K are reached, by the intermediate values theorem and in view of (22), we conclude that

Denoting by λ_I the restriction of the Lebesgue measure to the set I, it follows that λ_I is an irreducibility measure because, for any set

by (21) and (23).

Step (iii): First note that the assumptions E log a(η_t) < 0 and E {a(η_t)^r} < ∞ for r > 0 imply the existence of a number s ∈ (0,1) such that

(see Nelson, 1990; Berkes, Horváth, and Kokoszka, 2003, Lem. 2.3).

Let the test function defined by V(x) = 1 + x^s, let 0 < c < 1 − ρ₂, and let the compact set

where ω_s = E {ω(η_t−1)^s} and s is chosen small enough so that ω_s < ∞. We have, for x ∉ C, using the elementary inequality (a + b)^s ≤ a^s + b^s for a,b ≥ 0 and s ∈ [0,1],

which proves (10). Moreover (9) holds true.

It remains to check that φ(C) > 0 where φ = λ_I is the irreducibility measure obtained previously. Given the shape of the intervals I and C, it is clear that

If c is chosen close enough to ρ₂ − 1 the latter inequality will be verified. For such a c, the compact set C meets the assumptions of Theorem 1. It follows that the Markov chain (h_t) is geometrically ergodic. █

Proof of Theorem 3. We will show that the process (ε_t) inherits the mixing property established for (h_t). We first show that the process Y_t = (h_t,η_t)′ has the mixing property. It is clear that (Y_t) is a Markov chain on

endowed with its Borel σ-field. Moreover (Y_t) is strictly stationary as a measurable function of η_t,η_t−1,…. By independence between h_t and η_t we can denote by

the stationary distribution of Y_t, where

is that of

that of η_t. Denote by

the transition probabilities of the Markov chain (Y_t). We have, for

Therefore, because

is a probability measure,

The right-hand-side term converging to 0 at exponential rate, by the geometric ergodicity of (h_t), we can deduce that (Y_t) is geometrically ergodic and thus geometrically β-mixing. Because ε_t = h⁻¹(h_t)η_t is a measurable function of Y_t, we can conclude in view of (8) that the process (ε_t) is geometrically β-mixing. █

Remark 6. The theorem could be straightforwardly extended to the case where N = ∞, provided that a(μ_i) < 1 for a finite number, say, N₀, of indexes i. Indeed, in this case the inequality (17) continues to hold with N replaced by N₀.

Remark 7. In Pham (1986), irreducibility is established for a class of models that is very similar to our model for (h_t). However we cannot use Pham's results because he assumes a continuous distribution for the i.i.d. process with a positive density in a neighborhood of 0, and more importantly, he requires ω(0) = 0 (with our notations), which does not hold, in particular, for the standard GARCH model.

Examples

1. Consider the standard ARCH(1) model and assume that η_t has a mass at zero, with Eη_t² = 1. Thus (2) is met because

It follows that geometric ergodicity holds, under Assumption A, without any restriction on the parameter α.

2. Consider the threshold ARCH(1) model, introduced by Zakoïan (1994), where ω(x) = ω > 0, h(x) = x, and a(x) = max(0,−x)α₋ + max(0,x)α₊, with α₋ > 0, α₊ > 0. Let

. We have

By Remark 3, under Assumption A, a necessary and sufficient condition for the existence of a stationary and geometrically ergodic solution is

5. STATISTICAL APPLICATIONS

The time-series literature abounds in statistical results requiring mixing assumptions. Consequently, Theorem 3 has numerous direct applications. We only give two of them. The first application concerns the asymptotic distribution of sample autocorrelations and is directly inspired from Romano and Thombs (1996). The second application shows that the standard Dickey–Fuller unit-root tests remain asymptotically valid when the error term follows the GARCH(1,1) model that we consider in this paper.

5.1. Behavior of the Sample Autocorrelations

For a time series ε₁,…ε_n the identification stage of a model of the form (1) may involve the use of many statistics. Traditional estimators of the population autocorrelations of the squares are given by

and

stands for the mean-corrected observations of a time series (X_t,1 ≤ t ≤ n). Such statistics are often used to get an insight into the fourth-order structure of the process (ε_t). For the model of Ding, Granger, and Engle (1993), based on a Box–Cox power transformation of the conditional standard deviation process, the squares can be replaced by powers δ of the |ε_t|. For nonparametric GARCH models, such as model (1), general transformations of the data lead to statistics of the form

for some measurable function g. The asymptotic distributions of such statistics are easily deduced from Theorem 3.

COROLLARY 1. Assume that E|g(ε_t)|^4+ν < ∞ for some ν > 0. Then, under the assumptions of Theorem 3, and for any fixed h ≥ 0, the vectors

where γ_g([ell ]) = Cov{g(ε_t),g(ε_{t−[ell ]})} and ρ_g([ell ]) = γ_g([ell ])/γ_g(0), are asymptotically normally distributed.

Proof. The proof is similar to that of Theorems 3.1 and 3.2 in Romano and Thombs (1996). Theorem 3 shows that (Y_t) =: (g(ε_t) − Eg(ε_t)) is geometrically β- (and α-) mixing. Let

. The Wold–Cramer device and the CLT for strongly mixing processes, given in Ibragimov (1962) and Herrndorf (1984), show that

is asymptotically normally distributed with covariance matrix Σ given by

The absolute convergence of the last sum follows from standard covariance inequalities for mixing processes. To show the asymptotic normality of the vector involving the

, it remains to show that

. This can be proved by the arguments given in the proof of Proposition 7.3.7 of Brockwell and Davis (1991). The vector of the sample autocorrelations, because it is a differentiable function of the sample autocovariances vector, is also asymptotically normally distributed. █

Remark 8. For g(x) = x, the result can be deduced from the Lindeberg CLT for martingale differences. However, for autocovariances of general transformations of ε_t, it may be difficult, if not impossible, to rely on asymptotic theorems for martingales. In such cases, mixing results offer an alternative.

5.2. Unit-Root Tests for Autoregressive (AR) Models with GARCH Errors

Many financial series, such as (logarithms of) stock-market indices, are suspected to behave roughly like random walks with conditionally heteroskedastic increments. For such series, one could consider a model of the form

where (ε_t) belongs to the general class of GARCH(1,1) models (1). Given the consequences of the random walk hypothesis, especially in terms of persistence of the economic shocks, it is important to consider tests for the unit-root hypothesis H₀ : φ = 1 against the stationarity assumption H₁ : φ ∈ (−1,1). Let

be the least squares estimator (LSE) of φ and let {W(t), t ∈ [0,1]} be a standard Brownian motion. The following corollary demonstrates the asymptotic validity of the standard Dickey–Fuller tests, in our heteroskedastic framework.

COROLLARY 2. Suppose that (X_t) satisfies (24) where (ε_t) is a general GARCH(1,1) process satisfying the assumptions of Theorem 3. Then if E|ε_t|^2+ν < ∞ for some ν > 0,

and if E|ε_t|^4+ν < ∞ for some ν > 0,

Proof. Note that

. In view of Theorem 3, the weak convergence (25) is deduced from Phillips (1987), and (26) can be deduced, for instance, from Francq and Zakoïan (1998). █

Remark 9. For the standard GARCH(1,1) errors, Ling, Li, and McAleer (2003) derived the asymptotic distribution in (25) under the second moment condition, namely, α + β < 1.

Remark 10. Consider model (1) with h_t = σ_t². Assume that η_t has a symmetric distribution, that ω and a are even functions, and that the following moments exist, for i = 1,2,

with a₂ < 1. Then tedious computations show that the asymptotic variance of the LSE is, under H₁,

It is interesting to note that in the unit-root case the asymptotic distribution of the LSE is the same with i.i.d. or GARCH errors, whereas it depends on the GARCH parameters in the stationary case. Thus the Dickey–Fuller test statistic can still be used when the errors satisfy a model of the form (1). A similar finding was obtained by Rahbek et al. (2002) in a multivariate framework. They showed that the trace test for the cointegration rank remains valid when the standard i.i.d. Gaussian errors are replaced by ARCH-type innovations, with appropriate moment conditions.

Remark 11. Corollaries 1 and 2 are just given for illustrative purposes and can be straightforwardly extended. In particular, Corollary 2 could include more general models with augmented variables and/or an intercept and/or a deterministic trend, as in Phillips and Perron (1988). Similar convergences could also be stated for t-statistics.

References

REFERENCES

Ango Nze, P. (1992) Critères d'ergodicité de quelques modèles à représentation markovienne. Comptes Rendus de l'Academie des Sciences de Paris 315, 1301–1304.Google Scholar

Ango Nze, P. (1998) Critères d'ergodicité géométrique ou arithmétique de modèles linéaires perturbés à représentation markovienne. Comptes Rendus de l'Academie des Sciences de Paris 326, 371–376.Google Scholar

Basrak, B., R.A. Davis, & T. Mikosch (2002) Regular variation of GARCH processes. Stochastic Processes and Their Applications 99, 95–115.Google Scholar

Berkes, I., L. Horváth, & P.S. Kokoszka (2003) GARCH processes: Structure and estimation. Bernoulli 9, 201–227.Google Scholar

Bollerslev, T. (1986) Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 307–327.Google Scholar

Bougerol, P. & N. Picard (1992) Stationarity of GARCH processes and of some nonnegative time series. Journal of Econometrics 52, 115–127.Google Scholar

Boussama, F. (1998) Ergodicité, mélange et estimation dans les modèles GARCH. Ph.D. dissertation, Paris-7 University.

Bradley, R.C. (1986) Basic properties of strong mixing conditions. In E. Eberlein & M.S. Taqqu (eds.), Dependence in Probability and Statistics, A Survey of Recent Results, pp. 165–192. Birkhäuser.

Brockwell, P.J. & R.A. Davis (1991) Time Series: Theory and Methods. Springer-Verlag.

Bühlmann, P. & A.J. McNeil (2000) Nonparametric GARCH Models. Manuscript, ETH, Zürich.

Carrasco, M. & X. Chen (2002) Mixing and moment properties of various GARCH and stochastic volatility models. Econometric Theory 18, 17–39.Google Scholar

Chow, Y.S. & H. Teicher (1997) Probability Theory, 3rd ed. Springer-Verlag.

Davydov, Y. (1973) Mixing conditions for Markov chains. Theory of Probability and Its Applications 18, 313–328.Google Scholar

Ding, Z., C. Granger, & R.F. Engle (1993) A long memory property of stock market returns and a new model. Journal of Empirical Finance 1, 83–106.Google Scholar

Duan, J.-C. (1997) Augmented GARCH(p,q) process and its diffusion limit. Journal of Econometrics 79, 97–127.Google Scholar

Engle, R.F. (1982) Autoregressive conditional heteroskedasticity with estimates of the variance of the United Kingdom inflation. Econometrica 50, 987–1007.Google Scholar

Feigin, P.D. & R.L. Tweedie (1985) Random coefficient autoregressive processes: A Markov chain analysis of stationarity and finiteness of moments. Journal of Time Series Analysis 6, 1–14.Google Scholar

Fornari, F. & A. Mele (1997) Sign- and volatility-switching ARCH models: Theory and applications to international stock markets. Journal of Applied Econometrics 12, 49–65.Google Scholar

Francq, C. & J.-M. Zakoïan (1998) Estimating linear representations of nonlinear processes. Journal of Statistical Planning and Inference 68, 145–165.Google Scholar

Francq, C. & J.-M. Zakoïan (2004) Maximum likelihood estimation of pure GARCH and ARMA-GARCH processes. Bernoulli 10, 605–637.Google Scholar

Glosten, L.R., R. Jagannathan, & D. Runkle (1993) On the relation between the expected value and the volatility of the nominal excess return on stocks. Journal of Finance 48, 1779–1801.Google Scholar

He, C. & T. Teräsvirta (1999) Properties of moments of a family of GARCH processes. Journal of Econometrics 92, 173–192.Google Scholar

Herrndorf, N. (1984) A functional central limit theorem for weakly dependent sequences of random variables. Annals of Probability 12, 141–153.Google Scholar

Hwang, S.Y. & T.Y. Kim (2004) Power transformation and threshold modeling for ARCH innovations with applications to tests for ARCH structure. Stochastic Processes and Their Applications 110, 295–314.Google Scholar

Ibragimov, I.A. (1962) Some limit theorems for stationary processes. Theory of Probability and Its Applications 7, 349–382.Google Scholar

Jensen, S.T. & A. Rahbek (2004) Asymptotic normality of the QMLE estimator of ARCH in the nonstationary case. Econometrica 72, 641–646.Google Scholar

Klüppelberg, C., A. Lindner, & R. Maller (2004) A continuous time GARCH process driven by a Lévy process: Stationarity and second order behaviour. Journal of Applied Probability 41, 601–622.Google Scholar

Kristensen, D. (2006) Geometric ergodicity of a class of Markov chains with applications to time series models. Paper presented at the Econometric Society 2005 World Congress.

Lee, O. & D.W. Shin (2004) Strict stationarity and mixing properties of asymmetric power GARCH models allowing a signed volatility. Economics Letters 84, 167–173.Google Scholar

Lee, S.-W. & B.E. Hansen (1994) Asymptotic theory for the GARCH(1,1) quasi-maximum likelihood estimator. Econometric Theory 10, 29–52.Google Scholar

Ling, S., W.K. Li, & M. McAleer (2003) Estimation and testing for unit root processes with GARCH(1,1) errors: Theory and Monte Carlo evidence. Econometric Reviews 22, 179–202.Google Scholar

Ling, S. & M. McAleer (2002) Stationarity and the existence of moments of a family of GARCH processes. Journal of Econometrics 106, 109–117.Google Scholar

Lu, Z. (1996) A note on geometric ergodicity of autoregressive conditional heteroscedasticity (ARCH) model. Statistics and Probability Letters 30, 305–311.Google Scholar

Lumsdaine, R.L. (1996) Consistency and asymptotic normality of the quasi-maximum likelihood estimator in IGARCH(1,1) and covariance stationary GARCH(1,1) models. Econometrica 64, 575–596.Google Scholar

Meitz, M. & P. Saikkonen (2004) Ergodicity, Mixing, and Existence of Moments of a Class of Markov Models with Applications to GARCH and ACD Models. Paper presented at the Econometric Society 2005 World Congress.

Meyn, S.P. & R.L. Tweedie (1996) Markov Chains and Stochastic Stability, 3rd ed., Springer-Verlag.

Mokkadem, A. (1990) Propriétés de mélange des processus autorégressifs polynomiaux. Annales de l'Institut Henri Poincaré 26, 219–260.Google Scholar

Nelson, D.B. (1990) Stationarity and persistence in the GARCH(1,1) model. Econometric Theory 6, 318–334.Google Scholar

Pham, D.T. (1986) The mixing property of bilinear and generalised random coefficients autoregressive models. Stochastic Processes and Their Applications 23, 291–300.Google Scholar

Phillips, P.C.B. (1987) Time series regression with a unit root. Econometrica 55, 277–301.Google Scholar

Phillips, P.C.B. & P. Perron (1988) Testing for a unit root in time series regression. Biometrica 75, 335–346.Google Scholar

Rahbek, A., E. Hansen, & J. Dennis (2002) ARCH Innovations and Their Impact on Cointegration Rank Testing. Working paper, University of Copenhagen.

Romano, J.L. & L.A. Thombs (1996) Inference for autocorrelations under weak assumptions. Journal of the American Statistical Association 91, 590–600.Google Scholar

Schwert, G.W. (1989) Why does stock market volatility change over time? Journal of Finance 45, 1129–1155.Google Scholar

Taylor, S. (1986) Modelling Financial Time Series. Wiley.

Tjøstheim, D. (1990) Non-linear time series and Markov chains. Advances in Applied Probability 22, 587–611.Google Scholar

Yang, L. & R. Tschernig (2006) Nonparametric Estimation of Generalized Impulse Response Functions. Unpublished document, Michigan State University.

Yang, M. & R. Bewley (1995) Moving average conditional heteroskedastic processes. Economics Letters 49, 367–372.Google Scholar

Zakoïan, J.-M. (1994) Threshold heteroskedastic models. Journal of Economic Dynamics and Control 18, 931–955.Google Scholar

Article contents

MIXING PROPERTIES OF A GENERAL CLASS OF GARCH(1,1) MODELS WITHOUT MOMENT ASSUMPTIONS ON THE OBSERVED PROCESS

Abstract

1. INTRODUCTION

2. BASIC MARKOV CHAIN THEORY

3. STRICT STATIONARITY

4. GEOMETRIC ERGODICITY

Examples

5. STATISTICAL APPLICATIONS

5.1. Behavior of the Sample Autocorrelations

5.2. Unit-Root Tests for Autoregressive (AR) Models with GARCH Errors

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests