Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-02-05T23:59:03.849Z Has data issue: false hasContentIssue false

A CLOSED-FORM ESTIMATOR FOR THE GARCH(1,1) MODEL

Published online by Cambridge University Press:  09 February 2006

Dennis Kristensen
Affiliation:
University of Wisconsin-Madison
Oliver Linton
Affiliation:
London School of Economics
Rights & Permissions [Opens in a new window]

Abstract

We propose a closed-form estimator for the linear GARCH(1,1) model. The estimator has the advantage over the often used quasi-maximum likelihood estimator (QMLE) that it can be easily implemented and does not require the use of any numerical optimization procedures or the choice of initial values of the conditional variance process. We derive the asymptotic properties of the estimator, showing T(κ−1)/κ-consistency for some κ ∈ (1,2) when the fourth moment exists and -asymptotic normality when the eighth moment exists. We demonstrate that a finite number of Newton–Raphson iterations using our estimator as starting point will yield asymptotically the same distribution as the QMLE when the fourth moment exists. A simulation study confirms our theoretical results.The first author's research was supported by the Shoemaker Foundation. The second author's research was supported by the Economic and Social Science Research Council of the United Kingdom.

Type
NOTES AND PROBLEMS
Copyright
© 2006 Cambridge University Press

1. INTRODUCTION

The estimation of the Bollerslev (1986) generalized autoregressive conditional heteroskedasticity (GARCH) model is often carried out using the quasi-maximum likelihood estimator (QMLE). The asymptotic properties of this estimator are by now well established (see Jeantheau, 1998; Lee and Hansen, 1994; Lumsdaine, 1996; Ling and McAleer, 2003), and its finite-sample properties have been examined through Monte Carlo studies (Fiorentini, Calzolari, and Panattoni, 1996; Lumsdaine, 1995). Recently, Giraitis and Robinson (2001) have proposed using the Whittle likelihood estimator, which exploits only the second-order properties of the series but does so in an optimal way. Unfortunately, the calculation of both these estimators requires the use of numerical optimization procedures because closed-form expressions are not available. Therefore, the resulting estimator depends on the implementation, with different optimization techniques leading to potentially different estimators. This has been demonstrated in the studies by Brooks, Burke, and Persand (2001) and McCullough and Renfro (1999) where different commercially available software packages were used to estimate GARCH models by quasi-maximum likelihood (QML). Both studies reported markedly different outputs across the various packages, reflecting the different initialization and algorithmic strategies employed.

We propose a simple estimator of the parameters in the GARCH(1,1) model based on the second-order properties of the series like Giraitis and Robinson (2001). We derive an autoregressive moving average (ARMA) representation of the squared GARCH process and use the implied autocovariance function to define closed-form estimators of the parameters of the GARCH model. This strategy has already been used in the literature on estimating standard ARMA models (see Tuan, 1979; Galbraith and Zinde-Walsh, 1994) and in ARFIMA models recently by Mayoral (2004). The ARMA representation of GARCH models was already noted by Bollerslev (1986), whereas Francq and Zakoïan (2000) and Gouriéroux and Jasiak (2001, p. 130) proposed to utilize this to obtain estimators of the parameters. However, the estimators of Francq and Zakoïan (2000) and Gouriéroux and Jasiak (2001) require numerical optimization. Baillie and Chung (2001) took a somewhat similar approach using a minimum distance estimator based on the autocorrelation function of the squared GARCH process; again, numerical optimization is required to obtain the actual estimator.

Our estimator can readily be implemented without using numerical optimization methods. Furthermore, the estimator does not require one to choose (arbitrary) initial values for the conditional variance process. In that sense, the proposed estimator is more robust compared to the aforementioned estimators. On the other hand, for the estimator to be consistent and asymptotically normally distributed, relatively strong assumptions about the moments of the GARCH process have to be made. Although consistency and asymptotic normality of the QMLE can be shown under virtually no moment restrictions of the GARCH process, we require fourth moments to obtain consistency and T(κ−1)/κ-convergence toward a so-called α-stable distribution with index κ ∈ (1,2), and eighth moments for

-asymptotic normality.

The

-asymptotic normality result when the eighth moment exists follows from standard central limit theorems because the asymptotic variance in this case is well defined. When the eighth moment does not exist, we are still able to show that a limiting distribution exists, but it does not have second moment, and the convergence rate is slowed down. The latter result is based on recent results by Mikosch and Stărică (2000). By combining our estimator with a finite-order Newton–Raphson (NR) procedure one can achieve

-asymptotic normality, and even full efficiency under Gaussianity of the rescaled errors, based on fourth moments.

2. AN ARMA REPRESENTATION OF THE GARCH(1,1) MODEL

We consider the GARCH(1,1) process given by

We assume that

, where

is the σ-field generated by {zt,zt−1,…}. We can write xtyt2 as

where εt = xt − σt2 is a martingale difference sequence with respect to

. From this expression, we see that xt is a (heteroskedastic) ARMA(1,1) process with parameters φ and θ. We shall throughout assume that φ < 1, implying that E [xt] < ∞ (cf. Bollerslev, 1988). We introduce the covariance function of the process

Assuming that {xt} is stationary with second moment, γ(·) is well defined. Using standard results, we then have that the autocorrelation function, ρ(k) = γ(k)/γ(0), solves the following set of Yule–Walker equations:

(cf. Harvey, 1993, Ch. 2, eqns. (4.13a) and (4.13b); these equations were also derived in Bollerslev, 1988, and He and Teräsvirta, 1999). We can express (4) as a quadratic equation in θ,

Observe that b is only well defined if φ ≠ ρ(1). It is easily checked that ρ(1) ≥ φ with equality if and only if φ2 = 1 or θ = 0. The first case is ruled out by our assumption that φ < 1, whereas in the following discussion we assume β > 0.1

If β = 0, ρ(1) = α, and we estimate α by the first-order sample correlation.

Under this assumption, b > 2 is well defined, and a solution to the quadratic equation is given by

There is a second root that is reciprocal to the one stated here; however this has |θ| = β > 1, which is ruled out by φ < 1.2

This corresponds to the invertibility condition of the ARMA(1,1) model.

Finally, we observe that

This last expression is utilized by Engle and Sheppard (2001) to profile out ω in their estimation procedure.

The expressions (4), (5), and (6) can now be used to obtain estimators of the parameters α, β, and ω. First, we can estimate φ by

, where

is the sample autocorrelation function of xt,

with

Substituting the estimator of φ into (5), we obtain an estimator of θ,

assuming that

. This leads to the following estimators of λ = (α,β,ω)[top ]:

In practice, this method may lead to

. To deal with this problem, the estimator can be Winsorized (censored) at zero and one or at ε and 1 − ε for small positive ε.

Note that

for any wj sequence with

so that a more general class of estimators can be defined based on this relationship. It can be expected that for a sufficiently general class of weights one can obtain the same efficiency as the Whittle estimator of Giraitis and Robinson (2001). We do not pursue this approach to achieving efficiency because a more standard approach is available; see the discussion that follows. Nevertheless, some smoothing of the ratio of autocorrelations based on (10) may be desirable in practice.

3. ASYMPTOTIC PROPERTIES OF THE ESTIMATOR

We derive the asymptotic properties of our estimator

under the following set of assumptions:

A.1. The error process {zt} is independent and identically distributed (i.i.d.) with marginal distribution given by a lower semicontinuous density with support

. It satisfies E [zt] = 0 and E [zt2] = 1.

A.2. The observed process {yt} is strictly stationary with E [(β + αzt2)2] < 1.

A.3. E [(β + αzt2)4] < 1.

Under (A.1), the moment condition in (A.2) is necessary and sufficient for the GARCH model to have a strictly stationary solution with a fourth moment. This solution will be β-mixing with geometrically decreasing mixing coefficients (cf. Meitz and Saikkonen, 2004, Thm. 1). We then assume that this is the one we have observed. Assumption (A.3) is a strengthening of (A.2) implying that also the eighth moment of yt exists.

In some of the results stated later, the i.i.d. assumption in (A.1) can be weakened to {zt} being a stationary martingale difference sequence; this is more realistic, allowing for dependence in the rescaled errors. In (A.2) we assume that we have observed the stationary version of the process. This is merely for technical convenience and can most likely be removed.

Under (A.1)–(A.2), we prove consistency of the estimator and derive its asymptotic distribution (toward which it converges with rate slower than T−1/2). If additionally (A.3) holds, the estimator is shown to be

-asymptotically normally distributed.

We first state a lemma that shows that the basic building blocks of our estimator are consistent and gives their asymptotic distribution. Under (A.2), E [yt4] < ∞, but the eighth moment does not necessarily exist. One can show for ARMA processes with homoskedastic errors that the sample autocorrelation function is asymptotically normally distributed with only a second moment of the errors. In our case however, the errors, εt, are heteroskedastic, and to obtain asymptotic normality the fourth moment, Et4] < ∞, seems to be needed; see, for example, Hannan and Heyde (1972) for results in both the homoskedastic and heteroskedastic cases. The fourth moment of εt translates into the eighth moment of yt.

The estimator

has a well-defined asymptotic distribution without requiring the eighth moment of the GARCH process to exist however, but the limit will not be Gaussian. This result has been established in Mikosch and Stărică (2000), see also Basrak, Davis, and Mikosch (2002) and Davis and Mikosch (1998). They show that

converges in distribution toward a so-called stable, regularly varying distribution with index κ ∈ (1,2); see Samorodnitsky and Taqqu (1994) and Resnick (1987) for an introduction. The index κ is shown to be the solution to E [(αzt2 + β)κ] = 1. The convergence takes place with rate TaT−1 where aT = T1/κl(T) and l(T) is a slowly varying function. The sequence aT satisfies TP(yt4 > aT) → 1, such that the index κ is a measure of the tail thickness.

LEMMA 1. Under (A.1)–(A.2), the estimators

in (7) and (8) satisfy

. Furthermore,

where

and

for some κ ∈ (1,2), where W(m) = (Zi − ρ(i)Z0)i=1,…,m and (Zi)i=0,…,m has a κ-stable distribution. If additionally (A.3) holds, then

where Vρ(m) is the m × m matrix given by

where Yt is an m × 1 vector with Yt,i = (yt2 − σ2)(yt+i2 − σ2), i = 1,…,m.

Mikosch and Stărică (2000) establish a weak convergence result without the fourth moment condition (A.2). But in this case, one does not have any consistency result because the law of large numbers does not hold. The preceding weak convergence result of

under (A.1)–(A.2) has the flaw that the limit distribution is not in explicit form, which makes it difficult in practice to carry out any inference.

The consistency results and the weak convergence results in (11) and (13) in the preceding lemma can be proved without the i.i.d. assumption on {zt} in (A.1) by using martingale limit theory. This can be done by utilizing the ARMA structure of {xt} and applying the results of Hannan and Heyde (1972). It is not clear whether the more general weak convergence result in (12) can be extended to a non-i.i.d. setting however.

Observe that our estimator of λ can be expressed in terms of

. A simple application of the continuous mapping theorem therefore yields the following result.

THEOREM 2. Under (A.1)–(A.2), the estimator

given in (9) is consistent,

, and

where W(2) and κ are as in Lemma 1 and the function D is given subsequently in equations (A.1)–(A.3). If additionally Assumption (A.3) holds, then

An estimator of the covariance matrix V in (16) can be obtained by first estimating Vσ2 and Vρ(2) using heteroskedasticity and autocorrelation consistent (HAC) variance estimators (see Robinson and Velasco, 1997) and then substituting

into ∂D2,ρ(1),ρ(2))/∂(σ2,ρ(1),ρ(2)). One can alternatively use the analytic expressions of ρ(k) to obtain an estimator of Vσ2.

4. EFFICIENCY ISSUES

We here give a brief discussion of how one may improve on the efficiency of the closed-form estimator, in terms of both convergence rate and asymptotic variance. The basic idea is to perform a number of NR iterations using either the Whittle objective function or the Gaussian quasi-likelihood. Given closed-form estimates, one may wish to proceed to the QMLE or the Whittle estimator,3

Observe that the Whittle estimator proposed in Giraitis and Robinson (2001) in fact is the quasi-likelihood of the process {xt} assuming that it is Gaussian.

using the initial estimates as a starting point in the numerical optimization; this may help reduce numerical problems because our preliminary estimates are consistent. Alternatively, one can perform a number of NR iterations that do not necessitate the use of any numerical optimization procedure. We define the following sequence of NR estimators:

with the initial value being the closed-form estimator,

the criterion function.

4

We here assume that

is invertible; Robinson (1988) discusses these issues in detail. Also, the step length is kept constant; it may however be a good idea in practice to use a line search in each iteration.

In the case of the QMLE, QT(λ) is the Gaussian likelihood when assuming that zt ∼ i.i.d. N(0,1), whereas the Whittle estimator has QT(λ) as the discrete frequency form (cf. Giraitis and Robinson, 2001, pp. 611–612). For the QMLE, the NR estimator takes the form of a generalized least squares (GLS) type estimator: defining

, one can show that

In general, the NR estimator will satisfy

where

is the actual M estimator (cf. Robinson, 1988, Thm. 2). Under regularity conditions, the M estimator will satisfy

where H = E [HT0)] and Σ = E [ST0)ST0)[top ]] (see, e.g., Lee and Hansen, 1994, Thm. 2; Giraitis and Robinson, 2001, Thm. 1). Combining (18) and (19), we obtain the following result.

THEOREM 3. Assume that (19) holds together with (A.1)–(A.2). Then, with κ as in Lemma 1, for k ≥ 2 + [log(κ/2(κ − 1))/log(2)],

If additionally (A.3) is satisfied, the preceding result holds for k ≥ 1.

5. A SIMULATION STUDY

We now examine the quality of our estimator in finite sample through a Monte Carlo study. In particular, we are interested in its behavior when the fourth moment does not exist. We simulate the GARCH process given in (1)–(2) with zt ∼ i.i.d. N(0,1) for four different sets of parameter values. For each choice of parameter values, we simulated 5,000 data sets with T = 200, 400, 800, and 1,000 observations. For each data set, we obtained the QMLE using the Matlab GARCH Toolbox,

5

The Matlab package uses a subspace trust region method based on the interior-reflective Newton method of Coleman and Li (1996).

our ARMA estimator, and the GLS estimator in (17). To estimate φ, we used (10) with

, j = 1,2,3, and wj = 0, j > 3, and Winsorized at ε and 1 − ε with ε = 0.001. The GLS estimator reported here is based on two iterations. For the two first sets of parameter values, λ = (0.2, 0.15, 0.25)[top ] and λ = (0.2, 0.25, 0.35)[top ], both the fourth and eighth moments exist; for the third, λ = (0.2, 0.35, 0.45)[top ], only the fourth moment is well defined; and for the fourth, λ = (0.2, 0.10, 0.89)[top ], neither the fourth nor the eighth moment exists, but the second does.

The results in terms of bias, variance, and mean squared error (MSE) are reported in Tables 1, 2, 3, and 4 for the four different sets of parameters. For these results, we have discarded a small number of data sets (eight in total) where the Matlab GARCH Toolbox stalled and did not terminate the optimization procedure. We have also set the ARMA estimator equal to zero whenever it returned negative estimates.

MSE of the QMLE and ARMA estimator for λ = (0.2, 0.15, 0.25)

MSE of the QMLE and ARMA estimator for λ = (0.2, 0.25, 0.35)

MSE of the QMLE and ARMA estimator for λ = (0.2, 0.35, 0.45)

MSE of the QMLE and ARMA estimator for λ = (0.2, 0.10, 0.89)

For the first set of parameters, the ARMA and GLS estimators are serious competitors to the QMLE. For small GARCH effects, the closed-form estimators seem to be a good alternative to the QMLE in terms of MSE. For the remaining three sets of parameters, the QMLE has significantly better performance compared to the two other estimators as expected. As α and/or β increases the variances of the ARMA and GLS estimators in general increase; this is consistent with the theory, which predicts that the convergence rate of the estimator deteriorates as α + β increases. However, contrary to the theory, it seems as if the ARMA estimator of α and β remains consistent even if the fourth moment is not well defined; the performance of the closed-form estimator of ω is very poor, though. The GLS estimator based on two iterations yields a large improvement in the precision relative to the initial ARMA estimator, except for the estimation of ω. The improvement might have been even more pronounced if we had done further iterations.

6. CONCLUDING REMARKS

The procedure easily extends to the case where there is a mean process, say, yt = β[top ]xt + σt zt for some covariates xt, by applying the preceding process to the residuals from some preliminary fitting of the mean. Some aspects of the procedure extend to various multivariate cases and to GARCH(p,q). Specifically, in the multivariate case the relationships (3) and (6) continue to be useful and can be used as in Engle and Sheppard (2001) to reduce the dimensionality of the optimization space.

One may also consider other GARCH models that can be represented as an ARMA-like process. For example, suppose that the conditional variance is given as σt2 = ω + βσt−12 + αyt−121(yt−12 > 0) + δyt−121(yt−12 ≤ 0). Then for xtyt2 we have xt = ω + φxt−1 + ut, where ut is a martingale difference sequence with respect to

and φ = β + αm2+ + δm2, where m2+ = E [zt21(zt > 0)] and m2 = E [zt21(zt ≤ 0)]. Furthermore, σ2 = ω/(1 − φ) so that we can identify ω and φ as before from the variance and the second-order covariance. To proceed further one needs to make strong assumptions about the distribution of zt, for example, symmetry about zero, in which case φ = β + (α + δ)/2.

Our estimator also allows for a simple t-test for no GARCH effect because the asymptotic distribution derived earlier does not require α,β > 0 in contrast to the QMLE, where the Taylor expansion used requires this.

APPENDIX: Proofs

Proof of Lemma 1. The consistency part is a simple application of the law of large numbers for stationary and ergodic processes because E [xt] < ∞ and E [xt xt+k] < ∞ under (A.1)–(A.2). The weak convergence result in (12) follows from Mikosch and Stărică (2000, Sec. 5).

The convergence results in (11) and (13) are proved using a standard central limit theorem for Markov chains that are strongly mixing with geometric rate (cf. Meyn and Tweedie, 1993, Thm. 17.0.1). It is easily seen that

for any s,t ≥ 0 and k > 0. Thus, the asymptotic covariance between

is zero. █

Proof of Theorem 2. We have that

, where D = (Dα,Dβ,Dω)[top ] is given by

with

and F(ρ(1),ρ(2)) = ρ(2)/ρ(1). It is easily seen that D is a continuously differentiable mapping of σ2, ρ(1), and ρ(2). The convergence result in (14) now follows from the continuous mapping theorem together with the fact that

is

-consistent whereas

converge at a slower rate. The convergence result stated in (15) follows from the standard result for differentiable transformations of asymptotically normally distributed variables. █

References

REFERENCES

Baillie, R.T. & H. Chung (2001) Estimation of GARCH models from the autocorrelations of the squares of a process. Journal of Time Series Analysis 22, 631650.CrossRefGoogle Scholar
Basrak, B., R.A. Davis, & T. Mikosch (2002) Regular variation of GARCH processes. Stochastic Processes and Their Applications 99, 95115.CrossRefGoogle Scholar
Bollerslev, T. (1986) Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 307327.CrossRefGoogle Scholar
Bollerslev, T. (1988) On the correlation structure for the generalized autoregressive conditional heteroskedastic process. Journal of Time Series Analysis 9, 121131.CrossRefGoogle Scholar
Brooks, C., S.P. Burke, & G. Persand (2001) Benchmarks and the accuracy of GARCH model estimation. International Journal of Forecasting 17, 4556.CrossRefGoogle Scholar
Coleman, T.F. & Y. Li (1996) An interior, trust region approach for nonlinear minimization subject to bounds. SIAM Journal on Optimization 6, 418445.CrossRefGoogle Scholar
Davis, R.A. & T. Mikosch (1998) The sample autocorrelations of heavy-tailed processes with applications to ARCH. Annals of Statistics 26, 20492080.Google Scholar
Engle, R.F. & K. Sheppard (2001) Theoretical and Empirical Properties of Dynamic Conditional Correlation Multivariate GARCH. NBER Working paper w8554.
Fiorentini, G., G. Calzolari, & L. Panattoni (1996) Analytical derivatives and the computation of GARCH estimates. Journal of Applied Econometrics 11, 399417.3.0.CO;2-R>CrossRefGoogle Scholar
Francq, C. & J.-M. Zakoïan (2000) Estimating weak GARCH representations. Econometric Theory 16, 692728.CrossRefGoogle Scholar
Galbraith, J.W. & V. Zinde-Walsh (1994) A simple noniterative estimator for moving average models. Biometrika 81, 143155.CrossRefGoogle Scholar
Giraitis, L. & P.M. Robinson (2001) Whittle estimation of ARCH models. Econometric Theory 17, 608631.CrossRefGoogle Scholar
Gouriéroux, C. & J. Jasiak (2001) Financial Econometrics: Problems, Models and Methods. Princeton University Press.
Hannan, E.J. & C.C. Heyde (1972) On limit theorems for quadratic functions of discrete time series. Annals of Mathematical Statistics 43, 20582066.CrossRefGoogle Scholar
Harvey, A.C. (1993) Time Series Models, 2nd ed. Harvester Wheatsheaf.
He, C. & T. Teräsvirta (1999) Fourth moment structure of the GARCH(p,q) process. Econometric Theory 15, 824846.CrossRefGoogle Scholar
Jeantheau, T. (1998) Strong consistency of estimators for multivariate ARCH models. Econometric Theory 14, 7086.Google Scholar
Lee, S.-W. & B. Hansen (1994) Asymptotic theory for the GARCH(1,1) quasi-maximum likelihood estimator. Econometric Theory 10, 2953.CrossRefGoogle Scholar
Ling, S. & M. McAleer (2003) Asymptotic theory for a new vector ARMA-GARCH model. Econometric Theory 19, 280310.Google Scholar
Lumsdaine, R. (1995) Finite-sample properties of the maximum likelihood estimator in GARCH(1,1) and IGARCH(1,1) models: A Monte Carlo investigation. Journal of Business & Economic Statistics 13, 110.Google Scholar
Lumsdaine, R. (1996) Consistency and asymptotic normality of the quasi-maximum likelihood estimator in IGARCH(1,1) and covariance stationary GARCH(1,1) models. Econometrica 64, 575596.CrossRefGoogle Scholar
Mayoral, L. (2004) A New Minimum Distance Estimation Procedure of ARFIMA Models. Manuscript, Pompeu Fabra.
Meitz, M. & P. Saikkonen (2004) Ergodicity, Mixing and Existence of Moments of a Class of Markov Models with Applications to GARCH and ACD Models. Stockholm School of Economics, SSE/EFI Working Paper Series in Economics and Finance 573.
Meyn, S.P. & R.L. Tweedie (1993) Markov Chains and Stochastic Stability. Communications and Control Engineering Series. Springer-Verlag.
McCullough, B.D. & C.G. Renfro (1999) Benchmarks and software standards: A case study of GARCH procedures. Journal of Economic and Social Measurement 25, 5971.Google Scholar
Mikosch, T. & C. Stărică (2000) Limit theory for the sample autocorrelations and extremes of a GARCH(1,1) process. Annals of Statistics 28, 14271451.Google Scholar
Resnick, S.I. (1987) Extreme Values, Regular Variation, and Point Processes. Springer-Verlag.
Robinson, P.M. (1988) The stochastic difference between econometric statistics. Econometrica 56, 531548.CrossRefGoogle Scholar
Robinson, P.M. & C. Velasco (1997) Autocorrelation-robust inference. In G.S. Maddala & C.R. Rao (eds.), Robust Inference, pp. 267298. North-Holland.
Samorodnitsky, G. & M.S. Taqqu (1994) Stable Non-Gaussian Processes: Stochastic Models with Infinite Variance. Chapman & Hall.
Tuan, P.-D. (1979) The estimation of parameters for autoregressive-moving average models from sample autocovariances. Biometrika 66, 555560.CrossRefGoogle Scholar
Figure 0

MSE of the QMLE and ARMA estimator for λ = (0.2, 0.15, 0.25)

Figure 1

MSE of the QMLE and ARMA estimator for λ = (0.2, 0.25, 0.35)

Figure 2

MSE of the QMLE and ARMA estimator for λ = (0.2, 0.35, 0.45)

Figure 3

MSE of the QMLE and ARMA estimator for λ = (0.2, 0.10, 0.89)