SIMULTANEOUSLY MODELING CONDITIONAL HETEROSKEDASTICITY AND SCALE CHANGE

Yuanhua Feng

doi:10.1017/S0266466604203061

SIMULTANEOUSLY MODELING CONDITIONAL HETEROSKEDASTICITY AND SCALE CHANGE

Published online by Cambridge University Press: 08 June 2004

Yuanhua Feng

Show author details

Yuanhua Feng: Affiliation:
University of Konstanz

Article contents

Abstract
1. INTRODUCTION
2. THE MODEL
3. A SEMIPARAMETRIC ESTIMATION PROCEDURE
4. MAIN RESULTS
5. THE PROPOSED DATA-DRIVEN ALGORITHM
6. THE SIMULATION STUDY
7. APPLICATIONS
8. DISCUSSION
APPENDIX: PROOFS OF RESULTS
References

Rights & Permissions

Abstract

This paper proposes a semiparametric approach by introducing a smooth scale function into the standard generalized autoregressive conditional heteroskedastic (GARCH) model so that conditional heteroskedasticity (CH) and scale change in financial returns can be modeled simultaneously. An estimation procedure combining kernel estimation of the scale function and maximum likelihood estimation of the GARCH parameters is proposed. Asymptotic properties of the estimators are investigated in detail. It is shown that asymptotically normal, -consistent parameter estimation is available. A data-driven algorithm is developed for practical implementation. Finite sample performance of the proposal is studied through simulation. The proposal is applied to model CH and scale change in the daily S&P 500 and DAX 100 returns. It is shown that both series have simultaneously significant scale change and CH.We are very grateful to the co-editor and two referees for their helpful comments and suggestions, which led to a substantial improvement of this paper. The paper was finished under the advice of Professor Jan Beran, Department of Mathematics and Statistics, University of Konstanz, Germany, and was financially supported by the Center of Finance and Econometrics (CoFE), University of Konstanz. We thank colleagues in CoFE, especially Professor Winfried Pohlmeier, for their interesting questions at a talk of the author. It was these questions that motivated the author to write this paper. Our special thanks go to Dr. Erik Lüders, Department of Finance and Insurance, Laval University, and Stern School of Business, New York University, for his helpful suggestions.

Type: Research Article
Information: Econometric Theory , Volume 20 , Issue 3 , June 2004 , pp. 563 - 596

DOI: https://doi.org/10.1017/S0266466604203061 [Opens in a new window]
Copyright: © 2004 Cambridge University Press

1. INTRODUCTION

Modeling of heteroskedasticity in financial returns is one of the most important and interesting themes of financial econometrics. Well-known conditional heteroskedastic (CH) models are the autoregressive conditional heteroskedastic ARCH (Engle, 1982) and (generalized ARCH) GARCH (Bollerslev, 1986) together with numerous extensions. Most GARCH variants are however stationary models and are hence time homoskedastic with constant unconditional variance. In practice it is realized that financial returns are often not only conditional but also time heteroskedastic with time varying unconditional variance. This is shown by, e.g., Beran and Ocker (2001) by fitting a trend function to some volatility series defined by Ding, Granger, and Engle (1993). Nonstationarity in financial returns is investigated in detail by, e.g., Mikosch and Stărică (2004). They show that the phenomenon

by a fitted GARCH(1, 1) model often implies nonstationarity.

In recent years different approaches for simultaneously modeling conditional and time heteroskedasticity have been introduced in the literature by defining the volatility as a function not only of the past values but also of the time, e.g., GARCH model with change points (the piecewise GARCH model of Mikosch and Stărică, 2004) and local time homogeneous model with change points (Mercurio and Spokoiny, 2002). A general continuous time model to perform this may be found in Fan, Jiang, Zhang, and Zhou (2002). One can also obtain a similar model for discrete time series by introducing past information into the mean and volatility functions in the indexed stochastic model proposed by Yao and Morgan (1999). Another proposal in this context is the time heteroskedastic stochastic volatility model (Härdle, Spokoiny, and Teyssière, 2000).

In this paper another approach, called a semiparametric GARCH (SEMIGARCH) model is proposed by introducing a scale function σ(t) into the parametric GARCH model. This proposal is motivated by the observation that one important reason for the time heteroskedasticity is a slowly changing scale function in volatility. The advantages of this approach are as follows. 1. The volatility is decomposed into two multiplicative components corresponding to the location and the past information, respectively. 2. The GARCH parameters are estimated globally, and hence asymptotically normal,

-consistent estimators are available. 3. The SEMIGARCH model can also be used for predicting the future volatility. A semiparametric estimation procedure combining kernel estimation of the scale function and maximum likelihood estimation of the GARCH parameters is proposed. Asymptotic properties of the estimators are investigated in detail. A data-driven algorithm is developed for practical implementation. Finite sample performance of the proposal is examined through a simulation study. The proposal is applied to model CH and scale change in the daily S&P 500 and DAX 100 returns. It is shown that both series have simultaneously significant scale change and CH.

This approach provides an interesting alternative for modeling financial volatility. Whether or not it is better than another approach depends on the case considered. The idea proposed in this paper can be used to obtain semiparametric generalizations of other GARCH variants. Change points can also be introduced into the SEMIGARCH model.

The paper is organized as follows. Section 2 introduces the model. Section 3 describes the semiparametric estimation procedure. Asymptotic properties of the proposals are investigated in Section 4. Section 5 proposes a data-driven algorithm for practical implementation. Results of the simulation study are reported in Section 6. The proposal is applied to the log-returns of the daily S&P 500 and DAX 100 indices in Section 7. Section 8 contains some final discussion. Proofs of results are in the Appendix.

2. THE MODEL

Consider the equidistant time series model

where μ is an unknown constant, t_i = i/n, σ(t) > 0 is a smooth, bounded scale (or volatility) function, and {ε_i} is assumed to be a GARCH(r,s) process defined by

(Bollerslev, 1986), where η_i are independent and identically distributed (i.i.d.) N(0,1) random variables, α₀ > 0 and α₁,…,α_r,β₁,…,β_s ≥ 0. Let v(t) = σ²(t) denote the local variance of Y_i. The rescaled time index t_i = i/n is introduced to guarantee that the information increases as n increases and the availability of a consistent estimator of v. Now, model (1) defines indeed a sequence of processes.

Let θ = (α₀,α₁,…,α_r,β₁,…,β_s)′ be the unknown parameter vector. It is assumed that

, which ensures the existence of a unique strictly stationary solution of (2). The practical implementation of a nonparametric estimator

requires the moment condition E(ε_i⁸) < ∞. However, as pointed out by an anonymous referee, the condition of E(ε_i⁴) < ∞ is sufficient for the derivation of the asymptotic results. Necessary and sufficient conditions that guarantee the existence of high-order moments of a GARCH process may be found in Ling and Li (1997), Ling (1999), and Ling and McAleer (2002). It is further assumed var(ε_i) = E(ε_i²) = 1, implying

, to avoid identifiability problems.

The process defined by (1) and (2) is locally stationary in the sense of Dahlhaus (1997), which is a special case of Example 1 given there. Such a model provides a semiparametric extension of the standard GARCH model (Bollerslev, 1986) by introducing the scale function σ(t) into it, where h_i^1/2 stand for the conditional standard deviations of the standardized process ε_i. The total standard deviation at t_i is hence given by σ(t_i)h_i^1/2. For σ(t) ≡ σ₀, model (1) and (2) reduces to the standard GARCH model. Our purpose is to estimate v(t) and h_i separately. If the scale function σ(t) in (1) changes over time, then the assumption of a GARCH model is a misspecification. In this case the estimation of the GARCH model will be inconsistent. It can be shown through simulation that, if a nonconstant scale function is not eliminated, one will obtain

by a fitted GARCH(1,1) model as n → ∞, even when ε_i are i.i.d. Furthermore, in the presence of scale change the estimation of v(t) is also necessary for the prediction. On the other hand, if Y_i follows a GARCH model but model (1) and (2) is used, then the estimation is still

-consistent but with some loss in efficiency due to the estimation of σ(t).

The assumptions of model (1) and (2) can be weakened in different ways. For instance, if the constant mean μ in (1) is replaced by a smooth mean function g, then we obtain the following nonparametric regression with heteroskedastic and dependent errors:

where {ε_i} is a zero mean stationary process. Estimation of the mean function g in model (3) with i.i.d. ε_i is discussed in, e.g., Ruppert and Wand (1994), Fan and Gijbels (1995), and Efromovich (1999). Discussion on the estimation of the scale function in heteroskedastic nonparametric regression may be found in, e.g., Efromovich (1999). This paper focuses on investigating the estimation of σ(t) and θ under model (1) and (2).

3. A SEMIPARAMETRIC ESTIMATION PROCEDURE

Model (1) and (2) can be estimated by a semiparametric procedure combining nonparametric estimation of v(t) and parametric estimation of θ. A linear smoother of the squared residuals will estimate v(t). Let Z_i = (Y_i − μ). Then model (1) can be rewritten as follows:

where X_i = Z_i² and ξ_i = ε_i² − 1 ≥ −1 are zero mean stationary time series errors. Model (4) transfers the estimation of the scale function to a general nonparametric regression problem (for a related idea, see Efromovich, 1999, Sect. 4.3). On the one hand, model (4) is a special case of (3) with g(t) and σ(t) both being replaced by v(t). On the other hand, model (4) also applies to (3) by defining Z_i = Y_i − g(t_i). Hence, the extension of our results to model (3) is expected.

The kernel estimator of conditional variance proposed by Feng and Heiler (1998) will be adapted to estimate v(t). Let y₁,…,y_n, denote the observations. Let

. Let K(u) denote a second-order kernel with compact support [−1,1]. The Nadaraya–Watson estimator of v at t based on

is defined by

where

and b is the bandwidth. And we define

. It is assumed that b → 0, nb → ∞ as n → ∞, which together with other regular conditions ensures the consistency of

. The estimator defined in (5) does not depend on the dependence structure of the errors because

is a linear smoother. It is clear that

if all observations for which |t_i − t| ≤ b are not identical. The bias of

at a boundary point is of a larger order than in the interior because of the asymmetry in the observations. This is the so-called boundary effect of the kernel estimator, which can be overcome by using a local linear estimator (see, e.g., Härdle, Tsybakov, and Yang, 1998). However, as mentioned in Feng and Heiler (1998), a local linear estimator of v may sometimes be nonpositive. Hence, the kernel estimator is more preferable in the current context.

Following Bollerslev (1986), the conditional Gaussian log-likelihood in a parametric GARCH model takes the form (ignoring constants)

The maximizer of L(θ), denoted by

, is not available, because ε_i are unobservable in the current context. Hence we define the approximate log-likelihood by

where

are the standardized residuals given by

The symbols

are used to indicate that, for a given value of θ, h_i(ε;θ) in L(θ) depends on

. Similar to the parameter estimation in the SEMIFAR (semiparametric fractional autoregressive) model (Beran, 1999), θ will be estimated by

, the maximizer of

. Any standard GARCH packet can be used for estimating

. In this paper the S+ GARCH will be used.

obtained in this way is an approximate maximum likelihood estimator (MLE), which may perform differently from

(provided

were available).

4. MAIN RESULTS

For the derivation of the asymptotic results the following assumptions are required.

A1. Model (1) and (2) holds with i.i.d. N(0,1) η_i and strictly stationary ε_i such that E(ε_i⁴) < ∞. Furthermore, it is assumed that

A2. The function v(t) is strictly positive, bounded, and at least twice continuously differentiable on [0,1].

A3. The kernel K(u) is a symmetric density with compact support [−1,1].

A4. The bandwidth b satisfies b → 0 and nb → ∞ as n → ∞.

Assumptions A2–A4 are regular conditions in nonparametric regression. A1 summarizes conditions required on the GARCH model. For a GARCH(1,1) model, these conditions are stronger than those used by, e.g., Lee and Hansen (1994) and Lumsdaine (1996). Now, the condition E(ε_i⁴) < ∞ implies in particular α₁ + β₁ < 1, and hence E [ln(α₁η_i² + β₁)] < 0, one of the conditions used by Lee and Hansen (1994) and Lumsdaine (1996). In this paper the innovations η_i are assumed to be i.i.d. N(0,1) random variables as in, e.g., Bollerslev (1986) and Ling and Li (1997) for simplicity, which implies Assumption 2 in Lumsdaine (1996). If non-Gaussian innovations are considered, suitable moment conditions have to be used, which might depend on the orders of the GARCH model. For instance, for a GARCH(1,1) model, Lumsdaine (1996) introduces the moment condition E(η_i³²) < ∞ together with further regular conditions on the distribution of η_i (Assumption 2 therein). Furthermore, it can be shown that, under A1, other assumptions in Lee and Hansen (1994) hold. The additional assumption

in A1 is introduced to avoid the naive case with α_i ≡ 0 for all i = 1,…,r.

4.1. Asymptotic Properties of

Equation (4) is a nonparametric regression model with dependent and heteroskedastic errors. Pointwise results in nonparametric regression with dependent errors as given in, e.g., Altman (1990) and Hart (1991) can be adapted to

defined in (5) without any difficulty. Let γ_ξ(k) denote the autocovariance function of ξ_i. It is well known that

depends on c_f = f (0), where

is the spectral density of ξ_i. Let r′ = max(r,s). Following equations (6) and (7) in Bollerslev (1986) and observing that

, we have the ARMA(r′,s) representation of ξ_i:

where α_j′ = α_j + β_j for j ≤ min(r,s), α_j′ = α_j for j > s, if r > s, and α_j′ = β_j for j > r, if s > r, and

is a sequence of zero mean, uncorrelated random variables with independent η_i ∼ N(0,1). Equations (9) and (10) allow us to calculate c_f.

Define R(K) = ∫ K²(u) du and I(K) = ∫ u²K(u) du. At an interior point 0 < t < 1 the following results hold.

THEOREM 1. Under Assumptions A1–A4 we have the following results.

(i) The bias of

is given by

(ii) The variance of

is given by

(iii) Assume that nb⁵ → d² as n → ∞, for some d > 0; then

where D = I(K)v′′(t)/2 and V(t) = 2πc_f R(K)v²(t).

The proof of Theorem 1 is given in the Appendix. The asymptotic bias of

is the same as in nonparametric regression with i.i.d. errors. The asymptotic variance of

it is similar to that in nonparametric regression with short-range dependence, which depends, however, on the unknown underlying function v itself.

Let

. Under A1 we have

If ε_i follows a GARCH(1,1) model, we have

The last equation in (15) is due to the standardization of ε_i. The proof of (14) and (15) is given in the Appendix.

The mean integrated squared error (MISE) defined on [Δ,1 − Δ] will be used as a goodness-of-fit criterion, where Δ > 0 is used to avoid the boundary effect of

. Define

. The following theorem holds.

THEOREM 2. Under the assumptions of Theorem 1 we have the following results.

(i) The MISE of

(ii) Assume that I((v′′)²) ≠ 0. The asymptotically optimal bandwidth for estimating v, which minimizes the dominant part of the MISE, is given by

with

The proof of Theorem 2 is straightforward and is omitted. If a bandwidth b = O(b_A) = O(n^−1/5) is used, we have

4.2. Asymptotic Properties of

Asymptotic properties of

defined in Section 3 are investigated by Ling and Li (1997) under the general fractionally autoregressive integrated moving average–GARCH (FARIMA-GARCH) framework. More detailed asymptotic results in the special case of a GARCH(1,1) model may be found in Lee and Hansen (1994) and Lumsdaine (1996). Asymptotic properties of

will be studied by comparing its performance with that of

based on the results in Ling and Li (1997). At first we will introduce a general lemma. Let θ₀ = (θ₁⁰,…,θ_m⁰)′ be the true value of a m-dimensional parameter vector θ and be in the interior of the compact set Θ. Assume that there exists a consistent MLE

satisfying the equation ∂L(θ)/∂θ = 0, where L(θ) is a standard likelihood or log likelihood function. Furthermore, assume that L(θ) is three times differentiable, L′′(θ) converges in probability to a positive definite matrix, and all third-order partial derivatives of L(θ) have bounded expectations in Θ. Let

be a consistent estimate of L(θ). Then we have the following result.

LEMMA 1. Assume

for θ in a neighborhood of θ₀. Under the preceding regular conditions on L(θ) there exists a consistent MLE

satisfying

and

The proof of Lemma 1 is straightforward and is omitted. Lemma 1 ensures the existence of an approximate MLE and provides a tool to quantify the distance between it and an infeasible MLE. Note that

is in general

-consistent and asymptotically normal. Hence,

will have the same properties if

Now, denote by θ₀ = (α₀⁰,α₁⁰,…,α_r⁰,β₁⁰,…,β_s⁰)′ the true value of the unknown parameter vector θ. Assumption A1 ensures that θ₀ is in the interior of a compact parameter set Θ. Let

be as defined in Section 3. Let

and Ω₀, the value of Ω_θ at θ = θ₀, denote the information matrix. Then, following Lemma 1 and Theorems 3.1 and 3.2 in Ling and Li (1997), we have the following result.

THEOREM 3. Assume that A1–A4 hold.

We see that

-consistent and asymptotically normal up to a bias term B_θ. The proof of Theorem 3 is given in the Appendix and shows that the O(b²) term in B_θ is due to

and the O[(nb)⁻¹] term is due to

. If O(n^−1/2) < b < O(n^−1/4), B_θ is negligible, and we have

. Similar observations have been made in other semiparametric contexts, e.g., within the context of partially linear models. There, for a certain choice of bandwidth the nonparametric part has no effect on the rate of convergence of the parametric estimator (see Härdle, Liang, and Gao, 2000). If

is estimated using b = O(b_A), then B_θ = O(n^−2/5). If Y_i follow a GARCH model and b > O(n^−1/2), then

-consistent and asymptotically normal because now

is unbiased.

5. THE PROPOSED DATA-DRIVEN ALGORITHM

A plug-in bandwidth selector may be developed by replacing the unknowns c_f, I(v²), and I((v′′)²) in (18) with some suitable estimators. At first, it is proposed to estimate c_f by

where

is a nonparametric estimator of E(ε_i⁴). Although explicit formulas of E(ε_i⁴) are known (for common results, see He and Teräsvirta, 1999a; Karanasos, 1999; for results in some special cases, see Bollerslev, 1986; He and Teräsvirta, 1999b), we prefer to use

defined in (21) because the formulas of E(ε_i⁴) are in general too complex. For a GARCH(1,1) model, another simple estimator,

, say, may be defined based on (15) by replacing α₀, α₁, and β₁ with their estimates. Now

perform quite similarly. Assume that a bandwidth b_ε is used for estimating E(ε_i⁴), which satisfies A4 but is not necessarily the same as b. Furthermore, make the following assumption.

A1′. The same as A1 but with E(ε_i⁸) < ∞.

Then the following proposition holds.

PROPOSITION 1. Under Assumptions A1′ and A2–A4 we have

and

where c_f^ε denotes the value of the spectral density of the process ε_i⁴ at the origin.

The proof of Proposition 1 is given in the Appendix.

Remark 1. Equations (22) and (23) show that

-consistent, if O(n^−1/2) ≤ b_ε ≤ O(n^−1/4). The optimal bandwidth in a second-order sense, which balances the two terms on the right-hand side of (22), is of order O(n^−1/3). In this paper, we propose to use a bandwidth b_ε = O(n^−1/4) for estimating E(ε_i⁴) so that the estimator is more stable. Note that

is no longer

-consistent if a bandwidth b_ε = O(b_A) = O(n^−1/5) is used. The finally selected bandwidth is not so sensitive to the bandwidth for estimating E(ε_i⁴).

The integral I(v²) can be estimated by

where n₁ and n₂ denote the integer parts of nΔ and n(1 − Δ), respectively, and

is the same as defined in (5) but obtained with another bandwidth b_v, say, that satisfies A4. The following results hold for

PROPOSITION 2. Under the assumptions of Proposition 1 we have

and

The proof of Proposition 2 is given in the Appendix.

Remark 2. Note that the dominated orders of the biases and variances of

are the same. Hence similar statements as given in Remark 1 apply for results given in (25) and (26). This is not surprising because both v²(t_i) and ε_i⁴ are related to the fourth moment of the errors.

A well-known estimator of I((v′′)²) is given by

(see, e.g., Ruppert, Sheather, and Wand, 1995), where

is a kernel estimator of v′′ using a fourth-order kernel K₂ for estimating the second derivative (see, e.g., Müller, 1988) and again another bandwidth b_d. Corresponding results as given in Proposition 2 hold for

, for which the following adapted assumptions are required.

A2′. The function v(t) is strictly positive on [0,1] and is at least four times continuously differentiable.

A3′. v′′ is estimated with a symmetric fourth-order kernel for estimating the second derivative with compact support [−1,1].

A4′. The bandwidth b_d satisfies b_d → 0 and nb_d⁵ → ∞ as n → ∞.

PROPOSITION 3. Under Assumptions A1′–A4′ we have

and

The proof of Proposition 3 is omitted because it is well known in nonparametric regression (for results with i.i.d. errors, see, e.g., Ruppert et al., 1995; for results with dependent errors, see, e.g., Beran and Feng, 2002a, 2000b).

Remark 3. The MSE (mean squared error) of

is dominated by the squared bias. The optimal bandwidth for estimating I((v′′)²), which balances the two terms on the right-hand side of (28), is of order O(n^−1/7). With a bandwidth b_d = O(n^−1/7) we have

We see that for selecting the bandwidth b we have to choose at first three pilot bandwidths b_ε, b_v, and b_d. This problem will be solved using the iterative plug-in idea (Gasser, Kneip, and Köhler, 1991) with a so-called exponential inflation method (see Beran and Feng, 2002a, 2002b). Let b_j−1 denote the bandwidth for estimating v in the (j − 1)th iteration. Then in the jth iteration, the bandwidths b_ε,j = b_v,j = b_j−1^5/4 and b_d,j = b_j−1^5/7 will be used for estimating E(ε⁴), I(v²), and I((v′′)²), respectively. These inflation methods are chosen so that b_ε,j and b_v,j are both of order O_p(n^−1/4) and b_d,j is of the optimal order O_p(n^−1/7), when b_j−1 is of the optimal order O_p(n^−1/5). By an iterative plug-in algorithm the unknown constants in the pilot bandwidths can be simply omitted. Furthermore, we also need to choose a starting bandwidth b₀. In the current context, b₀ should satisfy A4 because we have to estimate θ in the first iteration. Theoretically, a bandwidth b₀ = O(n^−1/5) is more preferable. Our experience shows that b₀ = 0.5n^−1/5 is a good choice. Detailed discussions on this topic may be found in the next two sections, especially in Section 6.3.

The proposed data-driven algorithm processes as follows:

1. Start with the bandwidth b₀ = c₀ n^−1/5 with, e.g., c₀ = 0.5.

2. In the jth iteration

3. Increase j by one and repeatedly carry out step 2 until convergence is reached or until a given maximal number of iterations has been completed. Put

The condition |b_j − b_j−1| < 1/n is used as a convergence criterion of

, because such a difference is negligible. The maximal number of iterations is put to be 20. In this algorithm,

is estimated using b_j−1 as for

because we do not have a proper bandwidth selector for estimating θ. The asymptotic performance of

is quantified by the following theorem

THEOREM 4. Assume that A3 and A1′–A3′ hold and that I((v′′)²) ≠ 0. Then we have

The proof of Theorem 4 is given in the Appendix. Note that A4 and A4′ are automatically satisfied. The second O(n^−2/5) term on the right-hand side of (31) is due to the error in

caused by the bias in

, which is indeed negligible compared with the first term.

The proposed algorithm is coded in an S-Plus function called SEMIGARCH. A practical restriction 1/n ≤ b ≤ 0.5 − 1/n is used in the program for simplicity. Four commonly used kernels, namely, the uniform, the Epanechnikov, the bisquare, and the triweight kernels (see, e.g., Müller, 1988), are built into the program. As a standard version we propose the use of the Epanechnikov kernel with Δ = 0.05 and c₀ = 0.5, which will be used in the next two sections.

Remark 4. Note that b_A is not well defined if I((v′′)²) = 0 implying v′′(t) ≡ 0. However, the SEMIGARCH model also applies to this case. In particular, the proposed algorithm does work if Y_i follow a GARCH model. Now it can be shown that, theoretically, b_j → O_p(1) as j → ∞. Following the context after Theorem 3,

has the same asymptotic properties as by a GARCH model because

. And

-consistent with some loss in the efficiency compared with a parametric estimator, provided that no maximal number of iterations is given, because (nb_j)⁻¹ → O_p(n⁻¹) now.

6. THE SIMULATION STUDY

6.1. Design of the Simulation

To show the practical performance of our proposal, a simulation study was carried out. In the simulation study, ε_i were generated using the simulate.garch function in S+GARCH following one of the two GARCH(1,1) models.

Model 1 (M1). ε_i = η_i h_i^1/2, h_i = 0.6 + 0.2ε_i−1² + 0.2h_i−1 and

Model 2 (M2). ε_i = η_i h_i^1/2, h_i = 0.15 + 0.1ε_i−1² + 0.75h_i−1.

The y_i are generated following model (1) with μ ≡ 0 and one of the three scale functions:

The terms v₁(t) and v₂(t) are quite similar, and they are designed following the estimated scale function in the daily DAX 100 returns. The scale change with v₂ is stronger than that with v₁. It is strongest with v₃. To this end see the bandwidths required for estimating them given in Table 2. The scale function σ₂(t) may be found in Figure 2b, which follows. To confirm the statements in Remark 4, a constant scale function v₀(t) = σ₀²(t) ≡ 16 is also used. The simulation was carried out for three sample sizes n = 1,000, 2,000, 4,000. For each case 400 replications were done. For each replication, three GARCH (1,1) models were fitted to ε_i, the data-driven

. The estimators of α₁ and β₁ are denoted by

, respectively. For v₀ we have

. Here,

are used as a benchmark. Note in particular that the estimated parameters may sometimes be negative using the S+GARCH.

6.2. Results of the Simulation Study

To give a summary of the performance of

, and to compare them with

, the empirical efficiency (EFF) of an estimator w.r.t. the corresponding one estimated from ε_i is calculated. For instance,

These results are listed in Table 1. The difference between two related EFFs, e.g.,

, in a given case may be thought of as the gain by using the SEMIGARCH model. Table 1 shows that the EFFs of

seem to tend to 100%, whereas those of

seem to tend to zero, as n → ∞. Hence, the gains seem to tend to 100% as n → ∞. The EFFs of

under M2 are relatively low. In particular, for n = 1,000, the EFFs of

in the two cases of M2 with v₁ and v₃ are even smaller than those of

; i.e., the gain in these cases is slightly negative. This shows that n = 1,000 is sometimes not large enough for estimating the scale function when β₁ is large.

Empirical efficiencies (%) of the estimated parameters

Box plots of the 400 replications of

for n = 1,000 are shown in Figures 1a–1f, where the symbols E1, E2, and E3 denote estimators obtained from

, respectively. Those for n = 2,000 and n = 4,000 are omitted to save space. The simulation results show that

perform in general quite well. One clear problem arises with

under M2 with n = 1,000. Now, both the variance and the bias of

are strongly affected by some extremely small estimates (see Figures 1m–1p). This is due to the nonrobustness of the bandwidth selection. Hence, it is worthwhile to develop a robust procedure to improve the poor performance of

for small n. The quality of

is clearly improved as n increases. In particular, the estimation becomes more and more stable. Detailed statistics (in the first version of this paper) show that the standard deviations of

seem to converge at the same rate as those of

, but their biases converge a little more slowly. This confirms the results of Theorem 3. The simulation results show clearly that, in the case with scale change,

are inconsistent as a result of their biases. The situation become worse as n increases. In particular, we can see that

will tend to one as n → ∞, no matter how large β₁⁰ is. However, if there is no scale change the estimators

should of course be used. It is hence helpful to test whether or not the estimated scale function is significant. For the data examples given in the next section it is proposed to carry out such a test based on simulation.

Box plots of (E3), respectively, with n = 1,000, where the horizontal lines show the true values.

Now let us consider the quality of

. The sample means, standard deviations, and square roots of the MSEs of

together with the true asymptotic optimal bandwidths b_A are given in Table 2. Note that b_A and the MSE in cases with v₀ are not defined. Kernel density estimates of

(omitted to save space) show that the performance of

is satisfactory. In all cases the variance of

decreases as n increases. It is also true for the bias in most of the cases. Both the variance and the bias of

depend on the scale function and the model of the errors. For two related cases, the variance of

under M1 is smaller than that under M2. Generally, the stronger the scale change, the larger the variance of

. The bias of

by v₁ is always negative, and it is always positive by v₃. The bandwidth for v₂ is easiest to choose. The choice of the bandwidth by v₃ is in general easier than that by v₁, except for the case of M2 with n = 1,000. In this case, the detailed structure of v₃ may sometimes be smoothed away because of the large variation caused by the GARCH model. This shows again that n = 1,000 is sometimes not large enough for distinguishing the CH and the scale change.

Statistics on the selected bandwidth

Remark 5. As suggested by a referee, the performance of the proposed procedure for the cases with highly persistent GARCH effect is investigated through an additional simulation under a third model, model 3 (M3), with α₁ = 0.07 and β₁ = 0.87 and without trend. As expected, the proposed procedure does not work well for n = 1,000 because the variance of

and in particular that of

are too large as a result of some extreme estimates. This shows again that a robust estimation procedure should be developed. For n ≥ 2,000, the procedure works well. The empirical efficiencies are a little lower than those for M2. Detailed results of this additional simulation are omitted to save space.

Remark 6. In this paper, the bandwidth is selected by minimizing the dominant part of the MISE of

. In a semiparametric context, the performance of the bandwidth selection and the resulting parameter estimation may be improved if a plug-in algorithm that takes the MSE of

into account is developed. For this purpose a more detailed formula of the MSE of

is required, and one has to develop a suitable procedure to estimate the MSE. This is still an important open question and will be discussed elsewhere.

6.3. Detailed Analysis of Two Simulated Examples

In the following discussion, two simulated data sets are selected to show some details. The first example (called Sim 1) is a typical example of the replications under M2 with the scale function σ₂(t) and n = 2,000. The observations y_i, i = 1,…,2,000, are shown in Figure 2a. For Sim 1 we have

by starting with any bandwidth 3/n ≤ b₀ ≤ 0.5 − 1/n; i.e.,

does not depend on b₀ if b₀ is not too small. The σ₂(t) (solid line) and

(dashed line) are shown in Figure 2b. Figure 2c shows the standardized residuals

, which look stationary. The estimated GARCH(1,1) models are

Estimation results for the first simulated data set.

for y_i and

for

. For model (32) we have

, so that the fourth moment of this model does not exist. On the opposite model (33) has finite moments until at least twelfth order as for the underlying GARCH model. The estimated SEMIGARCH conditional and total standard deviations, i.e.,

, are shown in Figures 2d and 2e. The true conditional and total standard deviations of y_i, i.e., (h_i)^1/2 and σ₂(t_i)(h_i)^1/2, are shown in Figures 2f and 2g. Figure 2h shows the estimated GARCH conditional (in this case also the total) standard deviations (h_i^y)^1/2. The analysis of Sim 1 shows the following results.

(1) If a standard GARCH model is used, the scale change will be wrongly estimated as a part of the CH. Furthermore, the total variance tends to be overestimated when it is large and underestimated when it is small (compare Figures 2g and 2h). This phenomenon is mainly due to the overestimation of

and will be called the (volatility) inflation effect of the GARCH model in the presence of scale change.

(2) Following the SEMIGARCH model, both the conditional heteroskedasticity and the scale change are well estimated. The estimated SEMIGARCH total variances are quite close to the true values and are more stable and accurate than those following the standard GARCH model (compare Figures 2e and 2h). The errors in

are caused by the errors in these two estimates, and both of them can be clearly reduced if more dense observations are available, e.g., by analyzing high-frequency financial data. The MSE of the estimated total variances are 0.687 for the SEMIGARCH and 4.979 for the standard GARCH models; the latter is about seven times as large as the former.

Furthermore, (h_i^y)^1/2 shown in Figure 2h (see also Figure 3f) exhibit a clear signal of covariance nonstationarity, a property not shared by the true and the estimated SEMIGARCH conditional standard deviations.

The estimation results for the S&P 500 returns.

The second simulated data set (called Sim 2) is one of the replications under M1 with v₃ and n = 1,000, which is chosen to show that sometimes the selected bandwidth will be wrong if b₀ is too small or too large. That is, a moderate b₀ should be used as proposed in Section 5. For this data set we have either

if b₀ < 0.020. On the other hand, we have

, the largest allowed bandwidth in the program, if b₀ > 0.262. For any starting bandwidth b₀ ∈ [0.021, 0.262] a bandwidth

will be selected. Now,

does not depend on b₀. Note that the proposed default starting bandwidth b₀ = 0.5n^−1/5 = 0.126 lies in the middle part of the interval [0.021, 0.262]. In case when it is doubtful, if the selected bandwidth with b₀ = 0.5n^−1/5 is the optimal one, we recommend that the user try with some different b₀'s and choose the most reasonable

from all possible selected bandwidths by means of further analysis (see Feng, 2002).

7. APPLICATIONS

In this section the proposal will be applied to the log-returns of the daily S&P 500 and DAX 100 financial indexes from January 3, 1994, to August 23, 2000. For the S&P 500 returns shown in Figure 3a we have

(for any b₀ ≥ 0.075). The fitted GARCH models are

for y_i and

for

. As before, for model (34) we have

so that the fourth moment of this model does not exist. Model (35) has finite moments until twelfth order. To test whether the estimated trend is significantly nonconstant, 400 replications were generated following model (35) with the corresponding sample variance and without trend. The scale function was then estimated with the bandwidth b = 0.183 from each replication. Symmetric Monte Carlo confidence bounds that covered 95% or 99% of all estimated trends were calculated and are shown in Figure 3b together with the sample standard deviation (0.0099) and the estimated scale function

. We see that there is significant scale change in this data set. Furthermore, both

in model (35) are strongly significant. That is, this series has simultaneously significant scale change and CH. Figures 3c–3f show

, the SEMIGARCH conditional standard deviations

, the SEMIGARCH total standard deviations

, and the GARCH conditional standard deviations (h_i^y)^1/2. Comparing Figures 3e and 3f we see again that the estimated total variances following the SEMIGARCH model are more stable and those following the GARCH model are inflated.

For the DAX 100 returns we have

(for any b₀ ≥ 0.075). The fitted GARCH models are

for y_i and

for

. The condition for the existence of the fourth moment of model (36) is slightly satisfied, but the eighth moment of this model does not exist. Again, model (37) has finite moments until twelfth order. The S&P 500 and DAX 100 returns series perform quite similarly, and the conclusions on the former given previously apply to the latter.

Now, we will compare the performance of the GARCH and SEMIGARCH by predicting future volatility. The GARCH unconditional variance,

, say, is calculated following (34) or (36). For the SEMIGARCH,

is used as the unconditional variance in the near future. The predicted (expected) conditional standard deviations

following the GARCH and

following the SEMIGARCH, k = 1,2,…,100, for the S&P 500 and DAX 100 returns are shown in Figure 4 together with

. Note that, the conditional standard deviations by both series at the right end are lower than

. Consequently,

increase for both series. The

look quite reasonable and converge to

quickly. However,

in both cases seem to be underestimated, because of the inflation effect mentioned previously. Furthermore,

converge very slowly to some wrongly estimated limits. The sample standard deviation for the S&P 500 returns is 0.0099. Following (34) we have

, which is clearly overestimated as a result of the instability of this model. For the DAX 100 returns,

is about equal to its sample value, which is, however, clearly lower than the locally unconditional standard deviation at t = 1. There are two problems if the fitted parametric GARCH models from these data sets are used for predicting future volatility: (1) the unconditional variance at the current end was wrongly estimated; and (2) the predicted conditional variance converges very slowly, because these models only have finite moments of low orders. Both of these problems were overcome by applying the SEMIGARCH model.

Predicted standard deviations (middle dashes) and (solid line) together with their limits following the GARCH (short dashes) and following the SEMIGARCH (long dashes) for (a) the S&P 500 and (b) the DAX 100 returns.

8. DISCUSSION

The SEMIGARCH introduced in this paper provides a useful tool for estimating financial volatility in cases when the stationary assumption of a GARCH model is likely to break down, which decomposes the volatility into a smooth scale function of the location and a CH component depending on the past information. A data-driven algorithm is developed for practical implementation. Simulation and data examples show that the proposal works well in practice. There are some other recent proposals to deal with similar problems, e.g., the parametric GARCH model with change points (Mikosch and Stărică, 2004) for modeling structural breaks in the unconditional variance, which cannot be used for modeling slowly changing unconditional variance. On the other hand, structural breaks in the unconditional variance cannot be modeled by the SEMIGARCH. It is worthwhile to combine these two approaches. Another related work is Mercurio and Spokoiny (2002), where the volatility is assumed to be constant in some unknown time intervals. By this approach scale change and CH are modeled together but not separately.

APPENDIX: PROOFS OF RESULTS

Under model (1) and (2) v(t) is integrable. This implies that y is

-consistent. Hence, in the following discussion,

can be replaced by z_i and x_i respectively.

Proof of Theorem 1.

(i) The bias. Note that

is a linear smoother

where w_i are the weights defined by (5). The bias of

, which is just the same as in nonparametric regression with i.i.d. errors. That is, the bias depends neither on the dependence structure nor on the heteroskedasticity of the errors. This leads to the result given in (11).

(ii) The variance. Let ζ_i = v(t_i)ξ_i denote the errors in (4). Note that w_i = 0. For |t_i − t| > b we have

For |t_i − t| ≤ b and |t_j − t| ≤ b we have ζ_i = [v(t) + O(b)]ξ_i and ζ_j = [v(t) + O(b)]ξ_j. This leads to

Insert this into (A.2), then we have

Results in (12) follow from known results on [sum ][sum ]w_i w_jγ_ξ(i − j) in nonparametric regression with dependent errors (see, e.g., Beran, 1999; Beran and Feng, 2002a).

(iii) Asymptotic normality. Consider the estimation problem under the model without scale change:

Define

where

are observations obtained following model (A.5). Following the results in (i) and (ii) we see

. Hence

is asymptotically normal if and only if

is. Furthermore, following Theorem 4 in Beran and Feng (2001) it can be shown that the kernel estimator

is asymptotically normal if and only if the sample mean of the squared GARCH process ε_i² or equivalently the sample variance of ε_i is asymptotically normal. Basrak, Davis, and Mikosch (2002) show that the squared GARCH process ε_i² is strongly mixing with geometric rate. The condition E(ε_i⁴) < ∞ implies that there is a δ > 0 such that E|ε_i²|^2+δ < ∞. The conditions of Theorem 18.5.3 in Ibragimov and Linnik (1971) hold. This shows that n⁻¹[sum ]ε_i² of a GARCH process with finite fourth moment is asymptotically normal. Theorem 1 is proved. █

Proof of (14) and (15). Note that ξ_i has the autoregressive moving average (ARMA) representation

where φ(z) and ψ(z) are as defined before. Under A5 φ(z) and ψ(z) have no common roots. Under A1 all roots of φ(z) and ψ(z) lie outside the unit circle. Then the spectral density of ξ is given by

Note that E(ε_i⁴) = 3E(h_i²) (Bollerslev, 1986) and var(u_i) = E(u_i²) = 2E(h_i²). The last equation follows from (10). That is, var(u_i) = 2/3E(ε_i⁴). The result in (14) is proved by inserting this formula, ψ(1), and φ(1) into (A.8). The result in (15) is obtained by further inserting the explicit formula of E(ε_i⁴) for a GARCH(1,1) model (Bollerslev, 1986) into (14). █

The following analysis involves infinite past history of

. The presample values of

will be assumed to be zero. The presample values of ε_i² and h_i(ε;θ) (resp.

are chosen to be

(resp.

. For simplicity, it is also assumed that

(and hence

are of the same order of magnitude, if i and j are not far from each other. This is true if t_i and t_j are both in the interior or both in the boundary area. The preceding simplifications do not affect the asymptotic properties of

Consistency and asymptotic normality of

defined in Section 3 are a part of the results of Theorem 3.2 in Ling and Li (1997). Theorems 3.1 and 3.2 therein together show that conditions of Lemma 1 are fulfilled for the log-likelihood function L(θ). In the following discussion, we will investigate the difference between

caused by replacing the unobservable ε_i with

. Two lemmas are introduced at first.

LEMMA A.1. Under the assumptions of Theorem 3 we have

Proof of Lemma A.1. For any trial value θ = (α₀,α₁,…,α_r,β₁,…,β_s)′ ∈ Θ, one can rewrite h_i(ε;θ) as

and

This leads to

where a_j are obtained by matching the powers in B, which decay exponentially. █

LEMMA A.2. Under the assumptions of Theorem 3 we have, ∀θ ∈ Θ, the first element of

is zero and the other elements of it are all of the order

Proof of Lemma A.2. Following (21) in Bollerslev (1986) we have

where ζ_i = (1,ε_i−1²,…,ε_i−r²,h_i−1(ε;θ),…,h_i−s(ε;θ))′. Analogously, we have

where

. Denote by

we have

This leads to

Again, c_j decay exponentially. Observe that the first element of

is zero. Results of Lemma A.2 follow from (A.13) and Lemma A.1. █

Proof of Theorem 3.

(i) Under the conditions of Theorem 3, we have

. Following Lemmas A.1 and A.2,

. Following Lemma 1 there exists a consistent approximate MLE

satisfying the equation

such that

(ii) Note that

(see Ling and Li, 1997). Results given in this part hold if we can show

. Because

, we have to show that

, or equivalently

, is a matrix of the order o(n⁻¹).

Note that

By means of Taylor expansion and using the results of Lemmas A.1 and A.2 we have

where O_p denote the order of magnitude of a random vector and

Furthermore, note that

Inserting these results into (A.15), we obtain

where the random vector

Observe that

. We have that each element of T is of the order

Hence, the variance of each element of T is of the order

and so far

is consistent. This shows that all entries of

are of the order o(n⁻¹).

(iii) Now, we will calculate the order of magnitude of

. Observe that

at any point and

in the interior. We have, at an interior point t_i,

Furthermore, note that

at the boundary and that the length of the boundary area is equal to 2b. This shows that the expected value of each element of T is of the order O[b² + (nb)⁻¹] and hence

Theorem 3 is proved. █

A sketched proof of Proposition 1. Taylor expansion on

leads to

We have

Furthermore, we have E(T₁) = O(b_ε²) and T₂ [esdot ] MISE_[0,1] = O(nb_ε)⁻¹ + o(T₁), where MISE_[0,1] denotes the MISE on [0,1]. The results given in (22) are proved.

Observe that

. We have

Note that ε_i⁴ follow a squared ARMA process, which is again a second-order stationary process with absolute summable autocovariances under the assumption E(ε_i⁸) < ∞. Hence the spectral density of ε_i⁴ exists and

where c_f^ε is the value of the spectral density of ε_i⁴ at the origin (see, e.g., Brockwell and Davis, 1991, pp. 218ff). Proposition 1 is proved. █

A sketched proof of Proposition 2. Estimation of functionals ∫{v^(ν)(t)}² dt, where v^(ν) is the νth derivative of v, was investigated by Ruppert et al. (1995) and Beran and Feng (2002b) in nonparametric regression with independent and dependent errors, respectively. Note that I(v²) = ∫{v²(t)}² dt is a special case of such functionals with ν = 0. Furthermore, the results in Ruppert et al. (1995) and Beran and Feng (2002b) together show that the orders of magnitude in these results stay unchanged if short-range dependence and/or a bounded, smooth scale function are introduced into the error process. We obtain the results of Proposition 2 by setting k = 0, l = 2, and δ = 0 in the results in Beran and Feng (2002b), where k and l correspond to ν = 0 and the kernel order used here and δ is the long-memory parameter, which is zero in the current context. █

A sketched proof of Theorem 4. Note that

, where C_A is as defined in (18). Hence we have

Taylor expansion shows that

Observe that

The term

is of a much smaller order than that given in (A.25) and hence is omitted. As a result of the bias in

one has

The results as given in Theorem 4 hold. █

References

REFERENCES

Altman, N.S. (1990) Kernel smoothing with correlated errors. Journal of the American Statistical Association 85, 749–759.Google Scholar

Basrak, B., R.A. Davis, & T. Mikosch (2002) Regular variation of GARCH processes. Stochastic Processes and Their Applications 99, 95–115.Google Scholar

Beran, J. (1999) SEMIFAR Models: A Semiparametric Framework for Modelling Trends, Long Range Dependence, and Nonstationarity. CoFE discussion paper 99/16, University of Konstanz.

Beran, J. & Y. Feng (2001) Local polynomial estimation with a FARIMA-GARCH error process. Bernoulli 7, 733–750.Google Scholar

Beran, J. & Y. Feng (2002a) Local polynomial fitting with long-memory, short-memory and antipersistent errors. Annals of the Institute of Statistical Mathematics 54, 291–311.Google Scholar

Beran, J. & Y. Feng (2002b) Iterative plug-in algorithms for SEMIFAR models: Definition, convergence, and asymptotic properties. Journal of Computational and Graphical Statistics 11, 690–713.Google Scholar

Beran, J. & D. Ocker (2001) Volatility of stock market indices: An analysis based on SEMIFAR models. Journal of Business and Economic Statistic 19, 103–116.Google Scholar

Bollerslev, T. (1986) Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 307–327.Google Scholar

Brockwell, P.J. & R.A. Davis (1991) Time Series: Theory and Methods. Springer.

Dahlhaus, R. (1997) Fitting time series models to nonstationary processes. Annals of Statistics 25, 1–37.Google Scholar

Ding, Z., C.W.J. Granger, & R.F. Engle (1993) A long memory property of stock market returns and a new model. Journal of Empirical Finance 1, 83–106.Google Scholar

Efromovich, S. (1999) Nonparametric Curve Estimation: Methods, Theory, and Applications. Springer.

Engle, R.F. (1982) Autoregressive conditional heteroskedasticity with estimation of U.K. inflation. Econometrica 50, 987–1008.Google Scholar

Fan, J. & I. Gijbels (1995) Data-driven bandwidth selection in local polynomial fitting: Variable bandwidth and spatial adaptation. Journal of the Royal Statistical Society, Series B 57, 371–394.Google Scholar

Fan, J., J. Jiang, C. Zhang, & Z. Zhou (2002) Time-dependent diffusion models for term structure dynamics and the stock price volatility. Statistica Sinica, forthcoming.Google Scholar

Feng, Y. (2002) An iterative plug-in algorithm for nonparametric modelling of seasonal time series. CoFE discussion paper 02/04, University of Konstanz.

Feng, Y. & S. Heiler (1998) Locally weighted autoregression. In R. Galata & H. Küchenhoff (eds.), Econometrics in Theory and Practice, Festschrift für Hans Schneeweiß, pp. 101–117. Physica-Verlag.

Gasser, T., A. Kneip, & W. Köhler (1991) A flexible and fast method for automatic smoothing. Journal of American Statistical Association 86, 643–652.Google Scholar

Härdle, W., H. Liang, & J. Gao (2000) Partially Linear Models. Springer.

Härdle, W., V. Spokoiny, & G. Teyssière (2000) Adaptive Estimation for a Time Inhomogeneous Stochastic-Volatility Model. Discussion paper SFB 373, Humboldt University.

Härdle, W., A.B. Tsybakov, & L. Yang (1998) Nonparametric vector autoregression. Journal of Statistical Planning and Inference 68, 221–245.Google Scholar

Hart, J.D. (1991) Kernel regression estimation with time series errors. Journal of the Royal Statistical Society, Series B 53, 173–188.Google Scholar

He, C. & T. Teräsvirta (1999a) Fourth moment structure of the GARCH(p,q) process. Econometric Theory 15, 824–846.Google Scholar

He, C. & T. Teräsvirta (1999b) Properties of autocorrelation function of squared observations for second-order GARCH processes under two sets of parameter constraints. Journal of Time Series Analysis 20, 23–30.Google Scholar

Ibragimov, I.A. & Yu.V. Linnik (1971) Independent and Stationary Sequences of Random Variables. Wolters-Noordhoff.

Karanasos, M. (1999) The second moment and autocovariance function of the squared errors of the GARCH model. Journal of Econometrics 90, 63–76.Google Scholar

Lee, S.-W. & B.E. Hansen (1994) Asymptotic theory for the GARCH(1,1) quasi-maximum likelihood estimator. Econometric Theory 10, 29–52.Google Scholar

Ling, S. (1999) On probability properties of a double threshold ARMA conditional heteroskedasticity model. Journal of Applied Probability 36, 688–705.Google Scholar

Ling, S. & W.K. Li (1997) On fractional integrated autoregressive moving-average time series models with conditional heteroskedasticity. Journal of the American Statistical Association 92, 1184–1194.Google Scholar

Ling, S. & M. McAleer (2002) Necessary and sufficient moment conditions for the GARCH(r,s) and asymmetric power GARCH(r,s) models. Econometric Theory 18, 722–729.Google Scholar

Lumsdaine, R.L. (1996) Consistency and asymptotic normality of the quasi-maximum likelihood estimator in IGARCH(1,1) and GARCH(1,1) models. Econometrica 64, 575–596.Google Scholar

Mercurio, D. & V. Spokoiny (2002) Statistical Inference for Time-Inhomogeneous Volatility Models. Discussion paper SFB 373, Humboldt University.

Mikosch, T. & C. Stărică (2004) Change of structure in financial time series, long range dependence and the GARCH models. Review of Economics and Statistics, to appear.Google Scholar

Müller, H.G. (1988) Nonparametric Analysis of Longitudinal Data. Springer.

Ruppert, D., S.J. Sheather, & M.P. Wand (1995) An effective bandwidth selector for local least squares regression. Journal of the American Statistical Association 90, 1257–1270.Google Scholar

Ruppert, D. & M.P. Wand (1994) Multivariate locally weighted least squares regression. Annals of Statistics 22, 1346–1370.Google Scholar

Yao, Q. & B. Morgan (1999) Empirical transform estimation for indexed stochastic models. Journal of the Royal Statistical Society, Series B 61, 127–141.Google Scholar

Empirical efficiencies (%) of the estimated parameters

Box plots of (E3), respectively, with n = 1,000, where the horizontal lines show the true values.

Statistics on the selected bandwidth

Estimation results for the first simulated data set.

The estimation results for the S&P 500 returns.

Article contents

SIMULTANEOUSLY MODELING CONDITIONAL HETEROSKEDASTICITY AND SCALE CHANGE

Abstract

1. INTRODUCTION

2. THE MODEL

3. A SEMIPARAMETRIC ESTIMATION PROCEDURE

4. MAIN RESULTS

4.1. Asymptotic Properties of

4.2. Asymptotic Properties of

5. THE PROPOSED DATA-DRIVEN ALGORITHM

6. THE SIMULATION STUDY

6.1. Design of the Simulation

6.2. Results of the Simulation Study

6.3. Detailed Analysis of Two Simulated Examples

7. APPLICATIONS

8. DISCUSSION

APPENDIX: PROOFS OF RESULTS

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests