STATIONARITY TESTS UNDER TIME-VARYING SECOND MOMENTS

Giuseppe Cavaliere; A.M. Robert Taylor

doi:10.1017/S0266466605050553

STATIONARITY TESTS UNDER TIME-VARYING SECOND MOMENTS

Published online by Cambridge University Press: 23 September 2005

Giuseppe Cavaliere and

A.M. Robert Taylor

Show author details

Giuseppe Cavaliere: Affiliation:
University of Bologna
A.M. Robert Taylor: Affiliation:
University of Birmingham

Article contents

Abstract
1. INTRODUCTION
2. THE UNOBSERVED COMPONENTS MODEL
3. STATIONARITY TESTS
4. ASYMPTOTIC SIZE
5. ASYMPTOTIC POWER
6. NUMERICAL RESULTS
7. BOOTSTRAP PROCEDURES
8. CONCLUSIONS
APPENDIX
References

Rights & Permissions

Abstract

In this paper we analyze the effects of a very general class of time-varying variances on well-known “stationarity” tests of the I(0) null hypothesis. Our setup allows, among other things, for both single and multiple breaks in variance, smooth transition variance breaks, and (piecewise-) linear trending variances. We derive representations for the limiting distributions of the test statistics under variance breaks in the errors of I(0), I(1), and near-I(1) data generating processes, demonstrating the dependence of these representations on the precise pattern followed by the variance processes. Monte Carlo methods are used to quantify the effects of fixed and smooth transition single breaks and trending variances on the size and power properties of the tests. Finally, bootstrap versions of the tests are proposed that provide a solution to the inference problem.We are grateful to Peter Phillips, a co-editor, and two anonymous referees whose comments on an earlier draft have led to a considerable improvement in the paper.

Type: Research Article
Information: Econometric Theory , Volume 21 , Issue 6 , December 2005 , pp. 1112 - 1129

DOI: https://doi.org/10.1017/S0266466605050553 [Opens in a new window]
Copyright: © 2005 Cambridge University Press

1. INTRODUCTION

Applied researchers have recently focused attention on the question of whether or not the variability in the shocks driving macroeconomic time series has changed over time (see, e.g., the literature review in Busetti and Taylor, 2003). The empirical evidence suggests that a decline in volatility over the past 20 years or so is a common phenomenon in many real and price variables. These findings have helped stimulate interest among econometricians in analyzing the effects of innovation variance shifts on unit root and stationarity tests. Among others, Hamori and Tokihisa (1997) and Kim, Leybourne, and Newbold (2002) have derived the implications of a single permanent variance shift in the innovations of an I(1) process on the size properties of Dickey–Fuller tests. The effect of a single variance shift on the stationarity test (KPSS test) of Kwiatkowski, Phillips, Schmidt, and Shin (1992) has been analyzed independently by Busetti and Taylor (2003) and Cavaliere (2004a), who found that the test can suffer severe size distortions when there is a late (early) positive (negative) variance shift under the null.

We analyze the effects that a very general class of permanent variance breaks has on the behavior of the KPSS stationarity test, together with those of Lo (1991) and Xiao (2001); a brief review of these tests is given in Section 3. Our unobserved components model, introduced in Section 2, generalizes that considered in, inter alia, Kwiatkowski et al. (1992) to allow for innovation processes whose variances evolve over time according to a quite general mechanism that allows, e.g., single and multiple breaks, smooth transition breaks, and trending variances. Variance nonconstancy is allowed in both the irregular component and the errors driving the level of the process. In Sections 4 and 5 we analyze the effects of time-varying variances on the large-sample behavior of these statistics under both the I(0) null and global I(1) and local alternatives. In Section 6 these effects are quantified, using Monte Carlo simulation, for the aforementioned examples.

Related but different work was carried out by Hansen (2000), who shows that the Lagrange multiplier (LM) test of Nyblom (1989) for structural change in the parameters of a linear regression model (which contains the KPSS test as a special case) underrejects the I(0) null when the marginal distribution of the regressors changes over time. Conversely, in this paper we show that where the variance of the errors changes over time the picture is quite different, with the KPSS (and other stationarity) tests both under- and overrejecting the null, but with a more pronounced tendency toward overrejecting. Similarly, whereas Hansen (2000) shows that Nyblom's test loses (size-unadjusted) power under structural changes in the marginal distribution of the regressors, for most of the cases we consider the KPSS test gains power when the errors are heteroskedastic. In Section 7 we adapt the heteroskedastic bootstrap of Hansen (2000) to the present problem and show that the bootstrap tests perform well in practice. Section 8 concludes. Sketch proofs are given in an Appendix; detailed proofs appear in Cavaliere and Taylor (2004).

We use

to denote weak convergence as the sample size diverges, the indicator function, and the space of cadlag processes on [0,1] endowed with the Skorohod metric, respectively, whereas x := y means that x is defined by y. Finally, as in Phillips and Sun (2001), for two processes X and Y on [0,1] we define the projections

and

, where ⁽¹⁾ denotes the first derivative.

2. THE UNOBSERVED COMPONENTS MODEL

Consider the unobservable components (UC) data generating process (DGP)

under the following set of assumptions (which are taken to hold throughout the paper, except where stated otherwise).

Assumption

. The term {σ_t} satisfies σ_[sT] := ω(s), where

is a nonstochastic function with a finite number of points of discontinuity; moreover, ω(·) > 0 and satisfies a (uniform) first-order Lipschitz condition except at the points of discontinuity. Similarly, except where otherwise stated, σ_η[sT] := ω_η(s), where ω_η(·) satisfies the same conditions as ω(·).

Assumption

. The irregular component {ε_t} is a zero-mean, unit variance, strictly stationary mixing process with E|ε_t|^p < ∞ for some p > 2 and with mixing coefficients {α_m} satisfying

for some r ∈ (2,4], r ≤ p. The long-run variance

is strictly positive and finite. Furthermore, {ε_t} is independent of {η_t} at all leads and lags. As is standard, we refer to {ε_t} as an I(0) process.

Assumption

. The component x_t is a p × 1 deterministic vector satisfying the condition that there exist a scaling matrix δ_T and a bounded piecewise-continuous function F(·) on [0,1] such that δ_T x_[·T] → F(·) uniformly on [0,1], with

positive definite.

From (1) and (2), observe that under Assumption

, the variance of both the irregular component, u_t := σ_tε_t, and the shocks to the level process {μ_t} are heteroskedastic. Consequently, {u_t} is I(0) provided {σ_t} is constant, whereas {μ_t} reduces to a standard random walk if {σ_ηt} is constant and vanishes from (1) when σ_ηt = 0, all t. Notice that the model considered here generalizes the UC model discussed in Kwiatkowski et al. (1992) by allowing both {σ_t} and {σ_ηt} to be potentially nonconstant over time.

Busetti and Taylor (2003) consider the model discussed here under the constraint that σ_t = σ_ηt. In our framework we do not require this constraint to hold.

Assumption

allows for a wide class of models for the variances of the errors. Models of single or multiple variance shifts satisfy Assumption

with ω(·) piecewise constant. For example, the function

gives the single break model with a variance shift at time [mT], 0 < m < 1, analyzed by Busetti and Taylor (2003) and Cavaliere (2004a). If ω(·)² is an affine function, then the unconditional variance of the errors displays a linear trend. Piecewise-affine functions are also permitted, allowing for variances that follow a broken trend. Moreover, smooth transition variance shifts also satisfy Assumption

: e.g., the function

, which corresponds to a smooth (logistic) transition from σ₀² to σ₁². The parameter m determines the transition midpoint (for t = [mT], σ_t² = 0.5(σ₀² + σ₁²)) whereas γ > 0 controls the speed of transition (the fixed change-point model follows as a limiting case for γ → ∞).

Assumption

is standard and allows for a wide variety of possible forms for the deterministic component, including the pth-order trend function x_t := (1,t,…,t^p)′, 0 ≤ p < ∞. The broken intercept and broken intercept and trend functions considered, e.g., in Busetti and Harvey (2001) are obtained by specifying

respectively, in (1), t_m^j being defined as

satisfying lim_T→∞(m/T) = μ ∈ (0,1) (see Phillips and Xiao, 1998, p. 448).

Remark 1. If ω(·) is not constant then the irregular component, {u_t}, is unconditionally heteroskedastic. Conditional heteroskedasticity is also permitted through Assumption

(see, e.g., Hansen, 1992b). Assumption

has been used extensively in the econometric literature as it allows {ε_t} to belong to a wide class of weakly dependent stationary processes. The strict stationarity assumption is made without loss of generality and may be weakened to allow for weak heterogeneity of the errors, as in, e.g., Phillips (1987).

Remark 2. The assumption of nonstochastic variance functions {ω(·),ω_η(·)} can be easily weakened simply by assuming stochastic independence between {ε_t,η_t} and {σ_t,σ_ηt}, given that the stochastic functionals {ω(·),ω_η(·)} must have sample paths satisfying the requirements of Assumption

. In the stochastic variance framework, the results given in this paper hold conditionally on a given realization of {ω(·),ω_η(·)}.

3. STATIONARITY TESTS

Kwiatkowski et al. (1992) focus on testing the I(0) null hypothesis, H₀ : σ_η² = 0, against the I(1) alternative hypothesis, H₁ : σ_η² > 0, under the ancillary assumption that σ_t = σ,σ_ηt = σ_η, all t, so that, under H₀, {y_t} reduces to the I(0) process y_t = x_t′ β + u_t, t = 1,…,T. Kwiatkowski et al. (1992) propose the test that rejects H₀ for large values of the statistic

where

, the ordinary least squares (OLS) residuals from the regression of y_t on x_t, t = 1,…,T;

is a consistent estimator of the long-run variance of {u_t} under H₀ and has the form

being a bandwidth parameter and k(·) a weighting function. Kwiatkowski et al. (1992) assume

(Bartlett weights). However, because we are dealing with mixing errors (see Assumption

), throughout the paper we will require that q_T and k(·) satisfy the following assumption (de Jong, 2000).

Assumption

. (K₁) For all

is continuous at 0 and for almost all

, where l(x) is a nonincreasing function such that

; (K₂) q_T ↑ ∞ as T ↑ ∞, and q_T = o(T^γ), γ ≤ 1/2 − 1/r, where r is given in

Assumption

is sufficiently general for our purposes as it is satisfied by many of the most commonly employed kernels (see Hansen, 1992a; Jansson, 2002).

Remark 3. The

statistic maps the sequence

onto [0,1] by averaging the squared values of the sequence. Other stationarity tests can be obtained by taking different mappings. For example, the supremum of

and range of

deliver, respectively, the test of Xiao (2001) and the rescaled range (RS) test of Lo (1991), which reject H₀ for large values of the statistics

, respectively.

4. ASYMPTOTIC SIZE

Under the null hypothesis considered by Kwiatkowski et al. (1992), H₀ : σ_ηt² = σ_η² = 0, all t, if {σ_t} is constant across the sample, it is well known that (e.g., Kwiatkowski et al., 1992, pp. 164–165)

, where

, with B(·) a standard Brownian motion. For example, if x_t := (1,t,…,t^p−1)′, then F(s) := (1,s,…,s^p−1)′ and V(·) is a pth-level Brownian bridge.

Now, assume that H₀ holds but that σ_t is not necessarily constant over time; rather it satisfies Assumption

. Then, the asymptotic distribution of the

statistic assumes the form detailed in the following theorem.

THEOREM 1. Under H₀ : σ_ηt² = σ_η² = 0, all t,

, where

and where

Consequently, with respect to the homoskedastic case, the asymptotic distribution of the

statistic has the usual structure but with B(·) replaced by B_ω(·). It is only where ω(·) is constant throughout the sample that B_ω(·) reduces to a standard Brownian motion and, hence, that

has the standard limiting distribution.

Remark 4. The process B_ω(·) is a diffusion corresponding to the stochastic differential equation dB_ω(s) = (ω(s)/ω)dB(s) with initial condition B_ω(0) = 0. Because B_ω(·) has zero mean, variance

(where Λ_ω(·) is an increasing homeomorphism on [0,1]) and has independent increments, Corollary 29.10 of Davidson (1994) implies that B_ω(·) is distributed as B(Λ_ω(·)), and therefore at time s ∈ [0,1], B_ω(·) has the same distribution as the standard Brownian motion B(·) at time Λ_ω(s) ∈ [0,1]. That is, B_ω(·) is a Brownian motion under modification of the time domain (see, e.g., Revuz and Yor, 1991, p. 170).

Remark 5. Under the conditions of Theorem 1,

. Interestingly, in the case of no deterministic terms (i.e., x_t′ β = 0), because

(see Remark 4), it holds that

, and the asymptotic sizes of the

tests are not affected by variance changes that satisfy Assumption

. Simulation evidence reported in Cavaliere and Taylor (2004) suggests that this invariance property also holds reasonably well in small samples.

5. ASYMPTOTIC POWER

In this section we investigate the impact of time-varying variances in the irregular component in (1), and/or the error driving the level equation, (2), on both the consistency and local asymptotic power properties of the tests.

5.1. Consistency

It is well known (e.g., Kwiatkowski et al., 1992, eqn. (25)) that if σ_ηt² = σ_η² > 0, then

where

is a standard Brownian motion independent of B(·). Because q_T /T → 0, (4) implies that

diverges to +∞ at rate O_p(T/q_T) under the I(1) alternative. In addition to this result, note that if the {u_t} component has a time-varying variance,

is still distributed as in (4), because as T → ∞, the I(1) component {η_t} dominates.

Now, consider the general case where σ_ηt² ≠ 0 but is not necessarily constant, satisfying Assumption

. Here the following result holds.

THEOREM 2. If σ_ηt² ≠ 0, all t, the weak convergence (4) holds with W(·) replaced by W_{ω_η}(·), where W_{ω_η}(s) := B_{ω_η}(s) − P_F B_{ω_η}(s) with

Consequently, as in the case of constant variances, because q_T /T → 0, Theorem 2 implies that

diverges to +∞ at rate O_p(T/q_T) under global I(1) alternatives.

Remark 6. Under the conditions of Theorem 2,

and

, which imply that both

also diverge to +∞, at rate O_p((T/q_T)^1/2).

5.2. Asymptotic Local Power

We now focus attention on the limiting behavior of the

statistic under the local alternative (see also Busetti and Taylor, 2003, p. 513):

where c ≥ 0 is a noncentrality parameter and λ_εω/ω_η > 0 is a scale factor that simplifies the representation of the asymptotic distributions. Notice that ω/ω_η = 1 if σ_t = σ_ηt, t = 1,…,T; i.e., if the pattern of time variation is common to the variances of the irregular component in (1) and the error driving the level in (2). Moreover, where σ_t = σ and σ_ηt = σ_η, t = 1,…,T, H_c reduces to the local alternative considered by, inter alia, Stock (1994, p. 2799).

The following theorem details the large-sample behavior of

under H_c.

THEOREM 3. Under H_c of (5),

where the (independent) processes V_ω(·) and W_{ω_η}(·) are as previously defined.

Remark 7. Notice from (6) that the asymptotic local power of

is affected by heteroskedasticity in both the irregular component and the errors driving the level process. Moreover, because the limiting processes relating to these components enter the asymptotic distribution in different forms (W_{ω_η}(·) is integrated whereas V_ω(·) is not), it is anticipated that heteroskedasticity will have different effects in these two cases.

Remark 8. Under the homoskedastic condition that σ_t² = σ² and σ_ηt² = σ_η², for all t, the local alternative simplifies to H_c : σ_η² = (c²/T²)σ²λ_ε², and the right member of (6) reduces to

(cf. Busetti and Taylor, 2003, p. 513).

Remark 9. Under the conditions of Theorem 3,

6. NUMERICAL RESULTS

In this section we use Monte Carlo methods to quantify the finite-sample size and power properties of

of (3) and Remark 3, for the DGP (1)–(2) with β = 0 and (ε_t, η_t)′ ∼ NIID(0,I₂), where {σ_t²} and/or {σ_ηt²} vary according to Assumption

. We focus on the following three particular cases, where f (s) can be either ω(s) or ω_η(s):

Case (a): Single Break:
Case (b): Smooth Transition:
Case (c): Piecewise-Linear Trend:
.

Without loss of generality, in each case we set f₀ = 1 and vary the ratio d = f₀ /f₁ among d ∈ {0.25,4}. A positive (negative) variance shift obtains for d < 1 (d > 1). In both Cases (a) and (b) we vary the parameter m among m ∈ {0.1,0.5,0.9}. In Case (b) we report results setting the speed of transition parameter γ = 10. Under Case (c) we consider m ∈ {0.0,0.5,0.9}. For m = 0.0 the variance process follows a linear trend between f₀² for s = 0 and f₁² for s = 1. When m > 0 the variance is fixed at f₀² up until time [mT] after which time it follows a linear trend path until s = 1 where it equals f₁². Other parameter values were considered but add little to what is reported.²

Indeed, for Case (c) we also considered the generalized trend function,

, for a range of values of r but found very little dependence on r.

We have set both {ε_t} and {η_t} to be serially uncorrelated Gaussian sequences as the effects we are looking to quantify are those caused by nonconstant variances rather than serial correlation. The latter are already well documented in the literature; (see, inter alia, Kwiatkowski et al., 1992, pp. 169–172). Accordingly, we use a Bartlett kernel with q_T = 1. Samples of sizes T = 50 and 250 are considered; all tests were run at the nominal 5% level using critical values, obtained in the same fashion, under σ_t = 1 and σ_ηt = 0, t = 1,…,T.³

All simulation experiments were conducted using the RNDN function of Gauss 3.1 over 40,000 Monte Carlo replications.

6.1. Size Properties

Table 1 reports empirical rejection frequencies of the

tests when σ_ηt = 0, t = 1,…,T, and ω(s)², 0 ≤ s ≤ 1, satisfies either Case (a), (b), or (c) with σ_j = f_j, j = 0,1, for the range of parameter values outlined before. Results are reported for the cases where

are the OLS residuals from the regression of y_t on x_t = 1 (a constant) or x_t = (1, t)′ (a constant and linear trend), t = 1,…,T.

Empirical size of stationarity tests: Heteroskedastic errors

Consider first the results for the single break model. For early breaks (m = 0.1) the

test is (over-) undersized when (d = 4) d = 0.25. For late breaks this pattern is reversed. For the constant case,

displays the largest size distortions in most cases, whereas there seems to be little to choose between the

tests overall:

is better behaved (with only slight oversizing) than

for m = 0.5, but the reverse is true for both m = 0.1 and m = 0.9. Where significant size distortions occur in the

tests for the constant case, they worsen considerably for the linear trend case, especially so in the case of

. In the trend case the

test is noticeably better behaved than the other tests, behaving similarly to the constant case. Finally, for m = 0.5 the degree of oversizing seen in each of the three tests does not vary significantly between d = 4 and d = 0.25.

The results for the smooth transition break model largely mirror those for the single break but with the distortions somewhat ameliorated. This result is perhaps not surprising given that the logistic function used in Case (b) smooths the break across the sample. Although we report results for a relatively slow transition speed, γ = 10, we computed experiments for a range of values of γ and found the differences across γ quite small with results tending toward those for the single break model as γ increased. For example, by γ = 50 these results were indistinguishable.

Turning to the results for trending variances, for m = 0 the size of the

test is not substantially affected in either the constant or constant and trend cases, whereas the size distortions seen in the constant and trend cases for the

tests are roughly the same throughout for d = 4 and d = 0.25. Again the

test displays the worst size distortions. For all of the tests linear trending variances seem in most cases to have a lesser impact on size than either fixed or smooth transition breaks. The patterns of size distortions for the piecewise-linear trend (m = 0.5 and m = 0.9) exaggerate (dampen) those seen in the same setting when m = 0 and d = 0.25 (d = 4).

6.2. Local Power Properties

Table 3 reports empirical rejection frequencies of the

tests under a local alternative for each of Cases (a), (b) and (c). For each case, results are reported where either only {σ_t²} (labeled “shift in I(0) only”) or only {σ_ηt²} (labeled “shift in I(1) only”) vary through time and for the case where both vary. The range of values for the parameters is as in Section 6.1, excepting the case where both components vary through time where {σ_ηt²} is fixed throughout with d = 4 and m = 0.1 under Case (a), d = 4, m = 0.1, and γ = 10 under Case (b), and m = 0 and d = 4 under Case (c). In these cases, therefore, {σ_t²} and {σ_ηt²} evolve according to the same function with the same parameters, whereas for the other entries in the table they evolve according to the same function but with different parameters. The local alternative considered is (5), except that we do not scale out the nuisance parameter ω/ω_η.

⁴

Recall that this was done in Theorem 3 purely to simplify the right member of (6).

Results are reported for the linear trend case with c = 10. Results for the constant only case and for other values of c were qualitatively similar. Consequently, the results for the shift in I(0) only pertain to the local alternative H_c : σ_ηt² = σ_η² = (10/T)², t = 1,…,T, whereas all other results relate to H_c : σ_ηt² = (10/T)² × ω_η(t/T)², t = 1,…,T, where ω_η(.)² is as defined previously for each of Cases (a), (b) and (c).

Consider first Table 2, which reports benchmark results for the power of the

tests for the homoskedastic case, σ_t² = 1, t = 1,…,T, under the local alternative H_c : σ_ηt² = c²/T², t = 1,…,T, for c = 1, 5, 10, 15, 20, and 25. Observe that the

test is dominated on local power by both the

tests. The

test is the locally best invariant (LBI) test in this setting, so it is no surprise that it displays the highest power in most cases. However, the

test is very competitive on power and, indeed, tends to display higher power than

for c ≥ 20.

Empirical local power of stationarity tests: Homoskedastic errors

Empirical local power of stationarity tests under heteroskedastic errors: xt = (1, t)′

Turning to the results for the heteroskedastic cases in Table 3, a number of regularities are seen. First, in each of the cases of variance shifts in the I(0), I(1), and both I(0) and I(1) components, the

tests behave almost identically. Second, in the case of variance shifts in the I(1) component only, all three tests behave almost identically. Third, in the case where variance shifts affect both the I(0) and I(1) components, for the entries in Cases (a) and (b) for m = 0.1, d = 4 and Case (c) for m = 0.0, d = 4 (i.e., instances where precisely the same variance process applies to both the I(0) and I(1) components) the results are very similar to those seen in Table 2 for c = 10. Fourth, and as predicted by the asymptotic distribution theory (cf. Remark 7), changing variances in the I(0) and I(1) components (but not both) effect very different outcomes: negative (positive) shifts in the variance of the I(1) component result in increases (decreases) in power relative to the benchmark homoskedastic power in Table 2, whereas the converse is true for variance shifts in the I(0) component. Fifth, and in contrast to the preceding point, shifts in both the I(0) and I(1) variances tend not to inflate power beyond the homoskedastic benchmark; indeed, for single and smooth transition breaks with early positive shifts the empirical rejection frequencies of all the tests are close to the nominal level. Finally, the effects on power (relative to the homoskedastic case) of heteroskedastic variances are most pronounced for the single break case and least pronounced in the trend case (cf. Table 1).

7. BOOTSTRAP PROCEDURES

In this section we show that the size biases caused by time-varying second moments can be corrected by properly adapting the heteroskedastic fixed regressor bootstrap of Hansen (2000) to the present framework. Interestingly, the heteroskedastic bootstrap allows us to retrieve asymptotically correct p-values even in the presence of autocorrelated errors. The rationale behind this result is that whereas the asymptotic null distribution of the

statistic is affected by the heteroskedasticity function ω(·) it is not affected by the short memory properties of the I(0) component {ε_t} (see Theorem 1). We outline the bootstrap procedure for the

-based procedure, although the

- and

-based procedures may be bootstrapped in an entirely analogous fashion.

Let

denote the limiting null distribution of

(Theorem 1) and its cumulative distribution function (c.d.f.), respectively. Let

denote the residuals obtained by regressing y_t on x_t and let {z_t}_t=1^T denote an independent N(0,1) sequence. The bootstrap sample is defined as

, and the bootstrap statistic is given by

with

denoting the residuals obtained from the regression of y_t^b on x_t, t = 1,…,T. The bootstrap p-value is

, where G_T^b(·) denotes the c.d.f. of

The usefulness of the heteroskedastic bootstrap in the present framework is given in Theorem 4, which shows (i) that the bootstrap allows us to retrieve the correct asymptotic null distribution and hence that the p-values based on

are asymptotically pivotal and (ii) that a test based on the bootstrap p-values is consistent.

THEOREM 4. (i) Under the conditions of Theorem 1,

, where

denotes weak convergence in probability (see Giné and Zinn, 1990). (ii) Under the conditions of Theorem 2,

In practice, G_T^b(·) is not known but can be approximated in the usual way through numerical simulation by generating N (conditionally) independent bootstrap statistics,

, computed as before but from

, with {{z_n,t}_t=1^T}_n=1^N a doubly independent N(0,1) sequence. The simulated bootstrap p-value is then computed as

and is such that

In Table 4 we report results for the bootstrapped KPSS testing procedure, outlined before, applied to data generated according to Case (a) of Section 6.1. The results are therefore directly comparable with those given for Case (a) in Table 1 for the

test. Results are reported only for this case because this was the form of heteroskedasticity that effected the most significant size distortions in the original tests. The reported results are for experiments run over N = 1,000 bootstrap replications. Benchmark entries for the case where the errors are homoskedastic are also reported in the column labeled “IID.”

Empirical size of bootstrap KPSS tests: Heteroskedastic errors, Case (a)

A comparison of the results in Tables 1 and 4 shows that the bootstrap performs very well in practice with empirical sizes much closer to the nominal level than for the standard

test. Some oversizing, associated with early negative and late positive variance breaks, is still seen for T = 50 but is much reduced relative to that seen for the standard

test and is largely eliminated for T = 250. The undersizing seen in the standard

test for early positive and late negative breaks is eliminated by the bootstrap. Although not reported here, qualitatively similar improvements (available on request) were seen for bootstrapped implementations of the

tests and for data generated under Cases (b) and (c).

8. CONCLUSIONS

In this paper we have analyzed the effects that time-varying second moments of a very general form have on the stationarity tests of Kwiatkowski et al. (1992), Lo (1991), and Xiao (2001). We have demonstrated that, in general, heteroskedasticity changes the limiting distributions of these stationarity test statistics under both the null and local alternatives and (for appropriately rescaled statistics) global alternatives. We have presented Monte Carlo simulation results to quantify the finite-sample effects of heteroskedasticity on the size and power properties of the three tests. Results were presented for variances displaying either a single break, a smooth transition break, or a linear/piecewise-linear trend. Bootstrap versions of the tests, adapted from the heteroskedastic bootstrap principle of Hansen (2000), were developed and shown to greatly improve the finite-sample size properties of the tests. Although not considered here, it would be interesting and reasonably straightforward to extend the results presented in this paper to the corresponding tests for the null hypothesis of cointegration of Shin (1994), inter alia.

APPENDIX

Proof of Theorem 1. Define the partial sum

. Under Assumptions

(see also Cavaliere, 2004b). Similarly,

. After some algebra, the preceding results taken together with the convergence result

(Assumption

) allow us to conclude that

and, hence, by the continuous mapping theorem (CMT), that

. Note that the CMT also allows us to prove that

and that

(see Remarks 3 and 5). The proof is completed by showing that

, which follows from Cavaliere (2004b, Thm. 4). █

Proof of Theorem 2. Because {σ_ηt} satisfies Assumption

then

. This result also implies that

is dominated by {μ_t}. As in the proof of Theorem 1, it easily follows that the residuals

obey the functional central limit theorem

and, hence, by the CMT, the numerator of

satisfies

. Finally, as in Kwiatkowski et al. (1992) one can show that for any

and hence that

. █

Proof of Theorem 3. The proof follows directly from Theorems 1 and 2, using the CMT and the fact that, because

, as in Theorem 1. █

Proof of Theorem 4. (i) Conditionally on

is Gaussian with covariance kernel

(see, e.g., Hansen, 1996). Similarly, the process

is Gaussian with covariance kernel

. To simplify notation, but without loss of generality, assume that x_t contains a constant, i.e.,

; then S_T^b(s) = (1,0′)M_T^b(s), so that the asymptotic distribution of S_T^b(·) easily follows from that of M_T^b(·). Now,

, which is a consequence of the fact that

(notice that the true value of β is zero here) is of O_p(T^−1/2) and the mixing properties of ε_t² − E(ε_t²). It therefore follows that

and, by the CMT, that

. Hence,

weakly converges to

. Notice that

. As in Cavaliere (2004b) it is straightforward to show that

, so that

, a result that follows from a standard application of the weak law of large numbers for martingale difference sequences. The preceding results imply that

and hence that G_T^b(·) → G(·) uniformly in probability. The remainder of the proof is identical to the proof of Theorem 5 in Hansen (2000). (ii) Let

. Conditionally on

is Gaussian with zero mean and covariance kernel

. Hence,

and by the CMT

being a standard Brownian motion, independent of W_{ω_η}(r). Using similar arguments it can be shown that

and that

being a standard Brownian motion, independent of W_{ω_η}(·) and B_z₁(·). Consequently, the standardized variance estimator

satisfies

. Taken together, the preceding results imply that

weakly converges in probability to the random variable

, whose c.d.f. is denoted by

, and hence that

, uniformly. Consequently,

. Because, by Theorem 2,

diverges at the rate q_T⁻¹T, it follows that

. █

References

REFERENCES

Busetti, F. & A.C. Harvey (2001) Testing for the presence of a random walk in series with structural breaks. Journal of Time Series Analysis 22, 127–150.Google Scholar

Busetti, F. & A.M.R. Taylor (2003) Variance shifts, structural breaks and stationarity tests. Journal of Business & Economic Statistics 21, 510–531.Google Scholar

Cavaliere, G. (2004a) Testing stationarity under a permanent variance shift. Economics Letters 82, 403–408.Google Scholar

Cavaliere, G. (2004b) Unit root tests under time-varying variances. Econometric Reviews 23, 259–292.Google Scholar

Cavaliere, G. & A.M.R. Taylor (2004) Stationarity Tests under Time-Varying Second Moments. Discussion Papers in Economics 04-12, University of Birmingham.

Davidson, J. (1994) Stochastic Limit Theory. Oxford University Press.

de Jong, R. (2000) A strong consistency proof for heteroskedasticity and autocorrelation consistent covariance matrix estimators. Econometric Theory 16, 262–268.Google Scholar

Giné, E. & J. Zinn (1990) Bootstrapping general empirical measures. Annals of Probability 18, 851–869.Google Scholar

Hamori, S. & A. Tokihisa (1997) Testing for a unit root in the presence of a variance shift. Economics Letters 57, 245–253.Google Scholar

Hansen, B.E. (1992a) Consistent covariance matrix estimation for dependent heterogeneous processes. Econometrica 60, 967–972.Google Scholar

Hansen, B.E. (1992b) Convergence to stochastic integrals for dependent heterogeneous processes. Econometric Theory 8, 489–500.Google Scholar

Hansen, B.E. (1996) Inference when a nuisance parameter is not identified under the null hypothesis. Econometrica 64, 413–430.Google Scholar

Hansen, B.E. (2000) Testing for structural change in conditional models. Journal of Econometrics 97, 93–115Google Scholar

Jansson, M. (2002) Consistent covariance matrix estimation for linear processes. Econometric Theory 18, 1449–1459.Google Scholar

Kim, T.H., S. Leybourne, & P. Newbold (2002) Unit root tests with a break in innovation variance. Journal of Econometrics 109, 365–387.Google Scholar

Kwiatkowski, D., P.C.B. Phillips, P. Schmidt, & Y. Shin (1992) Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root? Journal of Econometrics 54, 159–178.Google Scholar

Lo, A.W. (1991) Long-term memory in stock market prices. Econometrica 59, 1279–1314.Google Scholar

Nyblom, J. (1989) Testing the constancy of parameters over time. Journal of the American Statistical Association 84, 223–230.Google Scholar

Phillips, P.C.B. (1987) Time series regression with a unit root. Econometrica 55, 277–301.Google Scholar

Phillips, P.C.B. & Y. Sun (2001) Nonorthogonal Hilbert projections in trend regression. Econometric Theory, Problem, 17, 854; Solution, 18, 1011–1015.Google Scholar

Phillips, P.C.B. & Z. Xiao (1998) A primer on unit root testing. Journal of Economic Surveys 12, 423–470.Google Scholar

Revuz, D. & M. Yor (1991) Continuous Martingales and Brownian Motion. Springer-Verlag.

Shin, Y. (1994) A residual-based test of the null of cointegration against the alternative of no cointegration. Econometric Theory 10, 91–115.Google Scholar

Stock, J.H. (1994) Unit roots, structural breaks and trends. In R.F. Engle & D.L. McFadden (eds.), Handbook of Econometrics, vol. 4, pp. 2739–2840. Elsevier Science.

Xiao, Z. (2001) Testing the null hypothesis of stationarity against an autoregressive unit root alternative. Journal of Time Series Analysis 22, 87–105.Google Scholar

Empirical size of stationarity tests: Heteroskedastic errors

Empirical local power of stationarity tests: Homoskedastic errors

Empirical local power of stationarity tests under heteroskedastic errors: xt = (1, t)′

Empirical size of bootstrap KPSS tests: Heteroskedastic errors, Case (a)

Article contents

STATIONARITY TESTS UNDER TIME-VARYING SECOND MOMENTS

Abstract

1. INTRODUCTION

2. THE UNOBSERVED COMPONENTS MODEL

3. STATIONARITY TESTS

4. ASYMPTOTIC SIZE

5. ASYMPTOTIC POWER

5.1. Consistency

5.2. Asymptotic Local Power

6. NUMERICAL RESULTS

6.1. Size Properties

6.2. Local Power Properties

7. BOOTSTRAP PROCEDURES

8. CONCLUSIONS

APPENDIX

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests