Hostname: page-component-745bb68f8f-l4dxg Total loading time: 0 Render date: 2025-02-11T07:11:24.995Z Has data issue: false hasContentIssue false

A CONSISTENT DIAGNOSTIC TEST FOR REGRESSION MODELS USING PROJECTIONS

Published online by Cambridge University Press:  03 November 2006

J. Carlos Escanciano
Affiliation:
Universidad de Navarra
Rights & Permissions [Opens in a new window]

Abstract

This paper proposes a consistent test for the goodness-of-fit of parametric regression models that overcomes two important problems of the existing tests, namely, the poor empirical power and size performance of the tests due to the curse of dimensionality and the subjective choice of parameters such as bandwidths, kernels, and integrating measures. We overcome these problems by using a residual marked empirical process based on projections (RMPP). We study the asymptotic null distribution of the test statistic, and we show that our test is able to detect local alternatives converging to the null at the parametric rate. It turns out that the asymptotic null distribution of the test statistic depends on the data generating process, and so a bootstrap procedure is considered. Our bootstrap test is robust to higher order dependence, in particular to conditional heteroskedasticity. For completeness, we propose a new minimum distance estimator constructed through the same RMPP as in the testing procedure. Therefore, the new estimator inherits all the good properties of the new test. We establish the consistency and asymptotic normality of the new minimum distance estimator. Finally, we present some Monte Carlo evidence that our testing procedure can play a valuable role in econometric regression modeling.The author thanks Carlos Velasco and Miguel A. Delgado for useful comments. The paper has also benefited from the comments of two referees and the co-editor. This research was funded by the Spanish Ministry of Education and Science reference number SEJ2004-04583/ECON and by the Universidad de Navarra reference number 16037001.

Type
Research Article
Copyright
© 2006 Cambridge University Press

1. INTRODUCTION

The purpose of the present paper is to develop a consistent, powerful, and simple diagnostic test for testing the adequacy of a parametric regression model with the property of being free of any user-chosen parameter (e.g., bandwidth) and, at the same time, being suitable for cases in which the covariate is of high or moderate finite dimension. Most consistent tests proposed in the literature give misleading results for this latter empirically relevant case. This problem is intrinsic and is often referred to as the “curse of dimensionality” in the regression literature; see Section 7.1 of Fan and Gijbels (1996) for some discussion on this problem. More precisely, let (Y,X′)′ be a random vector in a (d + 1)-dimensional Euclidean space, where Y represents the real-valued dependent (or response) variable, X is the d-dimensional explanatory variable,

, and A′ denotes the matrix transpose of A. Under E|Y | < ∞, it is well known that the regression function m(x) = E [Y |X = x] is well defined. If in addition E|Y |2 < ∞, then m(X) represents almost surely (a.s.) the “best” prediction of Y given X, in a mean square sense. Then, it is common in regression modeling to consider the following tautological expression:

where ε = YE [Y |X] is, by construction, the unpredictable part (in mean) of Y given X.

Much of the existing literature is concerned with parametric modeling in that m is assumed to belong to a given parametric family

and, by analogy, one considers the following parametric regression model:

with f (X,θ) a parametric specification for the regression function m(X) and with e(θ) a random variable (r.v.), disturbance of the model. Parametric regression models continue to be attractive to practitioners because these models have the appealing property that the parameter θ together with the functional form f (·,·) describes, in a very concise way, the relation between the response Y and the explanatory variable X. Because we do not know in advance the true regression model, to prevent wrong conclusions, every statistical inference that is based on model f should be accompanied by a proper model check. As a matter of fact, a correct specification of m is important in model-based economic decisions and/or to interpret parameters correctly.

Note that

is tantamount to

There is a vast amount of literature on testing consistently the correct specification of a parametric regression model. Although the idea of the proposed consistent tests is similar in all cases, namely, comparing a parametric and a (semi-) nonparametric estimation of a functional of the conditional mean in (2), they can be divided into two classes of tests. The first class of tests uses nonparametric smoothing estimators of E [e0)|X]. We call this approach the “local approach”; see Eubank and Spiegelman (1990), Eubank and Hart (1992), Wooldridge (1992), Yatchew (1992), Gozalo (1993), Härdle and Mammen (1993), Horowitz and Härdle (1994), Hong and White (1995), Zheng (1996), Li (1999), Horowitz and Spokoiny (2001), Koul and Ni (2004), and Guerre and Lavergne (2005) for some examples. A methodology related to the local approach is that of empirical likelihood procedures as proposed in Chen, Härdle, and Li (2003) and Tripathi and Kitamura (2003). The local approach requires smoothing of the data in addition to the estimation of the finite-dimensional parameter vector and leads to less precise fits. Tests based on the local approach have standard asymptotic null distributions, but their finite-sample distributions depend on the choice of a bandwidth (or similar) of the nonparametric estimator, which affects the inference procedures.

The second class of tests avoids smoothing estimation by means of reducing the conditional mean independence in (2) to an infinite (but parametric) number of unconditional orthogonality restrictions, i.e.,

where Π is a properly chosen space and the parametric family w(·,x) is such that the equivalence (3) holds; see Bierens and Ploberger (1997), Stinchcombe and White (1998), and Escanciano (2006) for primitive conditions on the family w(·,x) to satisfy this equivalence. We call the approach based on (3) the “integrated approach” because it uses the integrated (cumulative) measures of dependence E [e0)w(X,x)]. In the literature the most frequently used weighting functions have been the exponential function, e.g., w(X,x) = exp(ixX) in Bierens (1982, 1990), where

denotes the imaginary unit and the indicator function w(X,x) = 1(Xx); see, e.g., Stute (1997), Koul and Stute (1999), Whang (2000), and Li, Hsiao, and Zinn (2003), among many others. Different families w deliver different power properties of the integrated-approach-based tests. Most tests based on the integrated approach have nonstandard asymptotic null distributions, but they can be well approximated by bootstrap methods; see, e.g., Stute, Gonzalez-Manteiga, and Presedo-Quindimil (1998).

An important problem with the local approach arises when the dimension of the explanatory variable X is high or even moderate. The sparseness of the data in high-dimensional spaces leads most local-based test statistics to suffer a considerable bias, even for large sample sizes. This is an important practical limitation for most tests considered in the literature, because it is not uncommon in econometric modeling to have high-order models. Some statistical theories have been developed to overcome this problem; cf. generalized linear models (GLM) (see, e.g., McCullagh and Nelder, 1989) or single-index models (see, e.g., Powell, Stock, and Stoker, 1989). However, these theories are semiparametric and, therefore, need smoothing techniques. In addition, they do not cover all possible models.

Here, we propose a new consistent test within the integrated framework that compares very well to indicator- and exponential-based tests. The new test is simple to compute, does not need user-chosen parameters or high-dimensional numerical integration, is robust to higher order dependence (in particular to conditional heteroskedasticity), and presents excellent empirical power properties in finite samples; see Section 4. Furthermore, our test procedure provides a formalization of some well-known traditional exploratory tools based on residual-fitted values plots.

The organization of the paper is as follows. In Section 2 we define the residual marked process based on projections (RMPP) as the basis for our test statistic. In Section 3 we study the asymptotic null distribution and the behavior against Pitman's local alternatives of the new test statistic. For completeness of exposition, we consider in this section a new minimum distance estimator for the regression parameter based on the RMPP, and we show its consistency and asymptotic normality under similar assumptions as in the testing procedure. Also, because the asymptotic null distribution depends on the data generating process (DGP), a bootstrap procedure to approximate the asymptotic critical values of the test statistic is considered. In Section 4 we conduct a simulation exercise comparing the new proposed test with some competing tests considered in the literature. This Monte Carlo experiment shows that our new test can play a valuable role in parametric regression modeling. Proofs of the main results are deferred to Appendix A. Appendix B contains a simple algorithm to compute the new test statistic.

2. THE RESIDUAL MARKED PROCESS BASED ON PROJECTIONS

Let {Zi = (Yi,Xi′)′}i=1n be a sequence of independent and identically distributed (i.i.d.) (d + 1)-dimensional r.v.s defined on the probability space

and with the same distribution as Z = (Y,X′)′, with 0 < E|Y | < ∞. The main goal in this paper is to test the null hypothesis (2), i.e.,

against the alternative

As argued before, one way to characterize H0 is by the infinite number of parametric unconditional moment restrictions

where the parametric family w(·,x) is such that the equivalence in (3) holds. Examples of such families are w(X,x) = 1(Xx), w(X,x) = exp(ixX), w(X,x) = sin(xX), and w(X,x) = 1/(1 + exp(cxX)) with c ≠ 0; see the aforementioned references for many other families.

In view of a sample {Zi}i=1n, define the marked empirical process

Define also Rn,w1(·) ≡ Rn,w(·,θn), where θn is an

-consistent estimator of θ0. The marks in Rn,w1 are given by the classical residuals; therefore, we call Rn,w1 a residual marked empirical process.

Because of the equivalence (3), it is natural to base the tests on a distance from Rn,w1 to zero, i.e., on a norm Γ(Rn,w1), say. The most used norms are the Cramér–von Mises (CvM) and Kolmogorov–Smirnov (KS) functionals:

respectively, where Ψ(x) is an integrating function satisfying some mild conditions; see A4 in Section 3. Other functionals are possible. Then, tests in the integrated approach reject the null hypothesis (2) for “large” values of Γ(Rn,w1).

The first consistent integrated test proposed in the literature was that of Bierens (1982) based on the exponential weighting family, i.e., using the residual marked process

where Φ(·) is a bounded one-to-one Borel measurable mapping from

. Bierens (1982) considered a CvM norm with integrating measures Ψ(dx) = ϒ(x) dx, with ϒ(x) = 1(x ∈ Πl=1d[−εll]), where εl > 0, l = 1,…, d, are arbitrarily chosen numbers (Bierens, 1982, p. 109), and ϒ(x) equals a d-variate normal density function (Bierens, 1982, p. 111).

On the other hand, Stute (1997) used the indicator family w(X,x) = 1(Xx) in the residual marked process. The main advantage of the indicator weighting function over the exponential function is that it avoids the choice of an arbitrary integrating function Ψ, because in the indicator case this is given by the natural empirical distribution function of {Xi}i=1n. On the contrary, the indicator weight has the drawback of being more sensitive to the dimension d than the exponential weight, which is based on one-dimensional projections (see Escanciano, 2006).

In this paper we propose a test based on a new family {w,Ψ} of weighting and integrating functions that possesses the good properties of the exponential- and indicator-based tests and at the same time prevents their deficiencies. The new test avoids the arbitrary choice of the integrating function or numerical integration in high-dimensional spaces and is less sensitive to the dimension d than indicator-based tests because it is based on one-dimensional projections. The CvM test based on this new family presents an excellent performance in finite samples and is very simple to compute. In addition, the new family w formalizes some traditional model diagnostic tools based on residual-fitted values plots for linear models.

Our first aim is to avoid the problem of the curse of dimensionality. The following result can be viewed as a particularization of the Cramér–Wold principle to our main concern, the goodness-of-fit of the regression function. The term |A| denotes the Euclidean norm of A.

LEMMA 1. A necessary and sufficient condition for (2) to hold is that for any vector

with |β| = 1,

Lemma 1 yields that consistent tests for H0 can be based on one-dimensional projections. In particular, we have the characterization of the null hypothesis H0:

where from now on

is the nuisance parameter space with

the unit ball in

, i.e.,

. Therefore, the test we consider here rejects the null hypothesis for “large” values of the standardized sample analogue of E [e0)1(β′Xu)].

An approach related to ours is that of Stute and Zhu (2002), who considered the weighting family {1(β0Xu)} for model checks of GLM in an i.i.d. framework. However, note that they fix the direction to β0, the direction involved in the GLM, and so their approach is clearly different from that considered here, because we consider all the directions β in

simultaneously. As a consequence, our test will be consistent against all alternatives, whereas in our present framework the Stute and Zhu (2002) test is only consistent against alternatives satisfying that E [e*)1(β*Xu)] ≠ 0 in a set with positive Lebesgue measure in

, where θ* and β* are the probabilistic limits under the alternative of the estimators of θ0 and β0, respectively.

The family 1(β′Xu) yields the RMPP

The marks of Rn1 are given by the classical residuals and the “jumps” by the projected regressors. Note that for a fixed direction β, Rn1 is uniquely determined by the residuals and the projected variables {β′Xi}i=1n and vice versa. As in the usual residual-regressors plot, we can plot the path of Rn1 for different directions β as an exploratory diagnostic tool. In particular, in the linear model, the plot of the path of Rn1n,u), with βn the least squares estimator, resembles the usual residual-fitted values plot. Therefore, tests based on Rn1n,u) provide a formalization of such traditional well-known exploratory tools.

To measure the distance from Rn1 to zero a norm has to be chosen. From computational considerations a CvM norm is very convenient in our context. Two facts motivate our choice of the integrating measure in the CvM norm. First, note that once the direction β is fixed, u lives in the projected regressor variable's space, and second, in principle, all the directions are equally important; cf. Lemma 1. To define our CvM test we need some notation. Let Fn(u) be the empirical distribution function of the projected regressors {β′Xi}i=1n and dβ the uniform density on the unit sphere. Also let Fβ(u) be the true cumulative probability distribution function (c.d.f.) of β′X. Then, we define the new CvM test as

Therefore, we reject the null hypothesis H0 for large values of PCvMn. See Appendix B for a simple algorithm to compute PCvMn from a given data set {Zi}i=1n. The next section justifies inference for PCvMn based on the asymptotic theory.

Our test statistic PCvMn avoids the deficiencies of the Bierens (1982) and Stute (1997) tests, namely, the arbitrary choice of the integrating function or numerical integration in high-dimensional spaces and the low power performance when the dimension d is large, respectively. However, it is worthwhile to mention that our test is not necessarily better than the Bierens (1982) and Stute (1997) tests. In fact, using the results of Bierens and Ploberger (1997) it can be shown that all these tests are asymptotically admissible, and therefore none of them is strictly better than the others uniformly over the space of alternatives. However, in our simulations that follow we show that for the alternatives considered our test is the best or comparable to the best test. A simple intuition as to why our test performs so well with the alternatives considered is as follows. Under the alternative it can be shown that, uniformly in x ∈ Π,

where θ1 is the probabilistic limit of θn under the alternative HA. On the other hand, under the normalization E [m2(X1)] = 1, where m(·,θ1) = E [e1)|X = ·], it holds that the optimization problem

attains its optimum at w*(·) = m(·,θ1). Therefore, as w(·,·) is closer to m(·,·), the test based on w is expected to have better power properties. It seems that for the models considered in Section 4 m(·,θ1) can be “well approximated” by our weight function 1(β′Xu), and this might explain the good power properties of our test procedure.

During the revision process of the paper one of the referees suggested a modification of our test that might have better finite-sample performance. Based on the inequality

which follows from simple algebra, the modified test statistic is

However, contrary to PCvMn the latter test statistic involves numerical integration and is much more difficult to compute. Therefore, we do not study this modified test statistic further in the paper. On the contrary, the next section studies the asymptotic distribution theory for PCvMn.

3. ASYMPTOTIC THEORY

Now, we establish the limit distribution of Rn1 under the null hypothesis H0. For the asymptotic theory, note that Rn1 can be viewed as a mapping from

with values in [ell ](Π), the space of all real-valued functions that are uniformly bounded on Π. Let ⇒ denote weak convergence on [ell ](Π) and

denote convergence in outer probability; see Definitions 1.3.3 and 1.9.1, respectively, in van der Vaart and Wellner (1996). Also,

stands for convergence in distribution of real r.v.s. To derive asymptotic results we consider the following assumptions. First, let us denote by FY(·) and FX(·) the marginal c.d.f. of Y and X, respectively. Also let Ψp(·) be the product measure of Fβ(·) and the uniform distribution on

, i.e., Ψp(dβ,du) = Fβ(du) dβ. In the discussion that follows C is a generic constant that may change from one expression to another.

Assumption A1.

A1(a) {Zi = (Yi,Xi′)′}i=1n is a sequence of i.i.d. random vectors with 0 < E|Yi| < ∞.

A1(b) E|ε|2 < C.

Assumption A2. f (·,θ) is twice continuously differentiable in a neighborhood Θ0 of θ0, Θ0 ⊂ Θ. The score g(X,θ) = (∂/∂θ′) f (X,θ) verifies that there exists a FX(·)-integrable function M(·) with supθ∈Θ0|g(·,θ)| ≤ M(·).

Assumption A3.

A3(a) The parametric space Θ is compact in

. The true parameter θ0 belongs to the interior of Θ. There exists a θ1 ∈ Θ such that |θn − θ1| = oP(1).

A3(b) The estimator θn satisfies the following asymptotic expansion under H0:

where l(·) is such that E [l(Y,X0)] = 0 and L0) = E [l(Y,X0)l′(Y,X0)] exists and is positive definite.

Assumption A4. Ψp(·) is absolutely continuous with respect to Lebesgue measure on Π.

Assumptions A1 and A2 are standard in the model checks literature; see, e.g., Bierens (1990) and Stute (1997). Assumption A3 is satisfied, e.g., for the nonlinear least squares estimator and (under further regularity assumptions) its robust modifications; see, e.g., Chapters 5 and 7 in Koul (2002). Note that A3(a) and A3(b) imply that θ0 = θ1 under the null H0, but they are not necessarily equal under the alternative. We shall show subsequently that A3 is also satisfied for a new minimum distance estimator. Assumption A4 is only necessary for consistency of the test.

Under A1 and (2), using a classical central limit theorem (CLT) for i.i.d. sequences, we have that the finite-dimensional distributions of Rn, where Rn is the process defined in (5) with θ = θ0 and w(X,x) = 1(β′Xu), converge to those of a multivariate normal distribution with a zero mean vector and variance-covariance matrix given by the covariance function

where x1 = (β1′,u1)′ and x2 = (β2′,u2)′. The next result is an extension of this convergence to weak convergence in the space [ell ](Π). Throughout the paper x = (β′,u)′ will denote the nuisance parameter, and we interchange the notation x and (β′,u)′ whenever this does not create confusion.

THEOREM 1. Under the null hypothesis H0 and Assumption A1

where R(·) is a Gaussian process with zero mean and covariance function given by (9).

In practice θ0 is unknown and has to be estimated from a sample {Zi}i=1n by an estimator θn, say. The next result shows the effect of the parameter uncertainty on the asymptotic null distribution of Rn1. To this end, let us define the function G(x0) = G(x) = E [g(X0)1(β′Xu)] and let V be a normal random vector with zero mean and variance-covariance matrix given by L0) as defined in A3(b).

THEOREM 2. Under the null hypothesis H0 and Assumptions A1–A3

where R is the same process as in Theorem 1 and

Theorem 2 and the continuous mapping theorem (CMT) (see, e.g., van der Vaart and Wellner, 1996, Thm. 1.3.6) yield the asymptotic null distribution of the functional PCvMn.

COROLLARY 1. Under the assumptions of Theorem 2, for any continuous functional (with respect to the supremum norm) Γ(·),

Furthermore,

Note that the integrating measure in PCvMn is a random measure, but Corollary 1 shows that the asymptotic theory is not affected by this fact. Also note that the asymptotic null distribution of PCvMn depends in a complex way on the DGP and the specification under the null, and so critical values have to be tabulated for each model and each DGP, making the application of these asymptotic results difficult in practice. To overcome this problem we approximate the asymptotic null distribution of continuous functionals of Rn1 by a bootstrap procedure given subsequently.

In Assumption A3 we require that the estimator of θ0 admits an asymptotic linear representation. For completeness of the presentation we give some mild sufficient conditions under which a minimum distance estimator (see Koul, 2002, Ch. 5 and references therein), is asymptotically linear. Motivated from Lemma 1, we have that under the null

and θ0 is the unique value that satisfies (10). Then, we propose estimating θ0 by the sample analogue of (10), i.e.,

This estimator is a minimum distance estimator and extends in some sense the generalized method of moments (GMM) estimator, frequently used in econometric and statistical applications. This kind of generalization of GMM was considered first in Carrasco and Florens (2000) for univariate problems. Recently, and for w(X,x) = 1(Xx), Dominguez and Lobato (2004) have considered an estimator similar to (11) for a conditional moment restriction under time series. Also using this principle, Koul and Ni (2004) have proposed a minimum distance estimation for θ0 using an L2-distance similar to that used in Härdle and Mammen (1993) in the “local approach.” Our estimator θn has the advantage of being free of any user-chosen parameter (bandwidth, kernel, or integrating measure) and is expected to be more robust to the problem of the curse of dimensionality than the estimating procedures based on 1(Xx) or local approaches. Now, we shall show that θn in (11) satisfies Assumption A3. The following matrices are involved in the asymptotic variance-covariance matrix of the estimator:

For the consistency and asymptotic normality of the estimator we need an additional assumption.

Assumption A1′. The regression function f (·,θ) satisfies that there exists an FX(·)-integrable function Kf(·) with supθ∈Θ| f (·,θ)| ≤ Kf(·).

THEOREM 3. Under H0 and Assumptions A1, A2, and A1′

(i) the estimator given in (11) is consistent, i.e., θn → θ0 a.s.;

(ii) if in addition, the matrix C is nonsingular, then

From Theorem 3 we have immediately the asymptotic linear expansion required in A3(b):

where now

Note that in general the estimator given in (11) is not asymptotically efficient. An asymptotically efficient estimator based on the same minimum distance principle can be constructed following the ideas of Carrasco and Florens (2000). This optimal estimator will require the choice of a regularization parameter needed to invert a covariance operator; see Carrasco and Florens (2000) for more details.

Now we study the asymptotic distribution of Rn1 under a sequence of local alternatives converging to null at a parametric rate n−1/2. We consider the local alternatives

where the random variable a(X) is FX-integrable with zero mean and satisfies P(a(X) = 0) < 1. To derive the next result we need the following assumption.

Assumption A3′. The estimator θn satisfies the following asymptotic expansion under HA,n:

where the function l(·) is as in Assumption A3 and ξa is a vector in

.

Remark 1. It is not difficult to show that θn in (11) satisfies Assumption A3′ under Assumptions A1, A2, and A1′ with

THEOREM 4. Under the local alternatives (12) and Assumptions A1, A2, and A3′

where R1 is the process defined in Theorem 2 and the function Da(·) is the determinist function

For some estimators, Da has an intuitive geometric interpretation. For instance, for the new minimum distance estimator (11) the shift function is given by

and represents the orthogonal projection in L2(Π,Ψp), the Hilbert space of all real-valued and Ψp-square-integrable functions on Π, of E [a(X)1(β′Xu)] parallel to G(β,u). The next corollary is a consequence of the CMT and Theorem 4.

COROLLARY 2. Under the local alternatives (12) and Assumptions A1, A2, and A3′, for any continuous functional Γ(·)

Furthermore,

Note that because of Lemma 1, we have that

Therefore, from this result it is not difficult to show that the test based on PCvMn is able to detect asymptotically any local alternative a(·) that is not parallel to g(·,θ0). This result is not attainable for tests based on the local approach, e.g., the Härdle and Mammen (1993) test.

We have seen before that the asymptotic null distribution of continuous functionals of Rn1 depends in a complicated way on the DGP and the specification under the null. Therefore, critical values for the test statistics cannot be tabulated for general cases. Here we propose to implement the test with the assistance of a bootstrap procedure. Resampling methods have been extensively used in the model checks literature of regression models; see, e.g., Stute et al. (1998) or more recently Li et al. (2003). It is shown in these papers that the most relevant bootstrap method for regression problems is the wild bootstrap (WB) introduced in Wu (1986). We approximate the asymptotic null distribution of Rn1 by that of

where the sequence {ei*(θn*)}i=1n are the fixed design wild bootstrap (FDWB) residuals computed from et*(θn*) = Yi* − f (Xin*), where Yi* = f (Xin) + ein)Vi, θn* is the bootstrap estimator calculated from the data {(Yi*,Xi′)′}i=1n, and {Vi}i=1n is a sequence of i.i.d. random variables with zero mean, unit variance, and bounded support and also independent of the sequence {Zi}i=1n. Examples of {Vi}i=1n sequences are i.i.d. Bernoulli variates with

where

, used in, e.g., Li et al. (2003). For other sequences see Mammen (1993). The reader is referred to Stute et al. (1998) for the theoretical justification of this bootstrap approximation and the assumptions needed. The results of these authors jointly with those proved here ensure that the proposed bootstrap test has a correct asymptotic level, is consistent, and is able to detect alternatives tending to the null at the parametric rate n−1/2. The next section shows that this bootstrap procedure provides good approximations in finite samples.

4. MONTE CARLO EVIDENCE

In this section we compare the new CvM test with some competing integrated-approach-based tests proposed in the literature. This study complements others considered in the literature; see, e.g., Miles and Mora (2003). We briefly describe our simulation setup. We denote by PCvMn the new CvM test defined in (8). For the explicit computation of PCvMn see Appendix B.

Bierens (1982, p. 111) proposed the CvM test statistic based on the exponential weight function w(X,x) = exp(ixX) and the d-variate normal density function as the integration function, i.e.,

We also consider here the CvM and KS statistics defined in Stute (1997), which are given by

respectively. Note that CvMn and PCvMn are the same test statistics when d = 1, by definition.

Recently, Stute and Zhu (2002) have considered an innovation process transformation of Rn1n,u) for testing the correct specification of GLM models, where βn is a suitable estimator of the GLM parameter, say, β0. More concretely, their test statistic is the CvM test

where

ann(u) and σnn2(u) are Nadaraya–Watson estimators of aβ0(u) = E [g(X0)/β0X = u] and σβ02(u) = E20X = u], respectively,

is the 99% quantile of Fnn. Under the correct specification of the GLM and some additional assumptions

where B(·) denotes a standard Brownian motion on [0,1]; see Stute and Zhu (2002) for further details. For the nonparametric estimators we have chosen a Gaussian kernel with bandwidth h = 0.5n−1/2 as in Stute and Zhu (2002).

We consider the same FDWB for the version of the exponential Bierens test and for the Stute (1997) test as for our CvM test PCvMn. For SZn we consider empirical critical values based on 10,000 simulations on the first null model in each block of models. In the discussion that follows, εi ∼ iid N(0,1) and νi ∼ iid exp(1) are standard Gaussian and centered exponential noises, respectively. We consider in the simulations two blocks of models. In the first block the null model is

where X1i = (Wi + W1i)/2 and X2i = (Wi + W2i)/2, and Wi, W1i, and W2i are i.i.d. U [0,2π], independent of εi, 1 ≤ in. We examine the adequacy of this model under the following DGPs:

  1. DGP1: Yi = 1 + X1i + X2i + εiXi′α0 + εi.
  2. DGP1-EXP: Yi = 1 + X1i + X2i + νi = Xi′α0 + νi.
  3. DGP2: Yi = Xi′α0 + 0.1(W1i − π)(W2i − π) + εi.
  4. DGP3: Yi = Xi′α0 + Xi′α0 exp{−0.01(Xi′α0)2} + εi.
  5. DGP4: Yi = Xi′α0 + cos(0.6πXi′α0) + εi.

DGP1 and DGP2 are considered in Hong and White (1995). DGP3 here is similar to their DGP3; see also Koul and Stute (1999). DGP4 is similar to that considered in Eubank and Hart (1992). DGP1-EXP is considered here to show the robustness of the tests against fatter tailed error distributions. For the first block of models we consider a sample size of n = 50, 100, and 300. The number of Monte Carlo experiments is 1,000, and the number of bootstrap replications is B = 500. For the bootstrap approximation we employ the sequence {Vi}i=1n of i.i.d. Bernoulli variates given in (13). We estimate the null model by the usual least squares estimator. The nominal levels are 10%, 5%, and 1%.

In Table 1 we show the empirical rejection probabilities (RP) associated with models DGP1 and DGP1-EXP. The empirical levels of the test statistics are close to the nominal level, even for sample sizes as small as 50. The empirical levels for DGP1-EXP are less accurate than for DGP1 but are reasonable, showing that the tests are robust to fat-tailed error distributions.

Empirical size of tests

In Table 2 we report the empirical power against the DGP2. It increases with the sample size n for all test statistics, as expected. It is shown that the new CvM test PCvMn has the best empirical power in all cases. The empirical power for CvMn,exp is reasonable and greater than or equal to CvMn and KSn for n = 50, but better for n = 100 and 300. The Stute and Zhu (2002) test, SZn, is the worst against this alternative. The rejection probabilities of PCvMn are comparable to the best test in Hong and White (1995) against this alternative. In Table 3 we show the RP for DGP3. For this alternative SZn and our test statistic, PCvMn, have generally the best empirical powers, SZn performing slightly better than PCvMn. Bierens' test CVMn,exp has good power properties for this alternative. Stute's test CvMn performs similarly to CVMn,exp, whereas KSn presents the worst results, with a moderate power. For DGP4 (Table 4), PCvMn and CVMn,exp have excellent empirical powers. Stute's tests, CvMn and KSn, and the Stute and Zhu (2002) test, SZn, have low power against this “high-frequency” alternative.

Empirical power of tests

Empirical power of tests

Empirical power of tests

The second block of models is taken from Zhu (2003). The null model is

whereas the DGPs considered are

where Xi′ is a random d-dimensional covariate with i.i.d. U [0,2π] marginal components, d = 3 and 6. When d = 3, γ0 = (1,1,2)′ and β0 = (2,1,1)′, and when d = 6, γ0 = (1,2,3,4,5,6)′ and β0 = (6,5,4,3,2,1)′. Furthermore, set b = 0.01, 0.02,…,0.1 when d = 3 and b = 0.001, 0.002,…,0.01 when d = 6. This experiment provides us with evidence of the power performance of the tests under local alternatives (b = 0 corresponds to the null hypothesis). The sample size is n = 25; the rest of the Monte Carlo parameters are as before.

We show the RP for these models in Figure 1. We see that in both cases, d = 3 and 6, our new test statistic PCvMn and SZn have the best empirical powers for all values of b. None of them is superior to the other for all values of b and for both models. For d = 3, SZn performs slightly better than PCvMn. They are followed by CvMn,exp. For d = 6, PCvMn has the best power for b ≤ 0.006, whereas SZn is the best for b > 0.006; CvMn,exp, CvMn, and KSn have very low empirical power against this alternative.

Rejection probabilities plots for d = 3 and 6. The solid, solid-star, dot, dash, and dash-dot lines are, respectively, for the empirical power of PCvMn, SZn, CvMn,exp, CvMn, and KSn.

Summarizing, these two Monte Carlo experiments show that our test possesses an excellent power performance in finite samples for the alternatives considered. In all cases, our test has the best empirical power or it is comparable to the best test among the tests proposed by Bierens (1982), Stute (1997), and Stute and Zhu (2002). In our Monte Carlo experiments we have focused on the integrated-approach-based tests. Miles and Mora (2003) have compared through simulations some local-based and integrated-based tests. These authors conclude that for one-dimensional regressors, the integrated-approach-based tests perform slightly better than the smoothing-based ones, especially Bierens' statistic. When the number of regressors is greater than one, some of the smoothing tests considered by these authors perform better. Therefore, it is important to compare our new test with the smoothing-based tests considered by these authors, especially for the case of multivariate regressors. This study is beyond the scope of this paper and is deferred for future research. Our test has the advantage that no bandwidth selection is required, though its implementation requires the use of a bootstrap procedure. Our Monte Carlo experiments show that our test should be considered a reasonably competent test to the best local-approach-based test and a valuable diagnostic procedure for regression modeling.

APPENDIX A: Proofs

Proof of Lemma 1. This follows easily from Part I of Theorem 1 in Bierens (1982). █

Proof of Theorem 1. By a classical CLT we can show that the finite-dimensional distributions of Rn converge to those of the Gaussian process R. The asymptotic equicontinuity of Rn follows by a direct application of Theorem 2.5.2 in van der Vaart and Wellner (1996); see also their Problem 14 on p. 152. █

Proof of Theorem 2. Applying the classical mean value theorem argument we have

where

and where

satisfies

. By Assumptions A1–A3, the generalization by Wolfowitz (1954) of the Glivenko–Cantelli theorem, and the uniform law of large numbers (ULLN) of Jennrich (1969), it is easy to show that I = oP(1) and II = oP(1) uniformly in x ∈ Π. Thus the theorem follows from Theorem 1 and Assumption A3. █

Proof of Corollary 1. For a nonrandom continuous functional, the result follows from the CMT and Theorem 2. For PCvMn the result follows because under the conditions of Theorem 2 we have that Rn1 is asymptotically tight and hence Lemma 3.1 in Chang (1990) applies. █

Proof of Theorem 3. The proof follows exactly the same steps as the proof of Theorems 1 and 2 in Dominguez and Lobato (2004), and thus it is omitted. █

Proof of Theorem 4. Under the local alternatives (12) write

with

Using A3′ as in Theorem 2, we obtain

uniformly in x ∈ Π. On the other hand, using the results of Wolfowitz (1954), we have uniformly in x ∈ Π

Using the preceding equations and (A.1), the theorem holds from Theorem 1 and Assumption A3′. █

APPENDIX B: Computation of the Test Statistic

By simple algebra

For d > 1, note that the integral Aijr is proportional to the volume of a spherical wedge and hence we can compute them from the formula

where Aijr(0) is the complementary angle between the vectors (XiXr) and (XjXr) measured in radians and Γ(·) is the gamma function. Thus, Aijr(0) is given by

Hence, the computation of these integrals is simple. In addition, there are some restrictions on the integrals Aijr that make the computation simpler, e.g., if Xi = Xj and XiXr then Aijr(0) = π, whereas if Xi = Xj and Xi = Xr then Aijr(0) = 2π. If XiXj and Xi = Xr or Xj = Xr, we have that Aijr(0) = π. Also, the symmetric property Aijr = Ajir holds.

References

REFERENCES

Bierens, H.J. (1982) Consistent model specification tests. Journal of Econometrics 20, 105134.CrossRefGoogle Scholar
Bierens, H.J. (1990) A consistent conditional moment test of functional form. Ecomometrica 58, 14431458.CrossRefGoogle Scholar
Bierens, H.J. & W. Ploberger (1997) Asymptotic theory of integrated conditional moment test. Econometrica 65, 11291151.CrossRefGoogle Scholar
Carrasco, M. & J.P. Florens (2000) Generalization of GMM to a continuum of moment conditions. Econometric Theory 16, 797834.CrossRefGoogle Scholar
Chang, N.M. (1990) Weak convergence of a self-consistent estimator of a survival function with doubly censored data. Annals of Statistics 18, 391404.CrossRefGoogle Scholar
Chen, S.X., W. Härdle, & M. Li (2003) An empirical likelihood goodness-of-fit test for time series. Journal of the Royal Statistical Society, Series B 65, 663678.CrossRefGoogle Scholar
Dominguez, M. & I. Lobato (2004) Consistent estimation of models defined by conditional moment restrictions. Econometrica 72, 16011615.CrossRefGoogle Scholar
Escanciano, J.C. (2006) Goodness-of-fit tests for linear and nonlinear time series models. Journal of the American Statistical Association 101, 531541.CrossRefGoogle Scholar
Eubank, R. & J. Hart (1992) Testing goodness-of-fit in regression via order selection criteria. Annals of Statistics 20, 14121425.CrossRefGoogle Scholar
Eubank, R. & S. Spiegelman (1990) Testing the goodness of fit of a linear model via nonparametric regression techniques. Journal of the American Statistical Association 85, 387392.CrossRefGoogle Scholar
Fan, J. & I. Gijbels (1996) Local Polynomial Modelling and Its Applications. Chapman and Hall.Google Scholar
Gozalo, P.L. (1993) A consistent model specification test for nonparametric estimation of regression function models. Econometric Theory 9, 451477.CrossRefGoogle Scholar
Guerre, E. & P. Lavergne (2005) Rate-optimal data-driven specification testing for regression models. Annals of Statistics 33, 840870.CrossRefGoogle Scholar
Härdle, W. & E. Mammen (1993) Comparing nonparametric versus parametric regression fits. Annals of Statistics 21, 19261974.CrossRefGoogle Scholar
Hong, Y. & H. White (1995) Consistent specification testing via nonparametric series regression. Econometrica 63, 11331159.CrossRefGoogle Scholar
Horowitz, J.L. & W. Härdle (1994) Testing a parametric model against a semiparametric alternative. Econometric Theory 10, 821848.CrossRefGoogle Scholar
Horowitz, J.L. & V.G. Spokoiny (2001) An adaptive, rate-optimal test of a parametric mean-regression model against a nonparametric alternative. Econometrica 69, 599631.CrossRefGoogle Scholar
Jennrich, R.I. (1969) Asymptotic properties of nonlinear least squares estimators. Annals of Mathematical Statistics 40, 633643.CrossRefGoogle Scholar
Koul, H.L. (2002) Weighted Empirical Processes in Dynamic Nonlinear Models, 2nd ed. Lecture Notes in Statistics, vol. 166. Springer-Verlag.CrossRefGoogle Scholar
Koul, H.L. & P. Ni (2004) Minimum distance regression model checking. Journal of Statistical Planning and Inference 119, 109144.CrossRefGoogle Scholar
Koul, H.L. & W. Stute (1999) Nonparametric model checks for time series. Annals of Statistics 27, 204236.CrossRefGoogle Scholar
Li, Q. (1999) Consistent model specification test for time series econometric models. Journal of Econometrics 92, 101147.CrossRefGoogle Scholar
Li, Q., C. Hsiao, & J. Zinn (2003) Consistent specification tests for semiparametric/nonparametric models based on series estimation methods. Journal of Econometrics 112, 295325.CrossRefGoogle Scholar
Mammen, E. (1993) Bootstrap and wild bootstrap for high-dimensional linear models. Annals of Statistics 21, 255285.CrossRefGoogle Scholar
McCullagh, P. & J. Nelder (1989) Generalized Linear Models. Monographs on Statistics and Applied Probability 37. Chapman and Hall.CrossRefGoogle Scholar
Miles, D. & J. Mora (2003) On the performance of nonparametric specification tests in regression models. Computational Statistics & Data Analysis 42, 477490.CrossRefGoogle Scholar
Powell, J.L., J.M. Stock, & T.M. Stoker (1989) Semiparametric estimation of index coefficients. Econometrica 57, 14031430.CrossRefGoogle Scholar
Stinchcombe, M. & H. White (1998) Consistent specification testing with nuisance parameters present only under the alternative. Econometric Theory 14, 295325.CrossRefGoogle Scholar
Stute, W. (1997) Nonparametric model checks for regression. Annals of Statistics 25, 613641.CrossRefGoogle Scholar
Stute, W., W. Gonzalez-Manteiga, & M. Presedo-Quindimil (1998) Bootstrap approximations in model checks for regression. Journal of the American Statistical Association 93, 141149.CrossRefGoogle Scholar
Stute, W. & L.X. Zhu (2002) Model checks for generalized linear models. Scandinavian Journal of Statistics 29, 535545.CrossRefGoogle Scholar
Tripathi, G. & Y. Kitamura (2003) Testing conditional moment restrictions. Annals of Statistics 31, 20592095.CrossRefGoogle Scholar
van der Vaart, A.W. & J.A. Wellner (1996) Weak Convergence and Empirical Processes. Springer-Verlag.CrossRefGoogle Scholar
Whang, Y.-J. (2000) Consistent bootstrap tests of parametric regression functions. Journal of Econometrics 98, 2746.CrossRefGoogle Scholar
Wolfowitz, J. (1954) Generalization of the theorem of Glivenko-Cantelli. Annals of Mathematical Statistics 25, 131138.CrossRefGoogle Scholar
Wooldridge, J. (1992) A test for functional form against nonparametric alternatives. Econometric Theory 8, 452475.CrossRefGoogle Scholar
Wu, C.F.J. (1986) Jacknife, bootstrap and other resampling methods in regression analysis (with discussion). Annals of Statistics 14, 12611350.Google Scholar
Yatchew, A.J. (1992) Nonparametric regression tests based on least squares. Econometric Theory 8, 435451.CrossRefGoogle Scholar
Zheng, X. (1996) A consistent test of functional form via nonparametric estimation technique. Journal of Econometrics 75, 263289.CrossRefGoogle Scholar
Zhu, L.X. (2003) Model checking of dimension-reduction type for regression. Statistica Sinica 13, 283296.Google Scholar
Figure 0

Empirical size of tests

Figure 1

Empirical power of tests

Figure 2

Empirical power of tests

Figure 3

Empirical power of tests

Figure 4

Rejection probabilities plots for d = 3 and 6. The solid, solid-star, dot, dash, and dash-dot lines are, respectively, for the empirical power of PCvMn, SZn, CvMn,exp, CvMn, and KSn.