Hostname: page-component-745bb68f8f-5r2nc Total loading time: 0 Render date: 2025-02-11T06:33:51.207Z Has data issue: false hasContentIssue false

A PORTMANTEAU TEST FOR SERIALLY CORRELATED ERRORS IN FIXED EFFECTS MODELS

Published online by Cambridge University Press:  30 August 2006

Atsushi Inoue
Affiliation:
North Carolina State University
Gary Solon
Affiliation:
University of Michigan
Rights & Permissions [Opens in a new window]

Abstract

We propose a portmanteau test for serial correlation of the error term in a fixed effects model. The test is derived as a Lagrange multiplier test, but it also has a straightforward Wald test interpretation. In Monte Carlo experiments, the test displays good size and power properties.The authors thank the co-editor, the referee, David Drukker, Christian Hansen, and Jeffrey Wooldridge for their helpful comments.

Type
Research Article
Copyright
© 2006 Cambridge University Press

1. INTRODUCTION

Empirical researchers frequently use longitudinal data to estimate fixed effects models of the form

where i = 1,2,…, N indexes cross-sectional units (such as individuals, firms, states in a country, or countries) and t = 1,2,…,T indexes time periods. In analyses of longitudinal microdata, T typically is fairly small. The K explanatory variables in the xit vector are commonly assumed to be strictly exogenous,

1

See Arellano and Honore (2001) for a discussion of the additional issues that arise when the explanatory variables include lagged dependent variables.

whereas the “fixed effect” ci is a time-invariant unit-specific effect that may be correlated with elements of xit but not with the error term εit. If εit is independent and identically distributed (i.i.d.) N(0,σ2), the efficient estimator of β is the “fixed effects estimator” that applies ordinary least squares (OLS) to the mean-differenced regression of yityi on xitxi where

. An alternative way of computing the same estimator is to apply OLS to the regression of yit on xit and a vector of unit-specific dummy variables.

Often, however, the error term is not i.i.d. but instead is serially correlated. This occurs in longitudinal data for the same reasons it frequently occurs in single time series—mainly because of left-out variables that evolve gradually over time. Quite strangely, researchers who learned in introductory econometrics always to check for serial correlation when estimating time series regressions completely forget this lesson when estimating fixed effects regressions with multiple time series. When Kezdi (2002) scoured three recent years' issues of the American Economic Review, Journal of Political Economy, and Quarterly Journal of Economics, he found that, of the 42 articles that estimated fixed effects models, 36 paid no attention whatsoever to the serial correlation issue. Similarly, Bertrand, Duflo, and Mullainathan (2004), who focused on the “differences in differences” special case in which the explanatory variable of main interest is a binary policy variable, located 65 articles that appeared in the same journals plus three applied field journals over the 1990–2000 period, and they found that 60 of those 65 studies totally ignored serial correlation. The trouble with this state of affairs is that ignoring serial correlation in the fixed effects context has the same poor consequences that it has with a single time series: it leads to inconsistent estimation of standard errors and hence to inappropriate hypothesis tests, and it also leads to inefficient estimation of the regression coefficients.

We conjecture that practitioners' inattention to serial correlation in fixed effects models is partly due to a lack of simple diagnostics. Therefore, in the next section, we present a straightforward statistic for testing the null hypothesis of no serial correlation against a general alternative that at least some of the autocorrelations are nonzero. Like the Box–Pierce test in the context of a single time series, our test is a portmanteau test in the sense that it is sensitive to serial correlation at many orders instead of just the first order.2

Existing tests for serial correlation in fixed effects models are discussed in Section 3.

When our test rejects the null hypothesis, as it often will, practitioners should proceed in the same three ways that they do in the time series context. First, they should consider whether the error term's serial correlation is a symptom of model misspecification.3

For example, Solon (1984a), upon finding large positive autocorrelations at low orders and large negative ones at high orders, recognized that he needed to add state-specific time trends to his model. Note that an advantage of our test relative to existing alternatives is that its attention to higher order autocorrelations sometimes may help with identifying specification problems.

Second, at a minimum, they should use a robust covariance matrix estimator to correct their estimated standard errors (Arellano, 1987; Kezdi, 2002). Third, they should consider attempting more efficient coefficient estimation through a feasible generalized least squares (GLS) procedure (Kiefer, 1980; Nickell, 1980; Bhargava, Franzini, and Narendranathan, 1982; Solon, 1984b; Hansen, 2003).

2. A PORTMANTEAU TEST

We can rewrite the model in equation (1) in matrix notation as

where [ell ]T is the T-dimensional column vector of ones, yi = [yi1 yi2yiT]′, Xi = [xi1 xi2xiT]′, and εi = [εi1 εi2 … εiT]′. Letting Σ = Eiεi′), we wish to test the null hypothesis Σ = σ2IT against the alternative that at least some off-diagonal elements of Σ are nonzero.4

Like the existing tests for serial correlation in fixed effects models, the initial version of our test assumes homoskedasticity. At the end of Section 2, we describe a modification of our test that allows the variance of εit to vary with t.

To devise a powerful test, we will start with a Lagrange multiplier (LM) approach under the assumption that εi is normally distributed. It will turn out, though, that the resulting test has a straightforward Wald test interpretation even when εi is nonnormal.

Because of the incidental parameters ci, one cannot construct an LM test based on the likelihood function. Thus we construct a conditional likelihood function based on a sufficient statistic for the individual specific effect ci (for this approach to the logit model for panel data, see Chamberlain, 1980). When Σ = σ2IT, the sufficient statistic is yi. When Σ ≠ σ2IT, however, the sufficient statistic is ([ell ]T′Σ−1yi)/([ell ]T′Σ−1[ell ]T). Then the conditional log-likelihood function is given by

Because of the mean-differencing transformation, the estimated covariance matrix is singular and is of rank T − 1. Let ςk be the (T − 1)(T − 2)/2 × 1 parameter vector obtained by stacking the lower-diagonal elements of the (T − 1) × (T − 1) matrix that remains after deleting the kth column and row of the covariance matrix Σ.5

Issues in the choice of k are discussed in Sections 2 and 3.

Under the null hypothesis of no autocorrelation, ςk is a vector of zeros. Let Dk,T denote the T2 × (T − 1)(T − 2)/2 matrix such that Dk,T = ∂ vec(Σ)/∂ςk′ where vec is the vec operator that transforms a matrix into a column vector by stacking the columns of the matrix. When k = 2 and T = 3, for example,

The gradient of the conditional log-likelihood function with respect to ςk is

When it is evaluated under the null hypothesis, it can be written as

where

. Let

Then under the null hypothesis, the distribution of the infeasible LM statistic

converges to a χ2 distribution with degrees of freedom (T − 1)(T − 2)/2 as N → ∞. Thus, our test will be applicable in the typical panel data setting in which T may be small but N is large.

As is typical in the application of LM tests, operationalizing our test requires substituting estimated values for unknown parameters in the test statistic. First, write the fixed effects estimator

as

and let

Second, compute the feasible LM statistic:

where

THEOREM 1. Suppose that

(a) Xii are i.i.d. and have finite fourth moments.

(b) Ei|Xi,ci) = 0.

(c) rank[E(XiMXi)] = dim(xit).

(d) T ≥ 3.

(e) Dk,TVDk,T is nonsingular.

Then we have the following:

(i) Under the null hypothesis that Σ = σ2IT, the LM test statistic is asymptotically distributed (as N → ∞) as χ2((T − 1)(T − 2)/2).

(ii) Under a sequence of local alternatives such that

where C = {cij} is a T × T symmetric matrix whose diagonal elements are all zeros, the LM test statistic is asymptotically distributed as the noncentral χ2((T − 1)(T − 2)/2) with noncentrality parameter δ′Dk,T(Dk,TVDk,T)−1Dk,T′δ where

The proof of Theorem 1 is provided in the Technical Appendix. Condition (d) is not restrictive because, with T = 2, the fixed effects estimator is identical to the first-difference estimator and, with only one transformed error term εi2 − εi1 per cross-sectional unit, serial correlation is not an issue. Condition (e) is not automatically implied by the other conditions but is extremely unlikely to be violated in practice.

Although we have motivated our test as an LM test, inspection of the test statistic reveals that it also is a Wald statistic that checks whether the sample autocovariances of the fixed effects residuals

are significantly different from their population counterparts under the null hypothesis. As explained in Wooldridge (2002, pp. 270, 274–275), autocovariances of the fixed effects residuals consistently estimate those of eit = εitεi, not εit. As a result, under the null hypothesis that εit is serially uncorrelated, the sample autocovariances of

converge not to zero but rather to −σ2/T. Our test ascertains whether the discrepancies between the sample autocovariances and

are statistically significant. Equivalently, it tests whether the sample autocorrelations are significantly different from −1/(T − 1).

Because (Dk,TVDk,T)−1 is positive definite, the LM test has nontrivial local power provided Dk,T′δ ≠ 0. Thus the test is consistent for global alternatives under which

where Σ is the covariance matrix of εi under the alternative hypothesis. Let cij denote the (i,j)th element of C. Then it follows that

where

. If the off-diagonal elements of C take at least two distinct values, the elements of δ are all nonzero because of the fixed effects transformation, and thus the test will have nontrivial local power regardless of the choice of k. On the other hand, if the off-diagonal elements of C take only one value, that is, cij = c for all i,j = 1,2,…,T such that ij, it follows that δ = 0, and thus the test does not have any power. This is due to the fixed effects transformation, and other tests based on the fixed effects residuals, such as the test of Wooldridge (2002), also will have no power against such local alternatives.

Although our test is consistent regardless of the choice of k except for the special alternatives discussed previously, the local power analysis in Theorem 1 provides some insight into the sensitivity of actual power to the choice of k. For example, consider a stationary autoregressive local alternative with T = 5, εiN(0,I5 + N−1/2)C where the Pitman drift matrix is given by6

When the autoregressive parameter is O(N−1/2), higher order autocorrelations are o(N−1/2), and thus the corresponding elements in C are zeros.

Then we have

where the nominal size is 0.05 and these numbers are estimated by Monte Carlo simulation with 100,000 replications. Although the differences are not very large, the test with k set to 1 or 5 has higher power than with k set to 2, 3, or 4. This is unsurprising because the test with k equal to 1 or 5 is based on three first-order sample autocovariances, whereas the test with k set to 2, 3, or 4 uses only two first-order autocovariances. Consequently, in the commonly encountered case in which serial correlation is most pronounced at the first order, the test with k set to 1 or 5 is better targeted for detecting serial correlation. The sensitivity of the actual finite-sample power to the choice of k will be further investigated in the next section.

Our test can be usefully modified in three ways. First, in certain circumstances, especially when T is relatively large, it may be desirable to focus the test statistic on only the lower-order autocovariances. Including all the autocovariances may lead to a loss of power analogous to that from including too many orders of autocorrelation when the Box–Pierce test is used with a single time series.

Second, the test is readily adapted to the case of unbalanced panel data in which some observations are randomly missing. Let si = [si1,…, siT]′ denote the T-dimensional column vector of selection indicators: sit = 1 if xit and yit are observed and sit = 0 otherwise. Then with some abuse of notation the modified LM statistic can be written as

where

THEOREM 2. Suppose that

(a) Xii,si are i.i.d. and have finite fourth moments.

(b) Ei|Xi,ci,si) = 0.

(d) rank[E(XiMi Xi)] = dim(xit).

(e)

is nonsingular.

Then under the null hypothesis that Eiεi′|si) = σ2IT, the modified LM test statistic is asymptotically distributed (as N → ∞) as χ2((T − 1)(T − 2)/2). The proof of Theorem 2 is analogous to the proof of Theorem 1 and thus is omitted.

Third, the test can be modified to allow for time-varying variances. Using the fixed effects residuals, estimate the possibly time-varying variances σ12,…, σT2 by the method of moments based on moment conditions

where DT denotes the T2 × T matrix such that DT = ∂ vec(Σ)/∂[σ12,…, σT2], W = Σ−1 − Σ−1[ell ]T [ell ]T′Σ−1/[ell ]T′Σ−1[ell ]T, and Σ is the diagonal matrix whose diagonal elements are given by σ12,…, σT2. Define the approximate LM test statistic by

where

,

is the diagonal matrix whose diagonal elements are given by the method of moments estimator

.

THEOREM 3. In addition to Assumptions (a)–(d) in Theorem 1, suppose that

(e′) Dk,TE(vi vi′)Dk,T is nonsingular where

(f) DT′(W [otimes ] W)DT is nonsingular.

Then under the null hypothesis that Σ is a diagonal matrix, the approximate LM test statistic is asymptotically distributed (as N → ∞) as χ2((T − 1)(T − 2)/2).

The proof of Theorem 3 is provided in the technical Appendix. Because the fixed effects estimator is not a maximum likelihood estimator in the presence of time-varying variances, the test statistic is not the exact LM statistic. If one is to obtain the exact LM statistic, one needs to estimate the fixed effects GLS estimator by

and iterate method of moments estimation of σ12,…, σT2 and fixed effects GLS estimation until

converge. Because the fixed effects estimator is consistent, however, the asymptotic null distributions of the approximate and exact LM statistics are identical.

3. MONTE CARLO ANALYSES

We have shown that, under the null hypothesis, our portmanteau test statistic converges to a χ2((T − 1)(T − 2)/2) distribution as N → ∞, but how applicable is that distribution when N is large but finite? To explore that question, we have conducted a Monte Carlo study in which the data generating process is equation (1) with β = 0, scalar xiti.i.d. N(0,1), εiti.i.d. N(0,1). The sample sizes considered are N = 50, 100, 250, 500 and T = 5, 8. The number of Monte Carlo replications is set to 10,000, and the deleted time period in the test statistic is k = 1.

Table 1 reports the actual rejection frequencies when the nominal size is 5%. In the experiments with T = 5, the empirical sizes come quite close to the nominal size. With T = 8, there are some mild size distortions at smaller N. These distortions mostly disappear by the time N reaches 500.

Empirical size of portmanteau tests for serial correlation

We also have conducted a series of Monte Carlo analyses to investigate the power of our test and compare it to the power of several other tests. Unlike our portmanteau test, most existing tests focus on the specific alternative of nonzero first-order autocorelation. For example, Bhargava et al. (1982), assuming normality of the error term, have developed a Durbin–Watson test against the alternative that the error term follows a first-order autoregression.

7

Baltagi and Wu (1999) have proposed a related test. The Hansen (2003) analysis emphasizes feasible GLS estimation, but his methods can be used to formulate a test against the alternative of a pth-order autoregression.

Wooldridge (2002, p. 275) has suggested applying OLS to the first-order autoregression of

and then performing a t-test of the hypothesis that the autoregressive coefficient equals −1/(T − 1). He emphasized that the standard error estimate in the denominator of the t-ratio must be robust to serial correlation. Another test, hinted at by Wooldridge (2002, pp. 282–283) and developed by Drukker (2003), is based on the residuals from OLS estimation of the first difference of equation (1). This test applies OLS to the first-order autoregression of the residuals and then performs a t-test of the hypothesis that the autoregressive coefficient equals −½. Again, the standard error estimate must be robust to serial correlation. Finally, Kezdi (2002) has proposed a White-type test that checks whether a covariance matrix estimate robust to serial correlation differs significantly from the conventional covariance matrix estimate that assumes no serial correlation.

The Monte Carlo analyses summarized in Table 2 compare the performances of all these tests in experiments with T = 8, N = 50, 100, 250, or 500, and 10,000 replications. The four rows within each panel correspond to experiments with four different data generating processes. The first (DGP1) is the same as in Table 1: the null hypothesis case of no serial correlation. In the second (DGP2), the error term εit follows a first-order autoregression with autoregressive parameter 0.4. In the third (DGP3), εit follows a second-order moving average process with first-order parameter 0.375 and second-order parameter 0.6. With these parameter values, the first- and second-order autocorrelations equal each other and are approximately 0.4. In each of these three data generating processes, Var(εit) = 1. In the fourth (DGP4), εit follows the nonstationary process

where vit and αi are i.i.d. normal with zero mean, Var(vit) = 0.5, and Var(αi) = 0.02. This experiment represents the situation in which misspecification of the fixed effects model (namely, the omission of the individual-specific linear time trends) may or may not be detected by serial correlation diagnostics.8

See Solon (1984a), Jacobson, LaLonde, and Sullivan (1993), Friedberg (1998), and Donohue and Levitt (2001) for examples of longitudinal analyses involving individual- or state-specific time trends.

The possibility of detection arises because the fixed effects residuals will be positively autocorrelated at low orders, negatively autocorrelated at high orders, and heteroskedastic.

Empirical power of alternative tests for serial correlation

Two general results from Table 2 are worth noting at the outset. First, as shown in the first rows, all the tests display an empirical size reasonably close to the nominal size of 0.05, especially when N is large. Second, as shown in the last column, the Kezdi test has no power against any of the departures from the null hypothesis. This is by design: the regressor in our experiments is i.i.d. As a result, the conventional covariance matrix estimator remains consistent despite serial correlation (and, in DGP4, heteroskedasticity) of the error term. Kezdi's test, which compares conventional and robust covariance matrix estimates, discovers no problem. If consistency of standard error estimation were the only concern, this would be a good outcome. As noted in Section 1, however, there are two other motives for serial correlation diagnostics: (1) to detect model misspecification and (2) to check whether more efficient estimation may be possible through feasible GLS. Our experiments highlight the point that Kezdi's test sometimes lacks power for these purposes.

The other lessons from Table 2 are specific to the particular data generating processes. Under DGP2, the serial correlation of the error term is most pronounced at the first order. Because all of the tests other than Kezdi's are sensitive to this type of serial correlation, all of them show good power when N is at least 100. With N = 50, however, our portmanteau test is conspicuously less powerful than the tests designed to focus on first-order autocorrelation.

Under DGP3, the first- and second-order autocorrelations both are around 0.4. Most of the tests still are powerful, but not the Wooldridge–Drukker test based on the residuals from first-difference estimation. That test checks an implication of the null hypothesis that the first-differenced error term has a first-order autocorrelation of −½. The trouble is that the same implication applies to any process for εit in which the first- and second-order autocorrelations are the same (but not necessarily zero). By design, the MA(2) process in DGP3 has that property, so the Wooldridge–Drukker test has no power in this case. The more general lesson is that the Wooldridge–Drukker test will lack power for detecting serial correlation of εit whenever the first- and second-order autocorrelations are similar.

Under DGP4, an important manifestation of the fixed effects model's misspecification is negative higher order autocorrelations of the residuals. As a result, our portmanteau test tends to show much better power than tests focused on first-order autocorrelation. The reason that our own test specialized to first-order autocovariances is also fairly powerful is that it is sensitive to the heteroskedasticity that causes the first-order autocovariances from different time periods to differ from each other. This is true to varying but lesser degrees for the other tests, which also assume homoskedasticity. If one wishes to use a test sensitive only to serial correlation and not to heteroskedasticity, one may use the variant of our test described at the end of Section 2.

Finally, following up on the theoretical analysis of Section 2, we have conducted Monte Carlo studies of the size and power of our test with varying k for T = 5 and 8. Because the choice of k matters more when T = 5, we display the results for that case in Table 3. Recall first that, under the null hypothesis, our test statistic is asymptotically χ2 regardless of the choice of k. Correspondingly, the Monte Carlo results for DGP1 (the null hypothesis case) show that the empirical size of the test is close to the nominal size 0.05 for all values of k. Recall also that, when the null hypothesis is false, the power of the test may vary with k. For example, when serial correlation is most pronounced at the first order, the test may be more powerful with k set to 1 or T because then it uses more first-order sample autocovariances than it does with intermediate values of k. Indeed, the Monte Carlo results for smaller N do show a power advantage for k = 1 or 5 under DGP2, the first-order autoregressive case. Because the test is consistent regardless of the choice of k, however, the rejection probability approaches 1 for all k as N increases.

Empirical power of our portmanteau test with different choices of k

Although our portmanteau test performs relatively well in most of these experiments, it is obvious that it will be suboptimal in certain circumstances. For example, when serial correlation is most pronounced at the first order and T is sufficiently large, the portmanteau test that uses all autocovariances should be expected to have less power than a test like the Bhargava et al. test, which focuses on first-order autocorrelation. On the other hand, for serial correlation manifested at higher orders, our portmanteau test can have a large power advantage. Our recommendation to practitioners is to use both a portmanteau test and a test for first-order autocorrelation. We are convinced that following this advice would be a major improvement over the typical current practice of ignoring serial correlation altogether.

TECHNICAL APPENDIX

Jacobians of (7). The population and sample versions of the Jacobian of the moment condition (7) are

respectively, and are obtained from repeated applications of Theorem 2 of Magnus and Neudecker (1999, p. 30).

Proof of Theorem 1. Because part (i) is a special case of part (ii), we will prove part (ii) only. Under local alternatives (5) the fixed effects estimator remains

-consistent. Because Xi is strictly exogenous, it follows that

For any value of C the second moments of ei converge in probability to those of ei that correspond to C = 0. Thus

Combining (A.3) and (A.4) we obtain the desired result.

Proof of Theorem 3. Because Xi is strictly exogenous and the fixed effects estimator is N1/2-consistent even when the variances are time-varying, we can treat

as ei in the following proof. Because the moment condition (7) is the first-order condition for the maximum likelihood estimator, the minimum distance estimator is consistent and asymptotically normal. It follows from Theorem 2 of Magnus and Neudecker (1999, p. 30), WMΣMW = W, and (A.1) that

Because

by the consistency of the fixed effects estimator

and the method of moments estimator

, this completes the proof of Theorem 3. █

References

REFERENCES

Arellano, M. (1987) Computing robust standard errors for within-groups estimators. Oxford Bulletin of Economics and Statistics 49, 431434.Google Scholar
Arellano, M. & B. Honore (2001) Panel data models: Some recent developments. In J.J. Heckman & E. Leamer (eds.), Handbook of Econometrics, vol. 5, pp. 32293296. North-Holland.
Baltagi, B.H. & P.X. Wu (1999) Unequally spaced panel data regressions with AR(1) disturbances. Econometric Theory 15, 814823.Google Scholar
Bertrand, M., E. Duflo, & S. Mullainathan (2004) How much should we trust differences-in-differences estimates? Quarterly Journal of Economics 119, 249275.Google Scholar
Bhargava, A., L. Franzini, & W. Narendranathan (1982) Serial correlation and the fixed effects model. Review of Economic Studies 49, 533549.Google Scholar
Chamberlain, G. (1980) Analysis of covariance with qualitative data. Review of Economic Studies 47, 225238.Google Scholar
Donohue, J.J. III & S.D. Levitt (2001) The impact of legalized abortion on crime. Quarterly Journal of Economics 116, 379420.Google Scholar
Drukker, D.M. (2003) Testing for serial correlation in linear panel-data models. Stata Journal 3, 168177.Google Scholar
Friedberg, L. (1998) Did unilateral divorce raise divorce rates? American Economic Review 88, 608627.Google Scholar
Hansen, C. (2003) Generalized Least Squares Estimation in Differences-in-Differences and Other Panel Models. Manuscript, MIT.
Jacobson, L.S., R.J. LaLonde, & D.G. Sullivan (1993) Earnings losses of displaced workers. American Economic Review 83, 685709.Google Scholar
Kezdi, G. (2002) Robust Standard Error Estimation in Fixed-Effects Panel Models. Manuscript, University of Michigan.
Kiefer, N.M. (1980) Estimation of fixed effect models for time series of cross-sections with arbitrary intertemporal covariance. Journal of Econometrics 14, 195202.Google Scholar
Magnus, J.R. & H. Neudecker (1999) Matrix Differential Calculus with Applications in Statistics and Econometrics, rev. ed. Wiley.
Nickell, S. (1980) Correcting the Biases in Dynamic Models with Fixed Effects. Working paper 133, Industrial Relations Section, Princeton University.
Solon, G. (1984a) The effects of unemployment insurance eligibility rules on job quitting behavior. Journal of Human Resources 19, 118126.Google Scholar
Solon, G. (1984b) Estimating Autocorrelations in Fixed-Effects Models. Technical working paper 32, National Bureau of Economic Research.
Wooldridge, J.M. (2002) Econometric Analysis of Cross Section and Panel Data. MIT Press.
Figure 0

Empirical size of portmanteau tests for serial correlation

Figure 1

Empirical power of alternative tests for serial correlation

Figure 2

Empirical power of our portmanteau test with different choices of k