1. MOTIVATION AND RESULTS
Mundlak (1978) considered a panel data regression model with error component disturbances
where the individual effects are a linear function of the averages of all the explanatory variables across time
where εi ∼ IIN(0,σε2), νit ∼ IIN(0,σν2), and Xi.′ is 1 × K vector of observations on the explanatory variables averaged over time. Mundlak showed that generalized least squares (GLS) on the resulting model,
yields
and
with
where P is a matrix that averages the observation across time for each individual and Q = INT − P is a matrix that obtains the deviations from individual means. This note gives an alternative derivation of this result using system estimation. Arellano (1993) applied system estimation to obtain an alternative derivation of the Hausman (1978) test. In fact, Arellano (1993) used the forward orthogonal deviations operator. Here, we use the usual fixed effects transformation. In particular, we write the panel model in vector form as
where η = Zμε + ν, Zμ = IN [otimes ] ιT with IN denoting an identity matrix of dimension N and ιT a vector of ones of dimension T. Here P is the projection matrix on Zμ, i.e., P = Zμ(Zμ′Zμ)−1Zμ′ = IN [otimes ] JT where JT is a matrix of ones of dimension T and JT = JT /T. Premultiplying (7) by P one gets
because P2 = P and PZμ = Zμ. Note that ordinary least squares (OLS) or GLS on (8) yields
, which is the usual between estimator of y on X. Similarly, premultiplying (7) by Q one gets
because QP = 0. OLS or GLS on (9) yields
, which is the usual within or fixed effects estimator of y on X. Stacking the system of equations (8) and (9), we get
and the system error vector has mean 0 and variance-covariance matrix given by
where σ12 = Tσε2 + σν2. This system estimation has been useful in deriving error components two-stage least squares (EC2SLS) and error components three-stage least squares (EC3SLS) (see Baltagi, 1981). It has also been used to derive GMM estimators for dynamic panel data models (see Arellano and Bover, 1995, and Blundell and Bond, 1998). For the Mundlak case, there is no need for partitioned inversion. In fact, the OLS normal equations on (10) yield
and
because P + Q = INT. Subtracting (13) from (12) one gets X′Qy = (X′QX)β, which yields
.
Solving (13) yields
. Similarly, the GLS normal equations on (10) yield
and
Equation (15) yields
. Subtracting (15) from (14) one gets X′Qy = (X′QX)β, which yields
. This proves that system OLS or GLS on (10) yields the same results that Mundlak found by applying GLS to (3).
In fact, one can prove that the Zyskind (1967) necessary and sufficient condition for OLS to be equivalent to GLS on the system of equations (10) is satisfied. This calls for PZΣ = ΣPZ, where
is the matrix of regressors in (10) and Σ is the variance-covariance matrix of its disturbances. It is straightforward to show that
from which it follows that
Note that the Hausman (1978) specification test based on the between minus within estimators is basically a test for H0,π = 0 in (3), and this is based upon
The
can be obtained from the GLS variance-covariance matrix of (10). This is given by the inverse of
which can be easily shown by partitioned inversion to be
Note that the second diagonal matrix is exactly the same as that given by (6), which completes the proof.