TIME-INVARIANT REGRESSOR IN NONLINEAR PANEL MODEL WITH FIXED EFFECTS

Jinyong Hahn; Juergen Meinecke

doi:10.1017/S0266466605050243

TIME-INVARIANT REGRESSOR IN NONLINEAR PANEL MODEL WITH FIXED EFFECTS

Published online by Cambridge University Press: 31 March 2005

Jinyong Hahn and

Juergen Meinecke

Show author details

Jinyong Hahn: Affiliation:
UCLA
Juergen Meinecke: Affiliation:
UCLA

Article contents

Abstract
1. INTRODUCTION
2. IV ESTIMATION
3. APPLICATION: NONLINEAR MODEL OF SOCIAL INTERACTIONS
4. SUMMARY AND EXTENSION
APPENDIX A: REGULARITY CONDITIONS
APPENDIX B: CONSISTENCY
APPENDIX C: EXPANSION
APPENDIX D: UNIFORM CONSISTENCY OF [circumflex]αi
APPENDIX E: PROOF OF THEOREM 1
References

Rights & Permissions

Abstract

This paper generalizes the intuition of Hausman and Taylor (1981, Econometrica 49, 1377–1398) and develops a method of dealing with a time-invariant regressor in nonlinear panel models with fixed effects. We illustrate the usefulness of our result by discussing the implication for some nonlinear models of social interactions.We are grateful to Dan Ackerberg and Jerry Hausman for helpful comments. The first author gratefully acknowledges financial support from NSF grant SES-0313651.

Type: MISCELLANEA
Information: Econometric Theory , Volume 21 , Issue 2 , April 2005 , pp. 455 - 469

DOI: https://doi.org/10.1017/S0266466605050243 [Opens in a new window]
Copyright: © 2005 Cambridge University Press

1. INTRODUCTION

Panel data allow the possibility of controlling for unobserved individual specific effects. In linear models, such “fixed effects” are usually eliminated by differencing. An unintended consequence of differencing is that it also eliminates the time-invariant regressor, which renders its coefficient unidentified. Hausman and Taylor (1981) used an instrumental variables (IV) approach to overcome such a problem. They show that a variable that is uncorrelated with individual fixed effects can be used as a valid instrument in estimating such a coefficient.

In this paper, we generalize their intuition and develop a method of dealing with a time-invariant regressor in the nonlinear framework. This method requires a large number of observations per individual (T), so its applicability is limited to the case where T is large. Because the IV estimation requires a large number of individuals (n), we adopt an asymptotic framework where both n and T grow to infinity at the same rate. This result is made possible by recent technical progress of panel analysis under such alternative asymptotics. See, e.g., Hahn and Kuersteiner (2002, 2003), Hahn and Newey (2004), and Woutersen (2002). We illustrate the usefulness of our result by discussing the implication for some nonlinear models of social interactions.

2. IV ESTIMATION

Suppose that we are given a set of moment restrictions

for some vector-valued function φ, where y_it, x_it, and w_i denote the dependent variable in the tth period, time-varying regressor in the tth period, and time-invariant regressor. Unobserved individual specific effects are summarized by the scalar variable γ_i. For example, in the case of the linear model

with (w_i′,x_it′) strictly exogenous such that

Our primary focus is to estimate the coefficient δ₀ of the time-invariant regressor w_i when γ_i is possibly correlated with w_i and x_it. We should note that estimation of θ₀ does not present any substantive conceptual challenge. If both n and T grow to infinity at the same rate, θ₀ can be

-consistently estimated. Letting α_i0 ≡ γ_i0 + w_i′δ₀, we can rewrite the model as E [φ(y_it,α_i0 + x_it′θ₀)] = 0, to which we can apply variants of recently developed methods discussed in Hahn and Kuersteiner (2002, 2003), Hahn and Newey (2004), and Woutersen (2002).

To understand how δ₀ could be estimated, suppose for a moment that we observe α_i0. Also suppose that we observe an additional variable z_i with dim(z_i) = dim(w_i)¹

It is easy to generalize the discussion to the overidentified case where dim(z_i) > dim(w_i). Because the primary purpose of this paper is identification and consistent estimation of δ₀, we focus on the exactly identified case.

and such that the following condition applies.

CONDITION 1. (i) E [z_iγ_i0] = 0; (ii) E [z_i w_i′] is nonsingular.

Note that z_i is required to be uncorrelated with the individual fixed effects. It is clear that we can consistently estimate δ₀ by

under the mild condition that the data are independent and identically distributed (i.i.d.) over i.

CONDITION 2. ({y_i1,y_i2,…},{x_i1,x_i2,…},z_i,w_i,γ_i0) is i.i.d. over i.

Hausman and Taylor (1981) noted that

would remain consistent even if we replace α_i0 by an unbiased estimate. In the nonlinear context, it is difficult to come up with such an unbiased estimator for α_i0. Therefore, Hausman and Taylor's method cannot be directly applied. The basic intuition in this paper is that, when both n and T grow to infinity at the same rate, we can come up with a

-consistent estimator for α_i0, say,

CONDITION 3. n,T → ∞ such that n/T → ρ, where 0 < ρ < ∞.

Because the estimation error becomes very small as the sample size increases, the IV estimator

is expected to be consistent for δ₀ in general. This is quite intuitive. Note that

is an IV estimator of

on w_i. Because

is a proxy for α_i0, and because the “measurement error” disappears as T → ∞, we should expect that the distribution of

should be similar to that of

if T is large.

To come up with a

-consistent estimator

, we will assume that

for some functions u and v, where X_it ≡ (y_i1,…,y_iT,x_i1,…,x_iT) and where dim(θ) = dim(u) = p and dim(α) = dim(v) = 1. We will consider the estimator that solves

This indicates that (i) the first component u is used throughout the sample for estimation of θ₀; and (ii) the second component v is used only for the ith individual to estimate α_i0. We do not expect this separation to be constraining in practice. For example, in the linear model (2), we may take u(X_it;θ,α_i) = x_it·(y_it − α_i − x_it′θ), v(X_it;θ,α_i) = y_it − α_i − x_it′θ, which will result in the usual fixed effects estimator.

Under regularity conditions discussed in Appendix A, it can be shown that the

, which solves (5), is uniformly consistent over i.

See Appendix D.

Because

is a proxy for α_i0, there is a “measurement error.” If there is a correlation between the measurement error and the instrument z_i, the resultant estimator

may be biased. Condition 4 rules this out.

CONDITION 4. E [v(X_it;θ,α_i)|z_i] = 0.³

We have v(X_it;θ,α_i) = ε_it in the linear model (2), and Hausman and Taylor's instrument z_i for such a linear model is required to satisfy E [z_i·ε_it] = 0.

Theorem 1 establishes that the IV estimator

in (4) based on the proxy

of α_i0 has the same asymptotic distribution as

in (3).

THEOREM 1. Assume Conditions 1–7. Further assume that E [|z_i w_i′|] < ∞ and E [|γ_i0²z_i z_i′|] < ∞. We then have

Proof. See Appendix E.

3. APPLICATION: NONLINEAR MODEL OF SOCIAL INTERACTIONS

Identification and estimation of various social effects in the nonlinear model of social interactions based on the preceding framework are straightforward and provide one way of dealing with the typical identification problems that are peculiar to these models.

For a grouped cross-section of data, a model of social interactions can have the form of a conditional likelihood f (y_gi,γ_g0 + E_g [y_gi]β + E_g [s_gi′]φ + s_gi′ζ + x_gi′θ₀). Here, y_gi is the outcome/behavior of interest for the ith individual in the gth group, and E_g [·] denotes the mean for the gth group. Following the classification of Manski (1993), the coefficient on E_g [y_gi] determines the strength of endogenous social effects in explaining individual outcomes. In addition to y_gi, we observe s_gi, a vector of individual characteristics that also generate exogenous (contextual) social effects, and x_gi, a vector of individual characteristics that operate at the individual level only. Finally γ_g0, which is not observed by the econometrician, captures the presence of correlated group effects.

The focus of Graham and Hahn (2003) is on the linear-in-means model without contextual effects. They exploit the idea of Hausman and Taylor (1981) of using IVs to identify the parameters. Identification of the endogenous effect is made possible by an instrument that exogenously explains the between-group variation of the individual characteristics x_gi. To be more specific, they consider the simplified model

and examine whether the endogenous effects can be identified in the presence of correlated effects. For such purpose, they considered the social equilibrium

They show that

under a standard strict exogeneity condition, where for any vector

. They also note that θ₀ /(1 − β) is the limit of the two-stage least squares estimator applied to the social equilibrium (6) if (i) E_g [y_gi] and E_g [x_gi] are observed; and (ii) there exists an instrument z_g such that E [z_gγ_g0] = 0 and E [z_g E_g [x_gi′]] ≠ 0.

Brock and Durlauf (2001a, 2001b) are concerned with nonlinear models without correlated group effects. To be specific, they considered a logit model⁴

They actually considered the model

where m_g^e denotes the (common) expectation of y_g among agents in group g. Under some auxiliary assumption including rational expectations and common knowledge, the model is reduced to the simpler form presented here.

with

where h_g = (E_g [s_gi′],s_gi′)′. They show that the parameter (k,β,ξ,θ₀) is identified and can be consistently estimated by maximum likelihood estimation. Exploiting nonlinearity, they established that the endogenous effects β can be identified from ξ. Because h_g contains E_g [s_gi], the contextual effects are identified from the endogenous effects in nonlinear models in general.⁵

They note that these identification results are still valid in the presence of multiple equilibria.

The result in the previous section can be used to identify social effects in the presence of correlated group effects. Let γ_g0 denote the group characteristic that may be correlated with observed variables such as E_g [y_gi] or x_gi. Assume that the conditional likelihood given as γ_g0,E_g [y_gi],h_g,x_gi takes the form f (y_gi,γ_g0 + E_g [y_gi]β + h_g′ξ + x_gi′θ₀). Note that the explanatory variables affect the outcome through the linear index γ_g0 + E_g [y_gi]β + h_g′ξ + x_gi′θ₀. For example, we may have

which is a generalized version of Brock and Durlauf's logit model that allows correlated group effects. Interpreting E_g [y_gi] as just one of the regular time invariant regressors and writing w_g = (E_g [y_gi],h_g′)′ and w_g′δ₀ = βE_g [y_gi] + h_g′ξ, we get the conditional likelihood as f (y_gi,γ_g0 + w_g′δ₀ + x_gi′θ₀). This model can be understood to be a nonlinear panel model with group fixed effects and some individual-invariant regressor w_g, for which identification results have been established earlier in this paper. We note that consistent estimation of γ_g0 + w_g′δ₀ and θ₀ can be achieved by considering the maximum likelihood estimator, i.e., by taking

We may therefore conclude that the social effects are identified as such.

4. SUMMARY AND EXTENSION

In this paper, we generalized the result of Hausman and Taylor (1981) to nonlinear panel models with fixed effects. The usefulness of the result is illustrated with some nonlinear models of social interactions. It would be interesting to generalize Hausman and Taylor's specification test to the nonlinear setup, which is left for future research.

APPENDIX A: REGULARITY CONDITIONS

Condition 5. (i) Given time-invariant variables (α_i0,z_i,w_i), (y_it,x_it) is i.i.d. over t; (ii) for every i, G_(i)(θ₀,α_i0) = 0; (iii) for each η > 0, inf_i inf_{{(θ,α):|(θ,α)−(θ₀,α_i0)|>η}}|G_(i)(θ,α)| > 0, where

Remark 1. We are assuming α_i are deterministic sequence of numbers, i.e., all the results in this paper are results conditional on α_i.

Condition 6. (i) The function g(·;θ,α) is continuous in

; (ii) the parameter space

is compact; (iii) there exists a function M(X_it) such that

and sup_i E [M(X_it)^Q] < ∞ for some Q > 64.

Condition 7. (i) min_i E [v(X_it;θ₀,α_i0)²] > 0; (ii)

, where

APPENDIX B: CONSISTENCY

LEMMA 1. Assume that W_t are i.i.d. with E [W_t] = 0 and E [W_t^2k] < ∞. Then,

for some constant C(k).

Proof. By adopting an argument in the proof of Lemma 5.1 in Lahiri (1992), we have

where for each fixed j ∈ {1,…,2k}, [sum ]_α extends over all j-tuples of positive integers (α₁,…,α_j) such that α₁ + ··· + α_j = 2k and [sum ]_I extends over all ordered j-tuples (t₁,…,t_j) of integers such that 1 ≤ t_j ≤ T. Also, C(α₁,…,α_j) stands for a bounded constant. Note that if j > k then at least one of the indices α_j = 1. By independence and the fact that E [W_t] = 0 it follows that

whenever j > k. This shows that

for some constant C(k). █

LEMMA 2. Suppose that, for each i, {ξ_it,t = 1,2,…} is a sequence of zero mean i.i.d. random variables. We assume that {ξ_it,t = 1,2,3} are independent across i. We also assume that max_i E [|ξ_it|¹⁶] < ∞. Finally, we assume that n = O(T). We then have

for every η > 0.

Proof. Using Lemma 1, we obtain

, where C > 0 is a constant. Therefore, we have

, or

. █

LEMMA 3. Suppose that Conditions 3 and 6 hold. We then have for all η > 0 that

Proof. Let η > 0 be given. We note that

Let ε > 0 be chosen such that 2ε max_i E [M(X_it)] < η/3. Divide

into subsets

such that |(θ,α) − (θ′,α′)| < ε whenever (θ,α) and (θ′,α′) are in the same subset. Let (θ_j,α_j) denote some point in ϒ_j for each j. Then,

and therefore

For

, we have

and therefore

by Lemma 2. Combining (B.2)–(B.4) and n = O(T), we obtain the desired conclusion. █

APPENDIX C: EXPANSION

Let

be such that

. Letting U_i(X_it;θ,α_i) ≡ u(X_it;θ,α_i) − ρ_i0 v(X_it;θ,α_i), ρ_i0 ≡ E [∂u(X_it;θ,α_i)/∂α_i′](E [∂v(X_it;θ,α_i)/∂α_i′])⁻¹ (in the likelihood framework, U is the efficient score for θ), we can recognize that

is a solution to

Let F ≡ (F₁,…,F_n) denote the collection of distribution functions F_i, where each F_i denotes the distribution function of X_it. Let

, where

denotes the empirical distribution function for the stratum i. Define F(ε) ≡ F + εΔ_iT for ε ∈ [0,T^−1/2], where

. For each fixed θ and ε, let V(X_it;θ,α_i) ≡ v(X_it;θ,α_i) and α_i(θ,F_i(ε)) be the solution to the estimating equation 0 = ∫V_i [θ,α_i(θ,F_i(ε))] dF_i(ε) and let θ(ε) be the solution to the estimating equation

. By Taylor series expansion, we have

, where

is somewhere in between 0 and T^−1/2. We therefore have

The last term in (C.1) can be shown to be o_p(1) by the same method as in Hahn and Newey (2004).

LEMMA 4. For every η > 0, we have

Proof. Only the first assertion is proved. The second assertion can be proved similarly. Let η be given. Recall that

. We therefore have

, from which we find

. By Lemma 3, we also have

. Therefore, for every

with probability equal to 1 − o(T⁻¹), we have

where ε ≡ inf_i inf_{{(θ,α)
:
|(θ,α)−(θ₀,α_i0)|>η}}|G_(i)(θ,α)| > 0. It follows that

LEMMA 5. Suppose that, for each i, {ξ_it(φ),t = 1,2,…} is a sequence of zero mean i.i.d. random variables indexed by some parameter φ ∈ Φ. We assume that {ξ_it(φ), t = 1,2,…} are independent across i. We also assume that sup_φ∈Φ|ξ_it(φ)| ≤ B_it for some sequence of random variables B_it that is i.i.d. across t and independent across i. Finally, we assume that max_i E [|B_it|⁶⁴] < ∞ and n = O(T). We then have

for every υ such that

. Here, {φ_i} is an arbitrary sequence in Φ.

Proof. By Markov's inequality, we obtain

By Lemma 1, we have

for some C. Therefore, we have

LEMMA 6. Suppose that K_i(·;θ(ε),α_i(θ(ε),ε)) is equal to ∂^m₁+m₂g(X_it;θ(ε), α_i(θ(ε),ε))/∂θ^m₁∂α_i^m₂ for some m₁ + m₂ ≤ 1,…,5. Then, for any η > 0, we have

Also,

for some constant

Proof. Note that we have

Therefore, we have

the right-hand side of which can be bounded by using Lemmas 4 and 5 in absolute value by some η > 0 with probability 1 − o(T⁻¹), which proves the first claim. The second claim can be proved similarly.

As for the third claim, we can show using Lemma 5 that max_i|∫K_i(·;θ(ε), α_i(θ(ε),ε)) dΔ_iT| can be bounded by in absolute value by CT^(1/10)−υ for some constant C > 0 and υ such that

with probability 1 − o(T⁻¹). █

LEMMA 7.

for some constant

Proof. In Hahn and Kuersteiner (2003), it is shown that

Using Lemma 6, we can see that (∫[∂V_i(·,θ,ε)/∂α_i] dF_i(ε))⁻¹ is uniformly bounded away from zero with probability 1 − o(T⁻¹). We can also see that, with probability 1 − o(T⁻¹), ∫[∂V_i(·,θ,ε)/∂θ] dF_i(ε) is uniformly bounded by some constant C and ∫V_i(·,θ,ε) dΔ_iT is uniformly bounded by CT^(1/10)−υ. █

LEMMA 8.

for some constant

Proof. In Hahn and Kuersteiner (2003), it is shown that θ^ε(ε) is equal to

Using Lemmas 6 and 7 we can bound the denominator of θ^ε(ε) by some C > 0 and the numerator by some CT^(1/10)−υ with probability 1 − o(T⁻¹). █

LEMMA 9.

for some constant

. Here, α_i^θ_rθ_r′ ≡ ∂²α_i /∂θ_r∂θ_r′. We similarly define α_i^θ_rε.

Proof. By repeatedly differentiating 0 = ∫V_i(·,θ,ε) dF_i(ε) with respect to ε, we obtain

The result then follows by applying the same argument as in the proof of Lemma 7. █

LEMMA 10.

for some constant

Sketch of Proof. By repeatedly differentiating

with respect to ε, we obtain a characterization of θ^εε(ε). (For more detailed characterization, see Hahn and Kuersteiner, 2003.) The conclusion follows by combining it with Lemmas 6–9. █

APPENDIX D: UNIFORM CONSISTENCY OF [circumflex]α_i

THEOREM 2. Under Conditions 3, 5, 6, and 7, we have

where

and Pr[max_i|κ_i| ≥ η] = o(1) for every η > 0. Here, v_it ≡ v_it(X_it,θ₀,α_i0).

Let

denote α_i that sets

equal to zero. (In other words, let

.) We then have the expansion

for some

. Let v_i(·,ε) ≡ V_i(θ(F(ε)),α_i(F_i(ε))). The first-order condition may be written as 0 = ∫v_i(·,ε) dF_i(ε). Differentiating repeatedly with respect to ε, we obtain

Because dv_i(·,ε)/dε = [∂v_i(·,ε)/∂θ′](∂θ/∂ε) + [∂v_i(·,ε)/∂α_i](∂α_i /∂ε), (D.2) implies that

Evaluating at ε = 0 we obtain

where V_i^α ≡ ∂v(X_it;θ,α_i)/∂α_i, V_i^θ ≡ ∂v(X_it;θ,α_i)/∂θ, and θ^ε(0) can be deduced from (C.2). Next, consider

such that

is characterized by

We now combine (D.1), (D.4), and

and obtain

Here, (D.6) can be obtained by evaluating (C.2) at ε = 0.

In light of (D.7), Theorem 2 can be obtained by showing that

for any

. This follows from representation (D.5) and also from Lemmas 6, 8, and 10.

APPENDIX E: PROOF OF THEOREM 1

Theorem 2 implies that we have

Under Conditions 4 and 5, the term

is of order o_p(1), and substitution of

for α_i0 in (3) does not affect the asymptotic distribution of the resultant estimator (4):

References

REFERENCES

Brock, W.A. & S.N. Durlauf (2001a) Interactions-based models. In J. Heckman & E. Leamer (eds.), Handbook of Econometrics, vol. 5, pp. 3297–3380. North-Holland.

Brock, W.A. & S.N. Durlauf (2001b) Discrete choice with social interactions. Review of Economic Studies 68, 235–260.Google Scholar

Graham, B.S. & J. Hahn (2003) Identification and Estimation of the Linear-in-Means Model of Social Interactions. Manuscript, UCLA.

Hahn, J. & G. Kuersteiner (2002) Asymptotically unbiased inference for a dynamic panel model with fixed effects when both n and T are large. Econometrica 70, 1639–1657.Google Scholar

Hahn, J. & G. Kuersteiner (2003) Bias Reduction for Dynamic Nonlinear Panel Models with Fixed Effects. Manuscript, UCLA.

Hahn, J. & W.K. Newey (2004) Jackknife and analytical bias reduction for nonlinear panel models. Econometrica 72, 1295–1319.Google Scholar

Hausman, J. & W. Taylor (1981) Panel data and unobservable individual effects. Econometrica 49, 1377–1398.Google Scholar

Lahiri, S. (1992) Edgeworth correction by moving block bootstrap for stationary and nonstationary data. In R. LePage & L. Billard (eds.), Exploring the Limits of Bootstrap, pp. 183–214. Wiley.

Manski, C. (1993) Identification of endogenous social effects: The reflection problem. Review of Economic Studies 60, 531–542.Google Scholar

Woutersen, T. (2002) Robustness against Incidental Parameters. Manuscript, University of Western Ontario.

Article contents

TIME-INVARIANT REGRESSOR IN NONLINEAR PANEL MODEL WITH FIXED EFFECTS

Abstract

1. INTRODUCTION

2. IV ESTIMATION

3. APPLICATION: NONLINEAR MODEL OF SOCIAL INTERACTIONS

4. SUMMARY AND EXTENSION

APPENDIX A: REGULARITY CONDITIONS

APPENDIX B: CONSISTENCY

APPENDIX C: EXPANSION

APPENDIX D: UNIFORM CONSISTENCY OF [circumflex]αi

APPENDIX E: PROOF OF THEOREM 1

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests

APPENDIX D: UNIFORM CONSISTENCY OF [circumflex]α_i