Mostly Harmless Econometrics: An Empiricist’s Companion
Fixed Effects versus Lagged Dependent Variables
Fixed effects and differences-in-differences estimators are based on the presumption of time-invariant (or group-invariant) omitted variables. Suppose, for example, we are interested in the effects of participation in a subsidized training program, as in the Dehejia and Wahba (1999) and Lalonde (1986) studies discussed in section (3.3.3). The key identifying assumption motivating fixed effects estimation in this case is
E(y0it&i; Xit, Dit) — E(Y0it&i; Xit), (5.3.1)
where a. i is an unobserved personal characteristic that determines, along with covariates, Xit, whether individual i gets training. To be concrete, & might be a measure of vocational skills, though a strike against the fixed-effects setup is the fact that the exact nature of the unobserved variables typically remains somewhat mysterious. In any case, coupled with a linear model for E(Yoit&i, Xit), assumption (5.3.1) leads to simple estimation strategies involving differences or deviations from means.
For many causal questions, the notion that the most important omitted variables are time-invariant doesn’t seem plausible. The evaluation of training programs is a case in point. It seems likely that people looking to improve their labor market options by participating in a government-sponsored training program have suffered some kind of setback. Many training programs explicitly target people who have suffered a recent setback, e. g., men who recently lost their jobs. Consistent with this, Ashenfelter (1978) and Ashenfelter and Card (1985) find that training participants typically have earnings histories that exhibit a pre-program dip. Past earnings is a time-varying confounder that cannot be subsumed in a time-invariant variable like &i.
The distinctive earnings histories of trainees motivates an estimation strategy that controls for past earnings directly and dispenses with the fixed effects. To be precise, instead of (5.3.1), we might base causal inference on the conditional independence assumption,
E(y0itYit— Xit; Dit) — E(y0it Yit— h; Xit). (5.3.2)
This is like saying that what makes trainees special is their earnings h periods ago. We can then use panel data to estimate
where the causal effect of training is ft. To make this more general, Yu-h can be a vector including lagged earnings for multiple periods.9
Applied researchers using panel data are often faced with the challenge of choosing between fixed-effects and lagged-dependent variables models, i. e., between causal inferences based on (5.3.1) and (5.3.2). One solution to this dilemma is to work with a model that includes both lagged dependent variables and unobserved individual effects. In other words, identification might be based on a weaker conditional independence assumption:
E(Y0itai; Yit—h; Xit; Dit) = E(Y0it^i; Yit—h; Xit), (5.3.4)
which requires conditioning on both a. i and Yit-h. We can then try to estimate causal effects using a specification like
Y it = <x. i + 6Yit-h + ^t + ftD it + XitS + Vit. (5.3.5)
Unfortunately, the conditions for consistent estimation of ft in equation (5.3.5) are much more demanding than those required with fixed effects or lagged dependent variables alone. This can be seen in a simple example where the lagged dependent variable is Yit_ 1. We kill the fixed effect by differencing, which produces
Ay it = 0AYit_i + AAt + ftADit + A XitS + A Vit. (5.3.6)
The problem here is that the differenced residual, AVit, is necessarily correlated with the lagged dependent variable, AYit_i, because both are a function of Vit_1. Consequently, OLS estimates of (5.3.6) are not consistent for the parameters in (5.3.5), a problem first noted by Nickell (1981). This problem can be solved, though the solution requires strong assumptions. The easiest solution is to use Yit_2 as an instrument for Ayit_ 1 in (5.3.6).[90] [91] But this requires that Yit_2 be uncorrelated with the differenced residuals, AVit. This seems unlikely since residuals are the part of earnings left over after accounting for covariates. Most people’s earnings are highly correlated from one year to the next, so that past earnings are an excellent predictor of future earnings and earnings growth. If Vit is serially correlated, there may be no consistent estimator for (5.3.6). (Note also that the IV strategy using Yit_2 as an instrument requires at least three periods to obtain data for t, t — 1, and t — 2).
Given the difficulties that arise when trying to estimate (5.3.6), we might ask whether the distinction between fixed effects and lagged dependent variables matters. The answer, unfortunately, is yes. The fixed-effects and lagged dependent variables models are not nested, which means we cannot hope to estimate
one and get the other as a special case if need be. Only the more general and harder-to-identify model, (5.3.5), nests both fixed effects and lagged dependent variables.[92].
So what’s an applied guy to do? One answer, as always, is to check the robustness of your findings using alternative identifying assumptions. That means that you would like to find broadly similar results using both models. Fixed effects and lagged dependent variables estimates also have a useful bracketing property. The appendix to this chapter shows that if (5.3.2) is correct, but you mistakenly use fixed effects, estimates of a positive treatment effect will tend to be too big. On the other hand, if (5.3.1) is correct and you mistakenly estimate an equation with lagged outcomes like (5.3.3), estimates of a positive treatment effect will tend to be too small. You can therefore think of fixed effects and lagged dependent variables as bounding the causal effect you are after. Guryan (2004) illustrates this sort of reasoning in a study estimating the effects of court-ordered busing on Black high school graduation rates.