A COMPANION TO Theoretical Econometrics
Panel Data Models
A panel (or longitudinal or temporal cross-sectional) data set is one which follows a number of individuals over time. By providing sequential observations for a number of individuals, panel data allow us to distinguish inter-individual differences from intra-individual differences, thus allow us to construct and test more complicated behavioral models than a single time series or cross section data set would allow. Moreover, panel data offer many more degrees of freedom, provide the possibility to control for omitted variable bias and reduce the problem of multicollinearity, hence improving the accuracy of parameter estimates and prediction (e. g. Baltagi, 1995; Chamberlain, 1984; Hsiao, 1985, 1986, 1995; Matyas and Sevestre, 1996).
However, the emphasis of panel data is on individual outcomes. Factors affecting individual outcomes are numerous, yet a model is a simplification of the real world. It is neither feasible nor desirable to include all factors that affect the outcomes in the specification. If the specification of the relationships among variables appears proper, yet the outcomes conditional on the included explanatory variables cannot be viewed as random draws from a probability distribution, then standard statistical procedures will lead to misleading inferences. Therefore, the focus of panel data research is on controlling the impact of unobserved heterogeneity among cross-sectional units over time in order to draw inference about the population characteristics.
If heterogeneity among cross-sectional units over time cannot be captured by the explanatory variables, one can either let this heterogeneity be represented by the error term or let the coefficients vary across individuals and/or over time. For instance, for a panel of N individuals over T time periods, a linear model specification can take the form
where both the coefficients of x variables and the error of the equation vary across individuals and over time. However, model (16.1) only has descriptive value. One can neither estimate pit nor use it to draw inference about the population if each individual is different and varies their behavioral patterns over time. In this chapter we will give a selected survey of panel data models.
For ease of exposition we shall assume that the unobserved heterogeneities vary across individuals but stay constant over time. We discuss linear models in Section 2, dynamic models in Section 3, nonlinear models in Section 4, sample attrition and sample selectivity in Section 5. Conclusions are in Section 6.