A COMPANION TO Theoretical Econometrics
Panel data
Panel data are repeated measurements over time for a set of cross-sectional units (e. g. households, firms, regions). The additional time dimension allows for consistent estimation when there is measurement error. As a start, consider the simple case with a single regressor and consider the impact of measurement error. For a typical observation indexed by n, n = 1,..., N, the model is
Vn = £ nP + lT • an + Є n
Xn = £ n + V
where yn, £ n, є n, and vn are vectors of length T and T is the number of observed time points; £ n, є n, and vn are mutually independent, and their distributions are independent of an, which is the so-called individual effect, a time-constant latent characteristic (fixed or random) of the cross-sectional units commonly included in a panel data model; see, e. g., Baltagi (1995) for an overview. The vector i T is a T-vector of ones. The random vectors £n, єn and vn are iid with zero expectation and variances E(£n£n) = X£, Е(єпє 'п) = о2eIT, and E(vnVn) = Xv, respectively. Con
sequently, E(xnx'n) = Xx = X£ + Xv. Eliminating £n gives yn = xnP + iT • an + un, where
Un = є n - VnP.
Let Q be a symmetric T x T-matrix which is as yet unspecified apart from the property QiT = 0, so that Qyn does not contain the individual effect an anymore. We consider estimators of P of the form
S = ^nxn Qyn (8 17)
ZnX'n QXn
This general formulation includes the so-called within-estimator when Q = IT - iTiT/T is the centering operator and the first-difference estimator when Q = RR', where R' is the matrix taking first differences. Now,
Again we have an estimator that is asymptotically biased towards zero. Whatever the precise structure of Zv and Zx, it seems reasonable to assume that the true regressor values E will be much stronger correlated over time than the measurement errors v. Therefore, x will also have a stronger correlation over time. Hence the variance matrix Xx will be more reduced than Xv by eliminating the means over time by the Q matrix and the bias of the estimator (8.17) will be worse than the bias of the OLS estimator.
The main virtue of the panel data structure is, however, that in some cases, several of these estimators (with different Q matrices) can be combined into a consistent estimator. The basic results on measurement error in panel data are due to Griliches and Hausman (1986); see also Wansbeek and Koning (1991). Further elaboration for a variety of cases is given by Bi0rn (1992a, 1992b).