A COMPANION TO Theoretical Econometrics
Dynamic Models
When xit and/or zit contain lagged dependent variables, because a typical panel contains a large number of cross-sectional units followed over a short period of time, it turns out that how the initial value of the dependent variable is modeled plays a crucial role with regard to the consistency and efficiency of an estimator (e. g. Anderson and Hsiao, 1981, 1982; Bhargava and Sargan, 1983; Blundell and Bond, 1998). Moreover, if there exists unobserved heterogeneity and the individual specific effects is more appropriately treated as random, the random effects and the lagged dependent variables are correlated, thus (16.6) is violated. If the individual effects are treated as fixed, then the number of individual specific parameters increases with the number of cross-sectional units, N. Contrary to the static case of Section 2, the estimation of the individual specific effects in a dynamic model is not independent of the estimation of the structural parameters that are common across N and T. Since for each individual there are only a finite number of observations, there is no way the individual specific parameters can be accurately estimated. The errors in the estimation of individual specific parameters will be transmitted into the estimation of the structural parameters if the two estimators are not independent. This is the classical incidental parameters problem (Neyman and Scott, 1948).
Consider a simple dynamic model with individual specific effects appearing in the intercepts only,
Уі = §T a + yif_!t1 + @i У2 + Ui (16.14)
where zit is a k2 - 1 dimensional exogenous variables, yi-1 = (yi 0,..., yi, T_1) and for ease of exposition, we assume that yi0 are observable. Let H be an m x T transformation matrix such that HeT = 0. Multiplying H to (16.14), we eliminate the individual specific effects ai from the specification,
Hy = HУi/-lУl + HZ;T2 + HUi. (16.15)
Since (16.15) does not depend on a,, if we can find instruments that are correlated with the explanatory variables but uncorrelated with HU, we can apply the
instrumental variable method to estimate уг and y2 (e. g. Anderson and Hsiao, 1981, 1982). Let W { be the q x m matrix of instrumental variables that satisfies
E(WjHuj) = 0. (16.16)
The generalized method of moments (GMM) estimator takes the form
(16.17)
where Ф = E (Hm! m'H' ). In the case when the transformation matrix H takes the form of first differencing (16.14), H is a (T - 1) x T matrix with all the elements in the fth row equal to zero except for the fth and (t + 1)th element that takes the value of -1 and 1, respectively. If uit is iid, then W { takes the form (e. g. Ahn and Schmidt, 1995; Amemiya and MaCurdy, 1986; Arellano and Bover, 1995; Blundell and Bond, 1998)
where z = (z'n,..., z'iT).
GMM estimator (16.17) makes use of T(T2-1) + T(k2 - 1) orthogonality conditions. Where k2 - 1 denotes the dimension of zit. In most applications, this is a large number. For instance, even in the case of k2 - 1 = 0, there are still 45 orthogonality conditions for a model with only one lagged dependent variable when T = 10 which makes the implementation of the GMM estimator (16.17) nontrivial. Moreover, in finite sample, it suffers severe bias as demonstrated in a Monte Carlo study conducted by Ziliak (1997) because of the correlation between the estimated weight matrix and the sample moments and/or the weak instruments phenomena (Wansbeek and Knaap, 1999). The bias of the GMM estimator leads to poor coverage rates for confidence intervals based on asymptotic critical values. Hsiao, Pesaran and Tahmiscioglu (1998b) propose a transformed maximum likelihood estimator that maximizes the likelihood function of (16.15) and the initial value function
(16.19)
where Ух, denotes the first difference of z,. The transformed MLE is consistent and asymptotically normally distributed provided that the generating process of zit is difference stationary, i. e. the data generating process of zit is of the form
Zit = Ц; + gt + X Bj i-j, (16.20)
j=о
where xit is iid with constant mean and covariances, and X || Bj || < ^. The transformed MLE is easier to implement than the GMM and is asymptotically more efficient because the GMM requires each instrument to be orthogonal to the transformed error while the transformed MLE only requires a linear combination of the instruments to be orthogonal to the transformed errors. The Monte Carlo studies conducted by Hsiao, Pesaran and Tahmiscioglu (1998) show that the transformed MLE performs very well even when T and N both are small.
Where individual heterogeneity cannot be completely captured by individual time-invariant effects, a varying parameter model is often used. However, while it may be reasonable to assume (16.5), (16.6) cannot hold if xit contains lagged dependent variables. For instance, consider a simple dynamic model
Vu = P іУц-1 + !zu + uit
= _Vi, t-1 + JZii + viu (16.21)
where vit = eiyil-1 + uit. By continuous substitution, it can be shown that
Vi, t-1 = ?X(P + ei) Z, t-j-1 + X( + ei)4i-j-1. (16.22)
j = о j = о
It follows that E(vu | yi, l-1) Ф 0. Therefore, the least squares estimator is inconsistent. Neither is the instrumental variable estimator feasible because the instruments that are uncorrelated with vit are most likely uncorrelated with zit as well.
Noting that when T ^ <*>, estimating the coefficients of each cross-sectional unit using the least squares method is consistent, Pesaran and Smith (1995) propose a mean group estimator that takes an average of the individual least squares estimated U,
~ 1 N -
_ = N Xe i. (16.23)
When both T and N ^ ™, (16.23) is consistent. However, the Monte Carlo studies conducted by Hsiao, Pesaran and Tahmiscioglu (1999) show that (16.23) does not perform well in finite sample.
Under (16.4), (16.5), conditional on yi0 being a fixed constant, the Bayes estimators of ° and « are identical to (16.10), conditional on C1 and C2 with diffuse priors for
_ and « (Hsiao, Pesaran and Tahmiscioglu, 1999). The Monte Carlo studies show that a hierarchical Bayesian approach (e. g. Lindley and Smith, 1972) performs fairly well even when T is small and better than other consistent estimators despite the fact that yi0 being a fixed constant is not justifiable. This makes Bayes' procedure particularly appealing. Moreover, the implementation of a Bayesian approach has been substantially simplified by the recent advance of Markov Chain Monte Carlo methods (e. g. Gelfand and Smith, 1990).