A COMPANION TO Theoretical Econometrics
Nearly uninformative moment conditions
While it is desirable to base estimation on the optimal moment conditions, this is not necessary. Even if the population moment condition is sub-optimal, the GMM framework can be used to obtain consistent, asymptotically normal estimators provided that the parameter is identified. In recent years, there has been a growing awareness that this proviso may not be so trivial in situations which arise in practice. In a very influential paper, Nelson and Startz (1990) drew attention to this potential problem and provided the first evidence of the problems it causes for the inference framework we have described above. Their paper has prompted
considerable interest in the behavior of GMM in cases in which the population moment condition provided is nearly uninformative about 90. In this section we concentrate on illustrating the nature of the problem, and then briefly consider a potential solution.
For expositional simplicity, we restrict attention to the simple linear regression model,
yt = xt90 + u t, (11.41)
in which ut is an iid process with mean zero and variance a2. Suppose the scalar parameter 90 is estimated by instrumental variables which, as we have seen, is just GMM estimation based on the population moment condition
E[z t u f(9o)] = 0, (11.42)
where z t is a q x 1 vector of instruments and ut(90) = yt - xt90. From Lemma 1, 90 is identified31 by (11.42) if rank{E[ztxt]} = 1. In this simple example, 90 is unidentified if E[ztxt] is the null vector, which would occur if zt and xt are uncorrelated and both possess zero means. In practice, it is unlikely that E[zt xt] is exactly zero. The contribution of Nelson and Startz's (1990) paper is to demonstrate that problems occur if E[ztxt] is nonzero but small.32 It is this scenario which we refer to as "nearly uninformative moment conditions."
To proceed, it is necessary to develop a model which can capture the idea of nearly uninformative moment conditions. Following Staiger and Stock (1997), we solve this problem by assuming that
xt = z'yT + £t, (11.43)
where yt = T~1/2c, c is a nonzero q x 1 vector of constants, and et is the unobserved error which has both a zero mean and is uncorrelated with zt.33 Notice that (11.43) implies that ET[ztxt] = E[ztz't]T~1/2c and so is nonzero for finite T but zero in the limit as T ^ ^.34 So the concept of nearly uninformative moment conditions is captured by assuming that {xt; t = 1, 2,... T} is generated by a sequence of processes whose relationship to zt disappears at rateT1/2. This rate is chosen so that the effects of the nearly uninformative moment conditions manifest themselves in the limiting behavior of the estimator. Since p = 1, we have
To analyze the limiting behavior of 0T - 90 it is necessary to impose certain regularity conditions. We explicitly assume that zt is independent of ut, but leave the other necessary regularity conditions unstated for brevity. Using the weak law of large numbers and the central limit theorem respectively, it follows that: (i) T~3Z'Z = Mzz, a positive definite matrix of constants; (ii) T~1/2Z'u A N(0, a2Mzz). Notice that neither (i) nor (ii) involve the relationship between xt and zt
and so would equally hold if 90 is properly identified. The key difference comes in the behavior of Z'x. From (11.43), it follows that
Z'x — T-1/2Z'Zc + Z'e, (11.45)
where e is the T x 1 vector with fth element et. Therefore, T~XZ'x — 0 and T~1/2Z'x — N(Mzzc, c^Mzz). The nature of this limiting behavior means that,
T -1/2 x'Z(T -1Z'Z)-1T -1/2 Z' и T -1/2 x'Z(T -1Z'Z)-1T -1/2 Z'x
(11.46)
where ^ ~ N(Mzzc, oMzz) and ¥2 ~ N(0, oMzz). Therefore, PT converges to a random variable if the moment conditions are nearly uninformative in the sense of (11.43). This is in marked contrast to the case when 90 is identified in the sense of Assumption 4. In that case, Theorem 1 indicates PT converges in probability to 90.
This analysis provides an indication that the asymptotic theory derived in Section 4 is inappropriate for the nearly uniformative moment condition case. It is unlikely to be known a priori if the population moment condition in question is informative - in the sense of Assumption 4 - or nearly uninformative. Therefore, it is useful to develop statistical tests to discriminate between the two cases. In our linear model example, a natural diagnostic is the F-statistic for the hypothesis xt is linearly unrelated to z t. If this hypothesis is not rejected then this can be interpreted as evidence that identification of 90 is suspect. Faced with an insignificant F-statistic, there are two possible responses. One strategy is to keep changing the instrument vector until the F-statistic is significant. However, Hall, Rudebusch and Wilcox (1996) report evidence that this approach does not solve the problem and in fact tends to make matters worse. A second, and more promising, strategy is to develop an inference theory which provides a better approximation in the nearly uninformative moment condition case. This line of research is still in its early stages but significant advances have been made by Staiger and Stock (1997), Stock and Wright (1997), and Wang and Zivot (1998).