Advanced Econometrics Takeshi Amemiya
Least Squares Estimator as Best Unbiased Estimator (BUE)
In this section we shall show that under Model 1 with normality, the least squares estimator of the regression parameters f} attains the Cramer-Rao lower bound and hence is the best unbiased estimator. Assumptions A, B, and C of Theorem 1.3.1 are easy to verify for our model; as for assumption D, we shall
verify it only for the case where fi is a scalar and <r2 is known, assuming that the parameter space of fi and <r2 is compact and does not contain a2 — 0.
Consider log L given in (1.3.2). In applying Theorem 1.3.1 to our model, we put0' = (/Г, <r2). The first - and second-order derivatives of log L are given by
From (1.3.15) and (1.3.16) we can immediately see that assumption A is satisfied. Taking the expectation of (1.3.17), (1.3.18), and (1.3.19), we obtain
We have
a log l a log l = і = ± x, x
dfi dfi' a* a2
(1'ЗИ)
2d4’
because £(u'u)2 = (T2 + 2T) a*, and
Therefore, from (1.3.20) to (1.3.23) we can see that assumptions В and C are both satisfied.
We shall verify assumption D only for the case where fi is a scalar and a2 is known so that we can use Theorem 1.3.2, which is stated for the case of a scalar parameter. Take fiL as the /of that theorem. We need to check only the last condition of the theorem. Differentiating (1.3.1) with respect to fi, we have
^ = ^(x'y-0x'x)L, (1.3.24)
where we have written X as x, inasmuch as it is a vector. Therefore, by Holder’s inequality (see Royden, 1968, p. 113),
= oI(/^2L^y) • (1-3.25)
The first integral on the right-hand side is finite because fi is assumed to have a finite variance. The second integral on the right-hand side is also finite because the moments of the normal distribution are finite. Moreover, both integrals are uniformly bounded in the assumed parameter space. Thus the last condition of Theorem 1.3.2 is satisfied.
Finally, from (1.3.8) and (1.3.20) we have
Vfi ^ <r2(X'X)-1 (1.3.26)
for any unbiased fi. The right-hand side of (1.3.26) is the variance-covariance matrix of the least squares estimator of fi, thereby proving that the least squares estimator is the best unbiased estimator under Model 1 with normality. Unlike the result in Section 1.2.5, the result in this section is not constrained by the linearity condition because the normality assumption was added. Nevertheless, even with the normality assumption, there may be a biased estimator that has a smaller average mean squared error than the least squares estimator, as we shall show in Section 2.2. In nonnormal situations, certain nonlinear estimators may be preferred to LS, as we shall see in Section 2.3.