INTRODUCTION TO STATISTICS AND ECONOMETRICS
Known Variance-Covariance Matrix
In this subsection we develop the theory of generalized least squares under the assumption that X is known (known up to a scalar multiple, to be precise); in the remaining subsections we discuss various ways the elements of X are specified as a function of a finite number of parameters so that they can be consistently estimated.
Since X is symmetric, by Theorem 11.5.1 we can find an orthogonal matrix H which diagonalizes X as H'XH = A, where A is the diagonal matrix consisting of the characteristic roots of X. Moreover, since X is positive definite, the diagonal elements of A are positive by Theorem 11.5.10. Using (11.5.4), we define X_1/2 = HA'1/2H', where A“1/2 =
_ і /о
Z>{, }, where X, is the ith diagonal element of A. Premultiplying
(13.1.1) by X_1/2, we obtain (13.1.3) y* = X*(3 + u*,
where у* = X_1/2y, X* = X_1/2X, and u* = X 1/2u. Then, by Theorem 4.1.6, Eu* = 0 and
(13.1.4) £u*u*' = £X-1/2 uu' (X_v 2)'
= X“1/2X(X“1/2)'
= x_1/2x1/2x1/2x~1/2
I.
(The reader should verify that X1/2X1/2 = X, that X 1/2X1/2 = I, and that (X 1/2)' = X 1/2 from the definitions of these matrices.) Therefore (13.1.3) is a classical regression model, and hence the least squares estimator applied to (13.1.3) has all the good properties derived in Chapter
12. We call it the generalized least squares (GLS) estimator applied to the original model (13.1.1). Denoting it by Pg, we have
= (X' X“1/2 X “17 2x)-1x' X_1/2 X“1/2y = (X'X-1X)-1x'X-1y.
(Suppose X is known up to a scalar multiple. That is, suppose X = aQ, where a is a scalar positive unknown parameter and Q is a known positive definite matrix. Then a drops out of formula (13.1.5) and we have Pg = (X'Q-1X)-1X'Q-1y. The classical regression model is a special case, in
a
which a = a and Q = I.)
Inserting (13.1.1) into the final term of (13.1.5) and using Theorem 4.1.6, we can readily show that
(13.1.6) £Pg = P and
(13.1.7) ypG = (X'X-1X)-1.
It is important to study the properties of the least squares estimator applied to the model (13.1.1) because the researcher may use the LS estimator under the mistaken assumption that his model is (at least approximately) the classical regression model. We have, using Theorem
4.1.6, (13.1.8) |
£P = P |
and (13.1.9) |
TP = E (X'X)~~ 1X'uu'X(X'X)-1 |
= (X'X)'1X'XX(X'X)“1. |
Thus the LS estimator is unbiased even under the model (13.1.1). Its variance-covariance matrix, however, is different from either (13.1.7) or
(12.2.22) . Since the GLS estimator is the best linear unbiased estimator under the model (13.1.1) and the LS estimator is a linear estimator, it follows from Theorem 12.2.1 that
(13.1.10) (X'X)“1X'XX(X'X)“1 > (X'X_1X)_1.
The above can also be directly verified using theorems in Chapter 11.
Although strict inequality generally holds in (13.1.10), there are cases where equality holds. (See Amemiya, 1985, section 6.1.3.)
The consistency and the asymptotic normality of the GLS estimator follow from Section 12.2.4. The LS estimator can be also shown to be consistent and asymptotically normal under general conditions in the model (13.1.1).
If X is unknown, its elements cannot be consistendy estimated unless we specify them to be functions of a finite number of parameters. In the next three subsections we consider various parameterizations of X. Let 0 be a vector of unknown parameters of a finite dimension. In each of the models to be discussed, we shall indicate how 0 can be consistently estimated. Denoting the consistent estimator by 0, we can define the feasible generalized least squares (FGLS) estimator, denoted by p/?, by
(13.1.11) pr= [x, X(ft)“1x]_1x'S(ftr1y.
where the dependence of X on 0 is expressed by the symbol X(0). Under general conditions, (Jp is consistent, and V'T (Pf — p) has the same limit distribution as sir (Pg — P).