A COMPANION TO Theoretical Econometrics

# Limited-Information Estimators of Structural Parameters

We now consider the estimation of an equation in the linear simultaneous system. Note that the ith rows of B and Г contain the coefficients in the ith structural equation of the system. For the moment, let us consider the first equation of the system and, after imposing zero restrictions and a normalization rule, write it as

Gt K

Уп = “X внУн - X Y1 ixtj + % or у1 = Y1P + X1Y + U1 = Z5 + U1, (6.3)

i=2 j=1

where yti = (t, i) element of the observation matrix Y, xtj = (t, j) element of X, uti = (t, i) element of U, pi;- = (i, j) element of B, Yj = (i, j) element of Г, u1 = 1st column of U, X = (X1, X2), Y = y Y1, Y2), Z = (Y1, X1), 5' = (p', y'), - P' = (P12, P13,..., P1G1), and - y' = (Y11, Y12,..., Y1Kl).

Thus, the normalization rule is P11 = 1 and the prior restrictions impose the exclusion of the last G - G1 endogenous variables and the last K - K1 exogenous variables from the first equation.

From assumption A4 it follows that u1 ~ (0, o11f) where o11 is the (1, 1) element of Z. By assumption A2, X1 and u1 are independent of each other. However, in general Y1 and u1 will be correlated.

We shall consider two general categories of estimators of (6.3):

1. limited information or single equation methods, and

2. full information or system methods.

Together with the specification (6.3) for the first equation, all that is needed for the limited information estimators are the reduced form equations for the "included endogenous variables," namely; (y1, Y1). Under the classical assumptions

we have listed for the linear SEM (including A4'), this requirement reduces to a specification of the exogenous variables in the system since reduced form equations are linear in the exogenous variables with additive disturbances which are identically distributed as mutually independent multivariate Gaussian across sample observations. Full information methods, on the other hand, require the exact specification of the other structural equations composing the system.

In (6.3), the ordinary least squares (OLS) estimator of 5 is obtained directly from the linear regression of y1 on Z:

Sols = (Z'Z)-1Z'yi = 5 + (Z'Z)-1Z'ui. (6.4)

Because of the nonzero covariance between Y1 and u1, this estimator is inconsistent, in general:

5 = plim [(Z'Z/T )-1(Z'u1/T)]

= [plim (Z'Z/T)-1][plim (Z 'u1/T)]

where Q 22 is the covariance matrix of tth row of Y1 and Q 21 is the covariance vector between the tth rows of Y1 and y1.

A generic class of estimators of 5 in (6.4) can be obtained by using an instrument matrix, say W, of the same size as Z to form the so-called limited information instrumental variable (IV) estimator satisfying the modified normal equations

Wyx = WZ S iv or SIV = (WZ )"1W'y1. (6.5)

The minimal requirements on the instrument matrix W, which can be either stochastic or not, are

| W Z | Ф 0 for any T and plim (W'Z/T) is nonsingular. (6.6)

The ordinary least squares estimator belongs to this class - with the instrument matrix, say WOLS, equal to Z itself.

With (6.6) satisfied, a necessary and sufficient condition for SIV to be consistent is

plim (W u1/T) = 0. (6.7)

Condition (6.7) is not satisfied in the case of OLS and hence OLS would be inconsistent in general. Note that a nonstochastic matrix W satisfying (6.6) will satisfy (6.7) as well.

The limited information estimators which we discuss presently and which have been developed as alternatives to OLS, can be interpreted as members of this class of IV estimators corresponding to stochastic matrices. First of all, since

Xa is exogenous, we have plim Х[щ/T = 0 and so, for consistency, we can take an instrument matrix of the form

W = (Wy, Xa), (6.8)

where Wy is the instrument matrix for Ya chosen so that

plim Wyua/ T = 0. (6.9)

One possibility is to use E(Ya) for Wy, since E(Ya) = XП[ and hence plim (naX'ua/ T) = na lim (X'ua/ T) = 0. However, there is still one problem: na is not known and E(Ya) is not observable. To remedy this, we can use an estimate of na, say Па and as long as plim Па is finite, we get plim (naX'ua/T) = (plim }a) [plim(X'ua/ T)] = 0, and consequently, the instrument matrix (X }, Xa) provides a subclass of consistent instrumental variable estimators of 5. Note in this discussion that Па need not be consistent for na; all that is needed is that Па have a finite probability limit and produces instruments that satisfy (6.6).

One member of this subclass is the two-stage least squares (2SLS) estimator where na is estimated from the unrestricted least squares regression of Ya on X; that is, Пі = (X'X)-aX'Ya and the instrument matrix for Z is the regression of Z on X:

W2sls = (X(X'X)-aX'Ya, Xa) = PxZ. (610)

The terminology for this estimator derives from the fact that it can be interpreted as least squares applied twice. The procedure, which first regresses Ya on X to get PxYa and then regresses ya on PxYa and Xa, produces in the second-step regression exactly the 2SLS estimator which we have defined above.

The 2SLS estimator also can be interpreted as a generalized least squares (GLS) estimator in the linear model obtained by premultiplying (6.3) by X'.

Other alternative preliminary estimates of na also have been proposed in the literature. One alternative is to estimate each structural equation by ordinary least squares, thus obtaining BOLS and fOLS and then using the appropriate submatrix of the derived reduced form coefficient matrix П = -(BOLS)-afOLS, say, П(0), to construct the instrument matrix

(XІЇГ, Xa). (6H)

Although Пі0) itself is inconsistent, the IV estimate of 5 based on (6.11) as instrument for Z will be consistent since (6.9) is satisfied.

The limited information instrumental variable efficient (LIVE) estimator is based on the instrument matrix

Wlive = (XПГ Xa),

where naa) is a consistent estimator of Па derived from some initial consistent estimates of B and Г. This would be a two-step procedure if the initial consistent estimates of B and Г are themselves obtained by an instrumental variable procedure. In fact, one way of getting these initial consistent estimates of B and Г is by using П(0), as in (6.11), to generate instruments for the first-step consistent estimation of B and Г. Further iteration of this sequential procedure leads to the so-called iterated instrumental variable (IIV) procedure.

Yet another alternative to 2SLS is the so-called modified two-stage least squares (M2SLS). Like 2SLS, this is a two-step regression procedure; but here, in the first stage, we would regress Y1 on H instead of X, where H is a T x h data matrix of full column rank and rank (PX1H) > G - 1 for PXl = I - PXl. We can further show that this estimator will be exactly equivalent to the instrumental variable method with (PHY1, X1) as the instrument matrix if the column space of H contains the column space of X1 (see Mariano, 1977). Because of this, a suggested manner of constructing H is to start with X1 and then add at least (G1 - 1) more of the remaining K2 exogenous variables or, alternatively, the first G1 - 1 principal components of PX1X2.

Another instrumental variable estimator is Theil's k-class estimator. In this estimator, the instrument matrix is a linear combination of the instrument matrices for OLS and 2SLS. Thus, for Ww = kW2SLS + (1 - k) WOLs = kPxZ + (1 - k)Z, the k-class estimator of 5 is

X(k) = (W[k)Z)-1W('k)y1 = 5 + (W^W^. (6.12)

For consistency, we see from (6.12) that

plim W[k)u1/T = [plim(1 - k)] [plim Z'u1/T ], (6.13)

assuming that plim k is finite. Thus, for (6.13) to be equal to zero (and consequently, the consistency of k-class), a necessary and sufficient condition is plim (1 - k) = 0.

The limited information maximum likelihood (LIML) estimator, though based, as the term connotes, on the principle of maximizing a certain likelihood function, can also be given an instrumental variable interpretation. In fact as we shall show presently, it is a member of the k-class of estimators in (6.12). Essentially, the LIML estimator of в and у maximizes the likelihood of the included endogenous variables subject to the identifiability restrictions imposed on the equation being estimated. This is limited information (rather than full information) maximum likelihood in the sense that the likelihood function considered pertains only to those endogenous variables appearing in the estimated equation; endogenous variables excluded from this equation are thus disregarded. Also, identifiability restrictions on other equations in the system are not taken into account in the constrained maximization of the appropriate likelihood function.

For an explicit development of the LIML estimator, we start with the non - normalized version of the equation to be estimated; Y*в* = (y1, Y1)P* = X1y + u1. Thus, for the moment, we take the first row of B as (в*', 0'). The reduced form equations for Y * are Y * = X П*' + V* = X1n*1 + X2n*2' + V *. From the relationship

ВП = - Г, we get Пц'Р* = у and П*2'Р* = 0. For identifiability of the first equation, a necessary and sufficient condition is rank П*2 = G1 - 1.

The LIML estimator thus maximizes the likelihood function for Y* subject to the restriction that П*2'Р* = 0. This constrained maximization process reduces to the minimization of

v = (p*'Ap*)/(p*'Sp*) = 1 + (p*' Wp*)/(p*'Sp*) (6.14)

with respect to P*, for

S = Y *PXY *, W = Y *'(PX - PX1)Y *, A = S + W = Y *'PX1Y *. (6.15)

Solving this minimization problem we get P*IML = a characteristic vector of A (with respect to S) corresponding to h, where h is the smallest root of | A - vS | = 0 and is equal to (PumlAPLiml)/(P*imlSPliml).

The above derivation of LIML also provides a least variance ratio interpretation for it. In (6.14), p*' Wp* is the marginal contribution (regression sum of squares) of X2, given X1, in the regression of Y* P on X, while E{P*'SP*/(T - K)} = o11.

Thus, LIML minimizes the explained sum of squares of Y *P* due to X2 given X1, relative to a stochastic proxy for o11. On the other hand, P*SLS simply minimizes P*' WP* in absolute terms.

For the estimator P*LIML as described above to be uniquely determined, a normalization rule needs to be imposed. We shall use the normalization that the first element of P*LIML is equal to unity as in the case of all the limited information estimators which we have discussed so far. In this case, it can be easily shown that the LIML estimator of P and у in the normalized equation (6.3) is a E-class estimator. The value of E which gives the LIML estimator is h, where h is the smallest characteristic root of A with respect to S. Note that because A = S + W, we also have h = 1 + € where € is the smallest characteristic root of W (wrt S). Thus we can interpret LIML as a linear combination of OLS and 2SLS, with E > 1. Indeed, it can be shown formally that the 2SLS estimate of P lies between the OLS and LIML estimates.

Also note that P*IML and у LiML can be characterized equivalently as the maximum likelihood estimates of P and у based on the "limited information" model

У1 = Y1P + X1Y + u1

Y1 = X П1 + V1.

If the equation being estimated is exactly identified by zero restrictions, then the indirect least squares (ILS) estimator of P and y is well-defined. This is obtained directly from the unrestricted least squares estimate П* and is the solution to the system of equations taken from П*2'Р* = 0 and П^'Р* = y after setting p*' = (1, P'). We can further verify that if the equation being estimated is exactly identified, then the following estimators are exactly equivalent: 2SLS, LIML, ILS, and IV using (X1, X2) as the instrument matrix for the regressors (X1, Y1).