Using gret l for Principles of Econometrics, 4th Edition
Alternatives to TSLS
There are several alternatives to the standard IV/TSLS estimator. Among them is the limited information maximum likelihood (LIML) estimator, which was first derived by Anderson and Rubin (1949). There is renewed interest in LIML because evidence indicates that it performs better than TSLS when instruments are weak. Several modifications of LIML have been suggested by Fuller (1977) and others. These estimators are unified in a common framework, along with TSLS, using the idea of a k-class of estimators. LIML suffers less from test size aberrations than the TSLS estimator, and the Fuller modification suffers less from bias. Each of these alternatives will be considered below.
In a system of M simultaneous equations let the endogenous variables be y1,y2,... ,yM. Let there be K exogenous variables x1,x2,... ,xK. The first structural equation within this system is
yi = a2y2 + віхі + в2х2 + ei (11.7)
The endogenous variable y2 has reduced form y2 = n12x1 + n22x2 +------- + nK2xK + v2 = E (y2) + v2,
which is consistently estimated by least squares. The predictions from the reduced form are
E (У2) = П12Х1 + П22Х2 +---- + ПК2ХК (11.8)
and the residuals are v2 = y2 — E (y2).
The two-stage least squares estimator is an IV estimator using E (y2) as an instrument. A k-class estimator is an IV estimator using instrumental variable y2 — kv2. The LIML estimator uses k = l where I is the minimum ratio of the sum of squared residuals from two regressions. The explanation is given on pages 468-469 of POE4. A modification suggested by Fuller (1977) that uses the k-class value
(11.9)
where K is the total number of instrumental variables (included and excluded exogenous variables) and N is the sample size. The value of a is a constant-usually 1 or 4. When a model is just identified, the LIML and TSLS estimates will be identical. It is only in overidentified models that the two will diverge. There is some evidence that LIML is indeed superior to TSLS when instruments are weak and models substantially overidentified.
With the Mroz data we estimate the hours supply equation
hours = в1 + e2mtr + вз educ + e4kidsl6 + e5nwifeinc + e (11.10)
A script can be used to estimate the model via LIML. The following one is used to replicate the results in Table 11B.3 of POE4.
1 open "@gretldirdatapoemroz. gdt"
2 square exper
3 series nwifeinc = (faminc-wage*hours)/1000
4 smpl hours>0 —restrict
5 list x = mtr educ kidsl6 nwifeinc const
6 list z1 = educ kidsl6 nwifeinc const exper
7 list z2 = educ kidsl6 nwifeinc const exper sq_exper largecity
8 list z3 = kidsl6 nwifeinc const mothereduc fathereduc
9 list z4 = kidsl6 nwifeinc const mothereduc fathereduc exper
10
10 tsls hours x; z1 --liml
11 tsls hours x; z2 --liml
12 tsls hours x; z3 --liml
13 tsls hours x; z4 --liml
LIML estimation uses the tsls command with the --liml option. The results from LIML estimation of the hours equation, (11.10) the fourth model in line 14, are given below. The variables mtr and educ are endogenous, and the external instruments are mothereduc, fathereduc, and exper; two endogenous variables with three external instruments suggests that the model is overidentified in this specification.
LIML, using observations 1-428
Dependent variable: hours
Instrumented: mtr educ
Smallest eigenvalue = 1.00288
LR over-identification test: %2(1) = 1.232 [0.2670]
The LIML results are easy to replicate using matrix commands. Doing so reveals some of hansl’s power.
matrix y1 = { hours, mtr, educ }
matrix w = { kidsl6, nwifeinc, const, exper, mothereduc, fathereduc}
matrix z = { kidsl6, nwifeinc, const}
matrix Mz = I($nobs)-z*invpd(z’*z)*z’
matrix Mw = I($nobs)-w*invpd(w’*w)*w’
matrix Ez= Mz*y1
matrix W0 = Ez'*Ez
matrix Ew = Mw*y1
matrix W1 = Ew'*Ew
matrix G = inv(W1)*W0
matrix l = eigengen(G, null)
scalar minl = min(l)
printf "nThe minimum eigenvalue is %.8f n",minl
matrix X = { mtr, educ, kidsl6, nwifeinc, const }
matrix y = { hours }
matrix kM = (I($nobs)-(minl*Mw))
matrix b =invpd(X’*kM*X)*X’*kM*y
a=rownames(b, " mtr educ kidsl6 nwifeinc const ") printf "nThe liml estimates are n %.6f n", b
The equations that make this magic are found in Davidson and MacKinnon (2004, pp. 537-538).
The liml estimates are mtr -19196.516697 educ -197.259108 kidsl6 207.553130 nwifeing -104.941545 const 18587.905980
which matches the ones produced by gretl’s tsls with —liml option.
1 set echo off
2 open "@gretldirdatapoetruffles. gdt"
3 # reduce form estimation
4 list x = const ps di pf
5 ols q x
6 ols p x
7
7 # demand and supply of truffles
8 open "@gretldirdatapoetruffles. gdt"
9 list x = const ps di pf
10 tsls q const p ps di; x
11 tsls q const p pf; x
13
12 # Hausman test
13 ols p x
14 series v = $uhat
15 ols q const p pf v
16 omit v
19
17 # supply estimation by OLS
18 ols q const p pf
22
19 # Fulton Fish
20 open "@gretldirdatapoefultonfish. gdt"
21 #Estimate the reduced form equations
22 list days = mon tue wed thu
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
list z = const stormy days ols lquan z ols lprice z omit days —quiet
tsls lquan const lprice days ; z
# LIML
open "@gretldirdatapoemroz. gdt" square exper
series nwifeinc = (faminc-wage*hours)/1000 smpl hours>0 —restrict list x = mtr educ kidsl6 nwifeinc const list z1 = educ kidsl6 nwifeinc const exper
list z2 = educ kidsl6 nwifeinc const exper sq_exper largecity
list z3 = kidsl6 nwifeinc const mothereduc fathereduc
list z4 = kidsl6 nwifeinc const mothereduc fathereduc exper
# LIML using tsls
tsls hours x; z1 —liml tsls hours x; z2 —liml tsls hours x; z3 --liml tsls hours x; z4 --liml tsls hours x; z4
# LIML using matrices
matrix y1 = { hours, mtr, educ }
matrix w = { kidsl6, nwifeinc, const, exper, mothereduc, fathereduc}
matrix z = { kidsl6, nwifeinc, const}
matrix Mz = I($nobs)-z*invpd(z'*z)*z'
matrix Mw = I($nobs)-w*invpd(w'*w)*w'
matrix Ez= Mz*y1
matrix W0 = Ez'*Ez
matrix Ew = Mw*y1
matrix W1 = Ew'*Ew
matrix G = inv(W1)*W0
matrix l = eigengen(G, null)
scalar minl = min(l)
printf "nThe minimum eigenvalue is %.8f n",minl
matrix X = { mtr, educ, kidsl6, nwifeinc, const }
matrix y = { hours }
matrix kM = (I($nobs)-(minl*Mw))
matrix b =invpd(X'*kM*X)*X'*kM*y
a=rownames(b, " mtr educ kidsl6 nwifeinc const ") printf "nThe liml estimates are n %.6f n", b
# Fuller's Modified LIML a=1
scalar fuller_l=minl-(1/($nobs-cols(w))) printf "nThe minimum eigenvalue is %.8f n",minl matrix X = { mtr, educ, kidsl6, nwifeinc, const } matrix y = { hours }
78 matrix kM = (I($nobs)-(fuller_l*Mw))
79 matrix b =invpd(X’*kM*X)*X’*kM*y
80 a=rownames(b, " mtr educ kidsl6 nwifeinc const ")
81 printf "nThe liml estimates using Fuller a=1 n %.6f n", b
82 tsls hours mtr educ kidsl6 nwifeinc const ; z4 —liml