Using gret l for Principles of Econometrics, 4th Edition
Two-Stage Least Squares
To perform Two-Stage Least Squares (TSLS) or Instrumental Variables (IV) estimation you need instruments that are correlated with your independent variables, but not correlated with the errors of your model. In the wage model, we will need some variables that are correlated with education, but not with the model’s errors. We propose that mother’s education (mothereduc) is suitable. The mother’s education is unlikely to enter the daughter’s wage equation directly, but it is reasonable to believe that daughters of more highly educated mothers tend to get more education themselves. These propositions can and will be be tested later. In the meantime, estimating the wage equation using the instrumental variable estimator is carried out in the following example. First, load the mroz. gdt data into gretl. Then, to open the basic gretl dialog box that computes the IV estimator choose Model>Instrumental Variables>Two-Stage Least Squares from the pull-down menu as shown below in Figure 10.1. This opens the dialog box shown in Figure 10.2.
Figure 10.1: Two-stage least squares estimator from the pull-down menus |
In this example we choose l_wage as the dependent variable, put all of the desired instruments into the Instruments box, and put all of the independent variables, including the one(s) measured with error, into the Independent Variables box. If some of the right-hand side variables for the model are exogenous, they should be referenced in both lists. That’s why the const, exper, and sq_exper variables appear in both places. Press the OK button and the results are found in Table 10.1. Notice that gretl ignores the sound advice offered by the authors of your textbook and computes an R2. Keep in mind, though, gretl computes this as the squared correlation between observed and fitted values of the dependent variable, and you should resist the temptation to interpret R2 as the proportion of variation in l_wage accounted for by the model.
If you prefer to use a script, the syntax is very simple.
TSLS, using observations 1-428
Dependent variable: Lwage
Instrumented: educ
Instruments: const mothereduc exper sq_exper
Coefficient |
Std. Error |
z |
p-value |
|
const |
0.198186 |
0.472877 |
0.4191 |
0.6751 |
educ |
0.0492630 |
0.0374360 |
1.3159 |
0.1882 |
exper |
0.0448558 |
0.0135768 |
3.3039 |
0.0010 |
sq_exper |
-0.000922076 |
0.000406381 |
-2.2690 |
0.0233 |
Mean dependent var 1.190173 S. D. dependent var 0.723198 |
Sum squared resid 195.8291 S. E. of regression 0.679604
R2 0.135417 Adjusted R2 0.129300
F(3,424) 7.347957 P-value(F) 0.000082
Log-likelihood -3127.203 Akaike criterion 6262.407
Schwarz criterion 6278.643 Hannan-Quinn 6268.819
Table 10.1: Results from two-stage least squares estimation of the wage equation.
t3l3
depvar indepvars; instruments —vcv (print covariance matrix)
—robust (robust standard errors)
—liir. l (use Limited Information Maximum Likelihood)
—gram (use the Generalized Method of Moments)
tsls yl 0 y2 y3 xl x2 ; 0 xl x2 хЗ x4 x5 x6
The basic syntax is this: tsls y x ; z, where y is the dependent variable, x are the regressors, and z the instruments. Thus, the gretl command tsls calls for the IV estimator to be used and it is followed by the linear model you wish to estimate.
The script for the example above is
1 list x = const educ exper sq_exper
2 list z = const exper sq_exper mothereduc
3 tsls l_wage x ; z
In the script, the regressors for the wage equation are collected into a list called x. The instruments, which should include all exogenous variables in the model including the constant, are placed in the list called z. Notice that z includes all of the exogenous variables in x. Here the dependent variable, y, is replaced with its actual value from the example, (l_wage).
1 smpl wage>0 —restrict
2 ols educ z
3 series educ_hat = $yhat
Notice that the sample had to be restricted to those wages greater than zero using the —restrict option. If you fail to do this, the first stage regression will be estimated with all 753 observations instead of the 428 used in tsls. TSLS is implicitly limiting the first stage estimation to the nonmissing values of l_wage. You can see that the coefficient estimates are the same as those in Table 10.1, but the standard errors are not.