Using gret l for Principles of Econometrics, 4th Edition
Estimation
Hill et al. (2011) provides a subset of National Longitudinal Survey which is conducted by the U. S. Department of Labor. The database includes observations on women, who in 1968, were between the ages of 14 and 24. It then follows them through time, recording various aspects of their lives annually until 1973 and bi-annually afterwards. Our sample consists of 716 women observed in 5 years (1982, 1983, 1985, 1987 and 1988). The panel is balanced and there are 3580 total observations.
Two model considered is found in equation (15.2) below.
ln(wage)it = ви + в2 educit + в3 experit + в4 experft + въ tenureit
+вб tenure2) + e7southit + e8unionit + e9blackit + eit (15.4)
The main command used to estimate models with panel data in gretl is panel. syntax is:
panel
Arg u m ents: depvar indepvars
Options: —vcv (print covariance matrix)
—fixed-effects (estimate with group fixed effects)
—random-effects (random effects or GLS model)
—between (estimate the between-groups model)
—robust (robust standard errors; see below)
—time-dumm. ies (include time dummy variables)
—unit-weights (weighted least squares)
—iterate (iterative estimation)
—matrix-dif f (use matrix-difference method for Hausman test) —quiet (less verbose output)
—verbose (more verbose output)
All of the basic panel data estimators are available. Fixed effects, two-way fixed effects, random effects, between estimation and (not shown) pooled least squares.
The first model to be estimated is referred to as pooled least squares. Basically, it imposes the restriction that ви = в 1 for all individuals. The individuals have the same intercepts. Applying pooled least squares in a panel is restrictive in a number of ways. First, to estimate the model using least squares violates at least one assumption that is used in the proof of the Gauss-Markov theorem. It is almost certain that errors for an individual will be correlated. If Johnny isn’t the sharpest marble in the bag, it is likely that his earnings given equivalent education, experience, tenure and so on will be on the low side of average for each year. He has low ability and that affects each year’s average wage similarly.
It is also possible that an individual may have smaller of larger earnings variance compared to others in the sample. The solution to these specification issues is to use robust estimates of the variance covariance matrix. Recall that least squares is consistent for the slopes and intercept (but not efficient) when errors are correlated or heteroskedastic, but that this changes the nature of the variance-covariance.
Robust covariances in panel data take into account the special nature of these data. Specifically they account for autocorrelation within the observations on each individual and they allow the variances for different individuals to vary. Since panel data have both a time-series and a cross
sectional dimension one might expect that, in general, robust estimation of the covariance matrix would require handling both heteroskedasticity and autocorrelation (the HAC approach).
Gretl currently offers two robust covariance matrix estimators specifically for panel data. These are available for models estimated via fixed effects, pooled OLS, and pooled two-stage least squares. The default robust estimator is that suggested by Arellano (2003), which is HAC provided the panel is of the “large n, small T” variety (that is, many units are observed in relatively few periods).
In cases where autocorrelation is not an issue, however, the estimator proposed by Beck and Katz (1995) and discussed by Greene (2003, chapter 13) may be appropriate. This estimator takes into account contemporaneous correlation across the units and heteroskedasticity by unit.
1 open "@gretldirdatapoenls_panel. gdt"
2 list xvars = const educ exper exper2 tenure tenure2 south black union
3 panel lwage xvars —pooled —robust
The first thing to notice is that even though the model is being estimated by least squares, the panel command is used with the —pooled option. The —robust option requests the default HCCME for panel data which is basically a special version of HAC (see section 9.6.1).
Pooled OLS, using 3580 observations
Included 716 cross-sectional units
Time-series length = 5
Dependent variable: lwage
Robust (HAC) standard errors
Coefficient |
Std. Error |
f-ratio |
p-value |
|
const |
0.476600 |
0.0844094 |
5.6463 |
0.0000 |
educ |
0.0714488 |
0.00548952 |
13.0155 |
0.0000 |
exper |
0.0556851 |
0.0112896 |
4.9324 |
0.0000 |
exper2 |
-0.00114754 |
0.000491577 |
-2.3344 |
0.0196 |
tenure |
0.0149600 |
0.00711024 |
2.1040 |
0.0354 |
tenure2 |
-0.000486042 |
0.000409482 |
-1.1870 |
0.2353 |
south |
-0.106003 |
0.0270124 |
-3.9242 |
0.0001 |
black |
-0.116714 |
0.0280831 |
-4.1560 |
0.0000 |
union |
0.132243 |
0.0270255 |
4.8933 |
0.0000 |
p 0.811231 Durbin-Watson |
0.337344 |
As long as omitted effects (e. g., individual differences) are uncorrelated with any of the regressors, these estimates are consistent. If the individual differences are correlated with regressors, then you can estimate the model's parameters consistently using fixed effects.
For comparison purposes, the pooled least squares results without cluster standard errors is shown
Pooled OLS, using 3580 observations
Included 716 cross-sectional units
Time-series length = 5
Dependent variable: lwage
Coefficient |
Std. Error |
t-ratio |
p-value |
|
const |
0.476600 |
0.0561559 |
8.4871 |
0.0000 |
educ |
0.0714488 |
0.00268939 |
26.5669 |
0.0000 |
exper |
0.0556851 |
0.00860716 |
6.4696 |
0.0000 |
exper2 |
-0.00114754 |
0.000361287 |
-3.1763 |
0.0015 |
tenure |
0.0149600 |
0.00440728 |
3.3944 |
0.0007 |
tenure2 |
-0.000486042 |
0.000257704 |
-1.8860 |
0.0594 |
south |
-0.106003 |
0.0142008 |
-7.4645 |
0.0000 |
black |
-0.116714 |
0.0157159 |
-7.4265 |
0.0000 |
union |
0.132243 |
0.0149616 |
8.8388 |
0.0000 |