Using gret l for Principles of Econometrics, 4th Edition
Treatment Effects
In order to understand the measurement of treatment effects, consider a simple regression model in which the explanatory variable is a dummy variable, indicating whether a particular individual is in the treatment or control group. Let y be the outcome variable, the measured characteristic
the treatment is designed to affect. Define the indicator variable d as
The effect of the treatment on the outcome can be modeled as
Уі = ві + в2 di + ei i = 1, 2, ...,N (7.10)
where ei represents the collection of other factors affecting the outcome. The regression functions for the treatment and control groups are
The treatment effect that we want to measure is в2. The least squares estimator of в2 is
where y 1 is the sample mean for the observations on y for the treatment group and y0 is the sample mean for the observations on y for the untreated group. In this treatment/control framework the estimator b2 is called the difference estimator because it is the difference between the sample means of the treatment and control groups.
To illustrate, we use the data from project STAR described in POE4, chapter 7.5.3.
The first thing to do is to take a look at the descriptive statistics for a subset of the variables. The list v is created to hold the variable names of all the variables of interest. Then the summary command is issued for the variables in v with the —by option. This option takes an argument, which is the name of a discrete variable by which the subsets are determined. Here, small and regular are binary, taking the value of 1 for small classes and 0 otherwise. This will lead to two sets of summary statistics.
1 open "@gretldirdatapoestar. gdt"
2 list v = totalscore small tchexper boy freelunch white_asian
3 tchwhite tchmasters schurban schrural
4 summary v —by=small —simple
5 summary v —by=regular —simple
Here is a partial listing of the output:
regular = 1 (n = 2005):
Mean Minimum Maximum Std. Dev.
918.04 |
635.00 |
1229.0 |
73.138 |
|
small |
0.00000 |
0.00000 |
0.00000 |
0.00000 |
tchexper |
9.0683 |
0.00000 |
24.000 |
5.7244 |
boy |
0.51322 |
0.00000 |
1.0000 |
0.49995 |
freelunch |
0.47382 |
0.00000 |
1.0000 |
0.49944 |
white_asian |
0.68130 |
0.00000 |
1.0000 |
0.46609 |
tchwhite |
0.79800 |
0.00000 |
1.0000 |
0.40159 |
tchmasters |
0.36509 |
0.00000 |
1.0000 |
0.48157 |
schurban |
0.30125 |
0.00000 |
1.0000 |
0.45891 |
schrural |
0.49975 |
0.00000 |
1.0000 |
0.50012 |
small = 1 (n = |
1738): |
|||
Mean |
Minimum |
Maximum |
Std. Dev. |
|
totalscore |
931.94 |
747.00 |
1253.0 |
76.359 |
small |
1.0000 |
1.0000 |
1.0000 |
0.00000 |
tchexper |
8.9954 |
0.00000 |
27.000 |
5.7316 |
boy |
0.51496 |
0.00000 |
1.0000 |
0.49992 |
freelunch |
0.47181 |
0.00000 |
1.0000 |
0.49935 |
white_asian |
0.68470 |
0.00000 |
1.0000 |
0.46477 |
tchwhite |
0.86249 |
0.00000 |
1.0000 |
0.34449 |
tchmasters |
0.31761 |
0.00000 |
1.0000 |
0.46568 |
schurban |
0.30610 |
0.00000 |
1.0000 |
0.46100 |
schrural |
0.46260 |
0.00000 |
1.0000 |
0.49874 |
The —simple option drops the median, C. V., skewness and excess kurtosis from the summary statistics. In this case we don’t need those so the option is used.
Next, we want to drop the observations for those classrooms that have a teacher’s aide and to construct a set of variable lists to be used in the regressions that follow.
1 smpl aide!= 1 —restrict
2 list x1 = const small
3 list x2 = x1 tchexper
4 list x3 = x2 boy freelunch white_asian
5 list x4 = x3 tchwhite tchmasters schurban schrural
In the first line the smpl command is used to limit the sample (—restrict) to those observations for which the aide variable is not equal (!=) to one. The list commands are interesting. Notice that x1 is constructed in a conventional way using list; to the right of the equality is the name of two variables. Then x2 is created with the first elements consisting of the list, x1 followed by the additional variable tchexper. Thus, x2 contains const, small, and tchexper. The lists x3 and x4 are constructed similarly. New variables are appended to previously defined lists. It’s quite seamless and natural.
Now each of the models is estimated with the —quiet option and put into a model table.
OLS estimates
Dependent variable: totalscore
(і) |
(2) |
(3) |
(4) |
|
const |
918.0** |
907.6** |
927.6** |
936.0** |
(1.667) |
(2.542) |
(3.758) |
(5.057) |
|
small |
13.90** |
13.98** |
13.87** |
13.36** |
(2.447) |
(2.437) |
(2.338) |
(2.352) |
|
tchexper |
1.156** |
0.7025** |
0.7814** |
|
(0.2123) |
(0.2057) |
(0.2129) |
||
boy |
—15.34** |
—15.29** |
||
(2.335) |
(2.330) |
|||
freelunch |
—33.79** |
—32.05** |
||
(2.600) |
(2.666) |
|||
white_asian |
11.65** |
14.99** |
||
(2.801) |
(3.510) |
|||
tchwhite |
—2.775 |
|||
(3.535) |
||||
tchmasters |
—8.180** |
|||
(2.562) |
||||
schurban |
—8.216** |
|||
(3.673) |
||||
schrural |
—9.133** |
|||
(3.210) |
||||
n |
3743 |
3743 |
3743 |
3743 |
R2 |
0.0083 |
0.0158 |
0.0945 |
0.0988 |
г |
-2.145e+004 - |
2.144e+004 —2.128e+004 |
—2.127e+004 |
|
Standard |
errors in |
parentheses |
* indicates significance at the 10 percent level ** indicates significance at the 5 percent level |
The coefficient on the small indicator variable is not affected by adding or dropping variables from the model. This is indirect evidence that it is not correlated with other regressors. The effect of teacher experience on test scores falls quite a bit when boy, freelunch, and white_asian are added to the equation. This suggests that it is correlated with one or more of these variables and that omitting them from the model leads to biased estimation of the parameters by least squares.