Using gret l for Principles of Econometrics, 4th Edition
Linear Combination of Parameters
Since gretl stores and gives access to the estimated values of the coefficients and the variance - covariance matrix, testing hypotheses about linear combinations of parameters is very simple. Suppose you want an estimate of the average weekly food expenditure for a family earning $2000 per week. The average for any level of income is modeled using linear regression:
E (food-expincome) = ві + в2 income (3.9)
It can easily be shown that E(c1X + c2Y + c3) = c1E(X) + c2E(Y) + c3 where c1, c2, and c3 are
constants. If least squares is unbiased for the intercept and slope then E(b1) = в1 and E(b2) = в2. Hence, an estimate of the food expenditure for a family earning $2000 per week is
foodTexp = b1 + b220 = 83.416 + 10.2096 x 20 = 287.6089 (3.10)
The hypothesis that the average is statistically greater than $250 can be formally tested as:
Ho : в1 + в2 < 0 H1 : £1 + 20в2 > 250
Taking the variance of a linear combination is only slightly more complicated than finding the mean since in the variance calculation any covariance between X and Y needs to be accounted for. In general, var(c1X + c2Y + c3) = c^var(X) + c2var(Y) + 2c1c2cov(X, Y). Notice that adding a constant to a linear combination of random variables has no effect on its variance-only its mean. For a regression model, the elements needed to make this computation are found in the variance - covariance matrix.
The precision of least squares (and other estimators) is summarized by the variance-covariance matrix, which includes a measurement of the variance of the intercept and the slope, and covariance between the two. The variances of the least squares estimator fall on the diagonal of this
square matrix and the covariance is on the off-diagonal.
All of these elements have to be estimated from the data. To print an estimate of the variance - covariance matrix following a regression use the —vcv option with your regression in gretl:
ols food_exp const income —vcv
In terms of the hypothesis, var(b1 + 20b2 — 250) = 12var(b1) + 202var(b2) + 2(1)(20)cov(b1, b2). The covariance matrix printed by this option is:
Covariance matrix of regression coefficients:
const income
1884.44 -85.9032 const
4.38175 income
The arithmetic for variance is var(b1 + 20b2 — 250) = 1884.44 + (400)(4.38175) + (40)(—85.9032) = 201.017. The square root of this is the standard error, i. e., 14.178.
Of course, once you know the estimated standard error, you could just as well estimate an interval for the average food expenditure. The script to do just that is found below. Using hansl to do the arithmetic makes things a lot easier.
1 scalar vc = $vcv[1,1]+20"2*$vcv[2,2]+2*20*$vcv[2,1]
2 scalar se = sqrt(vc)
3 scalar tval = ($coeff(const)+20*$coeff(income)-250)/se
4 scalar p = pvalue(t,$df, tval)
5
5 scalar avg_food_20 = $coeff(const)+20*$coeff(income)
6 scalar lb = avg_food_20-critical(t,$df,0.025)*se
7 scalar ub = avg_food_20+critical(t,$df,0.025)*se
9
io print vc se tval p avg_food_20 lb ub
In the first line, the accessor $vcv is used. In it is the variance-covariance from the previously estimated model. (The square brackets contain the row and column location of the desired element. That is, the estimated variance of b1 is the element located in the first row and first column, hence $vcv[1,1]. The covariance between b1 and b2 can be found either in the first row, second column
or the second row, first column. So, $vcv[1,2]=$vcv[2,1]. The script also produces the p-value associated with a 5% one sided test.
In line 6 the average food expenditure is computed at income = 20, which corresponds to $2000/week (income is measured in $100). The lower and upper 95% confidence intervals are computed in lines 7 and 8.
1 set echo off
2 # confidence intervals
3 open "@gretldirdatapoefood. gdt"
4 ols food_exp const income
5 scalar lb = $coeff(income) - 2.024 * $stderr(income)
6 scalar ub = $coeff(income) + 2.024 * $stderr(income)
7 print lb ub
8
8 # using the critical function to get critical values
9 scalar lb = $coeff(income) - critical(t,$df,0.025) * $stderr(income)
10 scalar ub = $coeff(income) + critical(t,$df,0.025) * $stderr(income)
11 print lb ub
13
12 # t-ratio
13 open "@gretldirdatapoefood. gdt"
14 ols food_exp const income
17
15 #One sided test (Ha: b2 > zero)
16 scalar tratio1 = ($coeff(income) - 0)/ $stderr(income)
17 scalar c1 = critical(t,$df,.05)
18 scalar p1 = pvalue(t,$df, tratio1)
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
printf "The statistic = %.4f, 5%% critical value = %.4f and pvalue = %.4fn",tratio1, c1,p1
#One sided test (Ha: b2>5.5)
scalar tratio2 = ($coeff(income) - 5.5)/ $stderr(income) scalar c2 = critical(t,$df,.05) scalar p2 = pvalue(t,$df, tratio2)
printf "The statistic = %.4f, 5%% critical value = %.4f and pvalue = %.4fn",tratio2, c2,p2
#One sided test (Ha: b2<15)
scalar tratio3 = ($coeff(income) - 15)/ $stderr(income) scalar c3 = -1*critical(t,$df,.05)
scalar p3 = pvalue(t,$df, abs(tratio3))
printf "The statistic = %.4f, 5%% critical value = %.4f and pvalue = %.4fn",tratio3, c3,p3
#Two sided test (Ha: b2 not equal 7.5)
scalar tratio4 = ($coeff(income) - 7.5)/ $stderr(income)
scalar c4 = critical(t,$df,.025)
scalar p4 = 2*pvalue(t,$df, tratio4)
printf "The statistic = %.4f, 5%% critical value = %.4f and pvalue = %.4fn",tratio4, c4,p4
#Confidence interval
scalar lb = $coeff(income) - critical(t,$df,0.025) * $stderr(income)
scalar ub = $coeff(income) + critical(t,$df,0.025) * $stderr(income)
printf "The 95%% confidence interval is (%.4f, %.4f)n",lb, ub
#Two sided test (Ha: b2 not equal zero)
scalar tratio5 = ($coeff(income) - 0)/ $stderr(income) scalar c5 = critical(t,$df,.025) scalar p5 = 2*pvalue(t,$df, tratio5)
printf "The statistic = %.4f, 5%% critical value = %.4f and pvalue = %.4fn",tratio5, c5,p5
# linear combinations of coefficients open "@gretldirdatapoefood. gdt" ols food_exp const income --vcv
scalar vc = $vcv[1,1]+20rt2*$vcv[2,2]+2*20*$vcv[2,1] scalar se = sqrt(vc)
scalar tval = ($coeff(const)+20*$coeff(income)-250)/se scalar p = pvalue(t,$df, tval)
scalar avg_food_20 = $coeff(const)+20*$coeff(income) scalar lb = avg_food_20-critical(t,$df,0.025)*se scalar ub = avg_food_20+critical(t,$df,0.025)*se
print vc se tval p avg_food_20 lb ub
And for the repeated sampling exercise, the script is:
1 set echo off
2 open "@gretldirdatapoetable2_2.gdt"
3 list ylist = y1 y2 y3 y4 y5 y6 y7 y8 y9 y10
4 loop foreach i ylist —progressive —quiet
5 ols ylist.$i const x
6 scalar b1 = $coeff(const)
7 scalar b2 = $coeff(x)
8 scalar s1 = $stderr(const)
9 scalar s2 = $stderr(x)
10
її # 2.024 is the.025 critical value from the t(38) distribution
12 scalar c1L = b1 - critical(t,$df,.025)*s1
13 scalar c1R = b1 + critical(t,$df,.025)*s1
14 scalar c2L = b2 - critical(t,$df,.025)*s2
їв scalar c2R = b2 + critical(t,$df,.025)*s2
16
16 scalar sigma2 = $sigma"2
17 store @workdircoeff. gdt b1 b2 s1 s2 c1L c1R c2L c2R sigma2
18 endloop
20
19 open @workdircoeff. gdt
20 print c1L c1R c2L c2R —byobs
Monte Carlo to measure coverage probabilities of confidence intervals
1 set echo off
2 open "@gretldirdatapoefood. gdt"
3 set seed 3213798
4 loop 100 --progressive --quiet
в series u = normal(0,88)
6 series y = 80 + 10*income + u
7 ols y const income
8 # 2.024 is the.025 critical value from the t(38) distribution
9 scalar c1L = $coeff(const) - critical(t,$df,.025)*$stderr(const)
10 scalar c1R = $coeff(const) + critical(t,$df,.025)*$stderr(const)
11 scalar c2L = $coeff(income) - critical(t,$df,.025)*$stderr(income)
12 scalar c2R = $coeff(income) + critical(t,$df,.025)*$stderr(income)
13
13 # Compute the coverage probabilities of the Confidence Intervals
14 scalar p1 = (80>c1L && 80<c1R)
15 scalar p2 = (10>c2L && 10<c2R)
17
16 print p1 p2
17 store @workdircicoeff. gdt c1L c1R c2L c2R
18 endloop