Using gret l for Principles of Econometrics, 4th Edition
Monte Carlo Experiment
Once again, the consequences of repeated sampling can be explored using a simple Monte Carlo study. In this case, we will generate 100 samples and count the number of times the confidence interval includes the true value of the parameter. The simulation will be based on the food. gdt dataset.
The new script looks like this:
1 open "@gretldirdatapoefood. gdt"
2 set seed 3213798
3 loop 100 —progressive —quiet
4 series u = normal(0,88)
5 series y = 80 + 10*income + u
6 ols y const income
7
7 scalar c1L = $coeff(const) - critical(t,$df,.025)*$stderr(const)
8 scalar c1R = $coeff(const) + critical(t,$df,.025)*$stderr(const)
9 scalar c2L = $coeff(income) - critical(t,$df,.025)*$stderr(income)
10 scalar c2R = $coeff(income) + critical(t,$df,.025)*$stderr(income)
12
11 # Compute the coverage probabilities of the Confidence Intervals
12 scalar p1 = (80>c1L && 80<c1R)
13 scalar p2 = (10>c2L && 10<c2R)
16
14 print p1 p2
15 store @workdircicoeff. gdt c1L c1R c2L c2R
16 endloop
The results are stored in the gretl data set cicoeff. gdt. Opening this data set (open @workdir cicoeff. gdt) and examining the data will reveal interval estimates that vary much like those in Tables 3.1 and 3.2 of POE4. In line 4 of this script pseudo-random normals are drawn using the normal(mean, sd) command, and the mean has been set to 0 and the standard deviation to 88. The samples of y are generated linearly (80+10*food_exp) to which the random component is added in line 5. Then, the upper and lower bounds are computed. In lines 14 and 15 gretl’s “and” logical operator, &&, is used to determine whether the coefficient (80 or 10) falls within the computed bounds. The operator && actually yields the intersection of two sets so that if 80 is greater than the lower bound and smaller than the upper p1, then the condition is true and p1 is equal to 1. If the statement is false, it is equal to zero. Averaging p1 and p2 gives you the proportion of times in the Monte Carlo that the condition is true, which amounts to the empirical coverage rate of the computed interval.
With this seed, I get the following (Figure 3.6) result: You can see that the intercept falls within
OLS estimates using the 40 observations 1-40 Statistics for 100 repetitions Dependent variable: у
Statistics for 100 repetitions
Variable mean std. dev.
pi 0.930000 0.255147
p2 0.920000 0.271293
store: using filename c:tempcicoeff. gdt Data written OK.
the estimated interval 93 out of 100 times and the slope within its interval 92% of the time.