 Using gret l for Principles of Econometrics, 4th Edition

# Monte Carlo Simulation

The first step in a Monte Carlo exercise is to model the data generation process. This requires what Davidson and MacKinnon (2004) refer to as a fully specified statistical model. A fully specified parametric model “is one for which it is possible to simulate the dependent variable once the values of the parameters are known” (Davidson and MacKinnon, 2004, p. 19). First you’ll need a regression function, for instance:

E (ytQt) = ві + в2 xt (2.14)

where yt is your dependent variable, xt the dependent variable, Qt the current information set, and ві and в2 the parameters of interest. The information set Qt contains xt as well as other potential explanatory variables that determine the average of yt. The conditional mean of yt given the information set could represent a linear regression model or a discrete choice model. However, equation (2.14) is not complete; it requires some description of how the unobserved or excluded factors affect ytQt.

To complete the the specification we need to specify an “unambiguous recipe” for simulating the model on a computer (Davidson and MacKinnon, 2004, p. 17). This means we’ll need to specify a probability distribution for the unobserved components of the model and then use a pseudo-random number generator to generate samples of the desired size.

In this example the data generation process will be as follows. We will let N = 40 and consider a linear model of the form

Уі = ві + в2 Xi + ei i = 1, 2, ••• , 40. (2.15)

The errors of the model will iid N(0, 88). The parameters ві = 100 and в2 = 10. Finally, let xl, x2, ■ ■ ■ ,x20 = 10 and let x2l, x22, ■ ■ ■ , x40 = 20. This gives us enough information to simulate samples of yi from the model. The hansl script (hansl is an acronym for hansl’s a neat scripting language is:

1 nulldata 40

2 # Generate X

3 series x = (index>20) ? 20 : 10

4

4 # Generate systematic portion of model

5 series ys = 100 + 10*x

7

6 loop 1000 —progressive —quiet

7 y = ys + normal(0,50)

8 ols y const x

9 scalar b1 = \$coeff(const)

10 scalar b2 = \$coeff(x)

11 scalar sig2 = \$sigma"2

12 print b1 b2 sig2

13 store "@workdircoef. gdt" b1 b2 sig2

14 endloop

17

15 open "@workdircoef. gdt"

16 summary

17 freq b2 —normal

The first line creates an empty dataset that has room for 40 observations. Line 3 contains a ternary conditional assignment operator.2 Here is how it works. A series x is being created. The statement in parentheses is checked. The question mark (?) is the conditional assignment. If the statement in parentheses is true, then x is assigned the value to the left of the colon. If false it gets the value to the right. So, when index (a gretl default way of identifying the observation number) is greater than 20, x is set to 20, if index is less than or equal to 20 it is set to 10.

Next, the systematic portion of the model is created. For this we need x and the known values of the parameters (100, 10). Then we loop from 1 to 1000 in increments of 1. Normal random variates are added to the model, it is estimated by ols, and several statistics from that computation are retrieved, printed, and stored in a specified location.

The normal(0,50) statement generates normal random variables with mean of 0 and a variance of 50. The print statement used in this context actually tells gretl to accumulate the things that are listed and to print out summary statistics from their computation inside the loop. The store command tells gretl to output b1, b2, and sig2 to an external file. The —progressive option to the loop command alters the print and store commands a bit, and you can consult the Gretl Users Guide for more information about how.

Here is the output from the Monte Carlo. First, the output from the progressive loop:   OLS estimates using the 40 observations 1-40 Statistics for 1000 repetitions Dependent variable: у

In a progressive loop, gretl will print out the mean and standard deviation from the series of estimates. It works with all single equation estimators in gretl and is quite useful for Monte Carlo analysis. From this you can see that the average value of the constant in 1000 samples is 100.491. The average slope was 9.962. The third column gives the mean of the standard error calculation

2 A ternary operator has three parts. In this case, the parts give us a fancy way of creating if/else statements. The first part, a, lies to the left of?, the second, b, falls between the question mark and the colon and the last, c, is to the right of the colon, e. g., a? b:c. If a is true, then b if not, then c.

from the simulation. If the standard errors are being estimated consistently, then these should be fairly close to the standard deviation of estimated coefficients to their left. The outcome from the print command is: Statistics for 1000 repetitions Variable mean

M 100.491 24.5847

Ь2 9.96204 1.57931

sig2 2497.03 551.720

When the print command is issued, it will compute and print to the screen the ‘mean’ and ‘std. dev.’ of the estimated scalar. Notice that b1 and b2 match the output produced by the —progressive option. The print command is useful for studying the behavior of various statistics (like tests, confidence intervals, etc) and other estimators that cannot be handled properly within a progressive loop (e. g., mle, gmm, and system estimation commands).

The store statement works behind the scenes, but yields this informative piece of information:

3tore: using filename c:teirpcoef. gdt Data written OX.

This tells you where gretl wrote the dataset that contains the listed scalars, and that is was written properly. Now you are ready to open it up and perform additional analysis. In this example, we have used the @workdir macro. This basically tells gretl to go to the working directory to write the file. You could write files to gretl’s temporary directory using @dotdircoef. gdt.

The data set is opened and the summary statistics generated (again, if needed)

1 open "@workdircoef. gdt"

2 summary

3 freq b2 —normal

From here you can plot frequency distribution and test to see whether the least squares estimator of slope is normally distributed. The histogram certainly appears to be normally distributed compared to the line plot of the normal. Also, the hypothesis test of the normality null against nonnormality cannot be rejected at any reasonable level of significance.

2.2 Script

The script for chapter 2 is found below. These scripts can also be found at my website http: //www. learneconometrics. com/gretl.

1 set echo off

2 open "@gretldirdatapoefood. gdt"

3 setinfo food_exp - d "household food expenditure per week"

4 - n "Food Expenditure"

5 setinfo income - d "weekly household income" - n "Weekly Income"

6 labels

7

7 #Least squares

8 ols food_exp const income —vcv

9 ols 10 2

11

10 #Summary Statistics

11 summary food_exp income

14

 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

#Plot the Data gnuplot food_exp income

#List the Data

print food_exp income —byobs #Elasticity

ols food_exp const income --quiet

scalar elast=\$coeff(income)*mean(income)/mean(food_exp) #Prediction

scalar yhat = \$coeff(const) + \$coeff(income)*20 #Table 2.2

open "@gretldirdatapoetable2_2.gdt" list ylist = y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 loop foreach i ylist

ols ylist.\$i const x endloop

# slopes and elasticities at different points open "@gretldirdatapoebr. gdt"

series sqft2 = sqftrt2 ols price const sqft2

scalar slope_2000 = 2*\$coeff(sqft2)*2000 scalar slope_4000 = 2*\$coeff(sqft2)*4000 scalar slope_6000 = 2*\$coeff(sqft2)*6000 scalar elast_2000 = slope_2000*2000/117461.77 scalar elast_4000 = slope_4000*4000/302517.39 scalar elast_6000 = slope_6000*6000/610943.42

# histogram for price and log(price) series l_price = ln(price)

freq price freq l_price

# regression using indicator variables open "@gretldirdatapoeutown. gdt" ols price const utown —quiet

scalar ut = \$coeff(const)+\$coeff(utown) scalar other = \$coeff(const)

printf "nThe average in Utown is %.4f and the average elsewhere is %.4fn",ut, other

# Monte Carlo simulation

open "@gretldirdatapoefood. gdt" set seed 3213789 loop 100 —progressive —quiet series u = normal(0,88) series y1= 80+10*income+u ols y1 const income

endloop

# Monte Carlo simulation #2

# Generate systematic portion of model nulldata 40

# Generate X

series x = (index>20) ? 20 : 10

# Generate systematic portion of model series ys = 100 + 10*x

loop 1000 —progressive —quiet series y = ys + normal(0,50) ols y const x scalar b1 = \$coeff(const) scalar b2 = \$coeff(x) scalar sig2 = \$sigmart2 print b1 b2 sig2

store "@workdircoef. gdt" b1 b2 sig2 endloop

open "@workdircoef. gdt" summary

 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89

freq b2 —normal     Figure 2.12: Results from the script to compute an elasticity based on a linear regression.

 Model 4: OLS, using obse Display actual, fitted, residual Dependent variable: food Forecasts... Confidence intervals for coefficients coefficient Cr>nfir|gp|f P РІІІГКР... e const 83.4160 ^Coefficient covariance matrix_^^ * income 10.2096 ANOVA 35 *** Mean dependent var 233 Bootstrap... .6752 .з гоз ь. v. dependent var 112 Sum squared resid 304505.2 5.E. Of regression 39. 51700 R-squared 0.385002 Adjusted R-squared 0.368818 F(l, 38) 23. 78884 P-value(F) 0.000019 Log-likelihood -235 .5088 Akaike criterion 475 .0176 Schwarz criterion 473 .3954 Hannan-Quinn 476 .2339 Figure 2.13: Obtain the matrix that contains the least squares estimates of variance and covariance from the pull-down menu of your estimated model.           OLS estimates using the 40 observations 1-40 Statistics for 100 repetitions Dependent variable: yl Statistics for 100 repetitions Variable mean

Ы 38.1474

b2 9.59723

store: using filename c:tempcoeff. gdt

Data, written OK.

Figure 2.15: The summary results from 100 random samples of the Monte Carlo experiment. 1.4е+006

 1.2е+006

 le+006

 800000

 600000

 400000

 200000

 3000

 4000

 5000

 6000

 7000

 8000

 1000

 2000  Figure 2.16: Price versus size from the quadratic model. Figure 2.17: Price and its natural logarithm.

Добавить комментарий

## Using gret l for Principles of Econometrics, 4th Edition

### Simulation

In appendix 10F of POE4, the authors conduct a Monte Carlo experiment comparing the performance of OLS and TSLS. The basic simulation is based on the model y = x …

### Hausman Test

The Hausman test probes the consistency of the random effects estimator. The null hypothesis is that these estimates are consistent-that is, that the requirement of orthogonality of the model’s errors …

### Time-Varying Volatility and ARCH Models: Introduction to Financial Econometrics

In this chapter we’ll estimate several models in which the variance of the dependent variable changes over time. These are broadly referred to as ARCH (autoregressive conditional heteroskedas - ticity) …

## Как с нами связаться:

Украина:
г.Александрия
тел./факс +38 05235  77193 Бухгалтерия
+38 050 512 11 94 — гл. инженер-менеджер (продажи всего оборудования)

+38 050 457 13 30 — Рашид - продажи новинок
e-mail: msd@msd.com.ua
Схема проезда к производственному офису:
Схема проезда к МСД

Партнеры МСД

## Контакты для заказов шлакоблочного оборудования:

+38 096 992 9559 Инна (вайбер, вацап, телеграм)
Эл. почта: inna@msd.com.ua

За услуги или товары возможен прием платежей Онпай: Платежи ОнПай