Springer Texts in Business and Economics
Prediction
Let us now predict Y0 given X0. Usually this is done for a time series regression, where the researcher is interested in predicting the future, say one period ahead. This new observation Y0 is generated by (3.1), i. e.,
YQ = a + /3Xo + uo (3.11)
What is the Best Linear Unbiased Predictor (BLUP) of E(Y0)? From (3.11), E(Y0) = a + @X0 is a linear combination of a and в. Using the Gauss-Markov result, Y0 = aOLS + POLSX0 is BLUE for a+вХ0 and the variance of this predictor of E(Y0) is a2[(1/n) + (X0 — X)2/ rn=1 x2], see problem 10. But, what if we are interested in the BLUP for Y0 itself? Y0 differs from E(YQ) by u0, and the best predictor of u0 is zero, so the BLUP for Y0 is still Y0. The forecast error is
Yq — Yo = [Yo — E(Yq)] + [E (Yo) — Yo] = uq + [E(Yq) — Yq]
where u0 is the error committed even if the true regression line is known, and E(Y0) — Y0 is the difference between the sample and population regression lines. Hence, the variance of the forecast error becomes:
var(uo) + var[E (Yo) — Yo] + 2cov[uo, E(Yo) — Yo] = a2[1 + (1/n) + (Xo — X)2/ x2]
This says that the variance of the forecast error is equal to the variance of the predictor of E(YQ) plus the var(u0) plus twice the covariance of the predictor of E(Y0) and u0. But, this last covariance is zero, since u0 is a new disturbance and is not correlated with the disturbances in the sample upon which Yi is based. Therefore, the predictor of the average consumption of a $20,000 income household is the same as the predictor of consumption for a specific household whose income is $20,000. The difference is not in the predictor itself but in the variance attached to it. The latter variance being larger only by a2, the variance of u0. The variance of the predictor therefore, depends upon a2, the sample size, the variation in the X’s, and how far X0 is from the sample mean of the observed data. To summarize, the smaller a2 is, the larger n and YlП=і x2 are, and the closer X0 is to X, the smaller is the variance of the predictor. One can construct 95% confidence intervals to these predictions for every value of X0. In fact, this is (aoLS + eOLSXo) ± t.025;n-2{s[1 + (1/n) + (Xo — X)2/£n=1 x2] 2 } where s replaces a, and t.025;n-2 represents the 2.5% critical value obtained from a t-distribution with n — 2 degrees of freedom. Figure 3.5 shows this confidence band around the estimated regression line. This is a hyperbola which is the narrowest at X as expected, and widens as we predict away from X.