THE ECONOMETRICS OF MACROECONOMIC MODELLING
Model-based vs. data-based expectations
Apparently, it is often forgotten that the ‘classical’ regression formulation in (4.31) is consistent with the view that behaviour is driven by expectations, albeit not by model-based or rational expectations with unknown parameters that need to be estimated (unless they reside like memes in agents’ minds). To establish the expectations interpretation of (4.31), replace (4.30) by
yf = вхЄ+i
and assume that agents solve Ax^+1 = 0 to obtain xf+1. Substitution of xf+1 = xt, and using (4.17) for yf gives (4.31).
Ax<i+1 = 0, is an example of a univariate prediction rule without any parameters but which is instead based directly on data properties, hence they are referred to as data-based expectations; see Hendry (19956: ch. 6.2.3). Realistically, agents might choose to use data-based predictors because of the cost of information collection and processing associated with model-based predictors. It is true that agents who rely on Ax^+1 = 0 use a mis-specified model of the x-process in (4.25), and thus their forecasts will not attain the minimum mean square forecast error.9 Hence, in a stationary world there are gains from estimating a1 in (4.25). However, in practice there is no guarantee that the parameters of the x-process stay constant over the forecast horizon, and in this non-stationary state of the world a model-based forecast cannot be ranked as better than the forecast derived from the simple rule Ax^+1 = 0. In fact, depending on the dating of the regime shift relative to the ‘production’ of the forecast, the data-based forecast will be better than the model-based forecast in terms of bias.
In order to see this, we introduce a growth term in (4.25), that is,
xt = ao + a1xt-1 + ex, t, E[ex, t I xt_1 ]=0 (4.32)
and assume that there is a shift in a0 (to aj) in period T + 1.
9 This is the well-known theorem that the conditional mean of a correctly specified model attains the minimum mean squared forecast error; see Granger and Newbold (1986: ch. 4), Brockwell and Davies (1991: ch. 5.1), or Clements and Hendry (1998: ch. 2.7).
We consider two agents, A and B, who forecast xT+1. Agent A collects data for a period t = 1, 2, 3,...,T and is able to discover the true values of {a0, a} over that period. However, because of the unpredictable shift a0 ^ a0 in period T +1, A’s forecast error will be
eA, T+1 = °° — a0 + ex, T+1. (4.33)
Agent B, using the data-based forecast xT +1 = xT, will experience a forecast error
ев, т+1 = a — (1 - а1)хт + tx, T+1, which can be expressed as
eB, T+1 = «0 — ao + (1 — a1)(x0 — xT) + £x, T+1, (4.34)
where x° denotes the (unconditional) mean of xT (i. e. for the pre-shift intercept ф0), x0 = a0/(1 — a1). Comparison of (4.33) and (4.34) shows that the only difference between the two forecast errors is the term (1 — a1)(xT — x)?) in (4.34). Thus, both forecasts are damaged by a regime shift that occurs after the forecast is made. The conditional means and variances of the two errors are
E[eA, T +1 |
| T] = a*0 — a0, |
(4.35) |
E[eB, T +1 |
I T] = a0 — a0 + (1 — a1 )(x0 — xt), |
(4.36) |
Var[eA, T +1 |
I T]= Var[eB, T +1 | T], |
(4.37) |
establishing that in this example of a post-forecast regime-shift, there is no ranking of the two forecasting methods in terms of the first two moments of the forecast error. The conditional forecast error variances are identical, and the bias of the model-based forecast are not necessarily smaller than the bias of the naive data-based predictor: assume, for example, that a0 > a0—if at the same time xT < x0., the data-based bias can still be the smaller of the two. Moreover, unconditionally, the two predictors have the same bias and variance:
E[eA, T+1] = E[eB, T+1] = — «0, (4.38)
Var[eA, T+1] = Var[eB, T+1]. (4.39)
Next consider the forecasts made for period T + 2, conditional on T +1, as an example of a pre-forecast regime shift (a0 ^ af in period T + 1). Unless A discovers the shift in a0 and successfully intercept-corrects the forecast, his error-bias will once again be
E[eA, T+2 I T + 1] = [a0 — a.0]. (4.40)
The bias of agent B’s forecast error on the other hand becomes
E[eB, T+2 | T + 1] = (1 — 01)(x° — xt), (4.41)
where x0 denotes the post-regime shift unconditional mean of x, that is, x0 = a0/(1 — a1). Clearly, the bias of the data-based predictor can easily be smaller than the bias of the model-based prediction error (but the opposite can of course also be the case). However,
E[eA, T+2] = [«o _ «0],
Е[ев, т+2] = 0,
and the unconditional forecast errors are always smallest for the data-based prediction in this case of pre-forecast regime shift.
The analysis generalises to the case of a unit root in the ж-process, in fact it is seen directly from the above that the data-based forecast errors have even better properties for the case of a = 1, for example, E[eB, T+2 | T + 1] = 0 in (4.41). More generally, if xt is I(d), then solving Adxf+1 = 0 to obtain xj+l will result in forecast with the same robustness with respect to regime shifts as illustrated in our example; see Hendry (1995a, ch.6.2.3). This class of predictors belongs to forecasting models that are cast in terms of differences of the original data, that is, differenced vector autoregressions, denoted dVARs. They have a tradition in macroeconomics that goes back at least to the 1970s, then in the form of Box-Jenkins time-series analysis and ARIMA models. A common thread running through many published evaluations of forecasts, is that the naive time-series forecasts are often superior to the forecasts of the macroeconometric models under scrutiny (see, for example, Granger and New - bold 1986, ch. 9.4). Why dVARs tend to do so well in forecast competitions is now understood more fully, thanks to the work of, for example, Clements and Hendry (1996, 1998, 1999a). In brief, the explanation is exactly along the lines of our comparison of ‘naive’ and ‘sophisticated’ expectation formation above: the dVAR provides robust forecasts of non-stationary time-series that are subject to intermittent regime shifts. To beat them, the user of an econometric model must regularly take recourse to intercept corrections and other judgemental corrections (see Section 4.6). These issues are also discussed in further detail in Chapter 11.