THE ECONOMETRICS OF MACROECONOMIC MODELLING

# Identifying partial structure in submodels

Model builders often face demands from model users that are incompatible with a 3-5 equations closed form model. Hence, modellers often find themselves dealing with submodels for the different sectors of the economy. Thus it is often useful to think in terms of a simplification of the joint distribution of all observable variables in the model through sequential factorisation, conditioning, and marginalisations.

1.3.1 The theory of reduction

Consider the joint distribution of xt = (x1t, x2t, ■ ■ ■, xnt)', t = 1,...,T, and let xT = {xt}f=1. Sequential factorisation means that we factorise the joint density function Dx(xT | x0,Ax) into

T

Dx(xT 1 x0; Ax) Dx(x1 1 x0; Ax)l Dx(xt 1 xt—1> x0; Ax), (2.1)

t=2

which is what Spanos (1989) named the Haavelmo distribution. It explains the present xt as a function of the past x_1, initial conditions x0, and a time invariant parameter vector Ax. This is—by assumption—as close as we can get to representing what Hendry (1995a) calls the data generating process (DGP), which requires the error terms, et = xt-E(xt | xj_1 ,x0; Ax), to be an innovation process. The ensuing approach has been called ‘the theory of reduction’ as it seeks to explain the origin of empirical models in terms of reduction operations conducted implicitly on the DGP to induce the relevant empirical model (see Hendry and Richard 1982, 1983).

The second step in data reduction is further conditioning and simplification. We consider the partitioning xt = (y't, z't) and factorise the joint density function into a conditional density function for yt | zt and a marginal density function for zt:

Dx(xt I x_l, x0; x) = Dyz(yt | zt, x1_1,xo; yz) • Dz(zt | x^xo; Az).

(2.2)

In practice we then simplify by using approximations by kth order Markov processes and develop models for

Dx(xt 1 xt — 1 , x0; Ax) ~ Dx(xt 1 xt— 1; @x) (2.3)

Dyz (yt 1 zt, xt-1, x0,Ayz ) ^ Dyz (yt 1 zt, x t-1; @yz ) (2.4)

for t > k. The validity of this reduction requires that the residuals remain innovation processes.

A general linear dynamic class of models with a finite number of lags which is commonly used to model the n-dimensional process xt is the kth order VAR with Gaussian error, that is,

k

xt = у + "У (n^xt-i + £t,

i=1

where et is normal, independent and identically distributed, N. i.i. d. (0,Лє). A VAR model is also the starting point for analysing the cointegrating relationships that may be identified in the xt-vector (see Johansen 1988, 1991, 19956). Economic theory helps in determining which information sets to study and in interpreting the outcome of the analysis. In the following, we assume for simplicity that the elements of xt are non-stationary I(1)-variables that become stationary after being differenced once. Then, if there is cointegration, it is shown in Engle and Granger (1987) that the VAR system always has a vector equilibrium-correcting model (VEqCM) representation, which can be written in differences and levels (disregarding the possible presence of deterministic variables like trends) in the following way:

k-1

Axt = ^2 AiAxt-i + a(etxt-1) +£t, (2.5)

i=1

where a and в are n x r matrices of rank r (r < n) and (fitxt-1) comprises r cointegrating I(0) relationships. Cointegrated processes are seen to define a long-run equilibrium trajectory and departures from this induce ‘equilibrium correction’ which moves the economy back towards its steady-state path. These models are useful as they often lend themselves to an economic interpretation of model properties and their long-run (steady-state) properties may be given an interpretation as long-run equilibria between economic variables that are derived from economic theory. Theoretical consistency, that is, that the model contains identifiable structures that are interpretable in the light of economic theory, is but one criterion for a satisfactory representation of the economy.

1.3.2 Congruence

If one considers all the reduction operations involved in the process of going from the hypothetical DGP to an empirical model, it is evident that any econometric model is unlikely to coincide with the DGP. An econometric model may however, possess certain desirable properties, which will render it a valid representation of the DGP. According to the LSE methodology (see Mizon 1995 and Hendry 1995a), such a model should satisfy the following six criteria:

1. The model contains identifiable structures that are interpretable in the light of economic theory.

2. The errors should be homoscedastic innovations in order for the model to be a valid simplification of the DGP.

3. The model must be data admissible on accurate observations.

4. The conditioning variables must be (at least) weakly exogenous for the parameters of interest in the model.

5. The parameters must be constant over time and remain invariant to certain classes of interventions (depending on the purpose for which the model is to be used).

6. The model should be able to encompass rival models. A model Mi encompasses other models (Mj, j = i) if it can explain the results obtained by the other models.

Models that satisfy the first five criteria are said to be congruent, whereas an encompassing congruent model satisfies all six. Below, we comment on each of the requirements.

Economic theory (item 1) is a main guidance in the formulation of econometric models. Clear interpretation also helps communication of ideas and results among researchers and it structures the debate about economic issues. However, since economic theories are necessarily abstract and build on simplifying assumptions, a direct translation of a theoretical relationship to an econometric model will generally not lead to a satisfactory model. Notwithstanding their structural interpretation, such models will lack structural properties.

There is an important distinction between seeing theory as representing the correct specification (leaving parameter estimation to the econometrician), and viewing theory as a guideline in the specification of a model which also accommodates institutional features, attempts to accommodate heterogeneity among agents, addresses the temporal aspects for the data set and so on (see, for example, Granger 1999). Likewise, there is a huge methodological difference between a procedure of sequential simplification while controlling

for innovation errors as in Section 2.3.1 and the practice of adopting an axiom of a priori correct specification which by assumption implies white noise errors.

Homoscedastic innovation errors (item 2) mean that residuals cannot be predicted from the model’s own information set. Hence they are relative to that set. This is a property that follows logically from the reduction process and it is a necessary requirement for the empirical model to be one that is derived from the DGP. If the errors do not have this property, for example, if they are not white noise, some regularity in the data has not yet been included in the specification.

The requirement that the model must be data admissible (item 3) entails that the model must not produce predictions that are not logically possible. For example, if the data to be explained are proportions, the model should force all outcomes into the zero to one range.

Criterion 4 (weak exogeneity) holds if the parameters of interest are functions of 6yz (see (2.4)), which vary independently of 9x (see equation (2.3) and Engle et al. 1983 for a formal definition). This property relates to estimation efficiency: weak exogeneity of the conditioning variables zt is required for estimation of the conditional model for yt without loss of information relative to estimation of the joint model for yt and zt. In order to make conditional forecasts from the conditional model without loss of information, strong exogeneity is required. This is defined as the joint occurrence of weak exogeneity and Granger noncausality, which is absence of feedback from yt to zt, that is xj_1 in the marginal density function for zt, Dz(zt | x1_1,x0; z) in equation (2.2), does not include lagged values of yt.

Item 5 in the list is spelt out in greater detail in Hendry (1995a: pp. 33-4), where he gives a formal and concise definition. He defines structure as the set of basic permanent features of the economic mechanism. A vector of parameters defines a structure if it is invariant and directly characterises the relations under analysis, that is, it is not derived from more basic parameters. A parameter can be structural only if it is

• constant and so is invariant to an extension of the sample period;

• unaltered by changes elsewhere in the economy and so is invariant to regime shifts, etc.;

• remains the same for extensions of the information set and so is invariant to adding more variables to the analysis.

This invariance property is of particular importance for a progressive research programme: ideally, empirical modelling is a cumulative process where models continuously become overtaken by new and more useful ones. By useful, we understand models that possess structural properties (items 1-5), in particular models that are relatively invariant to changes elsewhere in the economy, that is, they contain autonomous parameters (see Frisch 1938, Haavelmo 1944,

Johansen 1977, and Aldrich 1989). Models with a high degree of autonomy represent structure: they remain invariant to changes in economic policies and other shocks to the economic system, as implied by the definition above.

However, structure is partial in two respects: first, autonomy is a relative concept, since an econometric model cannot be invariant to every imaginable shock; second, all parameters of an econometric model are unlikely to be equally invariant. Parameters with the highest degree of autonomy represent partial structure (see Hendry 19936, 19956). Examples are elements of the ^-vector in a cointegrating equation, which are often found to represent partial structure, as documented by Ericsson and Irons (1994). Finally, even though submodels are unlikely to contain partial structure to the same degree, it seems plausible that very aggregated models are less autonomous than the submodels, simply because the submodels can build on a richer information set.

Data congruence, that is, ability to characterise the data, remains an essential quality of useful econometric models (see Granger 1999 and Hendry 2002). In line with this, our research strategy is to check any hypothesised general model which is chosen as the starting point of a specification search for data congruence, and to decide on a final model after a general-to-specific (Gets) specification search. Due to recent advances in the theory and practice of data based model building, we know that by using Gets algorithms, a researcher stands a good chance of finding a close approximation to the data generating process (see Hoover and Perez 1999 and Hendry and Krolzig 1999), and that the danger of over-fitting is in fact surprisingly low.[9]

A congruent model is not necessarily a true model. Hendry (1995a: ch. 4) shows that an innovation is relative to its information set but may be predictable from other information. Hence, a sequence of congruent models could be developed and each of them encompassing all previous models. So satisfying all six criteria provides a recipe for a progressive research strategy. Congruency and its absence can be tested against available information, and hence, unlike truth, it is an operational concept in an empirical science (see Bontemps and Mizon 2003).

Finally, it should be noted that a strategy that puts a lot of emphasis on forecast behaviour, without a careful evaluation of the causes of forecast failure ex post, runs a risk of discarding models that actually contain important elements of structure. Hence, for example, Doornik and Hendry (1997a)

and Clements and Hendry (1999a: ch.3) show that the main source of forecast failure is deterministic shifts in means (e. g. the equilibrium savings rate), and not shifts in such coefficients (e. g. the propensity to consume) that are of primary concern in policy analysis. Structural breaks are a main concern in econometric modelling, but like any hypothesis of theory, the only way to judge the quality of a hypothesised break is by confrontation with the evidence in the data. Moreover, given that an encompassing approach is followed, a forecast failure is not merely destructive but represents a potential for improvement, since respecification follows in its wake: see Section 2.4.2.