A COMPANION TO Theoretical Econometrics
Some First-Generation RCMs
When considering situations in which the parameters of a regression model are thought to change - perhaps as frequently as every observation - the parameter variation must be given structure to make the problem tractable. A main identifying characteristic of first-generation RCMs is that they are concerned with providing a structure to the process generating the coefficients. In other words, this class of models seeks to account for the process generating the coefficients of the regression model, but does not address specification issues related to functional form, omitted variables, and measurement errors.2
To explain, consider the following model:
m = хдРд + £ xtjPj = x;pf (t = 1, 2,..., t), (19.1)
j = 2
where mt is the logarithm of real money balances (i. e. a measure of the money supply divided by a price-level variable); x is a row vector of K elements having the jth explanatory variable xtj as its jth element; xt1 = 1 for all t; the remaining Xj (j = 2,..., K) are the variables thought to influence the demand for real money
balances (such as the logarithms of real income and interest rates); pt is a column vector of K elements having the jth coefficient ptj - as its jth element; the first coefficient pfl combines the usual disturbance term and intercept and t indexes time series or cross section observations.
Since the model assumes that pt is changing, we have to say something about how it is changing. That is, we have to introduce some structure into the process thought to be determining the coefficients. One simple specification is
Pt = _ + e t, (19.2)
where _ = (_1,..., _K)' is the mean and the e t = (e t1,..., e tK)' are the disturbance terms that are identically and independently distributed with E(e t) = 0. Basically, this equation says that the variations in all the coefficients of equation (19.1) are random unless shown otherwise by the real-world sources and interpretations of Pt and at any particular time period or for any one individual these coefficients are different from their means.3 Assume further that the random components, et1,..., etK, are uncorrelated,
E(ete't) = diagfc^, a,..., a^] (19.3)
which is a K x K diagonal matrix whose ith diagonal element is a2. This assumption stipulates that the random component of one coefficient is uncorrelated with that of another. We also assume that the x t are independent of the pt.4 Correlations among the elements of et will be introduced at the end of the section.
Since we have postulated that all the coefficients of equation (19.1) vary over time or across individuals according to equation (19.2), an issue that arises is how to test whether equations (19.1) and (19.2) are true. In order to answer this question, first substitute equation (19.2) into equation (19.1) to obtain
K
m = x't_ + £ xtj є tj. (19.4)
j=1
Next, let wt denote the combined disturbances (i. e. the sum of the products of xtj and etj), wt = IK= 1 xtjєj, so that mt = x't _ + wt, where E(wt|xt) = 0. Moreover, the conditional variance of the combined disturbance term of equation (19.4) is
E(w2 |xt) = xta + ... + x^k alK, (19.5)
since wt is a linear combination of uncorrelated random variables and the xt are independent of the et. For t Ф s, the covariance between wt and ws is zero. Let vt = w2 - E(w2 |xt), where E(w2 |xt) is given in (19.5). It is straightforward to show that the conditional mean of vt given xt is zero.
The definition of vt can be rearranged to form a regression as
where E(vt |xt), as pointed out, is zero. Accordingly, using equations (19.5) and
(19.6) , w2 can be expressed as w] = Xf=1 x j o|. + vt, where for t Ф s, vt and vs are uncorrelated if the ej are independent.
If our goal is to test whether equation (19.1) is the same as the fixed-coefficient model of the conventional type, then our null hypothesis is that the random components of all the coefficients on xt2,..., xtK are zero with probability 1. A complete statement of this hypothesis is
H0: For i, j = 1,..., K; t, s = 1,..., T, E(P j) = _ and
F(P. P Л = M > 0 if i = j = 1 and t = s
h s [ 0 otherwise
such that E(et1 |xt) = E(et1) = 0 and et1 is normally distributed.
There are several alternatives to this hypothesis. One of them, provided by equations (19.1)-(19.3), is
H1: For i, j = 1,..., K; t, s = 1,..., T, E(P«) = and
o] > 0 if i = j and t = s
otherwise
such that E(E(,- |xt) = E(et;) = 0 and ej is normally distributed.
The following steps lead to a test of H0 against H1: (i) obtain the ordinary least squares (OLS) estimate of _, denoted by bOLS, by running the classical least squares regression of mt on xt; (ii) in equation (19.4), replace _ by bOLS to calculate Nt = mt - x(bOLS; (iii) square Nt and run the classical least squares regression of on x^,..., x]K;5 the sum of squares of the residuals of this regression gives an unrestricted error sum of squares (ESSD); and (iv) regress only on x2a; the sum of squares of the residuals of this regression gives a restricted error sum of squares (ESSR) because all the o|, j = 2,..., K, are assumed to be zero (i. e. it is restrictive because it imposes the restrictions implied by H0). Let q(+) = (T/(K - 1))(ESSR - ESSU)/ESSU be a test statistic. Reject H0 if the value q(w) of q(+) obtained in a particular sample is greater than some critical value c and do not reject H0 otherwise.
Although the foregoing testing methodology is characteristic of the first - generation literature, this test is misleading under the usual circumstances. Probabilities of false rejection of H0 (Type I error) and false acceptance of H1 (Type II error) associated with this test are unknown because both the exact finite sample (or asymptotic) distributions of q(+) when H0 and H1 are true are unknown. All that is known is that these probabilities are positive and less than 1. This is all that we need to show that the following argument is valid: Rejection of H0 is not proof that H0 is false or acceptance of H0 is not proof that H0 is true (Goldberger, 1991, p. 215). However, the occurrence of a real-world event does constitute strong evidence against H0 if that event has a small probability of occurring whenever H0 is true and a high probability of occurring whenever H1 is true. This
is an attractive definition of evidence, but finding such strong evidence is no easy task. To explain, we use the above test procedure. Note that there is no guarantee that one of H0 and H1 is true. When both H0 and H1 are false, observing a value of q(+) that lies in the critical region {q(+) > c} is not equivalent to observing a real - world event because the critical region whose probability is calculated under H0 (or Hj) to evaluate Type I error (or 1 - Type II error) probability is not the event of observing the actual data. Both H0 and H1 are false when, for example, the xt are correlated with the pt or when et is not normal. We show in the next section that we need to assume that the xt are correlated with the pt if we want to make assumptions that are consistent with the real-world interpretations of the coefficients of equation (19.1). If our assumptions are inconsistent, then both H0 and H1 are false. In that event, finding the value q(w) > c should not be taken as strong evidence against H0 and for H1, since the probabilities of the critical region {q(+) > c} calculated under H0 and H1 are incorrect. More generally, a test of a false null hypothesis against a false alternative hypothesis either rejects the false null hypothesis in favor of the false alternative hypothesis or accepts the false null hypothesis and rejects the false alternative hypothesis. Such tests continue to trap the unwary. We can never guarantee that one of H0 and H1 is true, particularly when the assumptions under which the test statistic q(+) is derived are inconsistent with the real-world interpretations of the coefficients of equation (19.1). We explain in the next section why such inconsistencies arise.6
At the very minimum, the above argument should indicate how difficult it is to produce strong evidence against hypotheses of our interest. For this reason, de Finetti (1974a, p. 128) says, "accept or reject is the unhappy formulation which I consider as the principal cause of the fogginess widespread all over the field of statistical inference and general reasoning." It is not possible to find useful approximations to reality by testing one false hypothesis against another false hypothesis. As discussed below, second-generation RCMs stress the importance of finding sufficient and logically consistent explanations of real phenomena (see, e. g., Zellner, 1988, p. 8) because of the limits to the usefulness of hypothesis testing.
In order to estimate the foregoing model, note that the error structure embedded in equation (19.4) is heteroskedastic. Specifically, the variance of the error at each sample point is a linear combination of the squares of the explanatory variables at that point (equation (19.5)). This suggests that this RCM can be estimated using a feasible generalized least squares estimation procedure that accounts for the heteroskedastic nature of the error process (Judge et al., 1985, pp. 808-9).
In the case where t indexes time, a natural extension of this class of models is to incorporate serial correlation in the process determining the coefficients as follows: For t = 1, 2,..., T,
(a) mt = xjPt, (b) pt = _ + et, (c) £t = Фєм + at. (19.7)
This model differs from the previous model because it assumes that the process generating the coefficients is autoregressive, where Ф is a matrix with eigenvalues less than 1 in absolute value.
This discussion above has presented the basic building blocks of first - generation RCMs. These models have been extended in a number of directions. For example, the stochastic structure determining the coefficients can be made to vary as a function of some observable variables. A main difficulty in using such models, however, is that one must specify the structure explaining coefficient variations. Errors in specification will lead to familiar and unfortunate consequences. Further, the structures are usually specified in a mechanical manner. Accordingly, we now discuss the class of second-generation RCMs, which directly confront these and other specification errors, and can be shown to include the first-generation RCMs and several well-known fixed-coefficient models as special cases (Swamy and Tavlas, 1995).