A COMPANION TO Theoretical Econometrics
The Cox procedure
This procedure focuses on the loglikelihood ratio statistic, and in the case of the above regression models is given by (using the notations of Section 2):
T ln |
2 zT |
2 |
1J |
LRfg — Lf (Pt) - Lg(TT) — |
where
a T = Т-Ц ef, a T = (X'X)-1X'y,
ef = y - XaT = Mxy, Mx = IT - X(X'X)-1X', (13.26)
and
mT = T-1e'gegr St = (Z'Z)-1Z'y,
eg = y - zSt = Mzy, Mz = IT - Z(Z'Z)-1Z'. (13.27)
v®*(9o)y |
In the general case where the regression models are nonnested the average loglikelihood ratio statistic, ^ ln(a Т/ m T), does not converge to zero even if T is sufficiently large. For example, under Hf we have
and under Hg:
1
PT™(T ~lLRfgIHg) = ^2ln
The LR statistic is naturally centered at zero if one or the other of the above probability limits is equal to zero; namely if either Xf = 0 or Xg = 0.14 When Xf = 0 then X C Z and Hf is nested in Hg. Alternatively, if Xg = 0, then Z С X and Hg is nested in Hf. Finally, if both Xf = 0 and X g = 0 then the two regression models are observationally equivalent. In the nonnested case where both Xf Ф 0 or Xg Ф 0, the standard LR statistic will not be applicable and needs to be properly centered. Cox's contribution was to note that this problem can be overcome if a consistent estimate of Plim^„(T^LRg | Hf), which we denote by Ef (T~4LRfg), is subtracted from T^1LRfg, which yields the new centered (modified) loglikelihood ratio statistic (also known as the Cox statistic) for testing Hf against Hg:
Sfg = T-lLRfg - Ef (T-lLRfg)
It is now clear that by construction the Cox statistic, Sfg, has asymptotically mean zero under Hf. As was pointed out earlier, since there is no natural null hypothesis in this setup, one also needs to consider the modified loglikelihood ratio statistic for testing Hg against Hf which is given by
wj + pj XgPT
d2
Both of these test statistics (when appropriately normalized by *Jt ) are asymptotically normally distributed under their respective nulls with a zero mean and finite variances. For the test of Hf against Hg we have15
The associated standardized Cox statistic is given by
By reversing the role of the null and the alternative hypothesis a similar standardized Cox statistic can be computed for testing Hg against Hf, which we denote by Ngf. Denote the (1 - a) percent critical value of the standard normal distribution by Ca, then four outcomes are possible:
1. Reject Hg but not Hf if |Nfg| < Ca and |Ngf > Ca,
2. Reject Hf but not Hg if |Nfg| > Ca and |Ngf| < Ca,
3. Reject both Hf and Hg if | Nfg | > Ca and | Ngf | > Ca,
4. Reject neither Hf nor Hg if | Nfg | < Ca and | Ngf | < Ca.
These are to be contrasted to the outcomes of the nested hypothesis testing where the null is either rejected or not, which stem from the fact that when the hypotheses under consideration are nonnested there is no natural null (or maintained) hypothesis and one therefore needs to consider in turn each of the hypotheses as the null. So there are twice as many possibilities as there are when the hypotheses are nested. Note that if we utilize the information in the direction of rejection, that is instead of comparing the absolute value of Nfg with Ca we determine whether rejection is in the direction of the null or the alternative, there are a total of eight possible test outcomes (see the discussion in Fisher and McAleer (1979) and Dastoor (1981)). This aspect of nonnested hypothesis testing has been criticized by some commentators, pointing out the test outcome can lead to ambiguities. (See, for example, Granger, King, and White, 1995.) However, this is a valid criticism only if the primary objective is to select a specific model for forecasting or decision making, but not if the aim is to learn about the comparative strengths and weaknesses of rival explanations. What is viewed as a weakness from the perspective of model selection now becomes a strength when placed in the
context of statistical inference and model building. For example, when both models are rejected the analysis points the investigator in the direction of developing a third model which incorporates the main desirable features of the original, as well as being theoretically meaningful. (See Pesaran and Deaton, 1978.)