A COMPANION TO Theoretical Econometrics
Diagnostic Testing in Time Series Contexts
All of the tests we covered in Section 2 can be applied in time series contexts. However, because we can no longer assume that the observations are independent of one another, the discussion of auxiliary assumptions under H0 is more complicated. We assume in this section that the weak law of large numbers and central limit theorems can be applied, so that standard inference procedures are available. This rules out processes with unit roots, or fractionally integrated processes. (See Wooldridge (1994) or Potscher and Prucha (chapter 10 in this volume) for a discussion of the kinds of dependence allowed.) For notational simplicity, we assume that the process is strictly stationary so that moment matrices do not depend on t.
1.2 Conditional mean diagnostics
To illustrate the issues that arise in obtaining conditional mean diagnostics for time series models, let xt be a vector of conditioning variables, which can contain contemporaneous variables, z t, as well as lagged values of yt and z t. Given x t, we may be interested in testing linearity of E( yt |xt), which is stated as
E( ytlxt) = 0o + xt0 (9.34)
for some p0 О R and 0 О R*. For example, yt might be the return on an asset, and xt might contain lags of yt and lagged economic variables. The same kinds of tests we discussed in Section 2.1 can be applied here, including RESET, the Davidson - MacKinnon test, and LM tests against a variety of nonlinear alternatives. Let gt = g(xt, X) be the 1 x q vector of misspecification indicators. The LM statistic is obtained exactly as in (9.14), with standard notational changes (the t subscript
replaces i and the sample size is denoted T rather than N). Just as in the cross section case, we need to assume homoskedasticity conditional on x t:
var(yt |x t) = о2. (9.35)
If xt contains lagged y t, this rules out dynamic forms of heteroskedasticity, such as ARCH (Engle, 1982) and GARCH (Bollerslev, 1986), as well as static forms of heteroskedasticity if x t contains zt.
Because of the serial dependence in time series data, we must add another auxiliary assumption in order for the usual LM statistic to have an asymptotic %2q distribution. If we write the model in error form as
yt = P0 + xtP + Щ,
then a useful auxiliary assumption is
Е(щ |xt, Ut-i, x t-i,...) = 0. (9.37)
Assumption (9.37) implies that {ut} is serially uncorrelated, but it implies much more. For example, ut and us are uncorrelated conditional on (x t, xs), for t Ф s. Also, ut is uncorrelated with any function of (xt, ut-1, xt-1,...).
We can easily see why (9.37) is sufficient, along with (9.35), to apply the usual LM test. As in the cross section case, we can show under (9.34) that (9.20) holds with the obvious changes in notation. Now, for (9.19) to have an asymptotic chi-square distribution, we need 62(T-1X T=1rt'ft) to consistently estimate
rt'ut
/
Assumption (9.37) ensures that all of the covariance terms in this asymptotic variance are zero. For s < t, E(utusr'trs) = E[E(ut |rt, us, rs)usr'trs] = 0 because E(ut |rt, us, rs) = 0 under (9.37). The last statement follows because (rt, us, rs) is a function of (xt, ut-1, xt-1,...). When we add the homoskedasticity assumption, we see that the usual T-R-squared statistic is asymptotically valid.
It is easily seen that (9.37) is equivalent to
E(yt |xt, yt-1, xt-1, yt-2,...) = E(yt |xt) = P0 + xtP, (9.39)
which we call dynamic completeness of the conditional mean. Under (9.39), all of the dynamics are captured by what we have put in xt. For example, if x t = (yt-1, zt-1), then (9.39) becomes
E(yt I yt-1, zt-1, yt-2, Zt-2...) = E(yt | yt-1, zt-1), (9.40)
which means at most one lag each of yt and zt are needed to fully capture the dynamics. (Because zt is not in xt in this example, (9.40) places no restrictions on
any contemporaneous relationship between yt and zt.) Generally, if xt contains lags of yt and possibly lags of other variables, we are often willing to assume dynamic completeness of the conditional mean when testing for nonlinearities. In any case, we should know that the usual kind of test essentially requires this assumption.
If xt = zt for a vector of contemporaneous variables, (9.39) is very strong:
E(yt lz ^ yt-v Z-n yt-v...) = E(Уt |zt^ (9.41)
which means that once contemporaneous z has been controlled for, lags of y and zt are irrelevant. If we are just interested in testing for nonlinearities in a static linear model for E(yt |zt), we might not want to impose dynamic completeness.
Relaxing the homoskedasticity assumption is easy: the same heteroskedasticity - robust statistic from regression (9.23) is valid, provided (9.39) holds. This statistic is not generally valid in the presence of serial correlation.
Wooldridge (1991a) discusses different ways to make conditional mean diagnostics robust to serial correlation (as well as to heteroskedasticity). One approach is to obtain an estimator of (9.38) that allows general serial correlation; see, for example, Newey and West (1987) and Andrews (1991). Perhaps the simplest approach is to prewhiten kt = ЙД, where the й are the OLS residuals from estimating the null model and the rt are the 1 x q residuals from the multivariate regression of gt on 1, x t, t = 1, 2,..., T. If et, t = (p + 1),..., T are the 1 x q residuals from a vector autoregression (VAR) of kt on 1, k t,..., kt, then the test statistic is
T л |
( T > |
-1 |
( T Л |
1 e t |
1e 'e t |
1e |
|
= P+1 у |
Vt = P+1 У |
Vt = P+1 У |
under (9.34), the statistic has an asymptotic x 2 distribution, provided the VAR adequately captures the serial correlation in {k t = utrt}.
We can gain useful insights by studying the appropriate asymptotic representation of the LM statistic. Under (9.34), regularity conditions, and strict stationarity and weak dependence, we can write the T-R2 LM statistic as
(9.42)
This representation does not assume either (9.35) or (9.37), but if either fails then a2E(r'trt) does not generally equal (9.38). This is why, without (9.35) and (9.37), the usual LM statistic does not have a limiting chi-square distribution.
We can use (9.42) to help resolve outstanding debates in the literature. For example, there has long been a debate about whether RESET in a model with strictly exogenous explanatory variables is robust to serial correlation (with homo - skedasticity maintained). The evidence is based on simulation studies. Thursby (1979) claims that RESET is robust to serial correlation; Porter and Kashyap (1984)
find that it is not. We can help reconcile the disagreement by studying (9.42). With strictly exogenous regressors, {ut} is independent of {xt}, and the {ut} are always assumed to have a constant variance (typically, {ut} follows a stable AR(1) model). Combined, these assumptions imply (9.35). Therefore, RESET will have a limiting chi-square distribution when the covariance terms in (9.38) are all zero, that is,
E(utusr'trs) = 0, t Ф s. (9.43)
When {xt} is independent of {ut},
E(utusr 'r) = E(uu)E(r 'r),
because rt is a function of xt. Recall that rt is a population residual from a regression that includes an intercept, and so it has zero mean. Here is the key: if {xt} is an independent sequence, as is often the case in simulation studies, then E(r 'rs) = 0, t Ф s. But then (9.43) holds, regardless of the degree of serial correlation in {ut}. Therefore, if {x t} is generated to be strictly exogenous and serially independent, RESET is asymptotically robust to arbitrary serial correlation in the errors. (We have also shown that (9.37) is not necessary for the T-R2 LM statistic to have a limiting chi-square distribution, as (9.37) is clearly false when {ut} is serially correlated. Instead, strict exogeneity and serial independence of {x t} are sufficient.)
If {x t} is serially correlated, the usual RESET statistic is not robust. However, what matters is serial correlation in rt, and this might be small even with substantial serial correlation in {xt}. For example, x2 net of its linear projection on to (1, xt) might not have much serial correlation, even if {xt} does.
Earlier we emphasized that, in general, the usual LM statistic, or its hetero - skedasticity-robust version, maintain dynamic completeness under H0. Because dynamic completeness implies that the errors are not serially correlated, serially correlated errors provide evidence against (9.39). Therefore, testing for serial correlation is a common specification test.
The most common form of the LM statistic for AR(p) serial correlation - see, e. g., Breusch (1978), Godfrey (1978), or Engle (1984) - is LM = (T _ p)R2u, where R2u is the usual R2 from the regression
tit on 1, x t, Ut-1, Mt-2,..., Ut_p, t = (p + 1),..., T. (9.44)
Under what assumptions is LM asymptotically x2? In addition to (9.39) (equivalently, (9.37)), sufficient is the homoskedasticity assumption
var(ut |x t, ut_1,..., ut-p) = a2. (9.45)
Notice that (9.35) is no longer sufficient; we must rule out heteroskedasticity conditional on lagged ut as well.
More generally, we can test for misspecified dynamics, misspecified functional form, or both, by using specification indicators g(wt, X), where wt is a subset of (xt, yt-1, xt-1,...). If we want to ensure the appropriate no serial correlation assumption holds, we take the null to be (9.39), which implies that E(yt |x t, wt) = E(yt|xt) = p0 + xtp. The homoskedasticity assumption is var(yt |x t, wt) = о2. The adjustment for heteroskedasticity is the same as described for pure functional form tests (see equation (9.23)).
In this section we have focused on a linear null model. Once we specify hypotheses in terms of conditional means, there are no special considerations for nonlinear regression functions with weakly dependent data. All of the tests we discussed for cross section applications can be applied. The statement of homoskedasticity is the same in the linear case, and the dynamic completeness assumption is stated as in (9.39) but with the linear regression function replaced by m(xt, P). The standard LM test discussed in Section 2.3 is valid, and both heteroskedasticity and serial correlation robust forms are easily computed (see Wooldridge (1991a) for details).