Introduction to the Mathematical and Statistical Foundations of Econometrics
Uniform Weak Laws of Large Numbers
7.1.1. Random Functions Depending on Finite-Dimensional Random Vectors
On the basis of Theorem 7.7, all the convergence in probability results in Chapter 6 for i. i.d. random variables or vectors carry over to strictly stationary time series processes with an a-mixing base. In particular, the uniform weak law of large numbers can now be restated as follows:
Theorem 7.8(a): (UWLLN) Let Xt be a strictly stationary k-variate time series process with an a-mixing base, and let в e © be nonrandom vectors in a compact subset © c Km. Moreover, letg(x, в) be a Borel-measurablefunction on Kk x © such that for each x, g(x, в) is a continuous function on ©. Finally, assume that E[supee©|g(X,,в)|] < to. Then
Theorem 7.8(a) can be proved along the same lines as the proof of the uniform weak law of large numbers for the i. i.d. case in Appendix 6.A of Chapter 6 simply by replacing the reference to the weak law of large numbers for i. i.d. random variables by a reference to Theorem 7.7.
7.1.2. Random Functions Depending on Infinite-Dimensional Random Vectors
In time series econometrics we quite often have to deal with random functions that depend on a countable infinite sequence of random variables or vectors. As an example, consider the time series process
Yt = AY-1 + Xt, with Xt = Vt - Y0 Vt-1, (7.18)
where the Vt’s are i. i.d. with zero expectation and finite variance a2 and the parameters involved satisfy |во| < 1 and |y0| < 1. The part
Yt = A0Y-1 + Xt (7.19)
is an autoregression of order 1, denoted by AR(1), and the part
Xt = Vt - Y0 Vt-1 (7.20)
is a moving average process or order 1, denoted by MA(1). Therefore, model (7.18) is called an ARMA(1, 1) model (see Box and Jenkins 1976). The
condition |в0| < 1 is necessary for the strict stationarity of Yt because then, by backwards substitution of (7.18), we can write model (7.18) as
TO
Yt = J2 ti(V-j - Y0 Vt-1-j) j =0
TO
= Vt + (в0 - Y0)J2 в0-1 Vt-j. (7.21)
j=1
This is the Wold decomposition of Yt. The MA(1) model (7.20) can be written as an AR(1) model in Vt:
Vt = Y0 Vt-1 + Ut.
If |Y0| < 1, then by backwards substitution of (7.22) we can write (7.20) as
TO
Xt = - J2 Y0jXt-j + Vt. (7.23)
j=1
If we substitute Xt = Yt - p0Yt-1 in (7.23), the ARMA(1, 1) model (7.18) can now be written as an infinite-order AR model:
TO
Yt = foYt-1 - J^ Y0(Yt-j - foYt-1-j) + Vt
j=1
TO
= в - Y0)J2 y0-1 Yt - j + Vt. (7.24)
j=1
Note that if в0 = Y0, then (7.24) and (7.21) reduce to Yt = Vt; thus, there is no way to identify the parameters. Consequently, we need to assume that в0 = Y0. Moreover, observe from (7.21) that Yt is strictly stationary with an independent (hence а-mixing) base.
There are different ways to estimate the parameters e0,Yo in model (7.18) on the basis of observations on Yt for t = 0, 1,..., и only. If we assume that the Vt’s are normally distributed, we can use maximum likelihood (see Chapter 8). But it is also possible to estimate the model by nonlinear least squares (NLLS).
If we would observe all the Yt’s for t < и, then the nonlinear least-squares estimator of 00 = (в0, y0)t is
where
TO
ft(0) = (в - Y)J2 Yj-1 Yt-j, with 0 = (в, y)T, (7.26)
j=1
and
© = [-1 + є, 1 - є] x [-1 + є, 1 - є], є є (0, 1),
for instance, where є is a small number. If we only observe the Yt’s for t = 0, 1,... ,n, which is the usual case, then we can still use NLLS by setting the Yt’s for t < 0 to zero. This yields the feasible NLLS estimator
n
в = argmin(1/n) V(Y - ft(в))2,
вє© t=i
where
t
ft (в) = (fi - y )£ Yj-1Yt-j ■
j=1
For proving the consistency of (7.28) we need to show first that
(Exercise), and
(Exercise) However, the random functions gt (в) = (Yt - ft (в ))2 depend on infinite-dimensional random vectors (Yt, Yt-1, Yt-2, Yt -2,.. .)T, and thus Theorem 7.8(a) is not applicable to (7.31). Therefore, we need to generalize Theorem 7.8(a) to prove (7.31):
Theorem 7.8(b): (UWLLN) Let = a(V, V-1, V-2,...), where V is a
time series process with an а-mixing base. Let gt (в) be a sequence of random functions on a compact subset © of a Euclidean space. Write Ns (в*) = {в є © : И в - в* у < S} for в* є © and S > 0. If for each в* є © and each S > 0,
(a) sup6eNsв) gt (в) and inf в eNs^gt (в) are measurable and strictly stationary,
Ф) E[sup0eNS(в*)gt(в)] < TO andE[infвЄЩ(в,)gt(в)] > - ТО
(c) lims,0E [supвєNS(в*)gt (в)] = lims.[0E [inf в g_ns(в,)& (в)] = E [gt (в*)];
then, plimn^тоsupв є© 1(1/n)Tf=1 gt (в) - E [g1 (в )]| = 0.
Theorem 7.8(b) can also be proved easily along the lines of the proof of the uniform weak law of large numbers in Appendix 6.A of Chapter 6.
Note that it is possible to strengthen the (uniform) weak laws of large numbers to corresponding strong laws or large numbers by imposing conditions on the speed of convergence to zero of a(m) (see McLeish 1975).
It is not too hard (but rather tedious) to verify that the conditions of Theorem 7.8(b) apply to the random functions gt(в) = (Yt - f (в))2 with Yt defined by (7.18) and ft(в) by (7.26).
7.1.3. Consistency of M-Estimators
Further conditions for the consistency of M-estimators are stated in the next theorem, which is a straightforward generalization of a corresponding result in Chapter 6 for the i. i.d. case:
Theorem 7.9: Let the conditions of Theorem 7.8(b) hold, and let §o = argmaxee0E[gi(<9)], § = argmax§e@(1/n)J2П=і gt(§). If for S > 0, supee@Ns(§0)E[gi(§)] < E[gi(§o)], then plimn^JI = §0. Similarly, if §o = argmin§e0E[gi(§)],§ = argmin§e0(1/n)Y!'t=1 gt(§), and for S > 0,
inf§g0Ns(§0)E[gl(§)] > E[gl(§0)], thenPlimn^TO§ = §0.
Again, it is not too hard (but rather tedious) to verify that the conditions of Theorem 7.9 apply to (7.25) with Yt defined by (7.18) and f (§)by (7.26). Thus the feasible NLLS estimator (7.28) is consistent.