Introduction to the Mathematical and Statistical Foundations of Econometrics
A Generic Central Limit Theorem
In this section I will explain McLeish’s (1974) central limit theorems for dependent random variables with an emphasis on stationary martingale difference processes.
The following approximation of exp(i ■ x) plays a key role in proving central limit theorems for dependent random variables.
Lemma 7.1: For x є К with x < 1, exp(i ■ x) = (1 + i ■ x)exp(-x2/2 + r(x)), where r(x) < x3.
Proof: It follows from the definition of the complex logarithm and the series expansion of log(1 + i ■ x) for |x | < 1 (see Appendix III) that
TO
log(1 + i ■ x) = i ■ x + x1 /2 + ^2(-1)k-1ikxk/k + i ■ m ■ n
k=3
= i ■ x + x2/2 — r(x) + i ■ m ■ n,
where r(x) = — Eь=з(— 1)k— 1ikxk/k. Taking the exp of both sides of the equation for log(1 + i ■ x) yields exp(i ■ x) = (1 + i ■ x) exp(—x2/2 + r(x)). To prove the inequality |r(x)| < |x |3, observe that
TO TO
r (x) = - (—1)k—1ikxk / k = x 3J2(— 1)kik+1 xk/(k + 3)
k=3 k=0
TO
= x3 2(— 1)2ki2k+1 x2k/(2k + 3)
k=0
TO
+ x3 ^(- 1)2k+1i2k+2x2k+1/(2k + 4)
k=0
= x3£(-1)kx2k+1/(2k + 4) + i ■ x3£(-1)kx2k/(2k + 3)
k=0 k=0
TOTO
= J2(-1)kx2k+4/(2k + 4) + i J2(-1)kx2k+3/(2k + 3)
k=0
where the last equality in (7.37) follows from
d TO TO
— J2 (-1)kx 2k+4/(2k + 4) = J2 (-1)kx 2k+3
k0
The theorem now follows from (7.37) and the easy inequalities
x 3 |x 1
/ 1 + y2 dy < f y3—y = 4 |x |4 < |x |3h/2
x 2 |x l
f J+y2 йУ < f y2dy = 3 lx I3 < lx |3Д/2,
which hold for |x | < 1. Q. E.D.
The result of Lemma 7.1 plays a key role in the proof of the following generic central limit theorem:
Lemma 7.2: Let Xt, t = 1, 2,..., n,... be a sequence of random variables satisfying the following four conditions:
plim max |XtI/Vn = 0, (7.38)
п^ж 1<t <n
plim(1/n)^2X2t = о2 є (0, ж),
t=1
and
Then
Proof: Without loss of generality we may assume that о2 = 1 because, if not, we may replace Xt by Xt/о. It follows from the first part of Lemma 7.1 that
Condition (7.39) implies that
|
|||
|
|||
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
n
Zn(f) = exp(-f2/2) - exp -(f2/2)(1/n)J^ X) j
x ex^^ r(f Xt/Vn^ ^p 0.
Because | Zn (f )| < 2 with probability 1 given that | exp( x2/2 + r(x))| < 1,
it follows from (7.49) and the dominated-convergence theorem that lim E [|Zn(f )|2] = 0.
Moreover, condition (7.41) implies (using zw = z ■ w and |z| = Vzz) that
2
ҐІ(1 + І f Xt/Vn)
t = 1
Therefore, it follows from the Cauchy-Schwarz inequality and (7.51) and (7.52) that
Finally, it follows now from (7.40), (7.48), and (7.53) that
Because the right-hand side of (7.54) is the characteristic function of the N(0, 1) distribution, the theorem follows for the case a2 = 1 Q. E.D.
Lemma 7.2 is the basis for various central limit theorems for dependent processes. See, for example, Davidson’s (1994) textbook. In the next section, I will specialize Lemma 7.2 to martingale difference processes.
7.5.3. Martingale Difference Central Limit Theorems
Note that Lemma 7.2 carries over if we replace the Xt’s by a double array Xn, t, t = 1, 2,...,n, n = 1, 2, 3, In particular, let
Yn, i = Xi,
Yn, t = XtI ||(1/n) YX2 < a2 + ij for t > 2. (7.55)
Then, by condition (7.39),
n
P[Yn, t = Xt for some t < n] < P [(1/n) ^ Xf > a2 + 1] ^ 0;
t=1
(7.56)
hence, (7.42) holds if
1n
^Y Yn, t ^d N(0, a2). (7.57)
Therefore, it suffices to verify the conditions of Lemma 7.2 for (7.55).
First, it follows straightforwardly from (7.56) that condition (7.39) implies
n
plim(1/n)Y Ylt = a2. (7.58)
t=1
Moreover, if Xt is strictly stationary with an a -mixing base and E [X2] = a2 є (0, то), then it follows from Theorem 7.7 that (7.39) holds and so does (7.58).
Next, let us have a closer look at condition (7.38). It is not hard to verify that, for arbitrary є > 0,
(7.59)
Hence, (7.38) is equivalent to the condition that, for arbitrary є > 0,
n
(1/n)Y X21 (I Xt | > є JR) ^ p 0. (7.60)
t=1
Note that (7.60) is true if Xt is strictly stationary because then
where
Hence,
Thus, (7.63) is finite if supn>1 (1 /n)J2't=1 E[Xj] < to, which in its turn is true if Xt is covariance stationary.
Finally, it follows from the law of iterated expectations that, for a martingale difference process Xt, E [ПП= 1 (1 + i*Xt/^n)] = E [П”= 1 (1 +
ifE[Xt.-1ІД/Й)] = 1, Vf є К, and therefore also E[П"=і(1 +
if Y„,t Д/й)] = E[ПП=і(1 + if E[Y„,t.t-і]А/и)] = 1, Vf є К.
We can now specialize Lemma 7.2 to martingale difference processes:
Theorem 7.10: Let Xt є К be a martingale difference process satisfying the following three conditions:
(a) (1/n)J2n=1 Xf ^p a2 є (0, то);
(b) For arbitrary є > 0, (1/n)YTt=1 X21 (Xt > ejn) ^ p 0;
(c) suPn>l(1/n)JTt=l E[Xf] < то.
Then, (1/Vn) m=1 Xt ^d N(0, a2).
Moreover, it is not hard to verify that the conditions of Theorem 7.10 hold if the martingale difference process Xt is strictly stationary with an а-mixing base and E[Xf] = a2 є (0, то):
Theorem 7.11: Let Xt є К be a strictly stationary martingale difference process with an а-mixing base satisfying E[X2] = a2 є (0, то). Then (1/Vn)En=1 Xt ^dN(0, a2).
1. Let U and V be independent standard normal random variables, and let Xt = U ■ cos(Xt) + V ■ sin(Xt) for all integers t and some nonrandom number X є (0, n). Prove that Xt is covariance stationary and deterministic.
2. Show that the process Xt in problem 1 does not have a vanishing memory but that nevertheless plim^^ (1 fn)^f П= 1 Xt = 0.
3. Let Xt be a time series process satisfying E[ Xt ] < то, and suppose that the events in the remote a - algebra.-то = ПТО0a (X-t, X-t-1, X-t-2,...) have either probability 0 or 1. Show that P(E[Xt .-то] = E[Xt]) = 1.
4. Prove (7.30).
5. Prove (7.31) by verifying the conditions on Theorem 7.8(b) for gt(в) = (Yt - ft (в))2 with Yt defined by (7.18) and ft (в) by (7.26).
6. Verify the conditions of Theorem 7.9 for gt (в) = (Yt - ft (в ))2 with Yt defined by (7.18) and ft(в) by (7.26).
7. Prove (7.50).
8. Prove (7.59).
APPENDIX
7.A.1. Introduction
In general terms, a Hilbert space is a space of elements for which properties similar to those of Euclidean spaces hold. We have seen in Appendix I that the Euclidean space Rn is a special case of a vector space, that is, a space of elements endowed with two arithmetic operations: addition, denoted by “+,” and scalar multiplication, denoted by a dot. In particular, a space V is a vector space if for all x, y, and z in Vand all scalars c, ci, and c2,
(a) x + y = y + x;
(b) x + (y + z) = (x + y) + z;
(c) There is a unique zero vector 0 in V such that x + 0 = x;
(d) For each x there exists a unique vector - x in V such that x + (—x) = 0;
(e) 1 ■ x = x;
(f) (C1C2) ■ x = C1 ■ (c2 ■ x);
(g) c ■ (x + y) = c ■ x + c ■ y;
(h) (c1 + c2) ■ x = c1 ■ x + c2 ■ x.
Scalars are real or complex numbers. If the scalar multiplication rules are confined to real numbers, the vector space Vis a real vector space. In the sequel I will only consider real vector spaces.
The inner product of two vectors x and y in Rn is defined by xTy. If we
denote (x, y) = xTy, it is trivial that this inner product obeys the rules in the
more general definition of the term:
Definition 7.A.1: An inner product on a real vector space V is a real function (x, y) on V x Vsuch that for all x, y, z in V and all c in R,
(1) (x, y) = (y, x);
(2) (cx, y) = c(x, y);
(3) (x + y, z) = (x, z) + (y, z);
(4) (x, x) > 0 when x = 0.
A vector space endowed with an inner product is called an inner-product space. Thus, Rn is an inner-product space. In Rn the norm of a vector x is defined by ||x II = VxTx. Therefore, the norm on a real inner-product space is defined similarly as ||x || = s/(x, x). Moreover, in Rn the distance between two vectors x and y is defined by ||x - y || = у/(x — y)T(x - y). Therefore, the distance between two vectors x and y in a real inner-product space is defined similarly as ||x — y\ = V(x — y, x — y). The latter is called a metric.
An inner-product space with associated norm and metric is called a pre - Hilbert space. The reason for the “pre” is that still one crucial property of Rn is missing, namely, that every Cauchy sequence in Rn has a limit in Rn.
Definition 7.A.2: A sequence of elements Xn of an inner-product space with associated norm and metric is called a Cauchy sequence if, for every є > 0, there exists an n0 such that for all k, m > n0, \xk — xm || < є.
Theorem 7.A.1: Every Cauchy sequence in Шг,£ < то has a limit in the space involved.
Proof: Consider first the case R. Let x = limsup„^TOxn, where xn is a Cauchy sequence. I will show first that x < то.
There exists a subsequence nk such that x = limk^TOx„t. Note that xnk is also a Cauchy sequence. For arbitrary є > 0 there exists an index k0 such that |x„k — x„m | < є if k, m > k0. If we keep k fixed and let m ^то, it follows that |x„k — x | < є; hence, x < то, Similarly, x = liminf„^TOxn > —то. Now we can find an index k0 and subsequences nk and nm suchthatfor k, m > k0, |x„k — x | < є, lx„m — x | < є, and |x„k — x„m | < є; hence, |x — x | < 3є. Because є is arbitrary, we must have x = Jc = limй^тоxn. If we apply this argument to each component of a vector-valued Cauchy sequence, the result for the case Re follows. Q. E.D.
For an inner-product space to be a Hilbert space, we have to require that the result in Theorem 7.A1 carry over to the inner-product space involved:
Definition 7.A.3: A Hilbert space H is a vector space endowed with an inner product and associated norm and metric such that every Cauchy sequence in H has a limit in H.