A COMPANION TO Theoretical Econometrics
Orders of magnitude
In determining the limiting behavior of sequences of random variables it is often helpful to employ notions of orders of relative magnitudes. We start with a review of the concepts of order of magnitudes for sequences of real numbers.
Definition 6. (Order of magnitude of a sequence of real numbers) Let an be a sequence of real numbers and let cn be a sequence of positive real numbers. We then say an is at most of order cn, and write an = O(cn), if there exists a constant M < such that c-1| an | < M for all n О N. We say an is of smaller order than cn, and write an = o(cn), if c^11 an | —> 0 as n — ^. (The definition extends to vectors and matrices by applying the definition to their norm.)
The following results concerning the algebra of order in magnitude operations are often useful.
Theorem 16. Let an and bn be sequences of real numbers, and let cn and dn be sequences of positive real numbers.
(a) If an = o(cn) and bn = o(dn), then anbn = o(cndn), | an |s = o(cn) for s > 0, an + bn = o(max{cn, dn}) = o(cn + dn).
(b) If an = O(cn) and bn = O(dn), then anbn = O(cndn), | an |s = O(cn) for s > 0,
an + bn = O(max{cn, dn}) = O(cn + dn).
(c) If an = o (cn) and bn = O(dn), then anbn = o(cndn).
We now generalize the concept of order of magnitude from sequences of real numbers to sequences of random variables.
Definition 7. (Order in probability of a sequence of random variables) Let Zn be a sequence of random variables, and let cn be a sequence of positive real numbers. We then say Zn is at most of order cn in probability, and write Zn = Op(cn), if for every e > 0 there exists a constant Me < ^ such that P(c-1 |Zn | > Me) < e. We say Zn is of smaller order in probability than cn, and write Zn = op(cn), if c4|Zn | — 0 as n — ro (The definition extends to vectors and matrices by applying the definition to their norm.)
The algebra of order in probability operations Op and op is identical to that of order in magnitude operations O and o presented in the theorem above; see, e. g., Fuller (1976, p. 184).
A sequence of random variables Zn that is Op(1) is also said to be "stochastically bounded" or "bounded in probability". The next theorem gives sufficient conditions for a sequence to be stochastically bounded.
Theorem 17.
(a) Suppose E |Zn |r = O(1) for some r > 0, then Zn = Op(1).
(b) Suppose Zn — Z, then Zn = Op(1).
Proof. Part (a) follows readily from Markov's inequality. To prove part (b) fix e > 0. Now choose M* such that F is continuous at - M* and M*, and F(-M*) < e/4 and F(M *) > 1 - e/4. Since every CDF has at most a countable number of discontinuity points, such a choice is possible. By assumption Fn(z) ^ F(z) for all continuity points of F. Let ne be such that for all n > ne
|Fn(-M*) - F(-M*)| < e/4
and
|Fn(M*) - F(M*)| < e/4.
Then for n > ne
P(|Zn | > M*) < Fn(-M*) - Fn(M*) + 1
< F(-M*) - F(M*) + 1 + e/2 < e.
Since limM^„ P(|Z; | > M) = 0 for each i Є N we can find an M** such that P(|Z; | > M**) < e for i = 1,..., ne - 1. Now let Me = max{M*, M**}. Then P(|Zn | > Me) < e for all n Є N. ■
3 Laws of Large Numbers
Let Zt, t Є N, be a sequence of random variables with EZt = pt. Furthermore let Zn = n-1 In=1Zt denote the sample mean, and let <n = EZn = n-1 Zn=1 pt. A law of large numbers (LLN) then specifies conditions under which
n
Zn - EZn = n-1Y, (Zt - Pt)
t=1
converges to zero either in probability or almost surely. If the convergence is in probability we speak of a weak LLN, if the convergence is almost surely we speak of a strong LLN. We note that in applications the random variables Zt may themselves be functions of other random variables.
The usefulness of LLNs stems from the fact that many estimators can be expressed as (continuous) functions of sample averages of random variables, or differ from such a function only by a term that can be shown to converge to zero i. p. or a. s. Thus to establish the probability or almost sure limit of such an estimator we may try to establish in a first step the limits for the respective averages by means of LLNs. In a second step we may then use Theorem 14 to derive the actual limit for the estimator.
Example 6. As an illustration consider the linear regression model yt = xtQ + et, t = 1,..., n, where yt, xt and et are all scalar and denote the dependent variable, the independent variable, and the disturbance term in period t. The ordinary least squares estimator for the parameter 0 is then given by
en =
and thus Pn is seen to be a function of the sample averages of xtet and x2t.