Advanced Econometrics Takeshi Amemiya
Distribution Function
Definition 3.1.5. The distribution function F(x) of a random variable X(a>) is defined by
F(x) = P{o)Х(оз) < x).
Note that the distribution function can be defined for any random variable because a probability is assigned to every element of A and hence to {cojX(eo) < x) for any x. We shall write P{o)X(oj) < x} more compactly as P(X<x).
A distribution function has the properties:
(i) F(-«) = 0.
(ii) FM=1.
(iii) It is nondecreasing and continuous from the left.
[Some authors define the distribution function as F(x) = P{toX(a>) ё x}. Then it is continuous from the right.]
Using a distribution function, we can define the expected value of a random variable whether it is discrete, continuous, or a mixture of the two. This is done by means of the Riemann-Stieltjes integral, which is a simple generalization of the familiar Riemann integral. Let X be a random variable with a distribution function F and let Y = h(X), where A( •) is Borel-measurable.6 We define the expected value of Y, denoted by EY as follows. Divide an interval [a, b] into n intervals with the end points a — x„ < x, < . . . < x„_! < x„ = b and let x f be an arbitrary point in [x,, x,+, ]. Define the partial sum
= 2 h{xf )[F(xi+1) - F(xt)] (3.1.1)
1-0
associated with this partition oftheinterval [a, b]. If, for any e > 0, there exists a real number A and a partition such that for every finer partition and for any choice of xf, IS'n — AI < e, we call A the Riemann-Stieltjes integral and denote it by jbah{x) dF(x). It exists if A is a continuous function except possibly for a countable number of discontinuities, provided that, whenever its discontinu
ity coincides with that of F, it is continuous from the right.7 Finally, we define
(3.1.2)
provided the limit (which may be +°° or —°°) exists regardless of the way a —*—oo and b-*<*>.
If dF/dxexists and is equal to /(x), F(xi+l) — F(x,) — /(x*)(xm — x,) for some x? Є [xt+i, x,] by the mean value theorem. Therefore
(3.1.3)
On the other hand, suppose X= ct with probability ph і =1,2,. . . , К. Take a < c, and cK < t, then, for sufficiently large n, each interval contains at most one of the cjs. Then, of the n terms in the summand of (3.1.1), only К terms containing cjs are nonzero. Therefore
(3.1.4)