Introduction to the Mathematical and Statistical Foundations of Econometrics
Transformations of Absolutely Continuous Random Vectors 4.4.1. The Linear Case
Let X — (X1, X2)T be a bivariate random vector with distribution function
where x — (x1; x2)T, u — (u ь u2)T.
In this section I will derive the joint density of Y — AX + b, where A is a (nonrandom) nonsingular 2 x 2 matrix and b is a nonrandom 2 x 1 vector.
Recall from linear algebra (see Appendix I) that any square matrix A can be decomposed into
A — R —1L ■ D ■ U,
where R is a permutation matrix (possibly equal to the unit matrix I), L is a lower-triangular matrix with diagonal elements all equal to 1, U is an upper - triangular matrix with diagonal elements all equal to 1, and D is a diagonal matrix. The transformation Y = AX + b can therefore be conducted in five steps:
Z1 = UX Z 2 = DZ1
Z 3 = LZ2 (4.20)
Z4 = R-1 Z3 Y = Z4 + b.
Therefore, I will consider the first four cases, A = U, A = D, A = L, and A = R-1 for b = 0 and then the case A = I, b = 0.
Let Y = AX with A an upper-triangular matrix:
A = (1 1) • (4 21)
Then
Along the same lines, it follows that, if A is a lower-triangular matrix, then the joint density of Y = AX is
(4.23)
0 a2 J ’
where a1 = 0, a2 = 0. Then Y1 = a1 X1 and Y2 = a2X2; hence, the joint distri
bution function H(y) is
H(y) = P (Y1 < У1, Y2 < У2) = P (a1 X1 < У1, a2X2 < У2) =
P(X1 < y1/a1, X2 < y2/a2)
ЛМ y2/a2
= f (x1, x2)dx1dx2 if a1 > 0, a2 > 0,
— TO —TO
P(X1 < y1/a1, X2 > y2/a2)
yi/a1 to
= f (x1, x2)dx1dx2 if a1 > 0, a2 < 0,
TO yi/ai
P(X1 > y1/a1, X2 < y2/a2)
to yi/a2
= f (xb x2)dx1dx2 if a1 < 0, a2 > 0,
ПМ TO
P(X1 > y1/a1, X2 > y2/a2)
Now consider the case Y = AX, for instance, where A is the inverse of a permutation matrix (which is a matrix that permutates the columns of the unit matrix):
Then the joint distribution function H(y) of Y is
H(y) = P(Yi < yi, Y2 < у2) = P(X2 < yi, Xі < у2)
= F(У2, yi) = F(Ay), and the density involved is d 2 H (y)
h(y) = а я = f (У2, y1) = f (Ay). d y1d y2
Finally, consider the case Y = X + b with b = (b1, b2)T. Then the joint distribution function H(y) of Y is
H(y) = P(Y1 < y1, Y2 < y2) = P(X1 < y1 - b1, X2 < y2 - Ьг)
= F(yi - bi, y2 - Ьг); hence, the density if Y is
d2 h (y)
h(y) = a a = f (yi - Ьъ y2 - b2) = f(y - b)
d yid y2
Combining these results, we find it is not hard to verify, using the decomposition (4.i9) and the five steps in (4.20), that for the bivariate case (k = 2):
Theorem 4.3: Let X be k-variate, absolutely continuously distributed with joint density f (x), and let Y = AX + b, where A is a nonsingular square matrix. Then Y is k-variate, absolutely continuously distributed with joint density h(y) = f (A-i(y - b))|det(A-i)|.
However, this result holds for the general case as well.