INTRODUCTION TO STATISTICS AND ECONOMETRICS
Tests for Structural Change
Suppose we have two regression regimes
(10.3.7) and |
Уь = |
a + 3i*T + ut, |
t= 1, 2, . . |
■ • ,Ti |
(10.3.8) |
Tit = |
a + 32*21 + u2t, |
t = 1, 2, . . |
• , T2, |
where each equation satisfies the assumptions of the model (10.1.1). We
9 9
denote Vun = crj and Vu% = a2 . In addition, we assume that (щ() and {u2t)
are normally distributed and independent of each other. This two-regression model is useful to analyze the possible occurrence of a structural change from one period to another. For example, (10.3.7) may represent a relationship between у and x in the prewar period and (10.3.8) in the postwar period.
First, we study the test of the null hypothesis H0: (3j = (32, assuming
9 9
aj = ct2 under either the null or the alternative hypothesis. We can construct a Student’s t statistic similar to the one defined in (10.3.5). Let Pi and 02 be the least squares estimators of (3] and p2 obtained from equations (10.3.7) and (10.3.8), respectively. Then, defining x*t = — щ
and x2t = x%t — x2 as in (10.2.11), we have under H0
£(4д2 X (4f
.t=і /=i
Let {ub} and {щ, be the least squares residuals calculated from (10.3.7) and (10.3.8), respectively. Then (10.3.2) implies
(10.3.10) ~ XT,-2 and
(10.3.11) ‘^4—
a2
Therefore, by Theorem 1 of the Appendix, we have
T, T2
(=1 (=1 n
(10.3.12) —y - + ^^~Xr-4,
CT1 0"2
where we have set T + T2 = T. Since (10.3.9) and (10.3.12) are independent, we have by Definition 2 of the Appendix
£(4)2
(=i (=i
where a2 = (T — 4) 1 (Х? і]й? г + Xjjjtt2(). The null hypothesis can be tested using (10.3.14) in either a one-tail or a two-tail test, depending on the alternative hypothesis.
Before discussing the difficult problem of testing 0] = 02 without assuming ct? = of, let us consider testing the null hypothesis H0: ct? = ct2. A simple test of this hypothesis can be constructed by using the chi-square variables defined in (10.3.10) and (10.3.11). Since they are independent of each other because (m1() and {m2J are independent, we have by Definition 3 of the Appendix
Ti
<г|(Г2- 2)£tffi
(10.3.15) -------- F(Ti - 2, T2 - 2).
1 2
a? (T ~ 2)X 4
£= I
Note that a? and a| drop out of the formula above under the null hypothesis a? = oj. A one-tail or a two-tail test should be used, depending on the alternative hypothesis.
Finally, we consider a test of the null hypothesis H0 0] = (32 without assuming a? = a2 . The difficulty of this situation arises from the fact that (10.3.14) cannot be derived from (10.3.13) without assuming ct? = o|. Several procedures are available to cope with this so-called Behrens-Fisher
problem, but we shall present only one—the method proposed by Welch (1938). For other methods see Kendall and Stuart (1973).
Welch’s method is based on the assumption that the following is approximately true when appropriate degrees of freedom, denoted v, are chosen:
,*2
1=1
where tij = (Tj — 2) and cr2 = (T2 — 2) Xjj ]ut ■ The assump
tion that (10.3.16) is approximately true is equivalent to the assumption that v£, where £; is defined by
-2 -2 1 °i a2 |
2 2 ai a2 |
|
X (X*f I (*2 f t= 1 *=1 |
X(4)2 £(4)2 (=i <=i |
(10.3.17) (; = |
2
is approximately distributed as Xv for an appropriately chosen value of v. Then we can apply Definition 2 of the Appendix to (10.3.9) and (10.3.17) to obtain (10.3.16).
The remaining question, therefore, is how we should determine the degrees of freedom v in such a way that v£ is approximately Xv • Since E£,
9
= 1 and since Ev = v by Theorem 2 of the Appendix, we have Evi; = Exl ■ We now equate the variances of v£ and Xv:
2ct|
2
Since Vxv = 2v by Theorem 2 of the Appendix, we should determine v by v = 2(T£) In practice, v must be estimated by inserting a and
of into the right-hand side of (10.3.18) and then choosing the integer that most closely satisfies v = 2(V£)-1.
EXERCISES
1. (Section 10.2.2)
Following the proof of the best linear unbiasedness of (3, prove the same for a.
2. (Section 10.2.2)
In the model (10.1.1) obtain the constrained least squares estimator of (3, denoted by 3, based on the assumption a = 3- That is to say, 3 minimizes Z(=i{yt — 3 — 3X«) • Derive its mean squared error without assuming that a = 3- Show that if in fact a = 3, the mean squared error of 3 is smaller than that of the least squares estimator 3-
3. (Section 10.2.2)
In the model (10.1.1) assume that a = 0, 3 = 1, T = 3, and xt = t for t = 1,2, and 3. Also assume that ut), <=1,2, and 3, are i. i.d. with the distribution P(ut = 1) = P{ut = — 1) = 0.5. Obtain the mean and mean squared error of the reverse least squares estimator (minimizing the sum of squares of the deviations in the direction of the x-axis) defined by 3* = Zf=iyi /bt=ytxt and compare them with those of the least squares estimator 3 = 2f= iytx*/zf= i xf. Create your own data by generating {ut} according to the above scheme and calculate Зя and 3 for T = 25 and T = 50.
4. (Section 10.2.4)
Give an example of a sequence that satisfies (10.2.64) but not (10.2.74).
5. (Section 10.2.4)
Suppose that yt = yf + ut and xt = xf + vt, t + 1, 2, . . . , T, where {yf} and (xf) are unknown constants, {yt and (xj are observable random variables, and {ut} and vt are unobservable random variables. Assume (ut, vt) is a bivariate i. i.d. random variable with mean zero and variances and crl and covariance cruv. The problem is to estimate the unknown parameter 3 in the relationship yf = 3xf, t = 1, 2, . . . , T, on the basis of observations {yt} and (xj. Obtain the probability limit
л <-p 7^2 17">5k2
of p = X,= iytxt/l, t= і xt, assuming limj^ooT Ъ1=1{х,) = c. This is known as an errors-in-variables model.
6. (Section 10.2.4)
In the model of the preceding exercise, assume also that limj^co = d Ф 0
and obtain the probability limit of
T
t= 1
7. (Section 10.2.4)
Consider a bivariate regression model yt = a + Px, + ut, t = 1,2, ,
T, where {xt} are known constants and {ut} are i. i.d. with Eut = 0 and Vut = u. Arrange {xt} in ascending order and define x^) ^ x<2) ^ . . . < X(7). LetS be T/2 if T is even and (T + l)/2 if T is odd. Also define
where we assume limy^X] = c < x2 = d < °°. Prove the con
sistency of (3 and a defined by
h ~ 7i _
P = --------- — and a = уг — Pxj.
Are these estimators better or worse than the least squares estimators P and a? Explain.
8. (Section 10.2.6)
Consider a bivariate regression model yt = a + Px, + ut, t = 1,2,..., 5, where {x,} are known constants and equal to (2, 0, 2, 0, 4) and [ut]
9
are i. i.d. with Eut = 0 and Vut = cr. We wish to predict on the basis of observations (y, y2, y$, 34). We consider two predictors of y5:
(1) % = a + px5, where a and P are the least squares estimators based on the first four observations on {xt} and [yt],
X (1 + хі)Уг~
t= 1
(2) ^5 = a + ax5, where a =-------------------------- •
X + xt)2
(=i
Obtain the mean squared prediction errors of the two predictors. For what values of a and (3 is preferred to yf
9. (Section 10.3.1)
Test the hypothesis that there is no gender difference in the wage rate by estimating the regression model
Уі = a + (Зх, + иь і = 1, 2, . . . , n,
where уі is the wage rate (dollars per hour) of the zth person and x, = 1 or 0, depending on whether the ith person is male or female. We assume that {иг are i. i.d. N{0, a2). The data are given by the following table:
Number |
Sample mean |
Sample variance |
|
of people |
of wage rate |
of wage rate |
|
Male |
20 |
5 |
3.75 |
Female |
10 |
4 |
3.00 |
10. (Section 10.3.2)
The accompanying table gives the annual U. S. data on hourly wage rates (y) and labor productivity (x) in two periods: Period 1, 1972- 1979; and Period 2, 1980-1986. (Source: Economic Report of the President, Government Printing Office, Washington, D. C., 1992.)
Period 1 |
||||||
x: |
3.70 92.60 |
3.94 95.00 |
4.24 4.53 4.86 93.30 95.50 98.30 |
5.25 99.80 |
5.69 100.40 |
6.16 99.30 |
Period 2 |
||||||
r- |
6.66 |
7.25 |
7.68 8.02 |
8.32 |
8.57 |
8.76 |
x: |
98.60 |
99.90 |
100.00 102.20 |
104.60 |
106.10 |
108.30 |
(a) Calculate the linear regression equations of у and x for each period and test whether the two lines differ in slope, assuming that the error variances are the same in both regressions.
(b) Test the equality of the error variances.
(c) Test the equality of the slope coefficients without assuming the equality of the variances.