Springer Texts in Business and Economics
Stochastic Explanatory Variables
Sections 5.5 and 5.6 will study violations of assumptions 2 and 3 in detail. This section deals with violations of assumption 4 and its effect on the properties of the OLS estimators. In this case, X is a random variable which may be (i) independent; (ii) contemporaneously uncorrelated; or
(iii) simply correlated with the disturbances.
Case 1: If X is independent of u, then all the results of Chapter 3 still hold, but now they are conditional on the particular set of X’s drawn in the sample. To illustrate this result, recall that for the simple linear regression:
3ols = в + ££1 wu where Wi = V EILi x2 (5.2)
Hence, when we take expectations Е(Е™=1 wiui) = En=i E(wi)E(ui) = 0. The first equality holds because X and u are independent and the second equality holds because the u’s have zero mean. In other words the unbiasedness property of the OLS estimator still holds. However, the
where the last equality follows from assumptions 2 and 3, homoskedasticity and no serial correlation. The only difference between this result and that of Chapter 3 is that we have expectations on the X's rather than the X's themselves. Hence, by conditioning on the particular set of X's that are observed, we can use all the results of Chapter 3. Also, maximizing the likelihood involves both the X's and the u's. But, as long as the distribution of the X's does not involve the parameters we are estimating, i. e., а, в and a2, the same maximum likelihood estimators are obtained. Why? Because f(x1 ,x2,...,xn, u1,u2,...,un) = f1(x1,x2,...,xn)f2(u1,u2,...,un) since the X’s and the u’s are independent. Maximizing f with respect to (а, в, a2) is the same as maximizing f2 with respect to (а, в, a2) as long as f1 is not a function of these parameters.
Case 2: Consider a simple model of consumption, where Yt, current consumption, is a function of Yt-1, consumption in the previous period. This is the case for a habit forming consumption good like cigarette smoking. In this case our regression equation becomes
Yt = а + (3^-1 + ut t = 2,...,T
where we lost one observation due to lagging. It is obvious that Yt is correlated to ut, but the question here is whether Yt-1 is correlated to ut. After all, Yt-1 is our explanatory variable Xt. As long as assumption 3 is not violated, i. e., the u’s are not correlated across periods, ut represents a freshly drawn disturbance independent of previous disturbances and hence is not correlated with the already predetermined Yt-1. This is what we mean by contemporaneously uncorrelated, i. e., ut is correlated with Yt, but it is not correlated with Yt-1. The OLS estimator of в is
вOLS = £t=2 ytVt-1^21=2 yt-1 = в + £t=2 yt-1ut/ £t=2 yt-1 (5.4)
and the expected value of (5.4) is not в because in general,
E(£ Г=2 yt-1ut/ £ T=2 yt-1) = E(£ T=2 yt-1 ut)/E (£ t=2 y2-1).
The expected value of a ratio is not the ratio of expected values. Also, even if E(Yt-1ut) = 0, one can easily show that E(yt-1ut) = 0. In fact, yt-1 = Yt-1 — Y, and Y contains Yt in it, and
we know that E(Ytut) = 0. Hence, we lost the unbiasedness property of OLS. However, all the asymptotic properties still hold. In fact, Pols is consistent because
plim Pols = в + cov(Yt_i, ut)/var(Yt_i) = в (5.5)
where the second equality follows from (5.4) and the fact that plim(^=2 yt-1ut/T) is cov(Yt-1,Ut) which is zero, and plimQ^T=2 yt--1/T) = var(Yt-1) which is positive and finite.
Case 3: X and u are correlated, in this case OLS is biased and inconsistent. This can be easily deduced from (5.2) since plim(^™=1 xiui/n) is the cov(X, u) = 0, and plim(^’=1 x2/n) is positive and finite. This means that OLS is no longer a viable estimator, and an alternative estimator that corrects for this bias has to be derived. In fact we will study three specific cases where this assumption is violated. These are: (i) the errors in measurement case; (ii) the case of a lagged dependent variable with correlated errors; and (iii) simultaneous equations.
Briefly, the errors in measurement case involves a situation where the true regression model is in terms of X*, but X* is measured with error, i. e., Xi = X* + vi, so we observe Xi but not X*. Hence, when we substitute this Xi for X* in the regression equation, we get
Yi = a + eX* + Ui = a + eXi + (u - в^р (5.6)
where the composite error term is now correlated with Xi because Xi is correlated with vi. After all, Xi = X* + vi and E(Xivi) = E(v2) if X* and vi are uncorrelated.
Similarly, in case (ii) above, if the u’s were correlated across time, i. e., ut-1 is correlated with ut, then Yt-1, which is a function of ut-1, will also be correlated with ut, and E(Yt-1ut) = 0. More on this and how to test for serial correlation in the presence of a lagged dependent variable in Chapter 6.
Finally, if one considers a demand and supply equations where quantity Qt is a function of price Pt in both equations
Qt = a + ePt + ut |
(demand) |
(5.7) |
Qt = 6 + YPt + vt |
(supply) |
(5.8) |
The question here is whether Pt is correlated with the disturbances ut and vt in both equations. The answer is yes, because (5.7) and (5.8) are two equations in two unknowns Pt and Qt. Solving for these variables, one gets Pt as well as Qt as a function of a constant and both ut and vt. This means that E(Ptut) = 0 and E(Ptvt) = 0 and OLS performed on either (5.7) or (5.8) is biased and inconsistent. We will study this simultaneous bias problem more rigorously in Chapter 11.
For all situations where X and u are correlated, it would be illuminating to show graphically why OLS is no longer a consistent estimator. Let us consider the case where the disturbances are, say, positively correlated with the explanatory variable. Figure 3.3 of Chapter 3 shows the true regression line a + eXi. It also shows that when Xi and ui are positively correlated then an Xi higher than its mean will be associated with a disturbance ui above its mean, i. e., a positive disturbance. Hence, Yi = a + eXi + ui will always be above the true regression line whenever Xi is above its mean. Similarly Yi would be below the true regression line for every Xi below its mean. This means that not knowing the true regression line, a researcher fitting OLS on this data will have a biased intercept and slope. In fact, the intercept will be understated and the slope will be overstated. Furthermore, this bias does not disappear with more data, since
this new data will be generated by the same mechanism described above. Hence these OLS estimates are inconsistent.
Similarly, if Xi and ui are negatively correlated, the intercept will be overstated and the slope will be understated. This story applies to any equation with at least one of its right hand side variables correlated with the disturbance term. Correlation due to the lagged dependent variable with autocorrelated errors, is studied in Chapter 6, whereas the correlation due to the simultaneous equations problem is studied in Chapter 11.