Least Squares Estimation
As explained in Chapter 3, least squares minimizes the residual sum of squares where the residuals are now given by ei = Yi — 2 — £K=2 PkXki and 2 and (3k denote guesses on the regression parameters a and вk, respectively. The residual sum of squares
RSS = £"=! e? = £!=!(£ — a — 32X2i —.. — вкXKi)2
is minimized by the following K first-order conditions:
3(£i=i e?)/da = —2 £"=! ei = 0 d(£7=1 e?)/d@k = —2 7=1 eiXki = 0, for k = 2,...,K. or, equivalently
ST=1 Yi = an + @2^7=1 X2i + .. + вК ЕІ=1 Xki
£7=i YiX2i = a £7=1 X2i + в2 E7=1 X2i + .. + вК E7=1 X2iXKi
£7=1 YiXKi = a £7=1 XKi + в2 E7=1 X2iXKi + .. + @kYa=1 XKi
B. H. Baltagi, Econometrics, Springer Texts in Business and Economics, DOI 10.1007/978-3-642-20059-5_4, © Springer-Verlag Berlin Heidelberg 2011
where the first equation multiplies the regression equation by the constant and sums, the second equation multiplies the regression equation by X2 and sums, and the K-th equation multiplies the regression equation by XK and sums. ^™=1 ui = 0 and 52™=1 uiXki = 0 for k = 2,...,K are implicitly imposed to arrive at (4.3). Solving these K equations in K unknowns, we get the OLS estimators. This can be done more succinctly in matrix form, see Chapter 7. Assumptions 1-4 insure that the OLS estimator is BLUE. Assumption 5 introduces normality and as a result the OLS estimator is also (i) a maximum likelihood estimator, (ii) it is normally distributed, and (iii) it is minimum variance unbiased. Normality also allows test of hypotheses. Without the normality assumption, one has to appeal to the Central Limit Theorem and the fact that the sample is large to perform hypotheses testing.
In order to make sure we can solve for the OLS estimators in (4.3) we need to impose one further assumption on the model besides those considered in Chapter 3.
Assumption 6: No perfect multicollinearity, i. e., the explanatory variables are not perfectly correlated with each other. This assumption states that, no explanatory variable Xk for k = 2,...,K is a perfect linear combination of the other X’s. If assumption 6 is violated, then one of the equations in (4.2) or (4.3) becomes redundant and we would have K — 1 linearly independent equations in K unknowns. This means that we cannot solve uniquely for the OLS estimators of the K coefficients.
Example 1: If X2i = 3X4i — 2X5i + X7i for i = 1,...,n, then multiplying this relationship by ei and summing over i we get
521=1 X2iei = 3 n=1 X4iei — 2 52 n=1X5iei + 527=1 X7iei.
This means that the second OLS normal equation in (4.2) can be represented as a perfect linear combination of the fourth, fifth and seventh OLS normal equations. Knowing the latter three equations, the second equation adds no new information. Alternatively, one could substitute this relationship in the original regression equation (4.1). After some algebra, X2 would be eliminated and the resulting equation becomes:
Yi = a + e3X3i + (3в2 + в 4 )X4i + (в 5 — 2^2)X5i + P 6X6i + (P 2 + Pl)X7i (4.4)
+ .. + P K XKi + Щ.
Note that the coefficients of X4i, X5i and X7i are now (3в2 + в4), (в5 — 2в2) and (в2 + в7), respectively. All of which are contaminated by в2. These linear combinations of в2, в4, в5 and в7 can be estimated from regression (4.4) which excludes X2i. In fact, the other X’s, not contaminated by this perfect linear relationship, will have coefficients that are not contaminated by в2 and hence are themselves estimable using OLS. However, в2, в4, в5 and в7 cannot be estimated separately. Perfect multicollinearity means that we cannot separate the influence on Y of the independent variables that are perfectly related. Hence, assumption 6 of no perfect multicollinearity is needed to guarantee a unique solution of the OLS normal equations. Note that it applies to perfect linear relationships and does not apply to perfect non-linear relationships among the independent variables. In other words, one can include X1i and Xfi like (years of experience) and (years of experience)2 in an equation explaining earnings of individuals. Although, there is a perfect quadratic relationship between these independent variables, this is not a perfect linear relationship and therefore, does not cause perfect multicollinearity.