A plot of the residuals of the regression is very important. The residuals are consistent estimates of the true disturbances. But unlike the ui’s, these ei’s are not independent. In fact, the OLS normal equations (3.2)and (3.3) give us two relationships between these residuals. Therefore, knowing (n — 2) of these residuals the remaining two residuals can be deduced. If we had the true ui’s, and we plotted them, they should look like a random scatter around the horizontal axis with no specific pattern to them. A plot of the ei s that shows a certain pattern like a set of positive residuals followed by a set of negative residuals as shown in Figure 3.6(a) may be
indicative of a violation of one of the 5 assumptions imposed on the model, or simply indicating a wrong functional form. For example, if assumption 3 is violated, so that the щ’s are say positively correlated, then it is likely to have a positive residual followed by a positive one, and a negative residual followed by a negative one, as observed in Figure 3.6(b). Alternatively, if we fit a linear regression line to a true quadratic relation between Y and X, then a scatter of residuals like that in Figure 3.6(c) will be generated. We will study how to deal with this violation and how to test for it in Chapter 5.
Large residuals are indicative of bad predictions in the sample. A large residual could be a typo, where the researcher entered this observation wrongly. Alternatively, it could be an influential observation, or an outlier which behaves differently from the other data points in the sample and therefore, is further away from the estimated regression line than the other data points. The fact that OLS minimizes the sum of squares of these residuals means that a large weight is put on this observation and hence it is influential. In other words, removing this observation from the sample may change the estimates and the regression line significantly. For more on the study of influential observations, see Belsely, Kuh and Welsch (1980). We will focus on this issue in Chapter 8 of this book.
Figure 3.6 Positively Correlated Residuals
Figure 3.7 Residual Variation Growing with X
One can also plot the residuals versus the X^s. If a pattern like Figure 3.7 emerges, this could be indicative of a violation of assumption 2 because the variation of the residuals is growing with Xi when it should be constant for all observations. Alternatively, it could imply a relationship between the Xi’s and the true disturbances which is a violation of assumption 4.
In summary, one should always plot the residuals to check the data, identify influential observations, and check violations of the 5 assumptions underlying the regression model. In the next few chapters, we will study various tests of the violation of the classical assumptions. Most of these tests are based on the residuals of the model. These tests along with residual plots should help the researcher gauge the adequacy of his or her model.
Table 3.1 Simple Regression Computations