A COMPANION TO Theoretical Econometrics
The Linear Regression Model with Measurement Error
The standard linear multiple regression model can be written as
у = SP + e, (8.1)
where у is an observable N-vector, e an unobservable N-vector of random variables, the elements of which are independently identically distributed (iid) with zero expectation and variance о2, and N is the sample size. The у-vector p is fixed but unknown. The N x у-matrix S contains the regressors, which are assumed to be independent of e. For simplicity, variables are assumed to be measured in deviations from their means.[9]
If there are measurement errors in the explanatory variables, S is not observed. Instead, we observe the matrix X:
X = S + V, (8.2)
where V (N x у) is a matrix of measurement errors. Its rows are assumed to be iid with zero expectation and covariance matrix Q (у x у) and independent of S and e. Columns of V (and corresponding rows and columns of Q) are zero when the corresponding regressors are measured without error.
We consider the consequences of neglecting the measurement errors. Let
b = (XX )-1 X'y
s2 - 1 (у - Xb) (у - Xb) = 1 у'(IN - X(XX)-1X')y
N - у N - у
be the ordinary least squares (OLS) estimators of P and о2 Substitution of (8.2) into (8.1) yields
у = (X - V)P + e = Xp + u, (8.3)
with u = e - VP. This means that (8.3) has a disturbance term which shares a stochastic term (V) with the regressor matrix. Thus, u is correlated with X and E(u | X) Ф 0. This lack of orthogonality means that a crucial assumption underlying the use of ordinary least squares regression is violated. As we shall see below, the main consequence is that b and s2 are no longer consistent estimators of P and о2. In order to analyze the inconsistency, let SS - S'S/N and SX = X'X/
N. Note that SX is observable but SS is not.
We can interpret (8.1) in two ways. It is either a functional or a structural model. Under the former interpretation, we do not make explicit assumptions regarding the distribution of S, but consider its elements as unknown fixed parameters. Under the latter interpretation, the elements of S are supposed to be random variables. For both cases, we assume plimN^„ S S = X3 with X3 a positive definite у x у-matrix. Hence,
Note that since X3 is positive definite and Q is positive semidefinite, XX is also positive definite.