A COMPANION TO Theoretical Econometrics
Artificial Regressions
Russell Davidson and James G. MacKinnon
All popular nonlinear estimation methods, including nonlinear least squares (NLS), maximum likelihood (ML), and the generalized method of moments (GMM), yield estimators which are asymptotically linear. Provided the sample size is large enough, the behavior of these nonlinear estimators in the neighborhood of the true parameter values closely resembles the behavior of the ordinary least squares (OLS) estimator. A particularly illuminating way to see the relationship between any nonlinear estimation method and OLS is to formulate the artificial regression that corresponds to the nonlinear estimator.
An artificial regression is a linear regression in which the regressand and regressors are constructed as functions of the data and parameters of the nonlinear model that is really of interest. In addition to helping us understand the asymptotic properties of nonlinear estimators, artificial regressions are often extremely useful as calculating devices. Among other things, they can be used to estimate covariance matrices, as key ingredients of nonlinear optimization methods, to compute one-step efficient estimators, and to calculate test statistics.
In the next section, we discuss the defining properties of an artificial regression. In the subsequent section, we introduce the Gauss-Newton regression, which is probably the most popular artificial regression. Then, in Section 4, we illustrate a number of uses of artificial regressions, using the Gauss-Newton regression as an example. In Section 5, we develop the most important use of artificial regressions, namely, hypothesis testing. We go beyond the Gauss-Newton regression in Sections 6 and 7, in which we introduce two quite generally applicable artificial regressions, one for models estimated by maximum likelihood, and one for models estimated by the generalized method of moments. Section 8 shows how artificial regressions may be modified to take account of the presence of heteroskedasticity of unknown form. Then, in Sections 9 and 10, we discuss double-length regressions and artificial regressions for binary response models, respectively.