A COMPANION TO Theoretical Econometrics
Nonparametric. Kernel Methods. of Estimation and. Hypothesis Testing
Over the last five decades much research in empirical and theoretical econometrics has been centered around the estimation and testing of various econometric functions. For example the regression functions studying the consumption and production functions, the heteroskedasticity functions studying the variability or volatility of financial returns, the autocorrelation function exploring the nature of the time series, and the density functions analyzing the shape of the residuals or any economic variable. A traditional approach to studying these functions has been to first impose a parametric functional form and then proceed with the estimation and testing of interest. A major disadvantage of this traditional approach is that the econometric analysis may not be robust to the slight data inconsistency with the particular parametric specification. Indeed any misspecification in the functional form may lead to erroneous conclusions. In view of these problems, recently a vast amount of literature has appeared on the nonparametric and semiparametric approaches to econometrics, see the books by Prakasa Rao (1983), Silverman (1986), Hardle (1990), Fan and Gijbels (1996) and Pagan and Ullah (1999). In fact a large number of papers continue to pour in to various journals of statistics and econometrics.
The basic point in the nonparametric approach to econometrics is to realize that, in many instances, one is attempting to estimate an expectation of one variable, y, conditional upon others, x. This identification directs attention to the need to be able to estimate the conditional mean of y given x from the data yi and x;, i = 1,..., n. A nonparametric estimate of this conditional mean simply follows as a weighted average Xw(xi, x)yi, where w(xi, x) are a set of weights that depend upon the distance of xi from the point x at which the conditional expectation is to be evaluated. A kernel weight is considered and it is the subject of discussion in Section 2. This section also indicates how the procedures extend to the estimation of any higher order moments and the estimation of the derivatives of the function linking y and x. Finally, a detailed discussion of the existing and some new goodness-of-fit procedures for the nonparametric regression are presented; and their applications for determining the window width in the kernel weight, and the variables selection are discussed.
A problem with the a priori specified parametric function is that when it is misspecified, even in the small regions of the data, the parametric fit may be poor (biased) though it may be smooth (low variance). On the other hand the nonparametric functional estimation techniques, which totally depend on the data and have no a priori specified functional form may trace the irregular pattern in the data well (less bias) but may be more variable (high variance). A solution discussed in Section 3 is to use a combination of parametric and nonparametric regressions which can improve upon, in the mean squared error (MSE) sense, the drawbacks of each when used individually.
Perhaps the major complications in a purely nonparametric approach to estimation is the "curse of dimensionality", which implies that, if an accurate measurement of the function is to be made, the size of sample should increase rapidly with the number of variables involved in any relation. This problem has lead to the development of additive nonparametric regressions which estimate the regressions with the large numbers of x with a similar accuracy as the regression with one variable. This is discussed in Section 4. Another solution is to consider a linear relationship for some variables while allowing a much smaller number to have an unknown nonlinear relation. Accordingly, Section 4.1 deals with these and other related models which are referred to as the semiparametric models.
While the major developments in nonparametric and semiparametric research have been in the area of estimation, only recently have papers started appearing which deal with the hypothesis testing issues. The general question is how to deal with the traditional hypothesis testing problems - such as the test of functional forms, restrictions, heteroskedasticity - in the nonparametric and semiparametric models. This is explored in Section 5.
The plan of the paper is as follows. In Section 2 we present the estimation of nonparametric regression. Then in Section 3 we discuss the combined regressions. Section 4 deals with the additive regressions and the semiparametric models. Finally, in Section 5 we explore the issues in the nonparametric hypothesis testing.