A COMPANION TO Theoretical Econometrics
Combined Regression
Both the parametric and nonparametric regressions, when used individually, have certain drawbacks. For example, when the a priori specified parametric regression m(x) = f (P, x) is misspecified even in the small regions of the data, the parametric fit may be poor (biased) though it may be smooth (low variance). On the other hand, the nonparametric regression techniques, which totally depend on the data and have no a priori specified functional form may trace the irregular pattern in the data well (less bias) but may be more variable (high variance). Thus, when the functional form of m(x) is unknown, a parametric model may not adequately describe the data in its entire range, whereas a nonparametric analysis would ignore the important a priori information about the underlying model. A solution considered in the literature is to use a combination of parametric and nonparametric regressions which can improve upon the drawbacks of each when used individually, see Eubank and Spiegelman (1990), Fan and Ullah (1998), and Glad (1998). Essentially the combined regression estimator controls both the bias and variance and hence improves the MSE of the fit. To see the idea behind the combined estimation let us start with a parametric model (m(x) = f (P, x)) which can be written as
Уі = m(xi) + U, = f^ xi) + g(xi) + Є u
where g(x,) = E(u, | x,) = m(x,) _ E(f(P, x,) x,) and є, = u, _ E(u, | x,) such that Е(є, | x,) = 0. Note that f (P, x,) may not be a correctly specified model so g(x,) Ф 0. If it is indeed a correct specification g(x,) = 0. The combined estimation of m(x) can be written as
nc(xb = f (S, x,) + i(x,),
where i(x,) = Ё(й, | x,) is obtained by the LLS estimation technique and й, = y, _ f (S, x,) is the parametric residual.
An alternative way to combine the two models is to introduce a weight parameter X and write у, = f (P, x,) + Xg(x,) + e,. If the parametric model is correct, X = 0. Thus, the parameter X measures the degree of accuracy of the parametric model. A value of X in the range 0 to 1 can be obtained by using the goodness of fit measures described in Section 2.1, especially R2 and EPE. Alternatively, an LS estimator > can be obtained by doing a density weighted regression of у, - f (S, x;) = й; on i(x,), see Fan and Ullah (1998). The combined estimator of m(x) can now be given by
n*(x,) = f (S, x,) + >i(x,).
This estimation indicates that a parametric start model f(S, x) be adjusted by > times i(x) to get a more accurate fit of the unknown m(x).
f(e x,)E(y*| x;), |
Instead of additive adjustments to the parametric start in mc(x) and mc*(x), Glad (1998) proposed a multiplicative adjustment as given below. Write
where y* = y;/f(P, x;). Then mg(xi) = f(S, x,)E(y*| x;) is the estimator proposed by Glad (1998), where y* = y;/f (P, x;), and E() is obtained by the LLS estimator described above.
The asymptotic convergence rates of mc(x) and its asymptotic normality are given in Fan and Ullah (1998). In small samples, the simulation results of Rahman and Ullah (1999) suggest that the combined estimators perform, in the MSE sense, as well as the parametric estimator if the parametric model is correct and perform better than both the parametric and nonparametric LLS estimators if the parametric model is incorrect.