A COMPANION TO Theoretical Econometrics
Estimation of the intercept
The semiparametric estimation of sample selection models described has focused on the estimation of regression coefficients in outcome and selection equations. The intercept of the outcome equation has been absorbed in the unknown distribution of disturbances. For some empirical applications, one might be interested in estimating counterfactual outcomes and, hence, the intercept of an outcome equation. The semi-nonparametric likelihood approach of Gallant and Nychka (1987) based on series expansion can consistently estimate the unknown intercept but the asymptotic distribution of the estimator remains unknown. Instead of the likelihood approach, alternative approaches can be based on the observed outcome equation under the assumption that the underlying disturbances have zero-mean. One may imagine that the observed outcome equation is not likely to be subject to selection bias for individuals whose decisions of participation are almost certain, i. e. individuals with observed characteristics x such that P(I = 1 |x) = 1. This idea is in Olsen (1982), and has lately been picked up in
Heckman (1990) and Andrews and Schafgans (1998). Consider the sample selection model with outcome y = в 1 + xP 2 + u, which can be observed only if zY > e, where (u, e) is independent of x and z. The p2 and j can be estimated by various semiparametric methods as described before. Let S2 and у be, respectively, consistent estimates of p2 and j. Heckman (1990) suggests the estimator S1 = Xn=1 Ii(Уі - x$2) 1{ъ„,»)(ZiY)/Ih(z-y), where {bn} is supposed to be a sequence of bandwidth parameters depending on sample size n such that bn ^ ^. The latter design is necessary as only the upper tails of zY would likely identify the individual with the choice probability close to one. Heckman did not provide an asymptotic analysis of his suggested estimator. Andrews and Schafgans (1998) suggest a smooth version by replacing I(K,„)( zj) with a smooth distribution function. The later modification provides relatively easy asymptotic analysis of the estimator. The rate of convergence and possible asymptotic distribution of the estimator depend on some relative behaviors of the upper tail distributions of zY and e. Andrews and Schafgans (1998) show that the intercept estimator can only achieve up to a cube-root-n rate of convergence when the upper tail of the distribution of zY has the same tail thickness as the upper tail of the distribution of e. The rate of convergence can be up to square-root-n only when the upper tail of the distribution of z j is thicker than the upper tail of the distribution of e. One may recall some similarity on the asymptotic behavior of order statistics for the estimation of the range of a distribution.