A COMPANION TO Theoretical Econometrics
Semiparametric efficiency bound and semiparametric MLE
On asymptotic efficiency for semiparametric estimation, Chamberlain (1986) derived an asymptotically lower bound for variances of - ft - consistent regular semiparametric estimators for the sample selection model (18.1). The notion of asymptotic efficiency for parameter estimators in a semiparametric model was originated in Stein (1956). The semiparametric asymptotic variance bound V for a semiparametric model is defined as the supremum of the Cramer-Rao bounds for all regular parametric submodels. The intuition behind this efficient criterion is that a parametric MLE for any parametric submodel should be at least as efficient as any semiparametric estimator. The class of estimators is restricted to regular estimators so as to exclude superefficient estimators and estimators using information that is not contained in a semiparametric model. Newey (1990)
provides a useful characterization for a semiparametric estimator to be regular. A way to derive the semiparametric bound is to derive a relevant tangent set and its efficient score S. Let 5 = (0', n')' be the parameters of a submodel and let S5 = (S0, Sn)' be the score vector. A tangent set I is defined as the mean-squares closure of all linear combinations of scores Sn for parameter models. The efficient score S is the unique vector such that S0 - S Є I and E(S't) = 0, for all t Є I. The semiparametric variance bound is V = (E[SS'])-1.
The sample selection model considered in Chamberlain (1986) is (18.1) under the assumption that e and u are independent of all the regressors in the model. With a parametric submodel, the scores for (p, у) of a single observation have the form dY) = a1( y - xp, I, zy)x and d^' Y) = a2( y - xp, I, zy)z, for some functions
a1 and a2. Chamberlain (1986) showed that the effective scores of the semiparametric sample selection model are simply the preceding scores with the factors x and z replaced, respectively, by x - E(x | zy) and z - E(z | zy), i. e.,
dТ(Р, у) = a1(y - xp, i, zу)(x - E(x | zу)), and
эр
Э1пі(в' Y) = a2(y - xp, I, zy)(z - E(z | zy). (18.13)
Э у
The information matrix of the semiparametric model is formed from these effective scores. The expressions in (18.13) provide insight into qualitative differences between parametric and semiparametric models. The information matrix of the semiparametric model is singular if there is no restriction on у of the choice equation. This is so because (z - E(z | zу))y = 0. Furthermore, if z consists of all discrete variables, then E(z | zу) generally equals z. So, in order that the information matrix is nonsingular, one component of z must have a continuously distributed variable. If z were a subvector of x, the effective scores would also be linearly dependent and the information matrix would be singular. So an exclusion restriction on p is also needed. Chamberlain (1986) pointed out that if the information matrix of the semiparametric model is singular, then there are parameters for which no (regular) consistent estimator can converge at rate n~1/2.
Ai (1997) and Chen and Lee (1998) proposed semiparametric scoring estimation methods based on index restriction and kernel regression functions. The approaches involve the construction of efficient score functions. Ai's estimator was defined by setting an estimated sample score equal to zero and involved solving nonlinear equations. Chen and Lee's approach was a two-step efficient scoring method. Given an initial -^fn - consistent estimator, the latter estimator has a closed form. Both the estimator of Ai and that of Chen and Lee were shown to be asymptotically efficient for model (18.1) under the independence assumption. Chen and Lee (1998) also derived the efficient bound for the polychotomous choice model with index restrictions and pointed out that their estimator attained that bound. As an index model with L choice alternatives, P(l;|x) = E(l;|zу) are functions of index z у of the choice equations and the density function of y
conditional on I1 = 1 and all exogenous variables in the model is the conditional density function of y - xP conditional on I1 = 1 and zу at the true parameter vector (Po, Y0), i. e. f(y | Ii = 1, v, z) = f(y - vPo | Ii = 1, zYo) = g(y - x$0, xy01 Ii = 1)/p(*Y0 | Ii = 1), where g(e, xy0111 = 1) is the conditional density of e and xy0 conditional on I1 = 1, and p(xy0111 = 1) is the conditional density of xy0. Given a random sample of size n, for any possible value 0, the probability function E(It | zy, 9) of Ii conditional on zy evaluated at point zy can be estimated by Pnl(x,, 0) = An(Il | x;, 0)/
n - 1 1ФІ ] a™n "v «1,
a kernel function on Rm when zy is an m-dimensional vector of indices. On the other hand, f(y - xP |I1 = 1, zy, 0) evaluated at point (y, - x;p, zly) can be estimated by fn(yі - xiв|Ili = 1, zy) = Cn(x;, y,, 0)/An(I11 x;, 0), where Cn(x;, y,, 0) = ZI1 johrjCV' xP)az (V' x, P), 2,1 аг 2,1) when y - xP is a vector of dimension k.
These nonparametric functions formulate a semiparametric loglikelihood function lnLn(0) = nZn=1{I1ilnfn(yі - Xiв|Ili = 1, zy) + ZiLA-lnPni(x,, 0)}. But, due to technical difficulties, this semiparametric likelihood function can hardly be used. Instead, one can work with its implied score function, i. e. the derivative of the semiparametric loglikelihood function with respect to 0. With an initial ■Jn - consistent estimate of 0, the Chen-Lee estimator is a semiparametric two - step scoring estimator.