Advanced Econometrics Takeshi Amemiya
Generalized Maximum Likelihood Estimator
Cosslett (1983) proposed maximizing the log likelihood function (9.2.7) of a binary QR model with respct to P and F, subject to the condition that F is a distribution function. The log likelihood function, denoted here as у/, is
¥(fi, Л = І {у, log F(x'fl) + (1 - y,) log [1 - F(x'M). (9.6.33)
f-i
The consistency proof of Kiefer and Wolfowite (1956) applies to this kind of model. Cosslett showed how to compute MLE fi and F and derived conditions for the consistency of MLE, translating the general conditions of Kiefer and Wolfowitz into this particular model. The conditions Cosslett found, which are not reproduced here, are quite reasonable and likely to hold in most practical applications.
Clearly some kind of normalization is needed on fi and /’before we maximize (9.6.33). Cosslet adopted the following normalization: The constant term is 0 and the sum of squares of the remaining parameters is equal to 1. Note that the assumption of zero constant term is adopted in lieu of Manski’s assumption F(0) = 0.5. We assume that the constant term has already been eliminated from the x - jJthat appears in (9.6.33). Thus we can proceed, assuming P'P = 1.
The maximization of (9.6.33) is carried out in two stages. In the first stage we shall fix p and maximize y/(P, F) with respect to F. Let the solution be P(P). Then in the second stage we shall maximize yt[P, F(P)] with respect to/?. Although the second stage presents a more difficult computational problem, we shall describe only the first stage because it is nonstandard and conceptually more difficult.
The first-stage maximization consists of several steps:
Step 1. Given ft, rank order {x'ft}. Suppose x'(l)ft < x[2)ft <. . . < x[n)ft, assuming there is no tie. Determine a sequence (y(1), y(2),. . . , y(n)) accordingly. Note that this is a sequence consisting only of ones and zeros.
Step 2. Partition this sequence into the smallest possible number of successive groups in such a way that each group consists of a nonincreasing sequence.
Step 3. Calculate the ratio of ones over the number of elements in each
group. Let a sequence of ratios thus obtained be (r,, r2........... rK), assuming
there are К groups. If this is a nondecreasing sequence, we are done. We define F(x'(0ft) = rj if the (i)th observation is in the y'th group.
Step 4. If, however, r} < , for some j, combine the jth and O' — 1 )th group
and repeat step 3 until we obtain a nondecreasing sequence.
The preceding procedure can best be taught by example:
Example 1.
|
In this example, there is no need for step 4. Example 2.
|
Here, the second and third group must be combined to yield
|
Note that F is not unique over some parts of the domain. For example, between x'(2)ft and x'{3)ft in Example 1, F may take any value between 0 and $.
Asymptotic normality has not been proved for Cosslett’s MLE ft, nor for any model to which the consistency proof of Kiefer and Wolfowitz is applicable. This seems to be as difficult a problem as proving the asymptotic normality of Manski’s maximum score estimator.
Cosslett’s MLE may be regarded as a generalization of Manski’s estimator because the latter searches only among one-jump step functions to maximize (9.6.33). However, this does not necessarily imply that Cosslett’s estimator is superior. Ranking of estimators can be done according to various criteria. If the purpose is prediction, Manski’s estimator is an attractive one because it maximizes the number of correct predictions.