Springer Texts in Business and Economics
Unordered Response Models
I = П n=i(Піі)ш (ni2)yi2..(Піт)yim (13.41)
This model can be motivated by a utility maximization story where the utility that individual i derives from say the occupational choice j is denoted by Uij and is a function of the job attributes for the i-th individual, i. e., some xij’s like the present value of potential earnings, and training cost/net worth for that job choice for individual i, see Boskin (1974).
Uij = xij в + Cij (13.42)
where в is a vector of implicit prices for these occupational characteristics. Therefore, the probability of choosing the first occupation is given by:
піі = Pr[Uii >Ui2,Uii > Ui3,. ..,Uii > Uim] (13.43)
= Pr[ei2 - eii < (xil - ХІ2')в, ei3 - eii
< (xii - хіз)в, ...,€im - Cii < (x'ii - х'іт)в]
The normality assumption involves a number of integrals but has the advantage of not necessarily assuming the e’s to be independent. The more popular assumption computationally is the multinomial logit model. This arises if and only if the e’s are independent and identically distributed as a Weibull density function, see McFadden (1974). The latter is given by F(z) = exp(—exp(—z)). The difference between any two random variables with a Weibull distribution has a logistic distribution Л^) = ez/1 + ez, giving the conditional logit model:
nij = Pr[yi = j]=exp[(xij - xim)'e]/{1 + £m - exp[(xij - xm)'в]} (13.44)
= exp[xij в]/£ m=i exp[xij в] for j = 1, 2,...,m - 1
and nim = Pr[yi = m] = 1/{1 + Ym=iexP[(%ij - Хт)'в]} = expX'mЯ/£т=1е*р^.в]. There are two consequences of this conditional logit specification. The first is that the odds of any two alternative occupations, say 1 and 2 is given by
Пі1 /ПІ2 = exp[(xii - Хі2)'в]
and this does not change when the number of alternatives change from m to m*, since the denominators divide out. Therefore, the odds are unaffected by an additional alternative. This is known as the independence of irrelevant alternatives property and can represent a serious weakness in the conditional logit model. For example, suppose the choices are between a pony and a bicycle, and children choose a pony two-thirds of the time. Suppose that an additional alternative is made available, an additional bicycle but of a different color, then one would still expect two-thirds of the children to choose the pony and the remaining one-third to split choices among the bicycles according to their color preference. In the conditional logit model, however, the proportion choosing the pony must fall to one half if the odds relative to either bicycle is to remain two to one in favor of the pony. This illustrates the point that when two or more of the m alternatives are close substitutes, the conditional logit model may not produce reasonable results. This feature is a consequence of assuming the errors eij’s as independent. Hausman and McFadden (1984) proposed a Hausman type test to check for the independence of these errors. They suggest that if a subset of the choices is truly irrelevant then omitting it from the model altogether will not change the parameter estimates systematically. Including them if they are irrelevant preserves consistency but is inefficient. The test statistic is
q = 0s - Vf )'[Vs - Vf]-1(Ps - Vf) (13.45)
where s indicates the estimators based on the restricted subset and f denotes the estimator based on the full set of choices. This is asymptotically distributed as xk, where k is the dimension of в.
Second, in this specification, none of the x.’s can be constant across different alternatives, because the corresponding в will not be identified. This means that we cannot include individual specific variables that do not vary across alternatives like race, sex, age, experience, income, etc. The latter type of data is more frequent in economics, see Schmidt and Strauss (1975). In this case the specification can be modified to allow for a differential impact of the explanatory variables upon the odds of choosing one alternative rather than the other:
nij = Pr[yi = j ] = exp(xij вj )/£ m=i exp(xij вj) for j = 1,...,m (13.46)
where now the parameter vector is indexed by j. If the xij’s are the same for every j, then
nij = Pr[yi = j ] = exp^j )/£ m=i exp^j) for j = 1,...,m (13.47)
This is the model used by Schmidt and Strauss (1975). A normalization would be to take вт = 0, in which case, we get the multinomial logit model
nim = 1/E m=1 exp(xiвj) (13.48)
and
The likelihood function, score equations, Hessian and information matrices are given in Maddala (1983, pp. 36-37).
Multinomial Logit Model. Terza (2002) reconsidered the Mullahy and Sindelar (1996) data set for problem drinking described in section 13.9. However, Terza reclassified the dependent variable as follows: y=1 when the individual is out of the labor force, y=2 when this individual is unemployed, and y=3 when this individual is employed. Table 13.14 runs a multinomial logit model which replicates some of the results in Terza (2002, p. 399) for males using Stata. Although the health variables are still significant the problem drinking variable is not significant. Problem 16 asks the reader to replicate these results for females.