Springer Texts in Business and Economics
Goodness of Fit Measures
There are problems with the use of conventional A2-type measures when the explained variable y takes on two values, see Maddala (1983, pp. 37-41). The predicted values р are probabilities
and the actual values of y are either 0 or 1 so the usual R2 is likely to be very low. Also, if there
is a constant in the model the linear probability and logit models satisfy y% = Sn=i рі- However, the probit model does not necessarily satisfy this exact relationship although it is approximately valid.
Several R2-type measures have been suggested in the literature, some of these are the following:
(ii) Measures based on the residual sum of squares: Effron (1978) suggested using
Rl = 1 - [£0= і(У - Уг)2/ ТГг=і(Уг - У)2] = 1 - [n £0=^ - У^/щщ]
since - У)2 = Тл=1 УІ - ny2 = пі - n(ni/n)2 = П1П2/П, where ni = £)£ Уг and
n2 = n - n1.
Amemiya (1981, p. 1504) suggests using [£0^^ - уг)2/уг(1 - уг)] as the residual sum of squares. This weights each squared error by the inverse of its variance.
(iii) Measures based on likelihood ratios: R2 = 1 - (£r/£u)2/n where £r is the restricted likelihood and £u is the unrestricted likelihood. This tests that all the slope coefficients are zero in the standard linear regression model. For the limited dependent variable model however, the likelihood function has a maximum of 1. This means that £r < £u < 1 or £r < (£r/£u) < 1 or £jn < 1 - R2 < 1 or 0 < R3 < 1 - £jn. Hence, Cragg and Uhler (1970) suggest a pseudo-R2 that lies between 0 and 1, and is given by R2 = (£Ujn - &n)/[(1 - &n)/£2Jn]. Another measure suggested by McFadden (1974) is R5 = 1 - (log£u/log£r).
(iv) Proportion of correct predictions: After computing y, one classifies the г-th observation as a success if уг > 0.5, and a failure if у < 0.5. This measure is useful but may not have enough discriminatory power.