A COMPANION TO Theoretical Econometrics
Other Parametric Count Regression Models
Various models that are less restrictive than Poisson are presented in this section.
First, overdispersion in count data may be due to unobserved heterogeneity. Then counts are viewed as being generated by a Poisson process, but the researcher is unable to correctly specify the rate parameter of this process. Instead the rate parameter is itself a random variable. This mixture approach, presented in Sections 3.1-3.2, leads to the widely-used negative binomial model.
Second, overdispersion, and in some cases underdispersion, may arise because the process generating the first event may differ from that determining later events. For example, an initial doctor consultation may be solely a patient's choice, while subsequent visits are also determined by the doctor. This leads to the hurdle model, presented in Section 3.3.
Third, overdispersion in count data may be due to failure of the assumption of independence of events which is implicit in the Poisson process. One can introduce dependence so that, for example, the occurrence of one doctor visit makes subsequent doctor visits more likely. This approach has not been widely used in count data analysis. (In duration data analysis this is called true state dependence, to be contrasted with the first approach of unobserved heterogeneity.) Particular assumptions again lead to the negative binomial; see also Winkelmann (1995). A discrete choice model that progressively models Pr[y = j y > j - 1] is presented in Section 3.4, and issues of dependence also arise in Section 5 on time series.
Fourth, one can refer to the extensive and rich literature on univariate iid count distributions, which offers intriguing possibilities such as the logarithmic series and hypergeometric distribution (Johnson, Kotz, and Kemp, 1992). New regression models can be developed by letting one or more parameters be a specified function of regressors. Such models are not presented here. The approach has less motivation than the first three approaches and the resulting models may not be any better.