Introduction to the Mathematical and Statistical Foundations of Econometrics
The Tobit Model
Let Zj = (Yj, XTj )T, j = 1,..., и be independent random vectors such that
Yj = max(Yj, 0), where Yj = a0 + всТXj + Uj
with Uj |Xj - N(0,o02). (8.16)
The random variables Yj are only observed if they are positive. Note that
P[Yj = 0|Xj] = P [ao + eoTXj + Uj < 0|Xj]
= P [Uj >a0 + вГXj |Xj] = 1 - Ф (ja0 + A0TXj)М>),
X
where Ф(х) = j exp(-u2/2)/V2ndu.
This is a Probit model. Because model (8.16) was proposed by Tobin (1958) and involves a Probit model for the case Yj = 0, it is called the Tobit model. For example, let the sample be a survey of households, where Yj is the amount of money household j spends on tobacco products and Xj is a vector of household characteristics. But there are households in which nobody smokes, and thus for these households Yj = 0.
In this case the setup of the conditional likelihood function is not as straightforward as in the previous examples because the conditional distribution of Yj given Xj is neither absolutely continuous nor discrete. Therefore, in this case it is easier to derive the likelihood function indirectly from Definition 8.1 as follows.
First note that the conditional distribution function of Yj, given Xj and Yj >
0, is
P[0 < Yj < y|Xj]
P[Yj > 0|Xj]
P [ — a0 — e0TXj < Uj < у — a0 — e0TXj | Xj ]
P[Yj > 0|Xj]
Ф ((у — a0 — eTXj)M)) — Ф (( — a0 — eTXj)M))
Ф ((a0 + eTXj)M)) hence, the conditional density function of Yj, given Xj and Yj > 0, is
= И (y - a0 — AlXj) Ы, (y > 0) 00Ф((а0 + во Xj)/0-0) exp(—x 2/2)
Next, observe that, for any Borel-measurable function g of (Yj, Xj) such that
E[|g(Yj, Xj)|] < to, we have
E[g(Yj, Xj)|Xj]
= g(0, Xj) P [Yj = 0| Xj ] + E [g(Yj, Xj) I (Yj > 0)| Xj ]
= g(0, Xj)P[Yj = 0|Xj]
+ E(E[g(Yj, Xj)|Xj, Yj > 0)|Xj]I(Yj > 0)|Xj)
= g(0, Xj) (1 - Ф ((ас + вТ Xj )/ac))
( |
TO
I g(y, Xj )h(y |0с, Xj, Yj > 0)dy • I (Yj > 0)| Xj с
= g(0, Xj) (1 - Ф ((ас + вТXj)M))
TO
+ j g(y, Xj^(y^c, Xj, Yj > 0)dy • Ф ((ас + P^Xj)/ос) с
= g(0, Xj) (1 - Ф ((ас + вТXj)M))
TO
+ — f g(y, Xj )ip ((y - ас - вс Xj )/ас) dy. (8.17)
ос J
с
Hence, if we choose g(Yj, X)
= (1 - Ф((а + вT Xj )/о)) I (Yj = Q) + о - >((Yj - а - вT Xj )/о )I (Yj > Q)
(1 - Ф((ас + вТ Xj )M>)) I (Yj = с) + о-V(Y; - ас - вТ Xj )M>)I (Yj > с) ’
(8.18)
it follows from (8.17) that
E[g(Yj, Xj)|Xj] = 1 - Ф ((а + вTXj)/о)
TO
+ О J V ((у - а - вТXj)/о) dy с
= 1 - Ф ((а + вТXj)/о)
+ 1 - Ф (( - а - вTXj)/о) = 1. (8.19)
In view of Definition 8.1, (8.18) and (8.19) suggest defining the conditional likelihood function of the Tobit model as
n
L П О) = П [(1 - ^(а + вT Xj) /о)) I (Yj = с)
j=1
+ о - V ((Yj - а - в TXj )/о) I (Yj > с)].
The conditions (b) in Definition 8.1 now follow from (8.19) with the a-algebras involved defined as in the regression case. Moreover, the conditions (c) also appty.
Finally, note that
Therefore, if one estimated a linear regression model using only the observations with Yj > 0, the OLS estimates would be inconsistent, owing to the last term in (8.20).