Introduction to the Mathematical and Statistical Foundations of Econometrics
Selecting a Test
The Wald, LR, and LM tests basically test the same null hypothesis against the same alternative, so which one should we use? The Wald test employs only the unrestricted ML estimator 0, and thus this test is the most convenient if we have to conduct unrestricted ML estimation anyway. The LM test is entirely based on the restricted ML estimator 0, and there are situations in which we start with restricted ML estimation or where restricted ML estimation is much easier to do than unrestricted ML estimation, or even where unrestricted ML estimation is not feasible because, without the restriction imposed, the model is incompletely specified. Then the LM test is the most convenient test. Both the Wald and the LM tests require the estimation of the matrix Й. That may be a problem for complicated models because of the partial derivatives involved. In that case I recommend using the LR test.
Although I have derived the Wald, LR, and LM tests for the special case of a null hypothesis of the type 02,0 = 0, the results involved can be modified to general linear hypotheses of the form RQ0 = q, where R is a r x m matrix of rank r, by reparametrizing the likelihood function as follows. Specify a (m — r) x m matrix R* such that the matrix
R*
is nonsingular. Then define new parameters by
If we substitute
0 = Q—1 в + Q—1(°
in the likelihood function, the null hypothesis involved is equivalent to в2 = 0.
1. Derive 0 = argmax0 Ln (0) for the case (8.11) and show that, if Z1,..., Zn is a random sample, then the ML estimator involved is consistent.
2. Derive 0 = argmax0Ln(0) for the case (8.13).
3. Show that the log-likelihood function of the Logit model is unimodal, that is, the matrix d2ln[Ln(0)]/(9090T) is negative-definite for all 0.
4. Prove (8.20).
5. Extend the proof of Theorem 8.2 to the multivariate parameter case.
Let (Y, X1),..(Yn, Xn) be a random sample from a bivariate continuous distribution with conditional density
f (y|x, в0) = (x/e0)exp(—y ■ x/в0) ifx > 0 and y > 0; f (y|x, в0) = 0 elsewhere,
where в0 > 0 is an unknown parameter. The marginal density h (x) of Xj is unknown, but we do know that h does not depend on в0 and h(x) = 0for x < 0.
(a) Specify the conditional likelihood function Lcn(в).
(b) Derive the maximum likelihood estimator в of в0.
(c) Show that в is unbiased.
(d) Show that the variance of в is equal to в0/п.
(e) Verify that this variance is equal to the Cramer-Rao lower bound.
(f) Derive the test statistic of the LR test of the null hypothesis в0 = 1 in the form for which it has an asymptotic x2 null distribution.
(g) Derive the test statistic of the Wald test of the null hypothesis в0 = 1.
(h) Derive the test statistic of the LM test of the null hypothesis в0 = 1.
(i) Show that under the null hypothesis в0 = 1 the LR test in part (f) has a limiting x2 distribution.
Let Z1,.. ., Zn be a random sample from the (nonsingular) Nk [x, X] distribution. Determine the maximum likelihood estimators of /x and X.
In the case in which the dependent variable Y is a duration (e. g., an unemployment duration spell), the conditional distribution of Y given a vector X of explanatory variables is often modeled by the proportional hazard model
/ y
P[Y < y|X = x] = 1 - exp
0
where k(t) is a positive function on (0, ж) such that k(t)dt = ж and ф is
a positive function.
The reason for calling this model a proportional hazard model is the following. Let f(y|x) be the conditional density of Y given X = x, and let G(y |x) = exp {—^(x) fy k(t)dt) , y > 0. The latter function is called the conditional survival function. Then f (y|x)/G(y^) = p(x)k(y) is called the hazard function because, for a small S > 0, Sf(y|x)/G(y|x) is approximately the conditional probability (hazard) that Y є (y, y + S] given that Y > y and
X = x.
Convenient specifications of k(t) and ^(x) are k(t) = у tY—1, y > 0 (Weibull specification) p(x) = exp(a + eTx).
Now consider a random sample of size n of unemployed workers. Each unemployed worker j is interviewed twice. The first time, worker j tells the interviewer how long he or she has been unemployed and reveals his or her
vector Xj of characteristics. Call this time Yi, j ■ A fixed period of length T later the interviewer asks worker j whether he or she is still (uninterruptedly) unemployed and, if not, how long it took during this period to find employment for the first time. Call this duration Y2, j ■ In the latter case the observed unemployment duration is Yj = Y1, j + Y2, j, but if the worker is still unemployed we only know that Yj > Y1, j + T■ The latter is called censoring. On the assumption that the Xj’s do not change over time, set up the conditional likelihood function for this case, using the specifications (8.68) and (8.69).