Definition of Best

Before we prove that the least squares estimator is best linear unbiased, we must define the term best. First we shall define it for scalar estimators, then for vector estimators.

Definition 1.2.1. Let в and 0* be scalar estimators of a scalar parameter 0. The estimator 0 is said to be at least as good as (or at least as efficient as) the estimator 0* if£(0 — в? si Е(в* — в)2 for all parameter values. The estimator 0 is said to be better (or more efficient) than the estimator 0* if 0 is at least as good as 0* and E(§ — 0f< Е(в* — в)2 for at least one parameter value. An estimator is said to be best (or efficient) in a class if it is better than any other estimator in the class.

The mean squared error is a reasonable criterion in many situations and is mathematically convenient. So, following the convention of the statistical literature, we have defined “better” to mean “having a smaller mean squared error.” However, there may be situations in which a researcher wishes to use other criteria, such as the mean absolute error.

Definition 1.2.2. Let 0 and 0* be estimators of a vector parameter 0. Let A and be their respective mean squared error matrices; that is, A = E(§— 0X0 — 0)' and В = E{0* — 0X0* — 0)Then we say 0 is better (or more efficient) than 0* if

c'(B — A)c Ш 0 for every vector c and every parameter value

(1.2.19)

and

c'(B —A)c>0 for at least one value of c and (1.2.20)

at least one value of the parameter.

This definition of better clearly coincides with Definition 1.2.1 if 0 is a scalar. In view of Definition 1.2.1, equivalent forms of statements (1.2.19) and

(1.2.19) are statements (1.2.21) and (1.2.22);

c'0 is at least as good as c' 0* for every vector с (1.2.21)

and

c'0 is better than c'0* for at least one value of c. (1.2.22)

Using Theorem 4 of Appendix 1, they also can be written as

В S A for every parameter value (1.2.23)

and

В Ф A for at least one parameter value. (1.2.24)

(Note that В ^ A means В — A is nonnegative definite and В > A means В — A is positive definite.)

We shall now prove the equivalence of (1.2.20) and (1.2.24). Because the phrase “for at least one parameter value” is common to both statements, we shall ignore it in the following proof. First, suppose (1.2.24) is not true. Then В = A. Therefore c'(B — A)c = 0 for every c, a condition that implies that

(1.2.20) is not true. Second, suppose (1.2.20) is not true. Then с' (B — A)c = 0 for every c and every diagonal element of В — A must be 0 (choose c to be the zero vector, except for 1 in the ith position). Also, the i, jth element of В — A is 0 (choose c to be the zero vector, except for 1 in the fth and yth positions, and note that В — A is symmetric). Thus (1.2.24) is not true. This completes the proof.

Note that replacing В Ф A in (1.2.24) with В > A—or making the corresponding change in (1.2.20) or (1.2.22) — is unwise because we could not then rank the estimator with the mean squared error matrix higher than the estimator with the mean squared error matrix

A problem with Definition 1.2.2 (more precisely, a problem inherent in the comparison of vector estimates rather than in this definition) is that often it does not allow us to say one estimator is either better or worse than the other. For example, consider

A“[o l] a"d B-[o i! <l'225)

Clearly, neither A S В nor В ї A. In such a case one might compare the trace and conclude that 0 is better than 0* because tr A < tr B. Another example is

A"[l 2] and B“[o 2} <1126)

Again, neither A ^ В nor B^ S A. If one were using the determinant as the criterion, one would prefer 0 over 0* because det A < det B.

Note that B^A implies both tr В tr A and det В ^ det A. The first follows from Theorem 7 and the second from Theorem 11 of Appendix 1. As these two examples show, neither tr В ^ tr A nor det В S det A implies B> A.

Use of the trace as a criterion has an obvious intuitive appeal, inasmuch as it is the sum of the individual variances. Justification for the use of the determinant involves more complicated reasoning. Suppose 0 ~ N{Q, V), where V is the variance-covariance matrix of 0. Then, by Theorem 1 of Appendix 2, (0 — 0)'V _1(0 — 0) ~ Я*, the chi-square distribution with К degrees of freedom, ATbeing the number of elements of 0. Therefore the (1 — a)% confidence ellipsoid for 0 is defined by

(0|(0 - в)'~'(в - 0) <Х2Л<*)}, (1.2.27)

where ^(a) is the number such that P [%% ^ X%(a)] ~ a - Then the volume of the ellipsoid (1.2.27) is proportional to the determinant of У, as shown by Anderson (1958, p. 170).

A more intuitive justification for the determinant criterion is possible for the case in which 0 is a two-dimensional vector. Let the mean squared error matrix of an estimator 0 = (0j, 02)' be

[an a12”|

La2i a22J

Suppose that 02 is known; define another estimator of Q by 0, = в і + a(§2 — 02). Its mean squared error is an + a2a22 + 2aa12 and attains the minimum value of an — (a2/a22) when a = —al2/a22. The larger al2, the more efficient can be estimation of 0,. Because the larger a12 implies the smaller determinant, the preceding reasoning provides a justification for the determinant criterion.

Despite these justifications, both criteria inevitably suffer from a certain degree of arbitrariness and therefore should be used with caution. Another useful scalar criterion based on the predictive efficiency of an estimator will be proposed in Section 1.6.

Advanced Econometrics Takeshi Amemiya

Nonlinear Limited Information Maximum Likelihood Estimator

In the preceding section we assumed the model (8.1.1) without specifying the model for Y( or assuming the normality of u, and derived the asymptotic distribution of the class of …

Results of Cosslett: Part II

Cosslett (1981b) summarized results obtained elsewhere, especially from his earlier papers (Cosslett, 1978, 1981a). He also included a numerical evaluation of the asymptotic bias and variance of various estimators. We …

Other Examples of Type 3 Tobit Models

Roberts, Maddala, and Enholm (1978) estimated two types of simultaneous equations Tobit models to explain how utility rates are determined. One of their models has a reduced form that is …

Definition of Best

Advanced Econometrics Takeshi Amemiya

Nonlinear Limited Information Maximum Likelihood Estimator

Results of Cosslett: Part II

Other Examples of Type 3 Tobit Models

Новые и рекомендуемые материалы:

Производство и продажа хонинговального инструмента

Оборудование для производства краски

Теплообменники для паровых и водяных котлов

Станок для производства ТЕРИВА TERIVA (блоки перекрытия)

Оборудование для производства пенобетона

Расфасовка угля, торфа, кормов, оборудование для упаковки-дозирования

Паровые котлы на дровах, опилках

Где работают наши линии по производству пенобетона

Где работают наши линии по производству пенопласта

Малый бизнес

Производимое оборудование

Техническая литература

Как с нами связаться:

Контакты для заказов оборудования: