A COMPANION TO Theoretical Econometrics
Other Developments
5.1 Unequal observations and missing data
Extending the standard SUR model to allow for an unequal number of observations in different equations causes some problems for estimation of the disturbance covariance matrix. (Problems associated with sample selection bias are avoided by assuming that data are missing at random.) If there are at least T0 < T observations available for all equations then the key issue is how to utilize the "extra" observations in the estimation of the disturbance covariance matrix. Monte Carlo comparisons between alternative estimators led Schmidt (1977) to conclude that estimators utilizing less information did not necessarily perform poorly relative to those using more of the sample information. Baltagi, Garvin, and Kerman (1989) provide an extensive Monte Carlo evaluation of several alternative covariance matrix estimators and the associated FGLS estimators of p, attempting to shed some further light on conclusions made by Schmidt (1977). They conclude that while the use of extra observations may lead to better estimates of X and I-1, this does not necessarily translate into better estimates of p. Baltagi et al. (1989) considered both I and I-1 because of Hwang (1990) who noted that the mathematical form of the alternative estimators of I gave a misleading impression of their respective information content. This was clarified by considering the associated estimators of I-1. Hwang (1990) also proposes a modification of the Telser (1964) estimator, which performs well when the contemporaneous correlation is high.
When there are unequal observations in equations, concern with the use of the "extra" observations arises because of two types of restrictions: (i) imposing equality of variances across groups defined by complete and incomplete observations; and (ii) the need to maintain a positive definite estimate of the covariance matrix. In the groupwise heteroskedasticity model employed by Bartels et al. (1996) the groupings corresponded to the divisions between complete and incomplete observations, and, provided that there are no across group restrictions on the parameters of the variance-covariance matrices, positive definite covariance matrix estimates can be readily obtained by applying a standard SUR estimation to each group of data separately.
Consider a two-equation SUR system of the form
y,- = X;p i + u i = 1, 2 (5.12)
where y1 and u1 are T-dimensional vectors, X1 is T x k, P, are k-dimensional vectors and
y 21 |
, X2 = |
X' |
, u2 = |
u21 |
_y 2 e _ |
_Xe _ |
_u2 e _T |
where the e subscript denotes m extra observations that are available for the second equation. If m = 0 there will be no gain from joint estimation because the system reduces to a basic SUR model with each equation containing an equal number of observations and common regressors. Conniffe (1985) and Im (1994) demonstrate that this conclusion no longer holds when there are an unequal number of observations, because y2e and Xe are available. OLS for the second equation is the best linear unbiased estimator (BLUE) as one would expect but joint estimation delivers a more efficient estimator of p1 than OLS applied to the first equation.
Suppose that y2 and X are fully observed but y1 is not. Instead, realizations of a dummy variable D are available where D = 1 if y1 > 0 and otherwise D = 0. Under an assumption of bivariate normality, a natural approach is to estimate the first equation by probit and the second by OLS. Chesher (1984) showed that joint estimation can deliver more efficient estimates than the "single-equation" probit, but, again as you would expect, there is no gain for the other equation. What if these two situations are combined? Conniffe (1997) examines this case, where the system comprises of a probit and a regression equation, but where there are more observations available for the probit equation. In this case estimates for both equations can be improved upon by joint estimation.
Meng and Rubin (1996) were also concerned with SUR models containing latent variables. They use an extension of the expectation maximization (EM) algorithm called the expectation conditional maximization (ECM) algorithm to discuss estimation and inference in SUR models when latent variables are present or when there are observations missing because of nonresponse. One application of this work is to seemingly unrelated tobit models; see Hwang, Sloan, and Adamache (1987) for an example.