A COMPANION TO Theoretical Econometrics

# Stochastic Specification

Many of the recent developments in estimation of the parameters of the SUR model have been motivated by the need to allow for more general stochastic specifications. These developments are driven in part by the usual diagnostic procedures of practitioners, but also by the strong theoretical arguments that have been presented for behavioral models of consumers and producers. Chavas and Segerson (1987) and Brown and Walker (1989, 1995) argue that behavioral models should include a stochastic component as an integral part of the model. When this is done, however, each equation in the resultant input share system or system of demand equations will typically exhibit heteroskedasticity. The basic SUR specification allows for heteroskedasticity across but not within equations and thus will be inappropriate for these systems.

Examples of where heteroskedastic SUR specifications have been discussed include the share equations system of Chavas and Segerson (1987); the SUR random coefficients model of Fiebig et al. (1991); and the groupwise heteroskedasticity model described in Bartels et al. (1996). Each of these examples are members of what Bartels and Fiebig (1992) called generalized SUR (GSUR).

Recall that the covariance matrix of the basic SUR model is Q = X ® IT. GSUR allows for a covariance matrix with the following structure:

Q = Q(0, X) = R(X <8> IT)R', (5.7)

where R = R(0) is block-diagonal, non-singular, and the parameters 0 and X are assumed to be separable. With this specification, GLS estimation can be viewed as proceeding in two stages: the first stage involves transforming the GSUR model on an equation-by-equation basis so that the classical SUR structure appears as the transformed model; and in the second stage the usual SUR estimator is applied to the transformed model.

The distinguishing feature of the GSUR class of models is the convenience of estimating 0 and X in separate stages with both stages involving familiar estimation techniques. Bollerslev (1990) also notes the computational convenience of this type of simplification for an SUR model that allows for time varying conditional variances and covariances. In this case estimation by maximum likelihood is proposed.

While computational convenience is important, this attractive feature of GSUR must be weighed against the potentially restrictive covariance structure shown in (5.7). While any symmetric, non-singular matrix Q can be written as R(X ® IT)R', the matrix R will in general depend on X as well as 0 and/or R may not be block diagonal. Ultimately, the validity of the assumption regarding the structure of the covariance matrix remains an empirical matter.

Mandy and Martins-Filho (1993) extended the basic SUR specification by assuming a contemporaneous covariance matrix that varies across observations within an equation. If we collect the N disturbances associated with the tth time period into the vector uw = (u1t,..., uNt)', they assume

o12 ■ |
' °1n |
|||

E(u(f)u(i)) = Qi = |
°21 |
°22 |
°2 N |
. (5.8) |

■ °Nn _ |

Specifically, they consider an SUR model subject to additive heteroskedasticity of the form

atij = a'jz j (5.9)

where aj is a vector of unknown parameters and z j a conformable vector of explanatory variables.

The framework of Mandy and Martins-Filho (1993) is less restrictive than GSUR but in general does not have the simplified estimation solutions of GSUR. Instead they develop an FGLS procedure that represents an SUR generalization of Amemiya's (1977) efficient estimator for the parameters of the covariance matrix in a single-equation additive heteroskedastic model. A practical problem with this class of estimators is the appearance of estimated covariance matrices which are not positive definite. Such problems should disappear with large enough samples but for any particular data set there are no guarantees. With some specifications, such as SUR with groupwise heteroskedasticity, the structure ensures that nonnegative variance estimates are obtained. Alternatively, for their SUR random coefficients model, Fiebig et al. (1991) employ an estimation procedure that automatically ensures the same result.

When the observations within an equation have a grouped structure it is reasonable to consider a random effects specification for each equation of the SUR system. Avery (1977) and Baltagi (1980) undertook the initial work on SUR models with error component disturbances. An extension allowing for the error components to be heteroskedastic has recently been proposed by Wan, Griffiths, and Anderson (1992) in their study of rice, maize, and wheat production in 28 regions of China over the period 1980-83. In another application, Kumbhakar and Heshmati (1996) consider a system comprising a cost equation and several cost share equations estimated using 26 annual observations for several Swedish manufacturing industries. For the cost equation, but not the share equations, they also specify disturbances comprising heteroskedastic error components. Estimation proceeds along the same lines as discussed in the context of GSUR. The cost equation is first transformed, and the usual SUR estimation can proceed for the share equations and the transformed cost equation.

As has been apparent from the discussion, a major use of SUR is in the estimation of systems of consumer or factor demands. Often these are specified in share form which brings with it special problems. In particular the shares should be restricted to lie between zero and one and should sum to unity. By far the most common approach to the specification of the stochastic component in such models is to append an additive error term to the deterministic component of the model that is obtained from economic theory. Typically the error term is assumed to be multivariate normal. Even if the deterministic component respects the constraint of lying between zero and one, it is clear that assuming normality for the stochastic component means that the modeled share can potentially violate the constraint. One approach is to choose a more appropriate distribution. Woodland (1979), who chose the Dirichlet distribution, seems to be the only example of this approach. An alternative approach advocated by Fry, Fry, and McLaren (1996) is to append a multivariate normal error after performing a logratio transformation of the observed shares.

A second property of share equations, that of adding up, implies that additive errors sum identically to zero across equations. The induced singularity of the covariance matrix is typically accommodated in estimation by deleting one of the equations. Conditions under which the resultant estimators are invariant to the equation that is dropped have long been known; see Bewley (1986) for a summary of these early contributions. More recently, McLaren (1990) and Dhrymes (1994) have provided alternative treatments, which they argue, provide a more transparent demonstration of the conditions for invariance. Dhrymes (1994) is the more general discussion as it allows for the added complication of auto - correlated errors.

Singularity of the disturbance covariance matrix places strong restrictions on any autocorrelation structure that is specified. If we consider the N-vector of disturbances associated with the fth time period then adding-up implies i'uw = 0 where і is a vector of ones. If a first order autoregressive process is assumed, i. e.

uw = Au(t_1) + e (t) (5.10)

then adding up requires that i'ew = 0 and i'A = kl where k is a scalar constant, implying that the columns of A sum to the same constant. If A is specified to be diagonal, all equations will need to have the same autocorrelation parameter. At the other extreme, A is a full matrix with (N - 1)2 identifiable parameters. A series of contributions by Moschini and Moro (1994), McLaren (1996) and Holt (1998) have suggested alternative specifications involving (N - 1) parameters, and hence represent a compromise between the very restrictive one-parameter specification and the computationally demanding full specification.

It is not unusual to observe empirical results where systems of static equations have been estimated to yield results that exhibit serially correlated residuals. Common advice that is often followed is to re-estimate assuming an autoregressive error structure. For example, Judge et al. (1985, p. 497) conclude that: "as a general recommendation for a researcher estimating a set of equations, we suggest that possible contemporaneous correlation should always be allowed for and, if the number of observations is sufficient, some kind of autocorrelation process could also be assumed."

In the single equation context there has been a movement away from the simple to specific modeling approach, involving correcting for autocorrelation in the errors of static regression models. For example, Anderson and Blundell (1982, p. 1560) note that: "in the context of a single equation model, it has been argued that an autoregressive error specification whilst being a convenient simplification, when appropriate, may be merely accommodating a dynamic structure in the model which could be better represented by a general unrestricted dynamic formulation." Mizon (1995) is even more forceful as evidenced by his paper's title: "A simple message for autocorrelation correctors: Don't." Such advice has largely gone unheeded in the SUR literature where there has been little work on dynamic SUR models. Support for this contention is provided by Moschini and Moro (1994) who report that they found 24 papers published over the period 1988-92 that estimated singular systems with autocorrelated disturbances and that out of these only three did so as special cases of the general dynamic model of Anderson and Blundell (1982).

Deschamps (1998) is one exception where a general dynamic model has been used. He proposes and illustrates a methodology for estimating long-run demand relationships by maximum likelihood. Unlike previous work such as Anderson and Blundell (1982), Deschamps formulates the full likelihood function that is not conditional on the first observations of the dependent variables.

One other contribution that recognizes the need for more work in the area of dynamic systems is that of Kiviet, Phillips, and Schipp (1995). They employ asymptotic expansions to compare the biases of alternative estimators of an SUR model comprised of dynamic regression equations.

As we have previously noted, the GLS estimator of a basic SUR model reduces to OLS when the design matrices are the same in each equation. Baltagi (1980), Bartels and Fiebig (1992) and Mandy and Martins-Filho (1993) all mention that this well known result needs modification when dealing with more general stochastic structures. The fact that each equation contains an identical set of explanatory variables is not a sufficient condition for joint GLS to collapse to OLS performed on each equation separately. The two-stage estimation process of GSUR highlights the intuition. The first stage involves transforming the GSUR model on an equation-by-equation basis so that the classical SUR structure appears as the transformed model. Even if the original explanatory variables were the same in each equation, the explanatory variables to be used in the second stage are generally not identical after being transformed. A related question is under what conditions does joint GLS collapse to GLS performed on each equation separately. Bartels and Fiebig (1992) provide necessary and sufficient conditions for the first-stage GLS estimates to be fully efficient. Lee (1995) provides some specific examples where this occurs in the estimation of singular systems with autoregressive errors.