A COMPANION TO Theoretical Econometrics
Bayesian random effects model
The Bayesian fixed effects model described above might initially appeal to researchers who do not want to make distributional assumptions about the inefficiency distribution. However, as we have shown above, this model is implicitly making strong and possibly unreasonable prior assumptions. Furthermore, we can only calculate relative, as opposed to absolute, efficiencies. For these reasons, it is desirable to develop a model which makes an explicit distributional assumption for the inefficiencies. With such a model, absolute efficiencies can be calculated in the spirit of the cross-sectional stochastic frontier model of Section 2, since the distribution assumed for the zi s allows us to separately identify zi and в 0. In addition, the resulting prior efficiency distributions will typically be more in line with our prior beliefs. Another important issue is the sensitivity of the posterior results on efficiency to the prior specification chosen. Since T is usually quite small, it makes sense to "borrow strength" from the observations for the other firms by linking the inefficiencies. Due to Assumption 3 in subsection 2.2, this is not done through the sampling model. Thus, Koop et al. (1997) define the difference between Bayesian fixed and random effects models through the prior for zi. In particular, what matters are the prior links that are assumed between the z{ s. Fixed effects models assume, a priori, that the z{ s are fully separated.
Random effects models introduce links between the zt s, typically by assuming they are all drawn from distributions that share some common unknown parameter(s). In Bayesian language, the random effects model then implies a hierarchical prior for the individual effects.
Formally, we define a Bayesian random effects model by combining (24.18) with the prior:
N
p(в0, S, h, z, X-1) - h-1p(5)fG{X-111, - ln (t*)) П fcZ |1, X-1). (24.27)
i =1
That is, we assume noninformative priors for h and в 0, whereas the inefficiencies are again assumed to be drawn from the exponential distribution with mean X. Note that the z; s are now linked through this common parameter X, for which we choose the same prior as in Section 2.
Bayesian analysis of this model proceeds along similar lines to the crosssectional stochastic frontier model presented in Section 2. In particular, a Gibbs sampler with data augmentation can be set up. Defining в = (во S')' and X = (int : x) the posterior conditional for the measurement error precision can be written as:
(24.28)
Next we obtain:
p(в | y, x, z, h, X-1) = fN+1(в | _, h-X'X)-1)p(S), (24.29)
where
_ = (X'X)-1 [y + (In ® It)z].
(Th)-1 lN)n I(Zi * 0), i=1 |
The posterior conditional for the inefficiencies takes the form:
(24.30)
where у = (y1... yN)' and x = (x1 ... x'n)'.
Furthermore, the posterior conditional for X-1, p(X-11 y, x, z, в, h), is the same as for the cross-sectional case (i. e. equation 24.10).
Using these results, Bayesian inference can be carried out using a Gibbs sampling algorithm based on (24.10), (24.28), (24.29), and (24.30). Although the formulas look somewhat complicated, it is worth stressing that all conditionals are either gamma or truncated normal.
Extending the random effects stochastic frontier model to allow for a nonlinear production function or explanatory variables in the efficiency distribution can easily be done in a similar fashion as for the cross-sectional model (see subsection 2.3 and Koop et al, 1997). Furthermore, different efficiency distributions can be allowed for in a straightforward manner and multiple outputs can be handled as in Fernandez et al. (2000). Here we concentrate on extending the model in a different direction. In particular, we free up the assumption that each firm's efficiency, x„ is constant over time. Let us use the definitions of X and в introduced in the previous subsection and write the stochastic frontier model with panel data as:
y = Xp - у + v, (24.31)
where у is a TN x 1 vector containing inefficiencies for each individual observation and y and v are defined as in subsection 3.1. In practice, we may want to put some structure on у and, thus, Fernandez et al. (1997) propose to rewrite it in terms of an M-dimensional vector u as:
Y = Du, (24.32)
where M < TN and D is a known TN x M matrix. Above, we implicitly assumed D = IN ® iT which implies M = N and u, = Yu = z. That is, firm-specific inefficiency was constant over time. However, a myriad of other possibilities exist. For instance, D can correspond to cases where clusters of firms or time periods share common efficiencies, or parametric time dependence exists in firm-specific efficiency. Also note the case D = ITN, which allows each firm in each period to have a different inefficiency (i. e. y = u). Thus, we are then effectively back in the cross section framework without exploiting the panel structure of the data. This case is considered in Koop et al. (1999, 2000), where interest is centered on the change in efficiency over time.17 With all such specifications, it is possible to conduct Bayesian inference by slightly altering the posterior conditionals presented above in an obvious manner.
However, as discussed in Fernandez et al. (1997), it is very important to be careful when using improper priors on any of the parameters. In some cases, improper priors imply that the posterior does not exist and, hence, valid Bayesian inference cannot be carried out. Intuitively, the inefficiencies can be interpreted as unknown parameters. If there are too many of these, prior information becomes necessary. As an example of the types of results proved in Fernandez et al. (1997), we state one of their main theorems:
Theorem 24.1 (Fernandez et al., 1997, Theorem 1).
ral model given in (24.31) and (24.32) and assume the standard noninformative prior for h : p(h) If rank(X : D) < TN then the posterior distribution exists for any bounded or proper p(P) and any proper p(u). However, if rank(X : D) = TN, then the posterior does not exist.
The Bayesian random effects model discussed above has rank(X : D) < TN, so the posterior does exist even though we have used an improper prior for h. However, for the case where efficiency varies over time and across firms (i. e. D = ITN), more informative priors are required in order to carry out valid Bayesian inference. Fernandez et al. (1997, Proposition 2) show that a weakly informative (not necessarily proper) prior on h that penalizes large values of the precision is sufficient.