A random variable y is said to be heteroskedastic if its variance can be different for different observations. Conversely, it is said to be homoskedastic if its variance is constant for all observations. The most common framework in which heteroskedasticity is studied in econometrics is in the context of the general linear model
Уі = x? P + еи (4.1)
where xi is a ^-dimensional vector of observations on a set of explanatory variables, в is a ^-dimensional vector of coefficients which we wish to estimate and yi denotes the ith observation (i = 1, 2,..., N) on a dependent variable. In a heteroskedastic model the error term ei is assumed to have zero-mean and variance a2, the i subscript on a2 reflecting possibly different variances for each observation. Conditional on x'P, the dependent variable yi has mean x-в and variance a2. Thus, in the heteroskedastic general linear model, the mean and variance of the random variable y can both change over observations.
Heteroskedasticity can arise empirically, through theoretical considerations and from model misspecification. Empirically, heteroskedasticity is often encountered when using cross-sectional data on a number of microeconomic units such as firms or households. A common example is the estimation of household expenditure functions. Expenditure on a commodity is more easily explained by conventional variables for households with low incomes than it is for households with high incomes. The lower predictive ability of the model for high incomes can be captured by specifying a variance a2 which is larger when income is larger. Data on firms invariably involve observations on economic units of varying sizes. Larger firms are likely to be more diverse and flexible with respect to the way in which values for yi are determined. This additional diversity is captured through an
error term with a larger variance. Theoretical considerations, such as randomness in behavior, can also lead to heteroskedasticity. Brown and Walker (1989, 1995) give examples of how it arises naturally in demand and production models. Moreover, as chapter 19 in this volume (by Swamy and Tavlas) illustrates, heteroskedasticity exists in all models with random coefficients. Misspecifications such as incorrect functional form, omitted variables and structural change are other reasons that a model may exhibit heteroskedasticity.
In this chapter we give the fundamentals of sampling theory and Bayesian estimation, and sampling theory hypothesis testing, for a linear model with heteroskedasticity. For sampling theory estimation it is convenient to first describe estimation for a known error covariance matrix and to then extend it for an unknown error covariance matrix. No attempt is made to give specific details of developments beyond what we consider to be the fundamentals. However, references to such developments and how they build on the fundamentals are provided. Autoregressive conditional heteroskedasticity (ARCH) which is popular for modeling volatility in time series is considered elsewhere in this volume and not discussed in this chapter.