Methods for introducing exact nonsample information
The most familiar method for introducing nonsample information into a regression model is to use restricted least squares (RLS). The restricted least squares estimator, which we denote as b*, is obtained by minimizing the sum of squared errors subject to J exact linear parameter restrictions, Rp = r. Examples of linear restrictions are p2 + p3 = 1 and p5 = p6. The variances of the RLS estimator are smaller than those of the OLS estimator, but b* is biased unless the parameter restrictions imposed are exactly true. As noted above, the restrictions do not have to be exactly true for RLS to be better than OLS under a criterion such as MSE, which trades off bias against variance reduction. A question that naturally arises is why such restrictions, if they exist, are not imposed at the outset. A classic example of RLS used to mitigate collinearity is the Almon (1965) polynomial distributed lag. To determine if the imposed restrictions improve the conditioning of X, substitute the restrictions into the model, via the method outlined in Fomby et al. (1984, p. 85), and apply the collinearity diagnostics.
Some familiar "tricks" employed in the presence of collinearity are, in fact, RLS estimators. The most common, and often ill-advised, strategy is to drop a variable if it is involved in a collinear relationship and its estimated coefficient is statistically insignificant. Dropping a variable, xk, is achieved by imposing the linear constraint that pk = 0. Unless pk = 0, dropping xk from the model generally biases all coefficient estimators. Similarly, two highly correlated variables are often replaced by their sum, say z = xk + xm. How is this achieved? By imposing the restriction that pk = pm. Once again, if this constraint is not correct, reductions in variance are obtained at the expense of biasing estimation of all regression coefficients. Kennedy (1983) detects the failure of a similar collinearity trick used by Buck and Hakim (1981) in the context of estimating and testing differences in parameters between two groups of observations.
Economists recognize the bias/precision tradeoff and wish to impose constraints that are "good." It is standard practice to check potential constraints against the data by testing them as if they were hypotheses. Should we drop xk? Test the hypothesis that pk = 0. Should we sum xk and xm? Test the hypothesis Pk = pm. Belsley (1991, p. 212) suggests formally testing for MSE improvement. The MSE test amounts to comparing the usual F-statistic for a joint hypothesis to critical values tabled in Toro-Vizcarrondo and Wallace (1968). Following the tests a decision is made to abandon restrictions that are rejected, and use restrictions that are not rejected. Such a strategy prevents egregious errors, but actually defines a new, "pre-test" estimation rule. This rule, which chooses either the OLS or RLS estimator based on the outcome of a test, does not have desirable statistical properties, but it seems unlikely that this practice will be abandoned. See Judge et al. (1985, ch. 3).
Another alternative is the Stein-rule estimator, which is a "smart" weighted average of the OLS and RLS estimators, weighting the RLS estimator more when the restrictions are compatible with the data, and weighting the OLS estimator more when the restrictions are not compatible with the data. The Stein-rule usually provides an MSE gain over OLS, but it is not guaranteed to ameliorate the specific problems caused by collinearity. See Judge et al. (1985, ch. 22).