Estimation of unrestricted VARs and VECMs
Given a sample y1,..., yT and presample values y-p+1,..., y0, the K equations of the VAR (32.1) may be estimated separately by least squares (LS) without losing efficiency relative to generalized LS (GLS) approaches. In fact, in this case LS is identical to GLS. Under standard assumptions, the LS estimator A of A = [A1 : ...: Ap] is consistent and asymptotically normally distributed (see, e. g., Lutkepohl, 1991),
VTvec(A - A) -4 N(0, XA) or, more intuitively, vec(A) ~ N(vec(A), XA/T).
Here vec denotes the column stacking operator which stacks the columns of a matrix in a column vector, -4 signifies convergence in distribution and ~ indicates "asymptotically distributed as".
Although this result also holds for I(1) cointegrated systems (see Sims, Stock, and Watson, 1990; Lutkepohl, 1991, ch. 11) it is important to note that in this case the covariance matrix XA is singular whereas it is nonsingular in the usual I(0) case. In other words, if there are integrated or cointegrated variables, some estimated coefficients or linear combinations of coefficients converge with a faster rate than VT. Therefore, the usual t-, %2-, and F-tests for inference regarding the VAR parameters may not be valid in this case (see, e. g. Toda and Phillips, 1993). Although inference problems may arise in VAR models with I(1) variables, there are also many unproblematic cases. Dolado and Lutkepohl (1996) show that if all variables are I(1) or I(0) and if a null hypothesis is considered which does not restrict elements of each of the Ai (i = 1,..., p) the usual tests have their standard asymptotic properties. For example, if the VAR order p > 2, the f-ratios have their usual asymptotic standard normal distributions (see also Toda and Yamamoto
(1995) for a related result).
If the white noise process uf is normally distributed (Gaussian) and the process yf is I(0), then the LS estimator is identical to the maximum likelihood (ML) estimator conditional on the initial values. It is also straightforward to include deterministic terms such as polynomial trends in the model (32.1). In this case the asymptotic properties of the VAR coefficients remain essentially the same as in the case without deterministic terms (Sims ef al., 1990).
If the cointegrating rank of the system under consideration is known and one wishes to impose a corresponding restriction, working with the VECM form
(32.3) is convenient. If the VAR order is p = 1 the estimators may be obtained by applying reduced rank regression (RRR) to Ayt = Пу-1 + ut subject to rank(n) = r. The approach is easily extended to higher order VAR processes as well (see Johansen, 1995). Under Gaussian assumptions the ML estimators conditional on the presample values may, in fact, be obtained in this way. However, in order to estimate the matrices a and в in П = aP' consistently, it is necessary to impose identifying restrictions. Without such restrictions only the product aP' = П can be estimated consistently. If uniqueness restrictions are imposed it can be shown that T(P - P) and VT (7 - a) converge in distribution (Johansen, 1995). Hence, the estimator of P converges with the fast rate T and is therefore sometimes called super-consisfenf. In contrast, the estimator of a converges with the usual rate VT. The estimators of Г = [Г1 : ... : rp-1] and П are consistent and asymptotically normal under general assumptions. The asymptotic distribution of Г is nonsingular so that standard inference may be used for the short-term parameters Г,-. On the other hand, the asymptotic distribution of П is singular if r < K. This result is due to two forces. On the one hand, imposing the rank constraint in estimating П restricts the parameter space and, on the other hand, П involves the cointegrating relations which are estimated super-consistently.
It is perhaps interesting to note that an estimator of A can be computed via the estimates of П and Г. That estimator has the advantage of imposing the cointegrating restrictions on the levels version of the VAR process. However, its asymptotic distribution is the same as in (32.8) where no restrictions have been imposed in estimating A.