1. MOTIVATION
In regression models with heteroskedasticity and serial correlation of unknown form, the standard approach to testing hypotheses on the regression parameters involves estimation of the correlation structure using nonparametric heteroskedasticity and autocorrelation consistent (HAC) estimators. These estimators have been thoroughly examined in the literature by, among many others, Andrews (1991), Andrews and Monahan (1992), Hansen (1992), de Jong and Davidson (2000), Newey and West (1987), Robinson (1991), and White (1984). They furnish consistent estimates of the correlation structure in single-equation models of cointegration also, allowing inference on the cointegrating vector to be carried out using conventional tests. Inference conducted in this manner leads to pivotal tests and is robust to heteroskedasticity and serial correlation of unknown form. Even though tests that use HAC estimators are valid asymptotically, they typically display substantial size distortions in finite samples (see, e.g., Andrews, 1991; Andrews and Monahan, 1992; den Haan and Levin, 1997).
Recent efforts have been made to improve upon the HAC approach in standard (stationary) regression models using inconsistent covariance matrix estimates. The first paper in this literature was Kiefer, Vogelsang, and Bunzel (2000), where a new test based on the Bartlett kernel with bandwidth equal to sample size was developed. Continuing this line of research, Bunzel, Kiefer, and Vogelsang (2001) extended the theory to nonlinear stationary regression models, and Kiefer and Vogelsang (2002) developed the new fixed bandwidth (fixed-b) asymptotic theory. In the case of a cointegration relationship with exogenous regressors, applying fixed-b asymptotics would have been a straightforward extension of the theory in the standard regression model. However, when the regressors are allowed to be endogenous, the task is nontrivial. Here we provide the asymptotic distribution of Wald and t-type statistics in single-equation cointegration models with endogenous regressors.
The principle behind the fixed-b theory is to let b = M/T where T is the sample size and M is the truncation lag or bandwidth used to compute the HAC estimator. The standard assumptions would require that b → 0, but fixed-b asymptotics instead assumes that the truncation lag is a fixed proportion of the sample, i.e., b is fixed. This approach has several advantages. First, it improves the asymptotic approximation, resulting in reduced size distortions. Second, it provides an asymptotic distribution that depends on the bandwidth and kernel, thus providing us with better tools for choosing these parameters.
2. RESULTS
Consider the following model containing a single cointegrating relationship in addition to some deterministic variables:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348frm001.gif?pub-status=live)
where f (t) denotes a (k1 × 1) vector of trend functions, Xt is a
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm001.gif?pub-status=live)
vector of regressors, and α and β are (k1 × 1) and
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm002.gif?pub-status=live)
vectors of parameters, respectively. Let ′ denote the transpose, except when it is used in conjunction with the kernel function, where it will denote the derivative. It is assumed that the sequence {ut} = {(u1,t,u2,t′)′} does not contain unit roots but may exhibit serial correlation or heteroskedasticity.
At times, it will be useful to stack the first equation in (1) and rewrite it as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348frm002.gif?pub-status=live)
Here f(T) is the (T × k1) stacked vector of trend functions, and X is the
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm003.gif?pub-status=live)
matrix of regressors. The following notation is required before we state the main assumptions of the note. Denote
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm004.gif?pub-status=live)
, Γ(j) = E(utut+j′), and Γ22(j) = E(u2,tu2,t+j′), let wj(r) be a j-vector of independent Wiener processes, and let [rT] denote the integer part of rT, where r ∈ [0,1]. In the discussion that follows ⇒ is used to denote weak convergence.
The first assumption, which is similar to that of Vogelsang (1998), is made to rule out ill-behaved trend functions and to provide some useful notation for deriving and stating the asymptotic distributions.
Assumption 1. There exist a (k1 × k1) diagonal matrix τT and a vector of functions F, such that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm005.gif?pub-status=live)
exists and is nonsingular. In addition, f (t) includes a constant term.
Assumption 1 can be relaxed but, as it stands, is sufficiently general to cover most commonly used models. For later use, let F(T) be the matrix of the stacked F(t/T) functions.
The next assumption provides us with the necessary invariance principles and ensures that we can estimate (1) consistently, even when the regressors are endogenous.
Assumption 2. {ut}t=1∞ is weakly stationary and satisfies the following conditions:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-96303-mediumThumb-S0266466606060348ffm006.jpg?pub-status=live)
Assumptions 2(a)–(d) have been used extensively in the literature on nonparametric covariance matrix estimation to ensure that the relevant multivariate invariance principles hold. These conditions are sufficient to provide the asymptotic distribution of the ordinary least squares (OLS) estimates of (1) if the regressors are exogenous. Assumptions 2(e)–(g) are made to allow us to deal with endogenous regressors in the manner suggested by Saikkonen (1991), Phillips and Loretan (1991), Stock and Watson (1993), and Wooldridge (1991). A direct implication (see Saikkonen, 1991) is that we can write u1,t as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348frm003.gif?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm007.gif?pub-status=live)
is a stationary process such that E(u2,t vt+l′) = E((Xt − Xt−1)vt+l′) = 0, l = 0,±1,±2,…. Following standard procedure, we can thus estimate the model using dynamic ordinary least squares (DOLS); i.e., we estimate
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348frm004.gif?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm008.gif?pub-status=live)
. We are now ready to make the third and final assumption.
Assumption 3. Let p → ∞ such that p3/T → 0 and T1/2 [sum ]| j|>p∥γj∥ → 0.
Saikkonen (1991) shows that if Assumption 3 holds, then (4) is asymptotically equivalent to
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm009.gif?pub-status=live)
and under Assumptions 1–3, the asymptotic distributions of the least squares estimates of α and β are well known (for the case where trends are included in the model, see Saikkonen, 1991; Phillips and Loretan, 1991; Stock and Watson, 1993; Wooldridge, 1991; Phillips and Hansen, 1990).
As usual, inference on β is conducted using the DOLS estimates and a HAC estimate of the asymptotic covariance matrix to form Wald or t-type test statistics. Denote the Cholesky composition of Ω by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm010.gif?pub-status=live)
. Then HAC estimators of σ take the general form
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348frm005.gif?pub-status=live)
Here N = T − (2p + 1),
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm011.gif?pub-status=live)
are the residuals from (4), M is called the bandwidth or the truncation lag, and k(x) is a kernel function satisfying k(x) = k(−x), k(0) = 1, |k(x)| ≤ 1, k(x) continuous at x = 0, and
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm012.gif?pub-status=live)
. Following Kiefer and Vogelsang (2002), we assume that M is directly proportional to T, such that M = [bT] (in place of the usual assumption that M/T → 0 as T → ∞) and develop this asymptotic theory for the cointegration model. The limiting distribution of
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm013.gif?pub-status=live)
will depend on the specific bandwidth (now fully determined by the parameter b) and kernel used to construct the estimator. To proceed we provide the following definition, which describes two different types of kernels.
Definition. A kernel is labeled type 1 if k(x) is twice continuously differentiable everywhere and is labeled type 2 if k(x) is continuous, k(x) = 0 for |x| ≥ 1, and k(x) is twice continuously differentiable everywhere except at |x| = 1.
In addition, we will consider the Bartlett kernel separately. The following lemma provides the asymptotic distribution of
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm014.gif?pub-status=live)
under fixed-b asymptotics and for various choices of kernels. To state the asymptotic distributions, we define
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-13718-mediumThumb-S0266466606060348ffm015.jpg?pub-status=live)
where w1(s) is independent of
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm016.gif?pub-status=live)
and where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm017.gif?pub-status=live)
is defined as the residual from the projection of
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm018.gif?pub-status=live)
on the subspace generated by F(s) in the Hilbert space of square integrable functions on [0,1] with the inner product
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm019.gif?pub-status=live)
, such that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm020.gif?pub-status=live)
. Correspondingly, F(s)X is the residual from the projection of F(s) onto the space generated by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm021.gif?pub-status=live)
.
LEMMA 1. If k is type 1,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm022.gif?pub-status=live)
If k is type 2
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm023.gif?pub-status=live)
where k*(x) = k(x/b) and k−*′(b) is the derivative of k*(x) from below at b. If k is the Bartlett kernel,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm024.gif?pub-status=live)
The proof of Lemma 1 follows that of Kiefer and Vogelsang (2002) but with the added complication that endogenous regressors are present, and therefore some additional work is required to determine the asymptotic distribution of the partial sums of the residuals. The asymptotic distribution of
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm025.gif?pub-status=live)
is proportional to σ2 and depends on the bandwidth and kernel as expected.
Using Lemma 1, hypotheses of the form H0 : Rβ = β0 can be tested using the standard Wald test. In what follows R is a nonstochastic restriction matrix of dimension
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm026.gif?pub-status=live)
and rank q. The Wald test for H0 is defined as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-45115-mediumThumb-S0266466606060348ffm027.jpg?pub-status=live)
The corresponding one-dimensional t-test can be obtained in the usual manner. Theorem 1, which follows, states the asymptotic distribution of W under fixed-b asymptotics.
THEOREM 1. Suppose Assumptions 1–3 hold. Then, under H0,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-99686-mediumThumb-S0266466606060348ffm028.jpg?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-11767-mediumThumb-S0266466606060348ffm029.jpg?pub-status=live)
and
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm030.gif?pub-status=live)
is the residual from the projection of the first q coordinates of
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm031.gif?pub-status=live)
onto the last
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm032.gif?pub-status=live)
coordinates of
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm033.gif?pub-status=live)
.
This theorem demonstrates that it is possible to obtain pivotal test statistics with the fixed-b assumption. The asymptotic distribution of the Wald test depends on the kernel and bandwidth, and through
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm034.gif?pub-status=live)
it also depends on the number of restrictions being tested, the number of regressors in the model, and the trends included, where standard b → 0 asymptotics would have resulted in an asymptotic χ2 distribution. Note that although the asymptotic distribution is nonstandard, it is simple to obtain critical values through simulations because the distribution is simply a function of independent Wiener processes. (Simulations guiding the choice of b and also some critical values are available from the author upon request.) With the critical values in hand, it is possible to use the fixed-b approach for testing in single-equation models of cointegration. Although the choice of b and the simulations demonstrating the efficacy of the test are treated in a companion paper (Bunzel, 2006), the fixed-b approach is likely to improve the asymptotic approximation, resulting in reduced size distortions. It also provides an asymptotic distribution that depends on the bandwidth and kernel, thus providing us with better tools for choosing these parameters.
APPENDIX
A.1. Proof of Lemma 1.
Following Kiefer and Vogelsang (2002), we define
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm035.gif?pub-status=live)
and use this expression to rewrite
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm036.gif?pub-status=live)
as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348frm006.gif?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm037.gif?pub-status=live)
are the residuals from (4). Note that for (A.1) to be valid it must be the case that the residuals sum to zero. Therefore, for the results that follow to be valid f (t) must include a constant term as assumed in Assumption 1. To establish the asymptotic distribution of
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm038.gif?pub-status=live)
, it is necessary first to determine the asymptotic distribution of
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm039.gif?pub-status=live)
.
LEMMA 2. Under Assumptions 1–3,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm040.gif?pub-status=live)
.
Proof. First define γ = [γ−p′,…, γp′]′, which is a
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm076.gif?pub-status=live)
vector of parameters and let ΔZt+p = [ΔXt−p′,ΔXt−p+1,…,′ΔXt+p′]′ be the corresponding vector of regressors. Simple matrix manipulations yield
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-02034-mediumThumb-S0266466606060348frm007.jpg?pub-status=live)
In what follows, we will show that the last term in the expression for
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm041.gif?pub-status=live)
, (A.2), is
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm042.gif?pub-status=live)
and therefore does not affect the asymptotic distribution of
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm043.gif?pub-status=live)
. First note that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-47904-mediumThumb-S0266466606060348ffm044.jpg?pub-status=live)
by Assumption 2(f). Thus
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm045.gif?pub-status=live)
. We then obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm046.gif?pub-status=live)
by equation (23) in Saikkonen (1991), which states that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm047.gif?pub-status=live)
.
We can now determine the asymptotic distribution of
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm048.gif?pub-status=live)
from the first two terms of (A.2). By Assumptions 1–3 we know from Saikkonen (1991) and Phillips and Hansen (1990) that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-64250-mediumThumb-S0266466606060348ffm049.jpg?pub-status=live)
It also follows directly from Assumption 2 that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm050.gif?pub-status=live)
. Because N/T → 1, it will also be the case that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm051.gif?pub-status=live)
. So it is now established that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-76055-mediumThumb-S0266466606060348ffm052.jpg?pub-status=live)
The rest of the proof is split into three cases, corresponding to type 1, type 2, and the Bartlett kernels. It follows directly from Kiefer and Vogelsang (2002) and Lemma 2.
Case 1. k(x) is a type 1 kernel. By definition of the second derivative, T2Δ2κil − (−k*′′((i − l)/N)) → 0, and using Lemma 2 it follows easily that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm053.gif?pub-status=live)
Case 2. k(x) is a type 2 kernel. Following Kiefer and Vogelsang (2002), we use simple algebra and the definition of Δ2κij to establish that when |i − j| > [bN], Δ2κij = 0, and when |i − j| = [bN], Δ2κij = −k(([bN] − 1)/[bN]). Also recall that when |i − j| < [bN] k(x) is twice continuously differentiable. We split up the expression of
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm054.gif?pub-status=live)
as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-99521-mediumThumb-S0266466606060348ffm055.jpg?pub-status=live)
where the asymptotic distribution follows directly from Lemma 2 and Kiefer and Vogelsang (2002).
Case 3. k(x) is the Bartlett kernel. Here again following Kiefer and Vogelsang (2002), it can be verified that when |i − j| = 0, Δ2κij = 2/[bN], and when |i − j| = [bN], Δ2κij = −(1/[bN]). Using these expressions and Lemma 2 in (A.1), we obtain the following limiting distribution:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-00956-mediumThumb-S0266466606060348ffm056.jpg?pub-status=live)
A.2. Proof of Theorem 1.
The initial step of the proof will be to rewrite the model, projecting out all regressors that are not related to the hypothesis in question. Then we will prove that the statistic is numerically unchanged if it is calculated from the rewritten model. Finally the expression of W obtained from the rewritten model will be used to derive the asymptotic distribution of the statistic.
To rewrite the model, let
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm057.gif?pub-status=live)
, where D is chosen such that L has full rank
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm077.gif?pub-status=live)
, and define
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm058.gif?pub-status=live)
. Using these definitions, model (4) can be rewritten in the following manner:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-35017-mediumThumb-S0266466606060348ffm059.jpg?pub-status=live)
Because
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm060.gif?pub-status=live)
are linear combinations of X, they too contain unit root processes if the original assumption of just one cointegration relationship is maintained. Furthermore
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm061.gif?pub-status=live)
contains the leads and lags of the differenced
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm062.gif?pub-status=live)
variables. We will now show that testing H0 : Rβ = β0 is equivalent to testing the hypothesis
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm063.gif?pub-status=live)
: β1* = β0 in the model
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348frm008.gif?pub-status=live)
where for any matrix G,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm064.gif?pub-status=live)
.
LEMMA 3. The statistic for testing
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm065.gif?pub-status=live)
from (A.3) is numerically identical to the statistic for testing H0 : Rβ = β0 in the model given by (4).
The proof follows from extensive but simple matrix algebra. To complete the proof of the theorem we thus need to determine the asymptotic distribution of
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm066.gif?pub-status=live)
Because
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm067.gif?pub-status=live)
, where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm068.gif?pub-status=live)
is defined as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm069.gif?pub-status=live)
, but for the model in (A.3), we know
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-34936-mediumThumb-S0266466606060348ffm070.jpg?pub-status=live)
and therefore,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-58400-mediumThumb-S0266466606060348ffm071.jpg?pub-status=live)
By the definition of X1*,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170215065628874-0502:S0266466606060348:S0266466606060348ffm072.gif?pub-status=live)
The distribution of W* can now be obtained.
If k is type 1,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-46492-mediumThumb-S0266466606060348ffm073.jpg?pub-status=live)
If k is type 2,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-99260-mediumThumb-S0266466606060348ffm074.jpg?pub-status=live)
If k is Bartlett,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170408223603-51128-mediumThumb-S0266466606060348ffm075.jpg?pub-status=live)