THE RANK OF A SUBMATRIX OF COINTEGRATION

Eiji Kurozumi

doi:10.1017/S0266466605050188

THE RANK OF A SUBMATRIX OF COINTEGRATION

Published online by Cambridge University Press: 31 March 2005

Eiji Kurozumi

Show author details

Eiji Kurozumi: Affiliation:
Hitotsubashi University and Boston University

Article contents

Abstract
1. INTRODUCTION
2. TEST OF THE RANK OF THE SUBMATRIX FOR NONTRENDING DATA
3. THE TEST OF THE RANK OF THE SUBMATRIX FOR TRENDING DATA
4. SIMULATION RESULTS
5. CONCLUSION
APPENDIX
References

Rights & Permissions

Abstract

This paper proposes a test of the rank of the submatrix of β, where β is a cointegrating matrix. In addition, the submatrix of β⊥, an orthogonal complement to β, is investigated. We construct the test statistic by using the eigenvalues of the quadratic form of the submatrix. We show that the test statistic has a limiting chi-square distribution when data are nontrending, whereas for trending data we have to consider a conservative test or other testing procedure that requires the pretest of the structure of the matrix. Finite sample simulations show that, although the simulation settings are limited, the proposed test works well for nontrending data, whereas we have to carefully use the test for trending data because it may become too conservative in some cases.I owe special thanks to two anonymous referees, the co-editor, Pierre Perron, and Taku Yamamoto. All errors are my responsibility. This research was supported by the Ministry of Education, Culture, Sports, Science and Technology under grants-in-aid 13730023 and 14203003.

Type: Research Article
Information: Econometric Theory , Volume 21 , Issue 2 , April 2005 , pp. 299 - 325

DOI: https://doi.org/10.1017/S0266466605050188 [Opens in a new window]
Copyright: © 2005 Cambridge University Press

1. INTRODUCTION

A vector autoregressive (VAR) process has often been used to model a multivariate economic time series and, following the seminal work of Engle and Granger (1987), a cointegrating relation has been incorporated into the VAR model. A typical n-dimensional VAR model of order m is

for t = 1,…,T, where {ε_t} is independently and identically distributed (i.i.d.) with mean zero and a positive definite covariance matrix Σ and det(I_n − A₁ z − ··· − A_m z^m) has all roots outside the unit circle or equal to 1. The model (1) can be written in the error correction (EC) format,

where α and β are n × r matrices with rank r, [utri ] = 1 − L, and L denotes the lag operator. We assume 0 < r < n, and then there are r cointegrating relations. The exact condition of the existence of cointegration is given by Johansen (1991, 1992). We also assume that the cointegrating rank r is known or estimated by some testing procedure, such as the likelihood ratio (LR) test proposed by Johansen (1988, 1991) or the Lagrange multiplier (LM) test by Lütkepohl and Saikkonen (2000) and Saikkonen and Lütkepohl (2000). Other testing procedures of the cointegrating rank are reviewed by Hubrich, Lütkepohl, and Saikkonen (2001) and papers therein.

In this paper, we investigate the tests of the rank of β₁, the submatrix of β, and the rank of β_⊥,1, the submatrix of β_⊥, where β = [β₁′,β₂′]′ and β_⊥ = [β_⊥,1′,β_⊥,2]′, with β_⊥ being an orthogonal complement to β. In practical analysis, we sometimes encounter cases where we need to know the rank of β₁ and/or β_⊥,1. For example, the cointegrating matrix is sometimes normalized as β* = β(a′β)⁻¹, as proposed by Johansen (1988, 1991) and Paruolo (1997), where a is an n × r matrix with full column rank and the prototype normalization is represented by a = [I_r,0]′. However, there is no guarantee that a′β is of full rank. In such a situation, we would like to know whether the first r rows of β have full rank. The second example is the Granger noncausality test. As shown in Toda and Phillips (1993), when there is a cointegrating relationship, in general the Wald statistic of the Granger noncausality test from the last n₃ variables of x_t to the first n₁ variables has a nonstandard limiting distribution, depending on nuisance parameters. However, if either the last n₃ rows of β or the first n₁ rows of α have full row rank, the Wald statistic is asymptotically χ² distributed. Then, the testing procedure in this paper is useful to check the rank of the submatrix of β, whereas the existing testing procedure may be available for the test of the rank of the submatrix of α. The third example is the test of long-run Granger noncausality proposed by Yamamoto and Kurozumi (2001, 2003). In a usual sense, Granger causality is concerned with the one period ahead forecast. This concept can be extended to the predictability of h period ahead horizon, and long-run Granger causality is defined when the forecast horizon h goes to infinity. See, for example, Bruneau and Jondeau (1999) and Dufour and Renault (1998). Yamamoto and Kurozumi (2003) proposed the test for long-run block noncausality, in which it is shown that the ranks of the submatrices of β and β_⊥ play an important role in constructing the test statistic. See Yamamoto and Kurozumi (2003) for more details.

Tests of the rank of a matrix have been investigated in the literature, and recent econometric developments can be seen in works by Camba-Mendez, Kapetanios, Smith, and Weale (2003), Cragg and Donald (1996, 1997), and Robin and Smith (2000), among others. Although these papers proposed tests of the rank of a matrix, they assumed that the estimator of the matrix is T^1/2 consistent and has a limiting normal distribution with a nonstochastic variance matrix. However, the estimator of the cointegrating matrix is T (or T^3/2) consistent and has an asymptotic nonstandard distribution. As a result, we cannot apply existing testing procedures to the cointegrating matrix.

The paper is organized as follows. In Section 2, we propose tests of the rank of β₁ and β_⊥,1 for nontrending data. We will show that the two test statistics proposed have asymptotically a χ² distribution and a distribution of the maximum eigenvalue of the product of normal random matrices. Section 3 considers the case of trending data. In this case, the test statistics do not necessarily converge to a χ² distribution and a distribution of the maximum eigenvalue. To overcome this situation, we propose two testing procedures. Section 4 investigates the finite sample properties of the tests. Section 5 concludes the paper.

In regard to notation, we use vec(A) to stack the rows of a matrix A into a column vector, [x], to denote the largest integer ≤ x, a = a(a′a)⁻¹ for a full column rank matrix a, and

signify convergence in probability, convergence in distribution, and weak convergence of the associated probability measures. We denote the rank of A by rk(A) and the column space of A by sp(A). We write integrals such as

simply as ∫XdY′ to achieve notational economy, and all integrals are from 0 to 1 except where otherwise noted.

2. TEST OF THE RANK OF THE SUBMATRIX FOR NONTRENDING DATA

2.1. The Model with d = 0

In this section we consider a test of rank for nontrending data with d = 0. The model considered in this section is

We estimate the model (3) by the maximum likelihood (ML) method assuming that {ε_t} is Gaussian, although asymptotic properties are preserved under more general assumptions. We denote the ML estimator with [caret ]. For example, the ML estimator of β is denoted by

. Using the result that

for 0 ≤ r ≤ 1 by the functional central limit theorem, where W(·) is an n-dimensional Brownian motion with a variance matrix Σ, Johansen (1988, 1995) showed that

where

with

, and G₀(·) and V(·) are independent. He also showed that

are consistent estimators of α, Σ, and Γ_i, respectively.

Let us partition β as β′ = [β₁′,β₂′] where β₁ and β₂ are n₁ × r and (n − n₁) × r matrices, respectively (0 < n₁ < n). Similarly, we partition β_⊥′ = [β_⊥,1′,β_⊥,2′] conformably. Note that β₁′ β_⊥,1 does not necessarily equal zero, whereas β′β_⊥ = β₁′ β_⊥,1 + β₂′ β_⊥,2 must be zero. Our interest lies in finding the rank of β₁, and thus we consider the following testing problem:

Note that the rank of β₁ is at most p ≡ min(n₁,r).

To test the rank of β₁, we follow the same strategy as Robin and Smith (2000), who test the rank of a matrix and investigate its quadratic form. In our situation, we construct a quadratic form of β₁. The advantage of considering a quadratic form is that the eigenvalues are nonnegative real values, even if those of β₁ are complex values. Then, the null hypothesis H₀ becomes equivalent to the existence of f positive real and n₁ − f zero eigenvalues.

Let Ψ and Φ be r × r and n₁ × n₁ possibly stochastic matrices that are symmetric and positive definite almost surely (a.s.). Because they are full rank matrices (a.s.), the rank of β₁ is equal to the rank of Φ⁻¹β₁Ψβ₁′ (a.s.). Therefore, the test of the rank of β₁ is equivalent to that of Φ⁻¹β₁Ψβ₁′, and then we consider the rank of the latter matrix. Note that, although this strategy is basically the same as that followed by Robin and Smith (2000), we cannot directly use their result because they assume that the estimated matrix is asymptotically normally distributed with a convergence rate T^1/2, whereas

is shown to be T-consistent and the limiting distribution is mixed Gaussian.

For the test of the rank of β₁, we define Ψ = α′Σ⁻¹α and

These Ψ and Φ are chosen so that the limiting distribution of the test statistic does not depend on nuisance parameters. Other choices of Φ may be possible because, as shown in the Appendix, the test statistic asymptotically does not depend on β₁(β′β)⁻¹β₁′, which appears when (6) is expanded. For example, we can use a constant multiple of (β′β)⁻¹ in the second term of (6). However, as indicated in the Appendix, Φ has to be invariant to the normalization of β. We use the definition (6) just because it seems simplest among other choices.

Let λ₁ ≥ λ₂ ≥ ··· ≥ λ_n₁ be the ordered eigenvalues of Φ⁻¹β₁Ψβ₁′, which are the solution of the determinant equation

Then, under H₀, λ₁ ≥ ··· ≥ λ_f > 0 and λ_f+1 = ··· = λ_n₁ = 0 (a.s.).

We construct a sample analogue of (7) using the ML estimator and investigate the limiting distributions of the eigenvalues. The sample analogue of (7) is given by

where

is the first n₁ rows of

, and

where

, with R_1t being the regression residual of x_t−1 on [utri ]x_t−1,…,[utri ]x_t−m+1, and we denote the ordered eigenvalues of (8) as

. Note that when n₁ > r, the smallest n₁ − r eigenvalues are obviously equal to 0, that is,

. We can easily see from the expressions (6) and (9) that

are positive definite (a.s.), whereas the expression (10) is simpler and may be used to construct

in practice.

To test the rank of β₁, we consider the following test statistic:

which rejects the null hypothesis when

takes large values. The second equality is established because p = min(n₁,r) and

when n₁ > r.

We can also consider the null hypothesis of rk(β₁) = f against the alternative of rk(β₁) = f + 1. In this case, the test statistic is defined by

To denote the limiting distribution of

, we define λ_max,j,k* as the maximum eigenvalue of

where X*′ is a j × k matrix with vec(X*′) ∼ N(0,I_j×k). The critical points of this distribution are given in Table 1 for the case where j ≥ k. They are calculated by simulations with 1,000,000 replications. Because the nonzero eigenvalues of X*X*′ are the same as those of X*′X*, we can refer to the percentage points of λ_max,k,j when j < k.

Critical values of the λmin,j,k* and λmax,j,k*

THEOREM 1. Let

be given by (10). If f < p, under H₀,

Remark 1. Because the determinant equation (8) converges to (7) in distribution, the estimated ordered eigenvalues of (8) also converge in distribution to those of (7). Then, under the alternative,

(a.s.), so that

goes to infinity. Therefore, the tests

are consistent.

Remark 2. Although the test statistics are constructed using the estimator of β_⊥,1, we do not have to assume that it is of full rank. We can see that the rank of β_⊥,1 is at least n₁ − f under H₀, noting that the column space of β_⊥,1 must contain n₁ − f bases that are orthogonal to sp(β₁) because [β₁,β_⊥,1] has full row rank n₁. Because β₁′ β_⊥,1 is not necessarily equal to zero, it is possible for sp(β_⊥,1) to contain some of the bases that span sp(β₁), so that the rank of β_⊥,1 may be greater than n₁ − f. It is shown in the Appendix that the limiting distributions of the test statistics depend not on the rank of β_⊥,1 but on the number of the bases orthogonal to sp(β₁), n₁ − f, unless f = n₁. When f = n₁, all the eigenvalues are asymptotically greater than zero (a.s.), and then the test statistics will diverge. This case is excluded from the theorem (f is assumed to be less than p = min(n₁,r)). In other words, our tests cannot be applied for the null hypothesis of full rank. If we need to check whether β₁ is of full rank or not, we may test for the null of f = n₁ − 1, and if we rejected the null hypothesis, we would conclude that it is a full row rank matrix.

Remark 3. Because the hypothesis about the rank of β₁ can be regarded as a restriction on the cointegrating matrix β, we may consider using the LR test as proposed by, for example, Johansen (1991, 1995) and Johansen and Juselius (1990, 1992). In fact, when f = 0 the null hypothesis is equivalent to β₁ = 0, and this hypothesis can be expressed as a linear restriction on β such as β = Hφ, where H = [0,I_n−n₁]′ and φ is an (n − n₁) × r unknown parameter. Then, the LR test is applicable to the test of f = 0. However, for 0 < f < p, the null hypothesis is expressed as β₁ = β₁₁ β₁₂′ where β₁₁ and β₁₂ are n₁ × f matrices with full column rank f. Then, we have to estimate the model with this restriction. Although the LR test might be applicable to the nonlinear hypothesis, it seems tedious to estimate the model with this nonlinear restriction, whereas our test uses only the ML estimator without the restriction. It is beyond our scope to investigate the applicability of the LR test to our case, and we do not discuss this in detail.

We may represent the null hypothesis as proposed by Boswijk (1996) and apply the LR test. According to his paper, the null hypothesis of rk(β₁) = f is expressed as β = (H_oφ,ψ) where H_o = [0,I_n₂]′ and (φ,ψ) ∈ R^{n₂×(r−f)} × R^n×f. As pointed out by Boswijk (1996, p. 156), the LR test for this hypothesis has an asymptotic χ² distribution only when “no linear combination of ψ lies in the column space of” H_o. Because there is no guarantee of this condition, we do not consider his method in this paper.

Next, we consider a test of the rank of the submatrix of β_⊥. The testing problem is

For the same reason as in the test of β₁, we investigate the rank of

, where

are (n − r) × (n − r) and n₁ × n₁ full rank matrices (a.s.). Similar to (7), we consider the following determinant equation:

where

and

and the sample analogue of (12) is given by

where

and

Let

be ordered eigenvalues of (12) and (13), respectively, and we construct the following test statistics, with q = min(n₁,n − r):

THEOREM 2. Let

be given by (14). If g < q, under H_0⊥,

Note that the consistency of the tests is shown in a similar way as in Remark 1. We also note that we cannot test the null of rk(β_⊥,1) = q by a similar reason to that given in Remark 2.

Given the preceding two theorems, we can test the rank of β₁ and β_⊥,1. In addition, we may consider the procedure to decide the rank of the submatrix, as the cointegrating rank is selected sequentially using the test of the cointegrating rank. For example, to decide the rank of β₁, we first test the null of f = 0. If the null hypothesis is accepted, the rank of β₁ is decided to be zero. Otherwise, we then test the hypothesis of f = 1. We sequentially continue to test the rank of β₁ until the null hypothesis is accepted. When the null of f = p − 1 is rejected, we consider that β₁ has full rank. Similarly, the rank of β_⊥,1 can be decided by the same procedure.

2.2. The Model with d ≠ 0

In the previous section, we considered the model with d = 0 for nontrending data. However, in practice, we sometimes consider the model (2) with d ≠ 0 but with the level of data having no linear trend. In this case, the constant term can be expressed as d = αρ₀ where ρ₀ is an r × 1 coefficient vector, so that the model (2) becomes

where β⁺ = [β′,ρ₀]′ and x_t−1⁺ = [x_t−1′,1]′. The ML estimator of β⁺ can be obtained by the reduced rank regression of [utri ]x_t on x_t−1⁺ corrected for [utri ]x_t−1,…,[utri ]x_t−m+1, and the estimator of the cointegrating matrix is the first n rows of

To test the rank of the submatrix of β for the model (15), we use

defined by

where

are (n − r + 1) × (n − r) and (n + 1) × (n − r + 1) matrices defined by

and

, with R_1t⁺ being the regression residual of x_t−1⁺ on [utri ]x_t−1,…,[utri ]x_t−m+1.

THEOREM 3. Consider the model (15) and let

be given by (16). If f < p, under H₀,

THEOREM 4. Consider the model (15) and let

be given by (14). If g < q, under H_0⊥,

In practical analysis, we will obtain

by the reduced rank regression, and we have to calculate

from

. If

can be easily obtained as explained in Johansen (1995, p. 95). When d = αρ₀, one of the methods to calculate

is as follows. First we calculate the orthogonal projection matrix of

. Then, by the singular value decomposition, M is expressed as M_l M_λ M_r′ where M_l and M_r are n × (n − r) orthogonal matrices and M_λ is an (n − r) × (n − r) diagonal matrix with positive diagonal elements. Because sp(M) = sp(M_l) and they are orthogonal to

, we can use M_l as

3. THE TEST OF THE RANK OF THE SUBMATRIX FOR TRENDING DATA

When data are trending, x_t can be expressed as the sum of the stochastic trend, the deterministic trend, and the I(0) component such that

where C = β_⊥(α_⊥′Γβ_⊥)⁻¹α_⊥′ as defined in Section 2.1, τ = Cd, C₁(L) = (C(L) − C(1))/(1 − L) with C(L) being a lag polynomial when [utri ]x_t is represented as the vector moving-average process like [utri ]x_t = C(L)(d + ε_t), and x₀* is a stochastic component such that β′x₀* = 0. See Johansen (1991, 1995) for more details. In this case, β_⊥ is decomposed to τ, the coefficient of a linear trend in (17), and γ, an n × (n − r − 1) matrix that is orthogonal to τ. We partition γ and τ into [γ₁′,γ₂′]′ and [τ₁′,τ₂′]′ in the same way as β. As shown in Chapter 13.2 of Johansen (1995),

can be expressed as

where

where G(r) = [G₁′(r),G₂′(r)]′ with G₁(r) = G₀(r) − ∫G₀ ds, G₀(r) = γ′CW(r) and G₂(r) = r − ½. We denote Ω = ∫GG′ ds and partition it into 2 × 2 blocks conformably with [U₁′,U₂′]′. We express the (i,j) block element of (∫GG′ ds)⁻¹ as Ω^ij for i,j = 1 and 2. In this section, we need the estimator of Ω¹¹, which is given by

and S₁₁ is defined in the same way as in the previous section, with R_1t being the regression residual of x_t−1 on a constant and [utri ]x_t−1,…,[utri ]x_t−m+1. Convergence of

is proved in Lemma 2(iii) in the Appendix, whereas the consistency of other ML estimators, such as

, is shown by Johansen (1991, 1995).

In the following discussion, we will show that the limiting distribution of

depends on whether the rank of [β₁,γ₁] is n₁ − 1 or n₁, or equivalently, whether τ₂ = 0 or not. We will propose two testing procedures to cope with this problem.

Let us consider the testing problem (5). Under the null hypothesis, we can find the f linearly independent column vectors in β₁, and we define β₁* as an n₁ × f matrix whose columns consist of those f vectors. We also define an n₁ × (n₁ − f) matrix δ* as an orthogonal complement to β₁*, so that δ*′β₁* = 0. We show that the direction of δ* is important in deciding the convergence rate of

and it also affects the limiting property of the test statistic.

Let us consider the case where r < n − 1. Because

is the first n₁ rows of

, it is expressed from (18) as

Suppose that an n₁ × 1 vector τ₁* exists that is orthogonal to γ₁ (τ₁*′γ₁ = 0) and belongs to the column space of δ*. Here, note that, because the n × n matrix [β,γ,τ] is of full rank, the first n₁ rows of this matrix, [β₁,γ₁,τ₁], must be of full row rank, which implies that a′[β₁,γ₁,τ₁] ≠ 0 for any nonzero vector a. Then, because τ₁* is orthogonal to both β₁ and γ₁ by the assumption, we have τ₁*′[β₁,γ₁,τ₁] = [0,0,τ₁*′τ₁] ≠ 0, so that τ₁*′τ₁ ≠ 0. This implies

whereas for an n₁ × (n − r − 1) matrix δ₀* whose columns span the orthogonal complement to τ₁* in sp(δ*),

On the other hand, if there exists no vector in sp(δ*) that is orthogonal to γ₁, we have

Therefore, the convergence rate of

depends on whether a vector τ₁* orthogonal to γ₁ exists in sp(δ*).

The existence of τ₁* indicates that the column space of [β₁,γ₁] does not include τ₁* because τ₁*′ β₁ = 0 and τ₁*′γ₁ = 0. We also note that the rank of [β₁,γ₁] must be n₁ − 1 or n₁ because [β₁,γ₁,τ₁] has full rank n₁. Then, from another point of view, we can say that the rank of [β₁,γ₁] is n₁ − 1 if a vector τ₁* exists, whereas the nonexistence of τ₁* is equivalent to rk([β₁,γ₁]) = n₁. Thus, we have to consider the asymptotic property separately according to the two cases where the rank of [β₁,γ₁] is n₁ and n₁ − 1 when r < n − 1.

For further investigation, let us consider the case where the rank of [β₁,γ₁] equals n₁ − 1. In this case, this matrix is expressed as [Θ₁₁,0] by some nonsingular transformation from the right-hand side, where Θ₁₁ is an n₁ × (n₁ − 1) matrix with rank n₁ − 1. Then, using the same nonsingular transformation, [β,γ] becomes

Let τ₁* be the orthogonal complement to the column space of Θ₁₁. Then, because τ₁*′Θ₁ = 0 and using the expression (22), we can see that the n × 1 vector [τ₁*′,0]′ is orthogonal to [β,γ]. Therefore, in this case, the trend parameter τ, which is orthogonal to β and γ, is a constant multiple of [τ₁*′,0]′. In other words, when rk[β₁,γ₁] = n₁ − 1, τ₂ must be equal to zero. Note that, because τ₁* is orthogonal to sp(β₁) and sp(γ₁), it is essentially the same as τ₁*.

On the other hand, when τ₂ = 0, τ is expressed as [τ₁′,0]′ and then τ₁′[β₁,γ₁] equals zero because τ′[β,γ] = 0. This implies that the n₁ × (n − 1) matrix [β₁,γ₁] does not have full row rank. Then, we have the following proposition.

PROPOSITION 1. The rank of [β₁,γ₁] is n₁ − 1 if and only if τ₂ = 0.

When r = n − 1, there is no γ, and in this case, rk(β₁) must be n₁ − 1 or n₁. Then, under the null hypothesis of rk(β₁) = n₁ − 1, δ* becomes an n₁ × 1 vector, and we have

In this case, the test statistics should be multiplied by T, that is,

are the appropriate test statistics.

In the following theorem, the test statistics are constructed from the eigenvalues of (8) using the same

as in the previous section and either

THEOREM 5. When r < n − 1,

(i.a) Let

be given by (24). If rk([β₁,γ₁]) = n₁ and f < p, under H₀,

(i.b) Let

be given by (25). If rk([β₁,γ₁]) = n₁ and f < p, under H₀,

converge in distribution to random variables that are bounded above by χ_{(n₁−f)(r−f)}² and λ_{max,n₁−f,r−f}*, respectively.

(ii) Let

be given by (25). If rk([β₁,γ₁]) = n₁ − 1 and f < p, under H₀,

When r = n − 1,

(iii) Let

be given by (25). Under the null hypothesis of f = n₁ − 1,

Remark 4. In the case of (i.b),

converges in distribution to χ_{(n₁−f)(r−f)}² if and only if δ*′τ₁ = 0, which is equivalent to the case where τ₁ ∈ sp(β₁*) = sp(β₁). See the proof in the Appendix. In general, the test using (25) is conservative if rk([β₁,γ₁]) = n₁.

From Theorem 5, if we know the rank of [β₁,γ₁] when r < n − 1, we can construct the test statistic

that converges to a χ² distribution by appropriately using (24) or (25). However, such information is not available in practice. Notice that if rk[β₁,γ₁] = n₁ − 1,

given by (24) may violate the condition that it is a full rank matrix, and in that case, the test statistic converges not to the same χ² distribution as given by Theorem 5(ii) but to a random variable that depends on a nuisance parameter. Then, the test using (24) is not desirable in practice. On the other hand, if we use

given by (25), we can test the hypothesis by referring to a χ² distribution irrespective of the rank of [β₁,γ₁], although the test may be conservative and the degrees of freedom may change depending on the rank of [β₁,γ₁]. Then, noting that the critical value of χ_{(n₁−f)(r−f)}² in Theorem 5(i) is greater than that of χ_{(n₁−f−1)(r−f)}² in (ii), we propose to test the null of rk(β₁) = f as follows.

1. We construct the test statistic

using (25).

2.If

is greater than the critical value of χ_{(n₁−f)(r−f)}², we reject the null hypothesis.

3. If

is less than the critical value of χ_{(n₁−f−1)(r−f)}², we accept the null hypothesis.

The test statistic

is used in the same manner. In this procedure, we may encounter the case where the test statistic is greater than the critical value of χ_{(n₁−f−1)(r−f)}² but less than that of χ_{(n₁−f)(r−f)}², that is, the case where

, where c_{(n₁−f−1)(r−f)} and c_{(n₁−f−1)(r−f)} are corresponding critical values. To cope with such a case, the following corollary is useful.

COROLLARY 1. Let

be given by (25). Suppose that r < n − 1 and the rank of β₁ is f (< p).

(i) If rk([β₁,γ₁]) = n₁,

converges in distribution to a random variable that is bounded above by λ_{min,r−f,n₁−f}*, which is the smallest nonzero eigenvalue of (11) with j = r − f and k = n₁ − f.

(ii) If

converges in probability to zero.

The percentage points of λ_{min,r−f,n₁−f}* are tabulated in Table 1.

Using the preceding corollary, we can cope with the situation where

. If

is less than some percentage (10, 5, or 1%) point of λ_{min,r−f,n₁−f}*, we reject the hypothesis of rk([β₁,γ₁]) = n₁. In that case, c_{(n₁−f−1)(r−f)} is an appropriate critical value for

, so that the null of rk(β₁) = f is rejected. On the other hand, if

is greater than the critical point of λ_{min,r−f,n₁−f}*, we accept the hypothesis of rk([β₁,γ₁]) = n₁, so that the rank of β₁ is decided to be f. We call this testing procedure TEST1.

The other strategy is to use the result of Proposition 1. From Johansen (1995),

converges in distribution to a normal random vector with mean zero and the variance matrix given by CΣC′. Although the Wald-type test may not be applicable to the test of τ₂ = 0 because the variance matrix might be degenerate, we can test whether each element of τ₂ is zero or not by the t-test statistic. We call the following testing procedure TEST2.

1. We test each element of τ₂.

2. If some of the elements of τ₂ are significant, we use Theorem 5(i.a).

3. If none of the elements of τ₂ are significant, we use Theorem 5(ii).

Next, we investigate a test of the rank of β_⊥,1. When data are trending, β_⊥,1 can be decomposed into [γ₁,τ₁] where γ₁ and τ₁ are the first n₁ rows of γ and τ, respectively. Then, testing the rank of β_⊥,1 is equivalent to testing the rank of [γ₁,τ₁], and therefore we construct a test statistic from

. Note that

is the first n₁ rows of

and is not necessarily numerically equal to

, although they span the same column space.

Let us consider the same determinant equation as (13) with

replaced by

and

We construct the test statistics

in the same way as in the previous section. Similar to Theorem 5, we have to distinguish two cases where r < n − 1 and r = n − 1. When r = n − 1, the rank of β_⊥,1 (= τ₁) must be 0 or 1, and in this case, we consider the null hypothesis of g = 0.

THEOREM 6. Let

be given by (26) and (27). When r < n − 1 and g < q, under H_0⊥,

converge in distribution to random variables that are bounded above by χ_{(n₁−g)(n−g−r)}² and λ_{max,n₁−g,n−g−r}*, respectively.

When r = n − 1, under the null hypothesis of g = 0,

4. SIMULATION RESULTS

In this section, we investigate the finite sample properties of the tests proposed in the previous sections. We consider the following four-dimensional EC model as a data generating process (DGP):

where {ε_t} ∼ i.i.d.N(0,I₄). Let

and we consider the following settings of parameters.

Here DGP1(1o), 2(2o), and 3(3o) correspond to the cases where the cointegrating rank is 1, 2, and 3, respectively. We set the (2,1) element of β as c₁, which takes values of 0, 0.005, 0.01, 0.025, 0.05, 0.075, and 0.1, and we consider the test of the rank of the first two rows of β. The case of c₁ = 0 corresponds to the null hypothesis under which the rank of β₁ is 0, 1, and 1 for DGP1, 2, and 3, whereas it is 1, 2, and 2 when c₁ ≠ 0, which corresponds to the alternative. For the case of nontrending data, we set d = 0 for the zero-mean process, whereas d is defined as αρ₀ for the case of d ≠ 0, where ρ₀ is set to be 1, [1,1]′, and [1,1,1]′ for DGP1(1o), 2(2o), and 3(3o), respectively. On the other hand, for the case of trending data, d is set to be d₁ and d₂; the former corresponds to the case where [β₁,γ₁] is of full rank (τ₂ ≠ 0), whereas the rank of [β₁,γ₁] is n₁ − 1 (τ₂ = 0) when d = d₂.

Similarly, we set the (2,1) element of β_⊥ as c₂ and consider the test of the rank of the first two rows of β_⊥. In this case, c₂ = 0 implies that the rank of β_⊥,1 is 1, 1, and 0 for DGPo1, o2, and o3, respectively, whereas it is 2, 2, and 1 under the alternative of c₂ ≠ 0.

We set x₀ = 0 and discard the first 100 observations in all experiments. The number of replication is 5,000, and the level of significance is set equal to 0.05. We only report the results of the test statistics

because the performances of

are almost the same as those of

Table 2 shows the simulation results of the test of rk(β₁). When the cointegrating rank is 1, the empirical size is greater than the nominal size, 0.05, for T = 100 when data are nontrending (d = 0 or d = αρ₀), whereas it becomes closer to 0.05 for T = 200. When data are trending, τ becomes

for d = d₁, whereas it is

for d = d₂. Similar to the case of nontrending data, the testing procedure TEST2 tends to overly reject the null of c₁ = 0 for T = 100, whereas the testing procedure TEST1 seems to be slightly conservative. Under the alternative of c₁ ≠ 0, the power increases rapidly around c₁ = 0.025 for nontrending data and for trending data with TEST2, whereas the testing procedure TEST1 seems to be less powerful. This is because TEST1 is a conservative test. When data are trending, both TEST1 and TEST2 are more powerful for the model with rk([β₁,γ₁]) = n₁ (d = d₁) than the model with rk([β₁,γ₁]) = n₁ − 1 (d = d₂).

Rejection frequencies of the tests of rk(β1)

When the cointegrating rank is 2, the relative performance is preserved for the cases of d = 0 and d = αρ₀. For trending data, τ becomes

for d = d₁ and d₂, respectively. Note that

is numerically equal to

because the determinant equation (11) with j = k = 1 has only one eigenvalue. Then, we can see that

converge in distribution to χ₁² under H₀ when rk([β₁,γ₁]) = n₁ = 2, whereas they converge in probability to zero when rk([β₁,γ₁]) = n₁ − 1 = 1. Then, the testing procedure TEST1 accepts the null hypothesis when it is less than the critical point of χ₁². On the other hand, the asymptotic size of TEST1 becomes 0 when rk([β₁,γ₁]) = n₁ − 1 (d = d₂) because

_T converges in probability to zero when rk([β₁,γ₁]) = n₁ − 1. Reflecting this fact, TEST1 is too conservative for d = d₂, and it is not powerful when the alternative is close to the null. TEST2 also seems to have no power when the rank of [β₁,γ₁] = n₁ − 1 = 1 (d = d₂). This is because τ₂ is very close to zero

For example, even when c₁ = 0.1, the third and fourth elements of τ are

and 3/430.

even under the alternative of c₁ ≠ 0, so that the pretest of τ₂ cannot reject the null of τ₂ = 0. When τ₂ is judged to be zero, the rank of [β₁,γ₁] is at most n₁ − 1 = 1. Then, because we are testing the null hypothesis of rk(β₁) = f = 1, we automatically accept the hypothesis when τ₂ = 0 is accepted in this case.

When the cointegrating rank is 3, we can see that the first two variables of x_t are cointegrated, whereas the last two variables are stationary. Note that we cannot generate the process such that the rank of β₁ is 1 while all the variables are nonstationary. Because we want to investigate the property of the test under the null hypothesis, we allowed several variables to be stationary.

In this case, the power property seems to be improved for all the cases compared with the cases where r = 1 and 2. For trending data, τ becomes

for d = d₁, whereas it is

for d = d₂. Note that in this case the last two rows of the impact matrix C become zero because the corresponding variables are stationary, so that τ₂, the last two rows of Cd, become zero irrespective of the value of d. We also note that the result of Theorem 5(iii) is applied because r = n − 1 = 3. That is, we do not have to use the conservative test or the pretest as in the cases where r < n − 1. This is the reason why both the size and power properties are improved for trending data compared with the cases where r < 3.

Table 3 reports the results of the test of rk(β_⊥,1). From the table, the test tends to overly reject the null hypothesis for several cases when T = 100, whereas the size becomes reasonable when T = 200, except for the case where r = 3 and d = d₁. In that case, the test becomes conservative as investigated in Theorem 6. As to the power, we can see that the more complicated the deterministic term becomes, the less powerful is the test.

Rejection frequencies of the tests of rk(β⊥,1)

5. CONCLUSION

In this paper, we proposed tests of the rank of the submatrix of cointegration. We can test the hypothesis straightforwardly when data are nontrending, whereas for trending data, we have to examine whether [β₁,γ₁] is of full rank or not or we have to use the conservative test. The simulation results show that we have to carefully use the test of rk(β₁) when data are trending and f = n₁ − 1, because the test might become too conservative to reject the null hypothesis.

APPENDIX

We use the notation H alternately for different definitions if there is no confusion.

Proof of Theorem 1. First, note that we can replace

in (8), where

is the first n₁ rows of

, because

. The latter relation is established because

is obtained by the nonsingular transformation of the columns of

does not depend on the normalization of

. We also define

whose columns span the orthogonal complement to

), so that

span the same column space. This implies that

can be obtained by the nonsingular transformation of the columns of

. Then, we can also replace

Under the null hypothesis, rk(β₁) is f, and then an n₁ × f matrix β₁* exists with rank f such that sp(β₁) = sp(β₁*). We denote the orthogonal complement to β₁* by δ*. That is, δ* is an n₁ × (n₁ − f) matrix with rank (n₁ − f) such that δ*′β₁* = 0.

LEMMA 1.

Proof.

(i) Proved by Johansen (1988, 1995).

(ii) As shown in Chapter 13.2 of Johansen (1995),

can be expressed as

for nontrending data, where TU_T converges in distribution to (∫G₀ G₀′ ds)⁻¹∫G₀ dV′. Because

is the first n₁ rows of

, we have

, so that

(iii) holds because

from (A.1).

(iv) is proved by noting that

from Johansen (1988, 1995),

. █

Now, let us consider the determinant equation (8). Note that (8) is equivalent to

where H = [β₁*,Tδ*] is an n × n nonsingular matrix. Using Lemma 1, we have

To investigate the asymptotic behavior of

, we consider

with the same expression as (9). Note that

because

by Lemma 1. Then,

is asymptotically equivalent to

Then, the equation (A.2) is asymptotically equal to

Therefore, the eigenvalues

converge in probability to zeros and are of order T⁻².

Here, notice that, in the same way as Johansen (1988, p. 246), we can find an r × (r − f) matrix J with rank (r − f) such that

with J′(β₁′ β₁*) = 0 and J′Ψ⁻¹J = I_r−f, implying that J′(α′Σ⁻¹α)⁻¹J = I_r−f because Ψ = α′Σ⁻¹α. Then, because |β₁*′β₁Ψβ₁′ β₁*| ≠ 0, (A.4) becomes

The variance matrix of X₀′J conditioned on G₀(·) is given by

Noting that sp(β_⊥,1) must contain δ* because [β₁,β_⊥,1] is of full row rank n₁ and sp(β₁) does not contain δ*, we can see that δ*′β_⊥,1 has full row rank n₁ − f irrespective of the rank of β_⊥,1, which is greater than n₁ − f as explained in Remark 2. As a result, we can see that the conditional variance matrix of X₀′J is nonsingular (a.s.). Then, by multiplying the square root of the left-hand side of (A.7) from both sides of (A.6), the determinant equation becomes (11) with j = n₁ − f and k = r − f, and then

converges in distribution to the solution of (11). This proves Theorem 1. █

Proof of Theorem 2. The outline of the proof is the same as the proof of Theorem 1, and thus we omit details.

Under the null hypothesis, an n₁ × g matrix β_⊥,1* exists such that sp(β_⊥,1*) = sp(β_⊥,1) and rk(β_⊥,1*) = g, and we denote the orthogonal complement to β_⊥,1* by η*. Consider the following determinant equation:

where H = [β_⊥,1*,Tη*]. As in the previous proof, we replaced [caret ] by ˜. Because

is the first n₁ rows of

, we obtain, using Lemma 1(iii),

Then, similar to the previous proof, we can show that

converges in distribution to a solution of (11) with j = n₁ − g and k = n − r − g. This proves Theorem 2. █

Proof of Theorems 3 and 4. Let

. Exactly in the same way as the proof of Lemma 13.2 in Johansen (1995), we can show that

where G₀⁺ = [G₀′,1]′. Then, because

is the first n rows of

, we have

whose conditional variance is given by L′(∫G₀⁺G₀⁺′ ds)⁻¹L [otimes ] (α′Σ⁻¹α)⁻¹. Because

as expressed in Johansen (1995, p. 179), we have

We also have

, which is proved as Lemma 1(iv), where

with

replaced by

. Then, the theorems are proved similarly to Theorems 1 and 2. █

Proof of Theorem 5. For the case where r < n − 1, we give the following lemma.

LEMMA 2.

Proof.

From Lemma 10.3 in Johansen (1995), T⁻¹[γ,T^−1/2τ]′S₁₁ [γ,T^−1/2τ] converges in distribution to Ω whereas β′S₁₁ β converges in probability to a positive definite matrix, Σ_β, and [γ,T^−1/2τ]′S₁₁ β = O_p(1). Then,

In addition, we can see that

because

. Using this result, we have

From (A.10) and (A.11),

converges in distribution to Ω¹¹. █

(i.a) Proved in the same way as Theorem 1.

(i.b) In this case, the determinant equation becomes asymptotically equivalent to

Note that, in general, for a given symmetric and positive definite matrix A and a vector b,

and then

for any nonzero vector c. By substituting δ*′γ₁(γ′γ)⁻¹Ω¹¹(γ′γ)⁻¹γ₁′δ* and

for A and b, we obtain, for a given G(·),

where X* is an (r − f) × (n₁ − f) matrix with vec(X*) ∼ N(0,I_{(r−f)(n₁−f)}). The equality is established if and only if δ*′τ₁ = 0.

(ii) Let us consider the determinant equation (A.2) with H = [β₁*,Tδ₀*,Tτ₁*]. Using Lemma 2 and by some algebra, the determinant equation is shown to be asymptotically equivalent to

This determinant equation implies that there are f nonzero eigenvalues, p − f − 1 eigenvalues of order T⁻², and one eigenvalue of order smaller than T⁻². Then, we can see that

We can also show that

is of order T³ if we choose H = [β₁*,Tδ₀*,T^3/2τ₁*].

For the case where r = n − 1, the limiting distribution is derived similarly using (23). █

Proof of Corollary 1.

(i) Note that, in general, for a given positive definite matrix A, a vector b, and a matrix D,

where we used the relation (A.12). By Theorem 9 of Magnus and Neudecker (1988, p. 208), we can see that the p − f th eigenvalue of D′A⁻¹D is greater than that of D′(A + bb′)⁻¹D. Then, by substituting δ*′γ₁(γ′γ)⁻¹Ω¹¹(γ′γ)⁻¹γ₁′δ*,

, and X′J for A, b, and D, the limiting distribution of

is shown to be bounded above by λ_{min,r−f,n₁−f}* because D′A⁻¹D = X*X*′ in this case. Note that

if and only if δ*′τ₁ = 0.

(ii) is proved in Theorem 5(ii). █

Proof of Theorem 6. Let us define β_⊥,1* and η* as in the proof of Theorem 2.

LEMMA 3.

(i)

, say, where

(ii)

Proof.

(i) Because η*′γ₁ = 0 and η*′τ₁ = 0, we have, using (A.14),

(ii) First, note that, because

is invariant to each normalization of

. Then, we can express

From the expression (A.1), we can see that

We also have, from the definition of τ,

Because the left-hand side is zero from the orthogonality between γ and τ, the first n − r − 1 rows of (α_⊥′Γβ_⊥)⁻¹α_⊥′ μ are zero. Then, because each estimator is consistent, we have

Combining (A.15) and (A.16), we obtain

. █

Similar to the proof of Theorem 2, we consider the same determinant equation as (A.8). Using Lemma 3, we have

where S₁ = [I_n−r−1,0], and then, using

, (A.8) is expressed as

for large values of T, where an (n − r) × (n − r − g) matrix J satisfies

. Noting that the conditional variance of Y′S₁ J is given by

the test statistic

conditioned on G(·) converges in distribution to

where vec(Y*) ∼ N(0,I_{(n₁−g)(n−r−g)}) and J = [J₁′,J₂′]′. Because

the limiting distribution (A.18) is bounded above by

References

REFERENCES

Boswijk, H.P. (1996) Testing identifiability of cointegrating vectors. Journal of Business & Economic Statistics 14, 153–160.Google Scholar

Bruneau, C. & E. Jondeau (1999) Long-run causality, with application to international links between long-term interest rates. Oxford Bulletin of Economics and Statistics 61, 545–568.Google Scholar

Camba-Mendez, G., G. Kapetanios, R.J. Smith, & M.R. Weale (2003) Tests of rank in reduced rank regression models. Journal of Business & Economic Statistics 21, 145–155.Google Scholar

Cragg, J.G. & S.G. Donald (1996) On the asymptotic properties of LDU-based tests of the rank of a matrix. Journal of the American Statistical Association 91, 1301–1309.Google Scholar

Cragg, J.G. & S.G. Donald (1997) Inferring the rank of a matrix. Journal of Econometrics 76, 223–250.Google Scholar

Dufour, J.M. & E. Renault (1998) Short run and long run causality in time series: Theory. Econometrica 66, 1099–1125.Google Scholar

Engle, R.F. & C.W.J. Granger (1987) Co-integration and error correction: Representation, estimation, and testing. Econometrica 55, 251–276.Google Scholar

Hubrich, K., H. Lütkepohl, & P. Saikkonen (2001) A review of systems cointegrating tests. Econometric Reviews 20, 247–318.Google Scholar

Johansen, S. (1988) Statistical analysis of cointegration vectors. Journal of Economic Dynamics and Control 12, 231–254.Google Scholar

Johansen, S. (1991) Estimation and hypothesis testing of cointegration vectors in Gaussian vector autoregressive models. Econometrica 59, 1551–1580.Google Scholar

Johansen, S. (1992) A representation of vector autoregressive processes integrated of order 2. Econometric Theory 8, 188–202.Google Scholar

Johansen, S. (1995) Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford University Press.

Johansen, S. & K. Juselius (1990) Maximum likelihood estimation and inference on cointegration: With applications to the demand for money. Oxford Bulletin of Economics and Statistics 52, 169–210.Google Scholar

Johansen, S. & K. Juselius (1992) Testing structural hypotheses in a multivariate cointegration analysis of the PPP and UIP for UK. Journal of Econometrics 53, 211–244.Google Scholar

Lütkepohl, H. & P. Saikkonen (2000) Testing for the cointegrating rank of a VAR process with a time trend. Journal of Econometrics 95, 177–198.Google Scholar

Magnus, J.R. & H. Neudecker (1988) Matrix Differential Calculus with Applications in Statistics and Econometrics. Wiley.

Paruolo, P. (1997) Asymptotic inference on the moving average impact matrix in cointegrated I(1) VAR systems. Econometric Theory 13, 79–118.Google Scholar

Robin, J.-M. & R.J. Smith (2000) Test of rank. Econometric Theory 16, 151–175.Google Scholar

Saikkonen, P. & H. Lütkepohl (2000) Testing for the cointegrating rank of a VAR process with an intercept. Econometric Theory 16, 373–406.Google Scholar

Toda, H.Y. & P.C.B. Phillips (1993) Vector autoregression and causality. Econometrica 61, 1367–1393.Google Scholar

Yamamoto, T. & E. Kurozumi (2001) Finite sample properties of the test for long-run Granger non-causality in cointegrated systems. In F. Ghassemi et al. (ed.), Proceedings of International Congress on Modelling and Simulation 2001. Modelling and Simulation Society of Australia and New Zealand.

Yamamoto, T. & E. Kurozumi (2003) Tests for Long-Run Granger Non-Causality in Cointegrated Systems. Discussion paper 2003-12, Graduate School of Economics, Hitotsubashi University.

Critical values of the λmin,j,k* and λmax,j,k*

Rejection frequencies of the tests of rk(β1)

Rejection frequencies of the tests of rk(β⊥,1)

Article contents

THE RANK OF A SUBMATRIX OF COINTEGRATION

Abstract

1. INTRODUCTION

2. TEST OF THE RANK OF THE SUBMATRIX FOR NONTRENDING DATA

2.1. The Model with d = 0

2.2. The Model with d ≠ 0

3. THE TEST OF THE RANK OF THE SUBMATRIX FOR TRENDING DATA

4. SIMULATION RESULTS

5. CONCLUSION

APPENDIX

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests