ADAPTIVE TESTING IN CONTINUOUS-TIME DIFFUSION MODELS

Jiti Gao; Maxwell King

doi:10.1017/S0266466604205023

ADAPTIVE TESTING IN CONTINUOUS-TIME DIFFUSION MODELS

Published online by Cambridge University Press: 01 October 2004

Jiti Gao and

Maxwell King

Show author details

Jiti Gao: Affiliation:
The University of Western Australia
Maxwell King: Affiliation:
Monash University

Article contents

Abstract
1. INTRODUCTION AND MOTIVATION
2. TESTING MARGINAL DENSITY FUNCTIONS
3. AN ADAPTIVE TEST PROCEDURE
4. EXAMPLE OF IMPLEMENTATION AND APPLICATION IN DIFFUSION MODELS
5. CONCLUSION
APPENDIX A
APPENDIX B
APPENDIX C
References

Rights & Permissions

Abstract

We propose an optimal test procedure for testing the marginal density functions of a class of nonlinear diffusion processes. The proposed test is not only an optimal one but also avoids undersmoothing. An adaptive test is constructed, and its asymptotic properties are investigated. To show the asymptotic properties, we establish some general results for moment inequalities and asymptotic distributions for strictly stationary processes under the α-mixing condition. These results are applicable to some other estimation and testing of strictly stationary processes with the α-mixing condition. An example of implementation is given to demonstrate that the proposed model specification procedure is applicable to economic and financial model specification and can be implemented in practice. To ensure the applicability and implementation, we propose a computer-intensive simulation scheme for the choice of a suitable bandwidth involved in the kernel estimation and also a simulated critical value for the proposed adaptive test. Our finite sample studies support both the proposed theory and the simulation procedure.The authors thank the co-editor and three anonymous referees for their constructive comments and suggestions. The first author also thanks Song Xi Chen for some constructive suggestions, in particular the suggestion on using the local linear form instead of the Nadaraya–Watson kernel form in equation (2.6), and Yongmiao Hong for sending a working paper. The authors acknowledge comments from seminar participants at the International Chinese Statistical Association Meeting in Hong Kong in July 2001, the Western Australian Branch Meeting of the Statistical Society of Australia in September 2001, the University of Western Australia, and Monash University. Thanks also go to the Australian Research Council for its financial support.

Type: Research Article
Information: Econometric Theory , Volume 20 , Issue 5 , October 2004 , pp. 844 - 882

DOI: https://doi.org/10.1017/S0266466604205023 [Opens in a new window]
Copyright: © 2004 Cambridge University Press

1. INTRODUCTION AND MOTIVATION

Continuous-time diffusion processes arise in many applications in econometrics, but perhaps nowhere do they play as large a role as in finance. Following the pathbreaking work of Black and Scholes (1973), the use of continuous-time diffusion processes has become a common feature of many applications, especially asset pricing models. This is probably due to the following two reasons. The first one is that continuous-time diffusion processes are able to mimic some important macroeconomic and financial phenomena (see Sundaresan, 2001). The second reason is that various parametric diffusion processes have already been used nicely to model financial data. In both theory and practice, however, one needs to specify whether a parametric diffusion process is appropriate for a given set of financial data. In other words, one needs to determine whether it is appropriate to use a diffusion process with both the drift and the volatility assumed to be parametric for a given set of financial data. To justify whether the use of parametric diffusion processes is appropriate or not for a given set of financial data, empirical researchers have recently shown a preference for nonparametric alternatives. Aït-Sahalia (1996a, 1996b) was among the first to pioneer the nonparametric approach. Other related studies include Jiang and Knight (1997), Stanton (1997), Chapman and Pearson (2000), Gao and King (2001), Hong and Li (2004), and Fan and Zhang (2003). Aït-Sahalia (1996a) considers testing the marginal density functions of a class of diffusion processes under the β-mixing condition. Pritsker (1998) conducts a finite sample simulation of a nonparametric kernel test proposed in Aït-Sahalia (1996a). The principal result of Pritsker (1998) is that the test rejects true models much too often when asymptotic critical values are used. This suggests that the use of an asymptotic critical value may not be suitable in the finite sample analysis of a test power. In addition, the use of an estimation-based bandwidth in the nonparametric kernel test may also contribute to the poor performance of the test in finite sample studies, because an estimation-based optimal bandwidth may not necessarily imply that the corresponding test is optimal. We have been motivated by these two aspects to establish a simulation procedure for the choice of both an appropriate critical value and a test optimum bandwidth to improve the test proposed in Aït-Sahalia (1996b).

Recently, Horowitz and Spokoiny (2001) have developed a new test of a parametric model of a conditional mean function against a nonparametric alternative. The test adapts to the unknown smoothness of the alternative model and is uniformly consistent against alternatives whose distance from the parametric model converges to zero at the fastest possible rate. This rate is slower than T^−1/2, where T is the number of observations. To the best of our knowledge, the problem of extending the approach of Horowitz and Spokoiny (2001) to construct an adaptive and optimal test for marginal density functions has not been considered. This paper then proposes an adaptive test for testing marginal density functions. The proposed test has an optimal-rate property. In theory, the proposed test is consistent against some local alternatives with an optimal rate as stated in Section 3. In practice, we demonstrate how to apply the test in Section 4 through using a simulated example. Our studies show that the proposed test has some advantages over the test proposed in Aït-Sahalia (1996a).

The rest of the paper is organized as follows. Section 2 discusses the testing of the marginal density. An adaptive test procedure is proposed in Section 3. Section 4 provides an example of implementation. Section 5 concludes the paper with some remarks on extensions. Mathematical assumptions and proofs are relegated to Appendixes A–C.

2. TESTING MARGINAL DENSITY FUNCTIONS

Consider a continuous-time diffusion process of the form

where μ(·) and σ(·) > 0 are, respectively, the univariate drift and volatility functions of the process indexed by θ and B_t is standard Brownian motion.

Let {r_t} satisfy model (2.1) and f (·,θ) be a parametric form of the marginal density function of {r_t}. Within the diffusion process, f (·,θ) is completely determined by the corresponding drift μ(·,θ) and the diffusion σ(·,θ) (see Aït-Sahalia, 1996a, expression (6)) given by

where {r_t} is distributed on D = (x_min,x_max) with −∞ ≤ x_min < x_max ≤ ∞, both the lower bound x₀ and ξ(θ) can be chosen to ensure that f (x,θ) is a probability density, and θ is an unknown parameter vector. Let Θ denote a parameter space in R^q and θ₀ ∈ Θ denote the true value of θ.

Let f (x) be a nonparametric form of the density function. The null and alternative hypotheses are

where 0 ≤ C_T ≤ 1, lim_T→∞ C_T = 0, and Δ_T(x) is a continuous function satisfying ∫Δ_T(x) dx = 0 and f (x) ≥ 0 under

. Theoretically, this requires that under

, the alternative function is still a probability density. In practice, the form of Δ_T(x) needs to be constructed. The simple and natural choice of Δ_T(x) is Δ_T(x) = f₁(x,θ₁) − f (x,θ₁), where f₁(x,θ) is another specified density function and θ₁ ∈ Θ. For example, f (x,θ) is the marginal density of {r_t} satisfying the CIR model proposed in Cox, Ingersoll, and Ross (1985), and f₁(x,θ) is the marginal density of {r_t} satisfying the AG model proposed in Ahn and Gao (1999).

For this case, the hypothesis structure (2.3) can be written as

This is equivalent to

This basically requires us to test whether {r_t} is sampled from f (x,θ₀) or from f (x,θ₁) with probability 1 − C_T and from f₁(x,θ₁) with probability C_T. Obviously, such a structure of the null hypothesis versus a sequence of local alternatives naturally extends the usual structure of the null hypothesis against a global alternative of the form

For the diffusion process, we observe the process at dates {tΔ|t = 0,1,…,}, where Δ > 0 is generally small but fixed. Let X_t = r_(t−1)Δ for t ≥ 1 throughout this section. Let k(·) be a kernel function,

be the standard kernel density estimator of f (x). Intuitively, it is natural to compare

directly.

In a seminal paper, Aït-Sahalia (1996a) uses a test statistic of the form

where

It then follows from (13) of Aït-Sahalia (1996a) that as T → ∞

under the β-mixing and some other conditions, where

in which R(k) = ∫k²(u) du < ∞ and k^(j)(0) denotes the j-times convolution product of k(·) given by

The preceding test statistic is based on

, which measures directly the difference between

. It can be shown that under H₀,

This implies that it has the same order as the mean square error of

if h is chosen to be O(T^−1/5). Thus, to obtain an asymptotically normal distribution with zero mean, h has to satisfy lim_T→∞ Th^4.5 = 0 as required in Assumption A5 of Aït-Sahalia (1996a). This implies undersmoothing.

To reduce the bias and avoid undersmoothing, we propose a nonparametric estimator,

, of f (x,θ) of the form

where

is a consistent estimator of θ, w_t(x) = w_t(x,h) = (1/T)k_h(x − X_t) × [(s₂(x) − s₁(x)(x − X_t))/(s₂(x)s₀(x) − s₁²(x))], and

for r = 0,1,2.

We also define

is a

-consistent estimator of θ, then we have

It follows from Fan and Gijbels (1996) that

provided that the first three derivatives of f (x) exist, where ξ₁ and ξ₂ lie between x and h and x, c_k, and d_k are constants depending on functionals of k(·), and σ_k² = ∫x²k(x) dx.

This implies that as T → ∞

As can be seen from (2.7), the use of the difference

can avoid undersmoothing. In other words, we can still assume lim sup_T→∞ Th⁵ < ∞.

Let us now establish our test statistic. We first have a look at the following distance function:

This naturally suggests estimating D(f,θ) by

We then propose using a test statistic of the form

We now state the main results of this section. Their proofs are relegated to Appendix A.

THEOREM 2.1. (i) Suppose that Assumptions A.1–A.5 in Appendix A hold. Then under

in (2.3) we have

where

(ii) Assume that the conditions of (i) hold. In addition, assume that there is a random data-driven

such that

. Then under

in (2.3) we have

THEOREM 2.2. (i) Suppose that Assumptions A.1–A.5 in Appendix A hold. Then under

in (2.3) we have

where

are as defined in (2.4).

(ii) Assume that the conditions of (i) hold. In addition, assume that there is a random data-driven

such that

. Then under

in (2.3) we have

Remark 2.1. (i) Similar to

of (2.4), one may replace

(ii) As can be seen from Theorem 2.2(i), we need to estimate both the asymptotic mean and variance of

involved in practice. It is possible to avoid estimating this kind of unknown quantity by introducing a weight function into

. In both theory and practice, however, the asymptotic power of the test may depend on the choice of such a weight function. We therefore follow a suggestion made by two of the referees and use the natural form

to construct an adaptive test in Section 3.

(iii) Theorem 2.2(i) establishes an asymptotic normality test statistic. Theorem 2.2(ii) shows that the asymptotic normality remains unchanged when h is replaced with the random data-driven

, which is known as the plug-in method. Fan and Gijbels (1996, pp. 152–154) have shown that the plug-in method has some advantages in applications. Whether the proposed test statistic

is optimal has not been discussed. A modified form of the test statistic is shown to be optimal, and the detailed discussion is given in Section 3.

3. AN ADAPTIVE TEST PROCEDURE

Section 2 establishes the asymptotic normality of the test statistic for testing the marginal densities. The test statistic has nontrivial power only if C_T converges more slowly than T^−1/2. To improve the asymptotic power properties of the test, we consider extending the approach of Horowitz and Spokoiny (2001) for testing nonparametric regression functions. It is assumed that a marginal density function g belongs to a class of s-times (s ≥ 2) differentiable density functions on R¹, such as a Hölder, Sobolev, or Besov class,

, which is separated from the null hypothesis by some distance C_T that converges to zero as T → ∞. The objective of this section is to find the fastest rate at which C_T can approach zero while permitting consistent testing uniformly over

. This rate is called the optimal rate of testing. A test is consistent uniformly over

Thus, the optimal rate of testing is the fastest rate at which C_T can approach zero while maintaining (3.1).

3.1. Asymptotic Behavior of the Test Statistic under the Null Hypothesis

As can be seen in Section 2, the proposed test statistic depends on the bandwidth. This section then suggests using

where H_T = {h = h_maxa^k : h ≥ h_min, k = 0,1,2,…}, in which 0 < h_min < h_max and 0 < a < 1. Let J_T denote the number of elements of H_T. In this case, J_T ≤ log_1/a(h_max/h_min). Detailed conditions on h_min and h_max will be given in Assumption B.3 in Appendix B.

Simulation Scheme.

We discuss how to obtain a critical value for L*. The exact α-level critical value, l_α* (0 < α < 1), is the 1 − α quantile of the exact finite sample distribution of L*. Because θ₀ is unknown, l_α* cannot be evaluated in practice. We therefore suggest choosing a simulated α-level critical value, l_α, by using the following simulation procedure.

1. For the simulation, we either use resamples of the sampled data X_t or generate the data X_t from the marginal density f (x,θ₀) or the corresponding transition density with an initial value of θ₀ under

2. The true value θ₀ is estimated based on the simulated {X_t}, and the resulting estimate is denoted by

3. We choose H_T as specified following (3.2) with h_min and h_max satisfying Assumption B.3 in Appendix B and then compute L* of (3.2) using the simulated {X_t} and

4. Repeat the preceding steps M times and produce M versions of L*, L_m* for m = 1,2…,M. The simulated critical value l_α is then the (1 − α)% percentile of the M values, L_m* for m = 1,2…,M, of L*.

We now state the following result, and its proof is relegated to Appendix B.

THEOREM 3.1. Assume that Assumptions A.1, A.3, and A.4 in Appendix A and B.1–B.3 listed in Appendix B hold. Then under

The main result on the behavior of the test statistic L* under

is that l_α is an asymptotically correct α-level critical value under any model in

3.2. Consistency against a Fixed Alternative

We now show that L* is consistent against a fixed alternative model. Assume that model (1.1) holds. Let the parameter set Θ be an open subset of R^q. Let

satisfy Assumption B.1 in Appendix B. For convenience, let

Measure the distance between

by the normalized l₂ distance

where ∥·∥ denotes the euclidean norm. If

is false, then

for all sufficiently large T and some C_ρ > 0. A consistent test will reject a false

with probability approaching one as T → ∞.

The following theorem establishes a consistency result, and its proof is relegated to Appendix B.

THEOREM 3.2. Assume that the conditions of Theorem 3.1 hold. In addition, if there is a C_ρ > 0 such that

holds then

3.3. Consistency against a Sequence of Local Alternatives

In this section, we consider the consistency of L* under local alternatives of the form

with

for some constant C₀ > 0 and θ₁ ∈ Θ.

Let

We now have that

To ensure that the rate of convergence of f_T to the parametric model F(θ₁) is the same as the rate of convergence of C_T to zero, in view of (3.4), we need to assume that Δ_T(x) is a continuous function that is normalized so that

for some δ > 0. When Δ_T(·) does not depend on T, condition (3.5) can be replaced by E [Δ²(X₁)] > 0, which holds automatically when Δ(·) ≠ 0.

We now state the following consistency result, and its proof is relegated to Appendix B.

THEOREM 3.3. Assume that Assumptions A.1, A.3, and A.4 in Appendix A and B.1–B.3 with h_max = c_max(log log T)⁻¹ for some constant c_max > 0 in Appendix B hold. Let

be a

-consistent estimator of θ. Let f_T satisfy (3.3) with

for some constant C > 0. In addition, let condition (3.5) hold. Then

The result shows that the power of the adaptive, rate-optimal test approaches one as T → ∞ for any function Δ_T(·) and sequence {C_T} that satisfy the conditions of Theorem 3.3.

3.4. Consistency against a Sequence of Smooth Alternatives

This section establishes that L* is consistent uniformly over alternatives in a Hölder smoothness class whose distance from the parametric model approaches zero at the fastest possible rate. It can be shown that we can extend the results to Sobolev and Besov classes under more technical conditions.

Before specifying our smoothness classes, we introduce the following notation. Define the Hölder norm

where S_f = {x ∈ R¹ : f (x) > 0}.

The smoothness classes that we consider consist of functions f ∈ S(H,s) ≡ {f : ∥ f ∥_H,s ≤ c_H} for some (unknown) s ≥ 2 and c_H < ∞.

For some s ≥ 2 and all sufficiently large C_f < ∞, define

where

is as defined in Section 3.2.

We now state the following consistency result, and its proof is relegated to Appendix B.

THEOREM 3.4. Assume that Assumptions A.1, A.3, and A.4 in Appendix A and B.1–B.3 in Appendix B hold. Then for 0 < α < 1 and B_H,T as defined in (3.7)

Remark 3.1. Theorems 3.1–3.4 show that we have established some consistency results for the proposed test given in (3.2). Such consistency results correspond to Theorems 1–4 of Horowitz and Spokoiny (2001) for a fixed design regression case. In our case, we deal with the case where the observations are stationary and α-mixing time series. In addition, the optimum version L* is asymptotically consistent as established in Theorem 3.2. This is one of the advantages of our test over existing ones, such as the natural competitor proposed in Aït-Sahalia (1996a). In Section 4, we show that our test also outperforms the natural competitor in the finite sample case.

4. EXAMPLE OF IMPLEMENTATION AND APPLICATION IN DIFFUSION MODELS

This section illustrates the proposed adaptive test by the following example. As the bootstrap simulation procedure for selecting both the bandwidth and simulated critical values is extremely computationally demanding, especially for large numbers of data, we only consider using the CIR model proposed by Cox et al. (1985) and show how to implement the adaptive test statistic L* of (3.2) in practice through using a simulated example. The main reason for choosing the model is not only because both the marginal and transition density functions have closed forms but also because the model has been studied extensively in the literature. See, for example, Aït-Sahalia (1999) and Hong and Li (2004).

Example 4.1

We consider using the CIR model given by

where κ > 0, β > 0, and σ > 0 are unknown parameters and B_t is standard Brownian motion. It can be shown that {r_t} is distributed on R⁺ = (0,∞) if 2κβ/σ² ≥ 1. Furthermore, it follows from Lemma 3.1 of Masry and Tjøstheim (1995) that the process {r_t} satisfies Assumption A.1(i). Alternatively, one may apply Assumption A.3′ of Aït-Sahalia (1996b, p. 552) to verify that {r_t} is strictly stationary and α-mixing.

As a result of (2.2), the marginal density function of {r_t} satisfying model (4.1) is

where θ = (β,κ,σ), ν = 2κβ/σ² − 1, and Γ(·) is the usual gamma function. Let θ₀ be the true value of θ.

To construct a sequence of local alternatives, we also consider using a marginal density of the form

where ν₁ = 2κ/σ². It is known that f₁(x,θ) is the marginal density of {r_t} satisfying the AG model proposed in Ahn and Gao (1999)

with parameter values κ > 0, β > 0, and σ > 0. The necessary and sufficient conditions for stationarity and unattainability of 0 and ∞ in finite expected time are the pairs κ > 0 and β > 0 (see Ahn and Gao, 1999). To show that {r_t} is strictly stationary and α-mixing, as explained in Appendix A of Ahn and Gao (1999, pp. 755–756), one needs only to verify Assumption A.3′ of Aït-Sahalia (1996b, p. 552). It is easy to see that such an assumption holds for the marginal density, drift, and diffusion functions given in (4.3) and (4.4).

The corresponding structure of the test problem (2.3) for this example can be constructed as

where

in which θ₁ ∈ Θ. The reason for choosing such Δ_T(·) as the local shift function is to ensure that the models under

fluctuate closely around those under

. The choice of (4.5) and (4.6) ensures that (3.7) holds with s = 2. This implies that the adaptive test is consistent against the sequence with an optimal rate. Note that Assumptions B.1 and B.2 hold.

In the following simulation, we consider using a class of alternatives of the form

where θ₁ ∈ Θ and 0 < ψ < 1 is defined as the truncation parameter to be chosen.

To compute the nonparametric estimators involved, we choose the normal kernel function given by

throughout the simulation. Observe that Assumptions A.1–A.4 hold. For the CIR and AG models, we simulate the data from their marginal density and transitional functions, which all have closed forms.

In the detailed simulation, we simulate the data from (4.2) for the CIR model, (4.3) for the AG model, and then (4.7) under

. Using the simulated data, we compute

in which R(k) =

are used after the choice of (4.8) and H_T is as defined following (3.2) with

. Note that Assumption B.3 holds.

To compare L* with

in (2.4), we construct a test statistic of the form

where h_* is chosen by using the following procedure.

[bull ] We simulate X_t with probability 1 − ψ from the CIR model and with probability ψ from the AG model with an initial value of θ₁ under

[bull ] Use the simulated data {X_t : t = 1,2,…,T} to estimate θ₁.

[bull ] Compute the resulting function of h given by

[bull ] Repeat the preceding steps Q = 1,000 times and produce Q versions of

denoted by

for m = 1,2,…,Q. Use the Q functions of h,

for m = 1,2,…,Q, to construct their empirical bootstrap distribution function, that is,

where I(U ≤ u) is the usual indicator function.

[bull ] For a given asymptotic critical value ecv_α at the level α (e.g., ecv_0.005 = 1.645 at the 5% level), we then calculate the following power function:

[bull ] Find approximately at which h value the power function ψ(h) is maximized. Denote the maximizer by h_*.

We then consider using the same choice of the parameter values as in (17) of Pritsker (1998). This means that the baseline model is model (4.1) with

. In this example, the same parameter values were also used as θ₁ in computing the power of the tests L* and L₀*. The truncation parameter was chosen as ψ = 0 under

whereas the truncation parameter was chosen as

under

. Three different sizes of sample T = 1,000, 2,755, or 5,500 were then considered. The corresponding simulated critical values, l_α and l_0α, of L* and L₀* at the α level are then found by using the simulation scheme proposed in Section 3.1. The sizes of the tests were then computed based on the data simulated under

, and the power values of the tests were calculated based on the data generated under

. In implementing the simulation procedure, we used M = 1,000 involved in the simulation scheme proposed in Section 3.1. The number of simulations in producing Table 1 was also 1,000. Both the size and the power of L* and L₀* are given in Table 1.

Rejection rates for the marginal density tests

Remark 4.1. (i) As can be seen from Table 1, the power values of both L* and L₀* look reasonable when

, or about 3%. This may show that both L* and L₀* are practically applicable to the medium sample case, because the difference between the null hypothesis and its alternative was made deliberately close. We also computed the power of the tests for the case where

or 5%. Our small sample results showed that the power of L* was already 100% even when T = 1,000. In general, it is true that the power increases as ψ increases for each case. Observe that L* is slightly more powerful than L₀*, although h_* involved in

has been chosen based on the assessment of its power. We observe that the sizes of the two tests are also close to either 5% in the first half of Table 1 or 1% in the second half of Table 1.

(ii) We also examined the dependence of the power on the choice of the initial parameter values. Our experience suggests that the power of the tests mainly depends on the choice of the truncation parameter ψ. This is both understandable and expected, because the test statistics finally depend only on the estimation and reestimation procedure of the vector of the initial parameters rather than the initial parameter values themselves. This is probably why artificial values or parameter values estimated from a set of real data are used as initial values for starting a simulation procedure. For example, Hong and Li (2004) use the parameter values estimated from the U.S. interest rate series for their simulation procedure.

(iii) Compared with existing results (see Pritsker, 1998), both the size and power of L₀* have been significantly improved. This is probably because (a) the choice of h involved in

is based on the assessment of the power of

rather than using an estimation-based optimal value and (b) to avoid using the asymptotic distribution of

and then an asymptotic critical value of 1.645 at the 5% or 2.33 at the 1% level, we have used the bootstrap-based simulated critical value, l_0α, at the level α. We also computed both the power and size values for the case where h was chosen by using a cross-validation criterion, and the resulting sizes and power values were similar to those obtained by Pritsker (1998), although L* always performed better than L₀*. This further demonstrates that the asymptotic distribution of either

can only provide some kind of idea about the asymptotic behavior. In practice, we strongly suggest using the proposed bootstrap simulation procedure for choosing a simulated critical value rather than an asymptotic critical value.

5. CONCLUSION

In this paper, we have considered testing the general continuous-time diffusion model (1.1) under the α-mixing condition. The results for continuous-time models under the α-mixing condition complement some existing results under the β-mixing condition. See, for example, Aït-Sahalia (1996a). Moreover, an adaptive and optimal test procedure has been established. This extension corresponds to Horowitz and Spokoiny (2001) for the fixed design nonparametric regression and then to Chen, Gao, and Li (2001) for a nonparametric time series regression model. To deal with the α-mixing condition, we have established some novel results for moment inequalities (see Lemma C.2) and limit theorems (see Lemma A.1) for degenerate U-statistics of strongly dependent processes. Both Lemmas A.1 and C.2 are applicable to some other estimation and testing of diffusion processes with the α-mixing condition (for more about various mixing conditions, see Doukhan, 1995). In addition, we have demonstrated how to implement the proposed test procedure in practice through using a simulated example.

The results given in this paper can be extended in a number of directions. First, it is possible to consider testing for both the marginal and transition density functions simultaneously, because the transition density can capture the full dynamics of a diffusion process and, in particular, can distinguish the diffusion processes that have the same marginal density but different transition densities. Second, the results of this paper for the short-range dependent continuous-time case can be extended to the long-range dependent continuous-time case. Third, one probably can relax the strict stationarity and the mixing condition, as the recent work by Aït-Sahalia (1999) and Karlsen and Tjøstheim (2001) indicates that it is possible to do such work without the stationarity and the mixing condition. This part is particularly important for two reasons: (i) for the long-range dependent case one needs to avoid assuming both the long-range dependence and the mixing condition, as they contradict each other; and (ii) some important models are nonstationary. These are some issues left for future research.

APPENDIX A

This Appendix lists the necessary assumptions for the establishment and the proof of the main results given in Section 2.

A.1. Assumptions. Let the parameter set Θ be an open subset of R^q. Let

. Define [dtri ]_θ f (x,θ) = ∂f (x,θ)/∂θ, [dtri ]_θ² f (x,θ) = ∂²f (x,θ)/∂θ∂θ′, and [dtri ]_θ³ f (x,θ) = ∂³f (x,θ)/∂θ∂θ′∂θ′′ whenever these derivatives exist. For any q × q matrix D, define

where

Assumption A.1. (i) Assume that the process {r_t} is strictly stationary and α-mixing with the mixing coefficient α(t) = C_αα^t defined by

for all s,t ≥ 1, where 0 < C_α < ∞ and 0 < α < 1 are constants and Ω_i^j denotes the σ-field generated by {r_t : i ≤ t ≤ j}.

(ii) Assume that the univariate kernel function k(·) is nonnegative, symmetric, and four-times differentiable on R¹ = (−∞,∞). In addition,

Assumption A.2. (i) The parameter space Θ ⊂ R^q is compact. In a neighborhood of the true parameter θ₀, f (x,θ) is twice continuously differentiable in θ; E [(∂f (x,θ)/∂θ)(∂f (x,θ)/∂θ)^τ] is of full rank. In addition, assume that G(x) is a positive and integrable function with E [G(X_t)] < ∞ uniformly in t ≥ 1 such that sup_θ∈Θ| f (X_t,θ)|² ≤ G(X_t) and sup_θ∈Θ∥[dtri ]_θ^j f (X_t,θ)∥² ≤ G(X_t) for j = 1,2,3, where for

(ii) Assume that

is a

-consistent estimator of θ₀.

Assumption A.3. For every θ ∈ Θ:

(i) The drift and the diffusion functions are three times continuously differentiable in x ∈ R⁺ = (0,∞), and σ > 0 on R⁺.

(ii) The integral of

converges at both boundaries of D, where v is fixed in D.

(iii) The integral of

diverges at both boundaries of D.

Assumption A.4. (i) Assume that the first three derivatives of f (x) are continuous on D and that f (x) > c_f > 0 on the interior of D for some c_f > 0. In addition, both f (x) and f²(x) are integrable on D.

(ii) The initial random variable r₀ is distributed as f (x).

(iii) The true drift and diffusion functions satisfy Assumption A.3.

Assumption A.5. The bandwidth parameter h satisfies that

Remark A.1. Assumptions A.1–A.4 are quite natural in this kind of problem. Assumptions A.2–A.4 correspond to Assumptions A0, A1, and A3 of Aït-Sahalia (1996a). Assumption A.1 is the exception. Assumption A.1(i) assumes the α-mixing condition, which is weaker than the β-mixing condition. Assumption A.1(ii) is quite general, allowing the use of the standard normal kernel. Assumption A.5 ensures that the theoretically optimum value of h_optimal = CT^−1/5 can be included. This is important, because there may be cases in which h_optimal is also optimal for testing purposes.

A.2. Technical Lemmas.

The following lemmas are necessary for the proof of the main results stated in Section 2.

LEMMA A.1. Let ξ_t be an r-dimensional strictly stationary and strong mixing (α-mixing) stochastic process. Let φ(·,·) be a symmetric Borel function defined on R^r × R^r. Assume that for any fixed x ∈ R^r, E [φ(ξ₁,x)] = 0 and E [φ(ξ_i,ξ_j)|Ω₀^j−1] = 0 for any i < j, where Ω_i^j denotes the σ-field generated by {ξ_s : i ≤ s ≤ j}. Let

. For some small constant 0 < δ < 1, let

where the maximization over P in the equation for M_T4 is taken over the four probability measures P(ξ₁,ξ_i,ξ_j,ξ_k), P(ξ₁)P(ξ_i,ξ_j,ξ_k), P(ξ₁)P(ξ_i₁)P(ξ_i₂,ξ_i₃), and P(ξ₁)P(ξ_i)P(ξ_j)P(ξ_k), where (i₁,i₂,i₃) is the permutation of (i,j,k) in ascending order;

Assume that all the M_T′s are finite. Let

If lim_T→∞(max{M_T,N_T}/σ_T²) = 0, then

Remark A.2. Lemma A.1 establishes central limit theorems for degenerate U-statistics of strongly dependent processes. It should be pointed out that the conclusion of Lemma A.1 remains true when the martingale assumption that E [φ(ξ_i,ξ_j)|Ω₀^j−1] = 0 for any i < j is removed. Such a martingale assumption is used only for a direct application of an existing central limit theorem (CLT) for martingales. Without such a condition, one needs only to decompose

and then apply the martingale CLT to

. Using the condition that E [φ(ξ₁,x)] = 0 for each given x, one can show that the terms involving E [φ(ξ_i,ξ_j)|Ω₀^j−1] are negligible (see Roussas and Ioannides, 1987, Theorem 5.5). Thus, as assumed in Lemma 3.2 of Hjellvik, Yao, and Tjøstheim (1996) and Theorem 2.1 of Fan and Li (1998), the condition that E [φ(ξ₁,x)] = 0 for each given x is the key assumption.

Proof. Let

To prove Lemma A.1, it suffices to show that as T → ∞

and

By Lemma C.1 (with η₁ = φ_ik, η₂ = φ_jk, l = 2, p_i = 2(1 + δ), and Q = 1/(1 + δ)),

Therefore,

because

Observe that

Let η_ijk = 1/3(φ_ikφ_jk + φ_ijφ_kj + φ_jiφ_ki) and η_ij = 1/3∫φ_ikφ_jk dP(ξ_k).

Then by Lemma C.2(i) in Appendix C,

Let C_φ = ∫φ₁₂²φ₃₄² dP(ξ₁) dP(ξ₂) dP(ξ₃) dP(ξ₄), where P(ξ) denotes the probability measure of ξ.

Using Lemma C.1 repeatedly, we have that for different i,j,k,l

where Δ(i,j,k,l) is the minimum increment in the sequence that is the permutation of i,j,k,l in ascending order.

Similar to (A.5), one can have for all different i,j,k,l

Therefore,

It now follows from (A.3)–(A.5) that for any ε > 0

Thus, (A.1) holds.

Note that for 2 ≤ k ≤ T,

It is easy to see that

Similar to (A.5), one can have for any (i,j) ≠ (s,t),

where Δ(·) is as defined in (A.5).

Consequently, the first two terms on the right-hand side of (A.7) are of order O(T³M_T4^1/(1+δ)), because

Thus, (A.2) follows from

This finishes the proof. █

Before stating the following lemmas, we define the following notation.

using

where

in which A is the T × T matrix with {a_st} as its (s,t) element.

We assume without loss of generality throughout the rest of this paper that

LEMMA A.2. Under the conditions of Theorem 2.1, we have as T → ∞

Proof. We now prove (A.9). It follows from Assumptions A.2 and A.3 that as T → ∞

This completes the proof of the first part of (A.9). For the proof of the second part of (A.9), let

Then

We first look at the main component of σ_T². We now have

Using Assumptions A.1–A.4, we have as T → ∞

where L(x) = ∫k(x + y)k(y) dy is as defined in (2.5).

Therefore, as T → ∞

where k⁽⁴⁾(·) is as defined in (2.5).

Similarly, one can show that as T → ∞

We now deal with the remainder term of var[N_0T(h)] . By Lemma C.1 (with η₁ = φ_ik, η₂ = φ_jk, l = 2, p_i = 2(1 + δ), and Q = 1/(1 + δ)),

where M_T1 is as defined in Lemma A.1.

Therefore, using the fact that

whose proof is similar to that of (A.17), which follows.

Equations (A.10)–(A.12) imply

This finishes the proof of the second part of (A.9). █

LEMMA A.3. Under the conditions of Theorem 2.1, we have as T → ∞

and

where C₁ is a constant and

are as defined in Section 2.

Proof. We now give only the proof of (A.14) in some detail, as the proofs of (A.13) and (A.14) are similar and quite standard and the details follow similarly from some existing results. See, for example, Fan and Gijbels (1996).

In view of the definition of w_t(x) and the second equation of (A.13), to prove (A.14), it suffices to show that as T → ∞

using a Taylor expansion to f (x) − f (x − vh). This finishes the proof of (A.14). █

A.3. Proof of Theorem 2.1.

Proof of Theorem 2.1(i). To prove Theorem 2.1(i), in view of Remark A.2 and Lemma A.3, it suffices to show that

To apply Lemma A.1, let ξ_t = X_t and φ(ξ_s,ξ_t) = φ_st defined previously. Let M_T and N_T be defined as in Lemma A.1. We now verify only the following condition listed in Lemma A.1:

for M_T1, M_{T
21}, M_T3, M_T51, M_T52, and M_T6, where σ_h² = hσ₀². The others follow similarly.

For the M_T part, one justifies only

The others follow similarly.

Let ψ_st = (1/Th)∫k((x − X_s)/h)k((x − X_t)/h)p(x) dx. We now have

where L(·) is as defined previously.

For any given 1 < ζ < 2 and T sufficiently large, we obtain

using Assumption A.1(ii), where f (x,y,z) denotes the joint density of (X_i,X_j,X_k) and C_p is a constant.

Thus, as T → ∞

Hence, (A.17) shows that (A.15) holds for the first part of M_T1. The proof for the second part of M_T1 follows in a similar way. Similarly, we have that as T → ∞

using Assumption A.1(ii).

This implies that as T → ∞

Thus, (A.18) now shows that (A.15) holds for M_T3. It follows from the structure of {ψ_ij} that (A.15) holds automatically for M_T51, M_T52, and M_T6, because E [φ_st] = 0 for s ≠ t.

We now prove that (A.15) holds for M_{T
21}. For some 0 < δ < 1 and 1 ≤ i < j < k ≤ T, let M_{T 21} = E [|ψ_ikψ_jk|^2(1+δ)] . Similar to (A.16) and (A.17), we obtain that as T → ∞

using the fact that lim_T→∞ Th = ∞.

This completes the proof of (A.15) for M_{T
21}, and thus (A.15) holds for the first part of {φ_st}. Similarly, one can show that (A.15) holds for the other parts of {φ_st}. Thus, we have shown that under

The proof of Theorem 2.1(i) is therefore finished. █

Proof of Theorem 2.1(ii). Note that as T → ∞

using the continuity of

in h. This completes the proof of Theorem 2.1(ii). █

Proof of Theorem 2.2. The proof follows from Theorem 2.1 and the following standard result:

APPENDIX B

This Appendix lists the necessary assumptions for the establishment and the proof of the main results given in Section 3.

B.1. Assumptions.

Assumption B.1. The parameter set Θ is an open subset of R^q for some q ≥ 1. The parametric family

satisfies the following conditions.

holds with probability one (almost surely).

Assumption B.2. (i) Let

be true. Then θ₀ ∈ Θ and

for any ε > 0 and all sufficiently large C_L.

(ii) Let

be false. Then there is a θ* ∈ Θ such that

for any ε > 0 and all sufficiently large C_L.

Assumption B.3. (i) Assume that the set H_T has the structure of (3.2) with c_minT^−γ = h_min < h_max = c_max(log log T)⁻¹, where γ, c_min, and c_max are some constants satisfying 0 < γ < 1 and 0 < c_min,c_max < ∞.

(ii) Assume that Δ_T(x) is continuous in x ∈ D and satisfies

for all T ≥ 1.

Remark B.1. Assumptions B.1(i) and B.1(ii) are quite standard in this kind of problem. See Assumptions 1(i) and (ii) of Horowitz and Spokoiny (2001). Assumption B.1(iii) is required to ensure that the marginal density function is identifiable. A similar condition is used in Assumption 1(iii) of Horowitz and Spokoiny (2001). It can be shown that Assumption B.1(iii) holds when f (x,θ) belongs to classes of simple linear and certain nonlinear functions in θ. The identifiability assumption is imposed to exclude the case where f (x,θ) is flat as a function of θ over certain range of θ and some value of x, because such a function may be neither identifiable nor a probability density. Assumption B.2 is needed to ensure that the true version of θ under

can be estimated by a

-consistent estimator. Assumption B.3(i) imposes some conditions on both h_min and h_max. The theoretical condition on h_min is quite general. In practice, we would suggest using

to include the estimation-based optimal bandwidth h_optimal = Cn^{−[1/(2s+1)]}, because the estimation-based optimal value may also be optimal for testing purposes in some cases. The restriction on h_max is required only for the proof of Theorem 3.3. It should be noted that h_max is not necessarily the optimal bandwidth such that the power of the resulting test is maximized. As explained at the beginning of Section 2, both the existence and reasonableness of Assumption B.3(ii) can be justified. Unlike the regression setting discussed in Horowitz and Spokoiny (2001), we need to assume

to ensure that the alternative is also a probability density. As the main results in Section 2 are only concerned with the null hypothesis, we do not need to assume such a rigorous condition for the main results.

This paper considers using only a set of discrete bandwidths for constructing the adaptive test. It is believed that some corresponding results of Theorems 3.1–3.4 can be established for the case where H_T is an interval of continuous bandwidth values. As H_T is always chosen as a set of discrete bandwidths in practice, we therefore think that such an extension from a set of discrete bandwidths to an interval of continuous bandwidth values may just be for theoretical and technical consideration. As such an extension also involves much more tedious and technical details, we do not discuss this issue in detail in this paper.

B.2. Technical Lemmas. Before stating the necessary lemmas for the proof of the results given in Section 3, we introduce the following notation.

LEMMA B.1. Suppose that the conditions of Theorem 2.1 hold.

(i) For every δ > 0

in probability, where C > 0 is a constant.

(ii) For each θ ∈ Θ and sufficiently large T

Proof. (i) It follows from the definition of Q_T(θ) that

To prove Lemma B.1(i), one first needs to show that

in probability for some constant C > 0.

Using the conditions of Lemma B.1, we now have

in probability.

In view of (B.2), to prove Lemma B.1(i), it suffices to show that

in probability.

A Taylor series expansion to f (X_t,θ) − f (X_t,θ₀) and an application of Assumption B.1(i) imply (B.3). This finishes the proof of Lemma B.1(i).

(ii) Let λ_min(A) and λ_max(A) denote the smallest and largest eigenvalues of A, respectively. In view of

to prove Lemma B.1(ii), it suffices to show that for n large enough

for some C > 0. Similar to the proof of Lemma A.2 of Gao, Tong, and Wolff (2002), one can easily finish the proof of (B.5). █

Without loss of generality, we consider the case of q = 1 in the following lemmas and their proofs. Define

LEMMA B.2. Under the conditions of Theorem 3.1, we have for any given θ ∈ Θ and i = 1,2

Proof. It suffices to show that for any large constant C₀ > 0

where

Similar to the proof of (A.1), one can show that as T → ∞

for some function C(θ).

Using Lemmas C.1 and C.2 in Appendix C and the fact that E [ε_t(x)] = 0 for x ∈ D, one can show that as T → ∞

Thus, equations (B.7)–(B.9) complete the proof. █

LEMMA B.3. Under the conditions of Theorem 3.1, we have as T → ∞

Proof. Similar to (B.7), we have for large constant C₀ > 0

Similar to (B.8), we can have as T → ∞

Analogous to (B.9), one can show that as T → ∞

Thus, equations (B.11)–(B.13) complete the proof of (B.10). █

LEMMA B.4. Under the conditions of Theorem 3.1, we have for each u > 0,

under

Proof. We now prove (B.14). Using a Taylor series expansion to f (X_t,θ) − f (X_t,θ₀) and Assumption B.1, we have for θ′ between θ and θ₀

Hence, (B.4), (B.10), (B.15), and Assumption B.1(i) imply

The proof of (B.14) follows from (B.15) and (B.16). █

LEMMA B.5. Suppose that the conditions of Theorem 3.1 hold. Then for every u > 0, some h ∈ H_T, and as T → ∞

under

Proof. In view of the definition of Q_n(θ), to prove (B.17), it suffices to show that as T → ∞

where q_T = E [Q_T(θ*)] .

Note that

where θ′ lies between θ and θ*.

In view of (B.6), (B.10), (B.18), and Assumptions B.1(i) and B.2(ii), to prove (B.17), it suffices to show that for any δ > 0,

as T → ∞.

Similar to (B.8) and (B.9), one can show that as T → ∞

Thus, equations (B.19) and (B.20) imply that as T → ∞

using q_T = CTh(1 + o(1)) given in the proof of Lemma B.1(ii), where C is a constant independent of T. Lemma B.5 therefore follows from (B.21). █

Recall the notation introduced in (A.9). We assume without loss of generality that k⁽⁴⁾(0) = 1 in Lemma A.2. Define

LEMMA B.6. Suppose that the conditions of Theorem 3.1 hold. Then as T → ∞

uniformly over h ∈ H_T.

Proof. The proof of (B.23) follows from (2.7) and (2.8) immediately. █

LEMMA B.7. Suppose that the conditions of Theorem 3.1 hold. Then max_{h∈H_T} L₀(h) and max_{h∈H_T} L_T(h) have identical asymptotic distributions under

Proof. Note that Q_T(θ₀) = 0 under

and that Lemmas A.3 and B.1–B.5 imply as T → ∞

Therefore, equations (B.21), (B.22), and (B.24) complete the proof of Lemma B.7. █

LEMMA B.8. Suppose that the conditions of Theorem 3.1 hold. Then for any x ≥ 0, h ∈ H_T, and all sufficiently large T

Proof. It follows from the beginning of the proof of Theorem 2.1(i) that for any small δ > 0 there exists a large integer T₀ ≥ 1 such that for T ≥ T₀

where

This implies for any T ≥ T₀ and x ≥ 0

using

The proof follows by letting

for any x ≥ 0. █

For 0 < α < 1, define

to be the 1 − α quantile of max_{h∈H_T} L₀(h).

LEMMA B.9. Suppose that the conditions of Theorem 3.1 hold. Then for large enough T

Proof. The proof is trivial.

LEMMA B.10. Suppose that the conditions of Theorem 3.1 hold. Suppose that

for some h ∈ H_T, where

Then

Proof. To prove Lemma B.10, in view of Lemmas B.6 and B.7, it suffices to show that

which holds if

for some h ∈ H_T. For any h ∈ H_T, using (B.21) and then (B.17) we have

On the other hand, condition (B.25) implies that as T → ∞

Observe that

Thus, it follows from (B.26) that as T → ∞

because L₀(h) is asymptotically normal and therefore bounded in probability and

Because of (B.27), as T → ∞

This finishes the proof. █

B.3. Proofs of Theorems 3.1–3.4.

Proof of Theorem 3.1. The proof follows from Lemmas B.6 and B.7.

Proof of Theorem 3.2. This proof is similar to that of Theorem 3.3, which follows, using Lemma B.1(ii). Alternatively, one can follow the corresponding proof of Theorem 2 of Horowitz and Spokoiny (2001) by using Lemma B.1(ii) and the condition that

to verify (B.25). █

Proof of Theorem 3.3. Condition (3.5) ensures that the rate of convergence of f_T to the parametric model F(θ₁) is the same as the rate of convergence of C_T to zero. In particular, when (3.5) holds,

In view of Lemma B.10, to complete the proof of Theorem 3.3, it suffices to verify (B.25). This verification follows from Lemma B.1(ii) and (B.28). █

Proof of Theorem 3.4. For the proof of Theorem 3.4, one needs to use the conditions of Theorem 3.4 to finish the proof. In our proof, we mainly use Lemma B.1(ii) and the condition of Theorem 3.4 that

to verify (B.25). █

APPENDIX C

The following two technical lemmas have already been used in the proofs of Lemma A.1 and Theorem 2.1. The two lemmas are of general interest in themselves and can be used for other nonparametric estimation and testing problems associated with the α-mixing condition.

LEMMA C.1. Suppose that M_mⁿ are the σ-fields generated by a stationary α-mixing process ξ_i with the mixing coefficient α(i). For some positive integers m let η_i ∈ M_{s_i}^t_i where s₁ < t₁ < s₂ < t₂ < ··· < t_m and suppose t_i − s_i > τ for all i. Assume further that

for some p_i > 1 for which

Then

Proof. See Roussas and Ioannides (1987).

LEMMA C.2. (i) Let ψ(·,·,·) be a symmetric Borel function defined on R^r × R^r × R^r. Let the processξ_i be defined as in Lemma A.1. Assume that for any fixed x,y ∈ R^r, E [ψ(ξ₁,x,y)] = 0. Then

where 0 < δ < 1 is a small constant, C > 0 is a constant independent of T and the function ψ, M = max{M₁,M₂,M₃}, and

(ii) Let φ(·,·) be a symmetric Borel function defined on R^r × R^r. Let the process ξ_i be defined as in Lemma A.1. Assume that for any fixed x ∈ R^r, E [φ(ξ₁,x)] = 0. Then

where δ > 0 is a constant, C > 0 is a constant independent of T and the function φ, and

Proof. As the proof of (ii) is similar to that of (i), one proves only (i). Let i₁,…,i₆ be distinct integers and 1 ≤ i_j ≤ T, let 1 ≤ k₁ < ··· < k₆ ≤ T be the permutation of i₁,…,i₆ in ascending order, and let d_c be the cth largest difference among k_j+1 − k_j, j = 1,…,5. Let

By Lemma C.1 (with η₁ = ψ(ξ_i₁,ξ_i₂,ξ_i₃), η₂ = ψ(ξ_i₄,ξ_i₅,ξ_i₆), l = 2, p_i = 2(1 + δ) and Q = 1/(1 + δ)),

Thus,

Similarly,

Analogously, it can be shown in a similar way that

On the other hand, if {k₆ − k₅,k₂ − k₁} = {d₄,d₅}, by using Lemma C.1 three times we have the inequality

Hence,

It follows from (C.3)–(C.7) that

Similar to (C.8), one can show that

Finally, it is easy to see that

The conclusion of Lemma C.2(i) follows immediately from (C.8)–(C.11). █

References

REFERENCES

Ahn, D.H. & B. Gao (1999) A parametric nonlinear model of term structure dynamics. Review of Financial Studies 12, 721–762.Google Scholar

Aït-Sahalia, Y. (1996a) Testing continuous-time models of the spot interest rate. Review of Financial Studies 9, 385–426.Google Scholar

Aït-Sahalia, Y. (1996b) Nonparametric pricing of interest rate derivative securities. Econometrica 64, 527–560.Google Scholar

Aït-Sahalia, Y. (1999) Transition densities for interest rate and other nonlinear diffusions. Journal of Finance 54, 1361–1395.Google Scholar

Black, F. & M. Scholes (1973) The pricing of options and corporate liabilities. Journal of Political Economy 3, 637–654.Google Scholar

Chapman, D. & N. Pearson (2000) Is the short rate drift actually nonlinear? Journal of Finance 54, 355–388.Google Scholar

Chen, S., J. Gao, & M. Li (2001) Simultaneous Specification Tests for Nonparametric Regression with Application to Diffusion Model Testing. Working paper, Department of Statistics and Applied Probability, National University of Singapore. Available at www.stat.nus.edu.sg/∼stacsx.

Cox, J., E. Ingersoll, & S. Ross (1985) An intertemporal general equilibrium model of asset prices. Econometrica 53, 363–384.Google Scholar

Doukhan, P. (1995) Mixing-Properties and Examples. Lecture Notes in Statistics. Springer-Verlag.

Fan, J. & I. Gijbels (1996) Local Polynomial Modelling and Its Applications. Chapman and Hall.

Fan, J. & C. Zhang (2003) A re-examination of Stanton's diffusion estimation with applications to financial model validation. Journal of the American Statistical Association 457, 118–134.Google Scholar

Fan, Y. & Q. Li (1998) Central limit theorem for degenerate U-statistics of absolutely regular processes with applications to model specification testing. Journal of Nonparametric Statistics 10, 245–271.Google Scholar

Gao, J. & M. King (2001) Estimation and Model Specification Testing in Nonparametric and Semiparametric Regression. Working paper, School of Mathematics and Statistics, the University of Western Australia, Australia. Available at www.maths.uwa.edu.au/∼jiti/jems.pdf.

Gao, J., H. Tong, & R. Wolff (2002) Model specification tests in nonparametric stochastic regression models. Journal of Multivariate Analysis 83, 324–359.Google Scholar

Hjellvik, V., Q. Yao, & D. Tjøstheim (1996) Linearity testing using local polynomial approximation. Discussion paper 60, Sonderforschungsbereich 373, Humboldt-Universität zu Berlin, Spandauerst. 1. 10178, Berlin.

Hong, Y. & H. Li (2004) Nonparametric specification testing for continuous-time models with application to spot interest rates. Review of Financial Studies 17, forthcoming.Google Scholar

Horowitz, J. & V. Spokoiny (2001) An adaptive, rate-optimal test of a parametric mean-regression model against a nonparametric alternative. Econometrica 69, 599–632.Google Scholar

Jiang, G. & J. Knight (1997) A nonparametric approach to the estimation of diffusion processes with an application to a short-term interest rate model. Econometric Theory 13, 615–645.Google Scholar

Karlsen, H. & D. Tjøstheim (2001) Nonparametric estimation in null recurrent time series. Annals of Statistics 29, 372–416.Google Scholar

Masry, E. & D. Tjøstheim (1995) Nonparametric estimation and identification of nonlinear ARCH time series. Econometric Theory 11, 258–289.Google Scholar

Pritsker, M. (1998) Nonparametric density estimation and tests of continuous time interest rate models. Review of Financial Studies 11, 449–487.Google Scholar

Roussas, G. & D. Ioannides (1987) Moment inequalities for mixing sequences of random variables. Stochastic Analysis and Applications 5, 61–120.Google Scholar

Stanton, R. (1997) A nonparametric model of term structure dynamics and the market price of interest rate risk. Journal of Finance 52, 1973–2002.Google Scholar

Sundaresan, S. (2001) Continuous-time methods in finance: A review and an assessment. Journal of Finance 55, 1569–1622.Google Scholar

Rejection rates for the marginal density tests

Article contents

ADAPTIVE TESTING IN CONTINUOUS-TIME DIFFUSION MODELS

Abstract

1. INTRODUCTION AND MOTIVATION

2. TESTING MARGINAL DENSITY FUNCTIONS

3. AN ADAPTIVE TEST PROCEDURE

3.1. Asymptotic Behavior of the Test Statistic under the Null Hypothesis

Simulation Scheme.

3.2. Consistency against a Fixed Alternative

3.3. Consistency against a Sequence of Local Alternatives

3.4. Consistency against a Sequence of Smooth Alternatives

4. EXAMPLE OF IMPLEMENTATION AND APPLICATION IN DIFFUSION MODELS

Example 4.1

5. CONCLUSION

APPENDIX A

APPENDIX B

APPENDIX C

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests