Published online by Cambridge University Press: 10 June 2004
This paper discusses the asymptotic behavior of distributions of state variables of Markov processes generated by first-order stochastic difference equations. It studies the problem in a context that is general in the sense that (i) the evolution of the system takes place in a general state space (i.e., a space that is not necessarily finite or even countable); and (ii) the orbits of the unperturbed, deterministic component of the system converge to subsets of the state space which can be more complicated than a stationary state or a periodic orbit, that is, they can be aperiodic or chaotic. The main result of the paper consists of the proof that, under certain conditions on the deterministic attractor and the stochastic perturbations, the Markov process describing the dynamics of a perturbed deterministic system possesses a unique, invariant, and stochastically stable probability measure. Some simple economic applications are also discussed.
This paper discusses the asymptotic behavior of distributions of the state variables of Markov processes generated by first-order stochastic difference equations. The theory of stochastic dynamical systems in discrete time plays a very significant role in economic dynamics. The well-known treatise by Stokey and Lucas (1989) provides an extensive discussion of the relevant mathematical and statistical methods and includes many applications of the theory of Markov processes to economic models.
To fix ideas and put the mathematical background in place, we start with some formal definitions.
The basic stochastic dynamical system investigated in the paper can be described by the following equation:
where the
are i.i.d. random vectors with values in Wε, an open subset of
; the state variables xt take values in M, an open subset of
; the initial vector x0 is a given constant (not random). Alternatively, we can take x0 as a random vector taking values in M, arbitrary but independent of
for t[ges ]1. In either case, xt is independent of
, for all t[ges ]0. The index ε parameterizes the level of ξ perturbations. T is a measurable function mapping M×Wε to
.
The fact that xt+1 is conditionally independent of xt−1, xt−2, …, given xt, ensures that (1) has the Markovian property, that is, for any integrable function ϕ, we have
In words, this means that the present value of the state variable x contains all the information from its past history relevant for the prediction of its future. System (1) generates a collection
of random variables, which is called Markov chain or process.
The dynamics of Markov processes such as (1) can be defined by the iterations of an one-step transition probability kernel
where, denoting by M the state space and by
the Borel σ-algebra on M, we have
The formal relation between the map T of equation (1) and the corresponding transition probability kernel is the following:
where χA is the indicator function of the set A, i.e.,
and νε is the probability measure
identical for all t.
In the investigation of systems represented by equation (1), we are especially interested in finding invariant configurations of the (random) state variables. Then we need the following.
DEFINITION 1. We say that a probability measure
is invariant with respect to P or, equivalently, that π is preserved by P, if we have
where the map F denotes the deterministic component of the stochastic dynamical system (1), which we shall often call “deterministic core.”
Although the results presented below, and in particular Theorem 1, can be applied to any problem taking the form of a Markovian process, we had in mind especially two broad types of economic models giving rise to stochastic difference equation such as (1). Because the models in questions are well known, we describe them very briefly, ignoring variations and omitting many technical details, for which we refer the reader to the literature. Our purpose here is to relate the models in questions to the mathematical setup described above.
The first class of models deals with intertemporal optimization problems, in the presence of exogenous shocks perturbing fundamentals. It includes single-agent models of optimal growth, inventory accumulation, asset pricing, search unemployment and many others. Earlier results in this area are found in Brock and Mirman (1972), subsequently extended, by Radner (1973), Brock and Majumdar (1978), Majumdar and Zilcha (1987), and Joshi (1995), among others. A detailed discussion of the basic mathematical methods involved is provided by Stokey and Lucas (1989).
A general representation of these problems can be written as follows:
where k is the endogenous state variable; y is the control variable;
,
; {ω1, ω2, …} is a sequence of i.i.d. exogenous shocks taking values in
; β∈(0, 1) is the constant discount factor. The correspondence Γ is the set of all possible control choices, given the known, endogenous and exogenous, state of the system;
is the return function.
The Bellman equation for problem (P1) can be written as
where (k, ω) is the current, known state of the system; y the control variable; ω′ is next period's shock, unknown at the moment of decision; and the function ϕ is the given “law of motion” relating the next period's value of the endogenous variable to its current state, the current decision, and the future exogenous shock.
Under appropriate, fairly standard assumptions on the spaces K, Y, Ω, the return function f, the feasibility constraint D, and the “law of motion” ϕ, there exists a unique, continuously differentiable function V satisfying (2) and an associated, continuous “policy function” g(k, ω), which determines the optimal value of the control variable y for any given pairs of the endogenous and exogenous state variables. Thus, from g and ϕ, we can derive a difference equation of the form
The sequence of the joint state variable {xt=(kt, ωt)} taking values in the set K×Ω follows a Markov process. If the sequence {ωt} is i.i.d., from equation (3) we can derive a first-order stochastic difference equation of the form
with the same properties as (1).
This class of models includes many variations, mostly of the overlapping generations (OLG) type. We shall provide an abstract characterization of it, referring the reader to the comprehensive survey by Chiappori and Guesnerie (1991) for details and a rich bibliography.
Consider a competitive economy in which, at each time t, the present state xt is entirely determined by nonstochastic fundamentals, agents' expectations about the state one step ahead, xt+1, and the equilibrium requirement that markets clear at all times. Suppose now that agents observe a Markov process of i.i.d. signals,
, characterized by a probability distribution νε, as described earlier, and that they commonly believe that these signals are perfectly correlated with the equilibrium values of the state variable x, in the sense that there exists a homeomorphism ϕ such that
Then, the equilibrium conditions can be written as
where P(xt, dxt+1)=νε[ϕ−1(xt, dxt+1)], and the function
depends on (nonstochastic) fundamentals, for example, utility functions. We say that agents' expectations are self-fulfilling if the equilibrium values of x, as determined by fundamentals and beliefs, actually validate equations (5) and (6). In this case, the resulting sequence of equilibrium values {xt} is a Markov process with probability transition kernel P(x, A), which is called sunspot equilibrium (SE). An SE is stationary (SSE) if there exists a probability measure π, invariant for the Markov process, as in Definition 1.
As will be illustrated in Example 2, below, sufficient conditions for the validation of (5) and (6) can be represented by an equation
. When the conditions for the implicit function theorem hold, Φ can be inverted locally with respect to xt+1, yielding a difference equation
like (1). In sunspot OLG models, “locally” usually means in a neighborhood of a stationary, or periodic, solution of the perfect-foresight, “deterministic core.” Whether or not the map T is uniquely defined globally depends on the properties of the function
and ultimately on the fundamental functions of the model. The results of this paper can therefore be applied both to the “invertible” and, locally, to the “noninvertible” case. Global analysis of the noninvertible case is difficult and involves discussion of delicate questions concerning the so-called “backward dynamics” that would take us far afield. Because our main interest here is to establish global results, the examples discussed in Section 3 refer to problems of Type 1 (stochastic dynamic optimization), and to problems of Type 2 (sunspots) limited to the invertible case.
Our discussion of existence and stability of invariant probability distributions for Markov processes generated by equation (1) is general in the following, twofold sense:
Proving the existence of an invariant probability distribution for the Markov process under investigation may not be very interesting if that distribution only obtains for very special (random) initial conditions. Therefore, we explicitly discuss the question of uniqueness and stochastic stability of invariant distributions, making use of some recent, powerful results in this field of research.
Given the rather technical nature of the argument, we postpone a more detailed discussion of the relevant specialized literature until the end of the paper, when the necessary concepts and methods have been properly defined. For the unexplained concepts and the basic results used in the following pages and in particular in Appendices A and B, we refer the reader to the excellent treatise on stochastic stability by Meyn and Tweedie (1993), which contains the state of the art in this area.
In view of what we said before and without loss of generality, we study the basic stochastic difference equation (1) in the following decomposed form:
where the map F(xt)=T(xt, 0) is the deterministic core, which we assume to be known, and the map
denotes the perturbation term.
We also need to consider the related family of equations:
where for each k the function
(to be distinguished from the kth iterate of T, Tk!) can be determined inductively as follows:
with
.
Equation (8) can be interpreted by saying that, for any given initial state x0∈M, and any given set of values
, the value of x at time k, xk, is determined by the function Tk. In that sense, equations (8) describe a deterministic system uniquely related to the Markov chain (1), known in the literature as an associated control system (ACS). We have more to say about ACS later.
Let us now introduce the following assumptions:
Assumption 1. The maps F and G (and therefore the map T) are continuously differentiable with bounded derivatives. This implies that those maps are Lipschitzian on bounded sets.
To formulate the next assumption, we need a few preliminary definitions.
For a set A⊂M and a scalar constant r>0, let L(r, A)=[x∈M|d(x, A)<r], where d(x, A) denotes the distance from x to A, namely,
and ‖·‖ is any vector norm, such as the Euclidean distance. Then we have the following definition.
DEFINITION 2. A compact subset Λ of the state space M, invariant under a map f, is said to be Lyapunov stable if, for any ε>0, there exists δ>0 (depending on ε) such that fn[L(δ, Λ)]⊂L(ε, Λ), for n[ges ]0. Λ is said to be asymptotically stable if
DEFINITION 3. A map f:Λ→Λ is said to be topologically transitive (t.t.) on Λ if, for any two open sets U, V⊂Λ, there exists an integer n such that fn(U)∩V≠[empty ].
Transitivity implies that orbits generated by the map f starting from any arbitrarily small open neighborhood visit any other arbitrarily small open neighborhood in Λ in finite time. Thus, the set Λ is dynamically indecomposable and must be studied as one piece.
Sometimes we need stronger forms of transitivity, as defined below.
DEFINITION 4. A map f:Λ→Λ is said to be strongly topologically transitive (s.t.t.) on Λ if for any integer m>0 the map fm is topologically transitive on Λ.
DEFINITION 5. A map f:Λ→Λ is said to be topologically mixing on Λ if for any two open non-empty sets U, V⊂Λ there exists a positive integer N such that for every n>N, fn(U)∩V≠[empty ].
Topological mixing implies s.t.t. and each of these properties implies t.t.
DEFINITION 6. An asymptotically stable set Λ is said to be an attractor if it is “indecomposable” under the action of the map f in the sense that f is topologically transitive on Λ.
We can now write the Assumption 2.
Assumption 2. The deterministic system
possesses a (locally unique) attractor Λ⊂M, which is in the interior of its basin of attraction B(Λ). Moreover, the map F is strongly topologically transitive on Λ.
The stronger form of indecomposability is necessary to guarantee aperiodicity of the Markov chain generated by (7). The case in which F is not t.t., and the one in which F is t.t., but not strongly t.t., are discussed in the Remarks 2 and 3 below. Notice than we do not require that the attractor be exponentially stable.
Assumption 3. For all ε>0, the probability measure νε characterizing the i.i.d. random variables
(discussed earlier) possesses a density γε that is lower semi-continuous (l.s.c.), and its support, that is, the set
is an open, bounded subset of
, containing the singleton {0}. (For ε=0, the measure νε degenerates to the Dirac measure centered on {0}.)
This assumption is obviously satisfied for many distributions with continuous density, commonly assumed in the description of noise. However, continuity is a sufficient, but not necessary, condition for l.s.c. For example, uniform distributions on bounded, open sets, although not continuous, would satisfy Assumption 3. In fact, a discrete distribution with mass concentrated on k points (x1, …, xk) can be approximated by a uniform distribution with support on small open balls centered on those points.
Before stating Assumption 4, we need to define the concept of forward accessibility. Keeping in mind the earlier definition of ACS and, in particular, equation (8), we can write Definition 7.
DEFINITION 7. Let A+(x) denote the set of all states reachable from x at some time in the future for any admissible sequences of perturbations
. Then, an ACS is said to be forward accessible (FA) if, for each x0∈M, the set A+(x0)⊂M has a non-empty interior.
Then, we can state the following assumption.
Assumption 4. The deterministic ACS (8) associated with the stochastic system (7) is forward accessible.
From Definition 7 (and the definition of interior), we gather that a Markov chain X, for which the corresponding ACS is FA, cannot be concentrated in some lower dimensional subset of the state space (e.g., if
, the chain cannot be concentrated in a point; if
, it cannot be concentrated in a line, and so forth). For additional comments on FA, together with a brief description of the method for ascertaining the presence of FA in simple models, see Appendix B.
Assumption 5. For all ξε, the perturbation term ‖G(x, ξε)‖ is bounded uniformly for x∈B(Λ), i.e., for each ε[ges ]0 there exists a Lε<∞, such that
and
Thus, by reducing ε, we can make the perturbation level as small as we please and in the limit for ε→0 the stochastic process (7) degenerates to its “deterministic core” (9).
We can now prove the following theorem.
THEOREM 1. Let a Markov chain X be defined by (7), and by the associated transition probability kernel Pε(x, A), and let assumptions 1–5 hold. Then, for any deterministic attractor Λ and a sufficiently small ε [depending on the size of the basin of attraction B(Λ)], there exists a set Oε⊂B(Λ) that is absorbing; i.e., Pε(x, Oε)=1 if x∈Oε. Moreover, for the chain X restricted to Oε, the following properties hold:1
Notice that, here, ‖·‖ denotes the total variation norm defined as
Convergence in the total variation norm implies weak convergence. We say that a sequence of probability measures {μk} converges weakly to μ if, for any bounded continuous function f, we have
[cf. Meyn and Tweedie (1993, pp. 311, 521)].
The proof of the Theorem above—which is the core of this paper—is rather long and quite technical and we have relegated it to Appendix A. In economic terms, the consequence of the mathematical results listed above can be summarized in the following proposition.
PROPOSITION 1. Let equation (7) describe a problem of Type 1 (e.g., stochastic optimal growth), let equation (9) describe the corresponding deterministic core, and let the conditions of the Theorem hold. Then,
Analogously, in the case of Type 2 problems (sunspots), we have the following proposition.
PROPOSITION 2. Let equation (7) describe a sunspot equilibrium, let equation (9) describe a (forward-moving) perfect-foresight equilibrium (PFE), and let the conditions of Theorem 1 hold. Then,
Remark 1. Strong topological transitivity is satisfied trivially when the deterministic attractor Λ is a fixed point. It is also satisfied when the dynamics of map F are chaotic and mixing on Λ. As we shall see in Example 3, strong transitivity may also be verified when the attractor is quasiperiodic (aperiodic but not chaotic).
Remark 2. When the deterministic map is t.t. on Λ, but not strongly t.t., Assumption 2 is violated, the Markov process need not be aperiodic or converge to a unique probability distribution. This situation obviously occurs when Λ is periodic and, less obviously, when Λ consists of two or more chaotic sets mapped into each other cyclically by F—the so-called periodic, or nonmixing chaos. However, this is not as serious a drawback as it seems, in view of the fact [cf. Meyn and Tweedie (1993, Proposition 5.4.6, p. 118)] that for an irreducible Markov process X periodic with period d, the state space can be decomposed as
where the d sets Di, i=1, …, d are disjoint, absorbing, and irreducible for the process
generated by the probability transition kernel Pd (i.e., the dth iterate of P), and the process Xd on any of the sets Di is aperiodic. The residual set E is transient and therefore negligible.
2A transient set E is negligible in the sense that the expected number of times that an infinite chain X starting in E returns to it is finite.
will converge to one of d probability measures πi (one for each set Di) and the choice among them will depend on the initial conditions.
3For further details on this point, see Doob (1953, pp. 190–218) and Stokey and Lucas (1989, pp. 334–351).
Remark 3. When the deterministic map F is not t.t. on an attracting set Λ (e.g., Λ is decomposable into two invariant sets), Assumption 2 is violated, the Markov process is not irreducible, and Theorem 1 is no longer valid. However, for any reducible T-chain, there exists a finite decomposition of the state space
where the sets Hk are disjoint, absorbing sets and the chain restricted to any of the Hk sets is uniformly ergodic, whereas E a transient set. For each of the absorbing, irreducible set Hk, there is a unique, invariant, probability distribution stochastically stable for appropriate initial conditions [cf. Tuominen and Tweedie (1979); Meyn and Tweedie (1993, p. 408)].
Remark 4. The hypothesis of independence of the exogenous perturbations is not essential. When perturbations are not independent, we can still write a stochastic difference equation in which there appear i.i.d. exogenous perturbations, but the equations need not be of first order. That is to say, we would have equations of the form
where wt denotes the state variable, the sequence {ξt} is i.i.d., and the order of the difference equation depends on the structure of the dependence of perturbations. The resulting nth-order system of equations can then be reduced to an equivalent first-order system by extending the state space through the introduction of appropriate auxiliary variables. This procedure is illustrated in Example 4.
A brief discussion of simple economic models of Type 1 and Type 2 will help understand the nature and relevance of our results and the assumptions on which they are based. To avoid repetitions, let us start with some general considerations concerning the four models that follow. First, the functional relationships commonly adopted in the models considered here guarantee that the “smoothness assumption,” Assumption 1, is satisfied. Second, the r.v.s considered below comply with Assumption 3. This does not pose any particular restriction on the economic primitives of the model and, as mentioned on p. 10 above, it allows a wide choice of stochastic perturbations.
Consider the well-known, one-good/two-sector optimal growth model in reduced form. In the deterministic version of the model, if we choose the single capital good k as the endogenous state variable, and the control variable is the amount y of output saved and invested, the map governing the dynamics of k along an optimal path coincides with the optimal policy function g; that is, we have
It is known that, for sufficiently small discount factor β (i.e., sufficiently large discount rate), any C2 map can be a policy function for a problem of optimal growth satisfying the standard economic requirements—in a nutshell, convexity of technology and convexity of preferences [see Montrucchio (1986), Boldrin and Montrucchio (1986)]. In particular, there exist specifications of the return function f, satisfying those economic requirements, that yield the optimal policy map
i.e., the “logistic” map much studied in the literature on chaotic dynamics [in the present context, besides the articles quoted above, see Deneckere and Pelikan (1986)].
Suppose now we introduce a stochastic perturbation such that a random proportion
of output saved at each instant t is wasted before it can be used as production input, but after the optimal choice of how much to save has been made. Consequently, we have
and
which is a special case of problem (P1), above, where the return function f(k, y) in this case can be interpreted as the “consumption frontier.”
We also assume that
is a sequence of i.i.d. r.v., uniformly distributed over (0, a), 0<a<1. If we normalize by defining
, the standard deviation ε of the r.v.
can be used as the index parameterizing the level of perturbations.
Putting F(kt)=μkt(1−kt) and
, equation (12) can be written in the form of (7) as
The dynamic behavior of the map F is extremely well documented in the mathematical literature on maps of the interval and we refer the reader to it for details [see, e.g., Whitley (1983), Sharkovsky et al. (1997)]. Broadly speaking, the properties of attractors depend on the single parameter μ, which is a decreasing function of the discount factor. In particular, it is known that, for μ∈(1, 3), attractors are all fixed points; for μ∈(3, μ∞), μ∞≈3.57, attractors are all periodic; for μ∈(μ∞, 4), there exist periodic, quasiperiodic, or chaotic attractors. We have already discussed the simpler cases (fixed-point and periodic attractors) in Remarks 1 and 2, and the quasiperiodic case is taken up in Example 3. Here, we concentrate on the case of attractors that are chaotic in the following sense: there exists a unique, absolutely continuous, ergodic F-invariant probability measure ρ with support in Λ, with respect to which F has a positive metric entropy (and a positive Lyapunov exponent of equal value). There are two possibilities here: (i) ρ is mixing, which implies that F is topologically mixing on Λ [see Katok and Hasselblatt (1995, p. 151)] and Assumption 2 is satisfied; (ii) F is topologically transitive, but not topologically mixing on Λ, and we have the case of periodic, or nonmixing, chaos, discussed in Remark 2.
Assumption 4 (forward accessibility) is verified because (see Appendix B)
which is true for initial conditions 0<k0<1 (initial conditions k0=0 or k0=1 are excluded because, for μ∞<μ<4, they are not in the basin of attraction of Λ). Finally, for initial conditions 0<k0<1, 0<F(kt)<1, ∀t. This and the said properties of the r.v., ξt, guarantees that, for kt∈B(Λ), the perturbations term
is bounded and goes to zero with ε, and Assumption 5 is verified.
The deterministic core of this example is the pure exchange OLG model discussed by Samuelson (1958), Gale (1973), and many others. Here, we use a modified version of the Benhabib and Day (1982) model. In this model, there is no production, but each agent receives endowments of a single, perishable consumption good. Because we concentrate on Gale's “classical case” (young are impatient and borrow from the old), for simplicity's sake we assume that only old agents receive a constant endowment w. At each time t, the young agent borrows a certain amount ct of the good and, when old, must pay back an amount w−gt+1 (and will accordingly consume an amount gt+1 of the good) that depends on the exchange rate between present and future consumption Rt+1. Finally, at each t, the good market clears; that is, w=ct+gt. (We would also need some intergenerational arrangements guaranteeing that young people's debts are always settled, as well as some ad hoc rules for the “time zero” of the model, but we cannot discuss them here.)
In the stochastic case, agents believe that in equilibrium the exchange rate is perfectly correlated with an extrinsic i.i.d. sunspot,
—that is,
. We assume that
is characterized by a probability distribution νε with open, bounded support on the real line including {0} and the bound goes to zero with ε (thus Assumption 5 is satisfied).
Assuming separable utility function, U(ct, gt+1)=v(ct)+u(gt+1), the young agent's program is
where expectation is taken w.r.t. the measure νε and it is conditional to ct. If agents' beliefs are self-fulfilling and markets always clear, i.e., ct=w−gt∀t, we must have
where
and
. If
is invertible (and putting
for simplicity's sake), condition (15) is satisfied for
Notice that the deterministic core of (16) is
, corresponding to the perfect-foresight no-sunspot version of the pure exchange OLG model, and
. Let us now choose the utility functions v(c)=A−we−c; u(g)=g, corresponding, respectively, to constant absolute risk aversion and risk neutrality, with A a positive constant. In this case,
and
. F is a much-studied member of the class of unimodal maps of the interval. Notice that the interval Iw=[0, w/e] is invariant for F and therefore, if c0∈Iw, gt=w−ct will never become negative along a deterministic orbit. In the complete stochastic system, for every given w the level of perturbations (the parameter ε) must be fixed so that gt[ges ]0 at all times. For w>1, F has two nonnegative fixed points, namely,
, unstable, and
, which is asymptotically stable (with improper oscillations) for 1<w<e2. If the endowment is increased, at w=e2, a flip bifurcation occurs, leading to an initially stable period-2 cycle and then, if w is increased further, to a cascade of bifurcations with (initially) stable cycles of increasing periods. Analytical and numerical studies of the map F [see, e.g., May and Oster (1976)] indicate that, for larger values of the endowment w, complex (chaotic) attractors will appear. Thus, the typology of attractors of the map F is similar to that occurring for the logistic map discussed in Example 1, and we could repeat what we said there with regard to Assumption 2. (However, notice that, in this case, very large values of w will lead back to simpler, periodic attractors). Other choices of the utility function v(c), such as v(c)=ac−bc2, or u(c)=λ(c+b)1−a/(1−a) would lead to similar results [for details, see Benhabib and Day (1982)]. Notice that Assumption 4 is satisfied because
.
This model differs from the one discussed in Example 2 in some important and related aspects concerning both technology and agents' behavior. First, the single good can be both consumed and invested (with depreciation equal to one period). There are no endowments and, at each time t, the single good is produced by current labor, supplied by young agents, and capital (output−consumption), invested at time t−1. Second, only old agents consume.
To fix ideas, let us consider the case in which (i) the technology is represented by a Leontief linear production function, yt=min[lt, bkt−1] (where y denotes output, k=y−c is capital, c is consumption, l is labor, and, for viable systems, b>1); (ii) young agents' preferences are represented by a separable utility function of a Cobb-Douglas type,
; (iii) uncertainty is represented by means of random, i.i.d. perturbations of the future price of consumption. Under these assumptions, equilibrium dynamics of consumption and labor supply are described by the following stochastic dynamical system:
where
is an i.i.d. process and
is a zero-mean r.v. with open, bounded support on the real line, including {0} and with bound going to zero with ε. If we denote by xt the two-dimensional vector variable (ct, lt), equation (17) has the same form as (1). The RHS can be split into a deterministic part
corresponding to the perfect-foresight version of the model and a stochastic perturbation
There exist two fixed points for the map F, namely, E1 located at the origin and
, in the positive orthant. E1 is always unstable. Stability of E2 depends on two parameters, that is, b, measuring productivity (output/capital ratio), and (β/α), the ratio between the utility elasticities. For sufficiently low values of b and β/α, E2 is locally, asymptotically stable and its basin of attraction is a forward invariant subset of
. If we increase either (or both) of those parameters, E2 loses its stability through a Neimark–Sacker bifurcation leading to the appearance of an invariant, closed curve around E2 [for details, cf. Reichlin (1986), Medio (1992, ch. 12)]. Numerical investigation suggests that for this model the curve is indeed stable. The dynamics on the curve can be periodic or quasiperiodic. Here, we concentrate on the quasiperiodic case. It is known that, under quite general conditions, in this case the dynamics of the map F on the invariant curve is homeomorphically equivalent to the fixed rotation of the circle, described by the map
,
Consider first that, for ρ irrational, the map f is t.t. on the circle and so are all the maps fk for arbitrary k>1. Consequently, f is strongly t.t. on the circle and, because of the homeomorphic equivalence, so is F on the invariant attracting curve. Thus, Assumption 2 is verified. FA can now be ascertained by means of the techniques explained in Appendix B. The determinant of the “controllability matrix”
, evaluated as in Appendix B, is
From an economic point of view, we are interested in solutions that stay in the subset
defined before. Therefore, for any initial conditions (l0, c0)∈M, we can always choose values of
belonging to the support of
, such that
, rank
. Consequently FA is verified and Assumption 4 holds. Finally, notice that, given the assumptions on
and for all (ct, lt) in the basin of attraction, the perturbation term G defined before is bounded and goes to zero with ε. Thus Assumption 5 is verified.
This is a simple variation of the well-known one-good/one-sector model of optimal growth.4
Here we use the version discussed by Cugno and Montrucchio (1998, pp. 178–179).
, 0<α<1, where c, x, k denote, respectively, consumption, output, and capital stock; yt=xt−ct is the control variable (saving) and kt=yt−1; and
is a technology shock observed after the optimal choice of how much to save has been made. In this case, we assume that the sequence of shocks {ωt} is generated by the equation
, with 0[les ]r<1;
is a zero-mean, i.i.d. process and ωt and
are independent. Therefore, the sequence {ωt} is a Markov chain stationary but not independent for r>0.
Under the stated assumptions, the Bellman equation can be written as
where Ω is the support of ω′ and μ is its probability distribution, conditional on ω and, as usual, for a generic variable, we adopt the notation x=xt and x′=xt+1. Let us now try a solution V(x, ω)=mln x+n1ω+n2, with m, n1, n2 undetermined coefficients. Substituting into (20), we have
whence, using the facts that at each point in time, (i) x′=eω′(k′)α=eω′yα and (ii) ∫Ωω′μ(ω, dω′)=rω, we can write
Next, finding the value of y that maximizes the RHS of (22) and solving for the unknown coefficients m, we obtain the policy function y=g(x)=αβx. Finally, considering that for all t, yt=kt+1, we can write the stochastic difference equation of the model; that is,
or, taking logarithms,
where C=αln(αβ) is a constant. Moving (24) one period back in time, solving for ωt, and substituting into (24), we obtain
where B=C(1−r).
Equation (25) is a one-variable, second-order stochastic difference equation with an i.i.d. perturbation. By introducing appropriate auxiliary variables, it can be transformed into a dynamically equivalent two-variable, first-order equation. Putting
, where
, we have
which has the same form as (1), with the RHS already split into a deterministic and a stochastic part, and it satisfies Assumption 1. It also provides an illustration of our comment in Remark 3 concerning the non-essentiality of the assumption of independence of perturbations.
The deterministic part of (26) is linear and it has a unique fixed point located at the origin (actually it corresponds to a positive value of the state variable,
). Under the stated assumptions, and, in particular, the fact that 0<α<1, 0[les ]r[les ]1 (both absolutely reasonable in this context), the fixed point is always globally, exponentially asymptotically stable. Thus (26) satisfies (trivially) Assumption 2. Assumption 4 (forward accessibility) can again be proved by applying the same technique as in Example 3. Choosing k=2, the controllability matrix is
hence
and rank
as required. Finally, in view of the stability properties of the fixed point and the fact that the noise is additive, the results of Theorem 1 can be established under weaker conditions on the perturbing term
. In particular, Assumption 5 can be replaced by the condition
[for details, see Tong (1990), pp. 127–129].
Apart from the general reference to the book by Meyn and Tweedie, mathematically the obvious reference is to Chan and Tong, to whose work [see, Tong (1990), in particular its Appendix 1, written by Chan; Chan and Tong (1994)], we would like to acknowledge our intellectual debt. In particular, these authors proved uniform ergodicity of a Markov chain under hypotheses similar to Assumptions 1 and 3–5 above [see Chan and Tong (1994)]. However, they assumed exponential stability of the deterministic attractor, an assumption considerably stronger than Assumption 2 and unnecessarily restrictive. As will be seen in Appendix A, our proof of Theorem 1 (and the choice of the Lyapunov function) is accordingly different.
Moreover, Chan and Tong do not relate these results on stochastic stability to the economic literature in general, or to the sunspot question in particular, and so we shall do it here briefly.
Consider first that, for irreducible, aperiodic Markov chains, uniform ergodicity is equivalent (it implies and is implied by) the celebrated Doeblin condition. This condition requires that there exist a (finite-valued) measure ϕ on
, an integer n[ges ]1, and a positive δ, such that
for every x∈M. Roughly speaking, this requirement means that there exists a measure ϕ such that the process X is not concentrated on ϕ-small sets.
5The Doeblin condition, under the name of “Hypothesis D,” and its implications are extensively discussed by Doob (1953, pp. 190–234). See also Stokey and Lucas (1989, pp. 344–351), and Meyn and Tweedie (1993, Th. 16.0.2, pp. 384–385). (Meyn and Tweedie, however, use a somewhat different definition of the Doeblin condition, which is equivalent to the traditional one for irreducible Markov chains.)
Uniform ergodicity (or, equivalently, the Doeblin condition) also implies that the operator
, defined by
(where
is the space of finite probability measures) is quasi compact. Hence, uniformly ergodic chains are sometimes called quasi compact. A formal definition and a thorough discussion of the concepts of compactness and quasi compactness of operators is out of the question here. Broadly speaking, we can say that if the operator
on
is quasi compact and the associated Markov chain is weak Feller (which in the case of our model is implied by Assumptions 1 and 3), then the sequences generated by
, starting from any initial condition on
, converge to a unique invariant probability measure.
The relation between the Doeblin condition and quasi compactness of operators, and the associated property of convergence of probability measures, was introduced in the economic literature by Carl Futia's (1982) excellent mathematical survey. Several applications of these ideas to economic problems are discussed by Stokey and Lucas (1989, chs. 11–13). Futia's results were also employed in the sunspot literature (cf. Farmer and Woodford (1997), originally circulated in 1984 as a CARESS W.P.; Chiappori and Guesnerie (1991), p. 1708).
A sufficient condition for uniform ergodicity of an irreducible, aperiodic T-chain is that the state space can be reduced to a compact invariant set [see Meyn and Tweedie (1993, Th. 16.2.5, p. 395)]. The compact set argument was apparently introduced in the economic literature by Blume (1982), generalized by Duffie et al. (1994), and applied in a number of contexts, including the sunspot models [cf. Chiappori and Guesnerie (1991, pp. 1708–1710)]. See also Stokey and Lucas (1989, ch. 12).
The distinctive advantage of the approach adopted in this paper is that we do not assume compactness of the state space (or any other equivalent condition), but deduce uniform ergodicity from assumptions directly concerning the deterministic core of the dynamical system, on the one hand, and its stochastic perturbations, on the other. In the context of the sunspots problem, this means that we relate those assumptions to the properties of the perfect-foresight equilibrium and the agents′ random beliefs, respectively.
Step 1. The Regularity Assumption 1, the Density Assumption 3, and the Forward accessibility Assumption 4, together imply that the Markov chain X is a T-chain {cf. Meyn and Tweedie [1993, Prop. 7.1.2, pp. 152–154 (scalar case) and 7.1.5, p. 157 (multidimensional case)]}.6
Meyn and Tweedie do not give a full proof of Proposition 7.1.5, but it can be found in Meyn and Caines (1991).
Step 2. From step 1 and the assumptions,
we conclude that any point x∈Λ is reachable, in the sense that for every neighborhood N of x,
, y∈B(Λ). But from Meyn and Tweedie (1993, Prop. 6.2.1, p. 133), we gather that, if X is a T-chain and the state space contains a reachable point x*, then X is ψ-irreducible with ψ=T(x*, ·). Aperiodicity follows from the above argument as well as the second part of Assumption 2 (F is strongly topologically transitive on Λ). This implies that, for any integer m, the Markov chain X(m) associated with the kernel Pm is also ψ-irreducible. Thus, the Markov chain X is aperiodic.
Step 3. The fact that X is a ψ-irreducible, aperiodic T-chain also implies that every compact set is petite [cf. Meyn and Tweedie (1993, Th. 6.2.5, p. 134)].
Step 4. To complete our proof, we now need to prove the following auxiliary lemma.7
Our converse lemma is an extension of Gordon's converse theorem (Gordon, 1972, Th. 3, p. 79). The main difference is that Gordon considers stability of a fixed point and we deal with the more general case of a compact attractor. (Of course, our dynamical system is autonomous and Gordon's is not, but this is irrelevant here.) We do not want to insist too much on the originality of this result. It is possible that a generalization of Gordon's theorem exists already in the very vast mathematical literature on stability, but we could not find it.
LEMMA 1 (Converse Stability Lemma). If a compact subset of the state space
is uniformly asymptotically stable for the deterministic dynamical system (9), then there exists a real scalar function V(x) which satisfies the following properties for x in any bounded set U⊆B(Λ):
Proof. First, we need to recall a lemma by Massera (1949, pp. 716–717),
8This lemma is mentioned by Gordon in the quoted article, but its source is misquoted. See also Hahn (1963, p. 70).
converge. Notice that Γ∈C1 and therefore it is Lipschitzian on a bounded set.
This result can be extended to the discrete-time case [cf. Hahn (1989, p. 235) and Gordon (1972, p. 78)]. If σ*(t) is a nonincreasing function, the application of the Massera Lemma and the integral test for convergence of a series guarantees the uniform convergence of the sums
where we put ρ(t)=1, Γ is the same as in (A.1), but we now have
.
Next, let us consider that the scalar function
(where Fs denotes the sth iterate of F) is defined for t[ges ]0, positive for x∉Λ, and nonincreasing. Moreover, because the set Λ is uniformly asymptotically stable for any initial condition x∈B(Λ), that function converges to 0 as t→∞. If we now put c=1,
, from the Massera Lemma and the integral test for sequences, we conclude that the infinite sums
converge.
Let us now define the scalar function
Obviously,
, and therefore, for any scalar k[ges ]1, (A.3) converges.
Moreover, for any two values x1, x2∈U, we have
where MΓ, MF are the Lipschitz constants of the functions Γ and F, respectively. We have used here the known fact that, for a given set
, the map
is Lipschitzian with Lipschitz constant equal to one. If we now put
the infinite sum
converges and we can write
where M<∞ is a constant. Property (c) then obtains.
If we now choose x2 as the point in Λ closest to x1, we shall have d(x2, Λ)=0, V(x2)=0, d(x1, Λ)=‖x1−x2‖ and
and property (b) follows.
Moreover, consider that, obviously, V(x)[ges ]Γ[d(x, Λ)] and property (a) follows. Finally, consider that ΔDETV(xt)=−Γ[d(xt, Λ)]<0 for x∉Λ, which establishes property (d).
Step 5. Applying the result just established to the complete, stochastic system (7), we have almost surely
where M<∞ is a constant. If the perturbation term
is sufficiently small (i.e., ε is sufficiently small), we can find a positive constant δε>0 (which can be made as small as we please by reducing ε) such that
From (A.4), it follows that [V(x)>Mδε]⇒[d(x, Λ)>δε]. Then, from (A.6) and the fact that Γ is increasing in its argument, it follows that, for sufficiently small ε, we can find a positive scalar constant ηε>Mδε such that the set Oε≡{x|V(x)[les ]ηε}⊂B(Λ) is absorbing; ηε, too, can be made arbitrarily small by reducing ε.
We can now show that, for the Markov chain constrained to the absorbing set Oε, the function V(x) defined above satisfies the so-called “geometric drift condition” for stochastic stability [Condition V(4), Meyn and Tweedie (1993, pp. 255, 367)]. Taking expectations
9These are conditional expectations for a given value of xt∈Oε. Therefore, we have
Consider now that, in view of Assumption 5,
. Next, define the scalar quantity βε=[Γ(δε)−aε]/ηε and the set Cε={x|d(x, Λ)[les ]δε}, and notice that: (i) we can always fix ε (and thereby δε and aε) so that βε>0, and (ii) Cε is compact and therefore, because X is a T-chain, petite (cf. Step 3 above). Hence, V(x) satisfies the Condition V(4),
10The fact that, in Meyn and Tweedie's definition of Condition V(4), the range of V(x) is [1, ∞], rather than [0, ∞], is irrelevant here.
where
denotes the characteristic function of the set Cε and bε<∞. Because Cε is petite, from Meyn and Tweedie (1993, Lemma 15.2.2, p. 367), we deduce that the function V is unbounded off petite set; that is, the “level set,” Oε defined above is also petite.
Then, from Meyn and Tweedie (1993, Th. 16.2.2, pp. 390–391, establishing that a ψ-irreducible, aperiodic Markov chain is uniformly ergodic if (and only if) the state space is petite, we conclude that the Markov chain X generated by (7), restricted to the absorbing set Oε, is uniformly ergodic. This proves point (a) of the Theorem. Point (b) of the Theorem follows because Meyn and Tweedie (1993, Th. 16.0.2, pp. 384–385) proves that (a) is equivalent to (b) (the former implies and is implied by the latter).
Finally, point (c) of the Theorem follows from Meyn and Tweedie (1993, Th. 13.0.1, pp. 309–310; A. 3, p. 500; and Th. 13.3.3, p. 323), establishing that for an aperiodic, ergodic (“positive Harris”) Markov chain, any initial probability measure converges in the total variation norm to the unique, finite invariant measure.
In this Appendix, we provide some additional explanations of the concept of forward accessibility and briefly discuss a method for ascertaining FA in simple models, such as those of Examples 1–4. For further technical details and proofs, see Meyn and Tweedie (1993, pp. 150–155). Unexplained symbols and notions are as in the main text of the paper.
Let us first recall from the text that to each system of stochastic difference equations like (1) we can associate a control system (ACS) whose trajectories are determined by the equations
where
is the state reached at time k starting from x, given a certain sequence
. Therefore,
defines the set of all the states reachable from an initial state x for any admissible sequence of perturbations, and
is the set of all states reachable from x at some time in the future.
Consider now equation (B.1) and the (n×k) matrix
of the partial derivative of the function Tk with respect to its ξ-arguments.
Then [cf. Meyn and Tweedie (1993, Prop. 7.1.4, p. 156)], a necessary and sufficient condition for an ACS defined by (B.1) to be FA is that, for each initial condition
, there exists an integer k[ges ]1 and a sequence of given values
such that
The matrix
is called the controllability matrix.
For scalar models, that is, those for which
, the “rank condition” for the associated ACS reduces to
Notice that, to prove the existence of the rank condition, and thereby FA, we must find an integer k[ges ]1 and an admissible sequence such that the condition holds for any initial condition.
Let us conclude by actually calculating the controllability matrix for the two-dimensional model discussed in Example 3—overlapping generations model with production.
If we now choose k=2, from equation (17) we have
Therefore, the controllability matrix is the (2×2) matrix
and rank
, iff
.
Financial support from the Italian Ministry of the University (MURST) and the National Council of Research (CNR) is gratefully acknowledged. The author is greatly indebted to Sean Meyn for a very stimulating e-mail correspondence and many insightful comments. Many thanks are also due to Alain Monfort and Sergio Invernizzi, whose comments prompted the author to tidy up the proof of Theorem 1, to an Associate Editor of MD, and two anonymous referees for many helpful comments and criticisms. The author is of course solely responsible for any remaining errors or misunderstandings.