1. Introduction
The mathematical theory of chemical reaction networks has attracted massive attention over the past two decades due to its wide-ranging applications in biology, chemistry, ecology, and epidemics [Reference Anderson and Kurtz8]. If a reaction system is well mixed and the numbers of molecules are very large, random fluctuations can be ignored and the evolution of the concentrations of all chemical species can be modeled deterministically as a set of ordinary differential equations based on the law of mass action. If the chemical species are present in low numbers, however, random fluctuations can no longer be ignored, and the evolution of the system is usually modeled stochastically as a continuous-time Markov chain on a high-dimensional lattice, which is widely known as a density-dependent Markov chain [Reference Ethier and Kurtz14]. The Kolmogorov forward equation for a density-dependent Markov chain is the well-known chemical master equation, which was first introduced by Delbrück [Reference Delbrück13]. At the center of the mathematical theory of chemical reaction networks is a limit theorem proved by Kurtz [Reference Kurtz31–Reference Kurtz33], which states that when the system size tends to infinity, the trajectories of the stochastic model of a reaction system will converge to those of the deterministic model over any compact time interval, whenever the initial condition converges. Thus far, stochastic reaction networks have served as a fundamental model for the single-cell stochastic gene expression dynamics of gene regulatory networks [Reference Hornos20, Reference Jia22–Reference Jia, Wang, Yin and Zhang25, Reference Kumar, Platini and Kulkarni30, Reference Peccoud and Ycart39, Reference Shahrezaei and Swain42]. Recently, the limit theorem of Kurtz has been generalized to stochastic gene regulatory networks with bursting dynamics [Reference Chen and Jia9, Reference Jia, Zhang and Qian26].
The limit theorem of Kurtz [Reference Kurtz32] can be viewed as the law of large numbers for stochastic reaction networks. The corresponding large deviation theory and the Freidlin–Wentzell-type metastability theory for stochastic reaction networks have also been studied by many authors [Reference Agazzi, Dembo and Eckmann1, Reference Agazzi and Mattingly3, Reference Anderson, Cappelletti, Kim and Nguyen5, Reference Lazarescu, Cossetto, Falasco and Esposito34, Reference Li and Lin36, Reference Shwartz and Weiss43] and were rigorously established by Agazzi et al. under the mass action kinetics [Reference Agazzi, Dembo and Eckmann2]. At the center of the metastability theory is a quantity called quasi-potential, which plays a crucial role in the analysis of the exit time and exit distribution from a basin of attraction, as well as the most probable transition paths between multiple attractors when the system size is large [Reference Olivieri and Vares38]. However, the quasi-potential is usually defined via an abstract variational expression, and hence it is very difficult to obtain a general analytical expression for it.
There are two types of reaction networks that should be distinguished: those satisfy detailed balance and those violate detailed balance. In terms of physical chemistry, detailed balance is a fundamental thermodynamic constraint for closed systems. If there is no sustained energy supply, then a chemical system, when it reaches the steady state, must satisfy detailed balance [Reference Qian40]. In the modeling of many biochemical systems, such as enzymes [Reference Cornish-Bowden12] and ion channels [Reference Sakmann and Neher41], detailed balance has become a basic requirement [Reference Alberty4, Reference Jia21]. However, in the literature, there are two different definitions of detailed balance for a chemical reaction network. From the deterministic perspective, detailed balance means that there is no net concentration flux between any pair of reversible reactions, in which case there is no chemical potential difference and thus the system is in chemical equilibrium. From the stochastic perspective, detailed balance means that there is no net probability flux between any pair of microstates on the high-dimensional nonnegative integer lattice, where each microstate is defined as the ordered tuple of concentrations of all chemical species. To distinguish between these, we refer to the former as deterministic detailed balance and the latter as stochastic detailed balance. Some authors have believed the two types of detailed balance to be equivalent [Reference Anderson, Craciun and Kurtz7]. However, Joshi [Reference Joshi27] pointed out recently that they are not equivalent; they are equivalent when different reactions lead to different species changes, while they are in general not equivalent for systems having two reactions that lead to the same species change—deterministic detailed balance implies stochastic detailed balance, and the opposite is not true.
In this paper, in addition to deterministic and stochastic detailed balance, we propose a third type of detailed balance, which is called local detailed balance. This new type of detailed balance characterizes the local asymptotic behavior of a reaction network as the system size tends to infinity. We prove that the three types of detailed balance (deterministic, stochastic, and local) are equivalent when different reactions lead to different species changes, while local detailed balance is even weaker than the other two when some different reactions lead to the same species change—stochastic detailed balance implies local detailed balance, and the opposite is not true. This is the first main result of the present paper. More importantly, under the condition of local detailed balance, we prove that a stochastic reaction network has a global potential that can be computed explicitly and concisely. The global potential reduces to the quasi-potential within each basin of attraction. In general, the quasi-potential is only defined within each basin of attraction. However, local detailed balance guarantees that the system has a global potential that can be defined over the whole space. This is the second main result of the present paper.
In [Reference Joshi27], Joshi gave the sufficient and necessary condition for deterministic detailed balance. While [Reference Joshi27] also provided a weaker sufficient condition for stochastic detailed balance, it is difficult to apply this condition in practice, because an infinite number of restrictions need to be verified. In this paper, we provide a simpler sufficient condition for stochastic detailed balance that is more applicable in practice. This new sufficient condition is imposed directly on rate constants, and only a finite number of restrictions need to be verified. This is the third main result of the present paper.
This paper is organized as follows. In Section 2, we recall the basic concepts of chemical reaction networks and state some preliminary results. In Section 3, we reveal the relationship among the four types of detailed balance and give some counterexamples. In Section 4, we show that a global potential exists for stochastic reaction networks satisfying local detailed balance and obtain an explicit expression for the global potential. The remaining sections are devoted to detailed proofs of the main theorems.
2. Model and preliminary results
Let $\mathbb{Z}_{\geq 0},\mathbb{R}_{\geq 0}$ , and $\mathbb{R}_{>0}$ denote the sets of nonnegative integers, nonnegative real numbers, and positive real numbers, respectively. Recall that a chemical reaction system is composed of a collection of chemical species $\{S_1,\dots,S_d\}$ and a family of reactions
where $\nu^j_i,\nu'^j_i\in\mathbb{Z}_{\geq 0}$ are the molecule numbers of $S_i$ consumed and created, respectively, in one instance of that reaction. For simplicity, we write $\nu_{i} = (\nu_i^1,\dots,\nu_{i}^d)$ and $\nu'_{\!\!i}=(\nu'^1_i,\dots,\nu'^d_{i})$ , which are called complexes. Then the reaction $R_i$ can be written more concisely as $\nu_i \rightarrow \nu'_{\!\!i}$ . Moreover, $\nu'_{\!\!i}-\nu_i$ is called the reaction vector of $R_i$ . Let
denote the collections of chemical species, complexes, and reactions, respectively. Then the ordered triple $\{\mathcal{S},\mathcal{C},\mathcal{R}\}$ is called a chemical reaction network.
A chemical reaction network is called reversible if for any reaction $R_i: \nu_i\rightarrow\nu'_{\!\!i}\in\mathcal{R}$ , there exists a reverse reaction $R^-_i:\nu'_{\!\!i}\rightarrow \nu_i\in\mathcal{R}$ [Reference Joshi27]. For any pair of reversible reactions $R_i$ and $R_i^-$ , we say that $R_i$ is a forward reaction if $\nu_i<\nu'_{\!\!i}$ , where the symbol $<$ is understood in the lexicographic order (i.e. $\nu_i<\nu'_{\!\!i}$ if and only if $\nu_i^j<\nu'^j_i$ for the first j at which $\nu_i^j$ and $\nu'^j_i$ differ); otherwise, $R_i$ is called a backward reaction. Throughout the paper, we assume that all reaction networks under consideration are reversible.
Most previous papers have assumed that different reactions have different reaction vectors. However, in many reaction networks, different reactions may have the same reaction vector. For example, the reaction $S_1\rightarrow S_2$ and the enzyme-catalyzed reaction $S_1+E\rightarrow S_2+E$ , where E denotes an enzyme, may coexist in a biochemical reaction system, with the latter having a larger reaction rate. To cover such systems, here we consider the more general case where each reaction vector may correspond to multiple reactions. For convenience, we introduce the following definition, which plays an important role in the present paper.
Definition 1. Two reactions are called equivalent if they have the same reaction vector.
From this definition, two equivalent reactions are either both forward or both backward. Following the notation in [Reference Joshi27], let $V(\mathcal{R})=\{\nu'_{\!\!i}-\nu_i|R_i \ \text{is a forward reaction}\}$ denote the collection of reaction vectors for forward reactions. Throughout the paper, the elements in $V(\mathcal{R})$ will be listed as $\omega_1,\cdots,\omega_r$ , where $1\leq r\leq N$ . For any $\omega_p = (\omega_p^1,\cdots,\omega_p^d)\in V(\mathcal{R})$ , we set
Then we can relabel the elements in $\mathcal{R}^+_p$ and $\mathcal{R}^-_p$ as
where $r_p$ represents the number of forward reactions with the same reaction vector $\omega_p$ ; we call $r_p$ the multiplicity of $\omega_p$ . Here we mainly focus on the case of $r_p > 1$ for some $1\leq p\leq r$ . For any $1\leq p \leq r$ and $1\leq l \leq r_p$ , we set $\nu_{pl}=(\nu^1_{pl},\dots,\nu^d_{pl})$ and $\nu'_{\!\!pl}=(\nu'^1_{\!\!pl},\dots,\nu'^d_{\!pl})$ .
We first recall the stochastic model of reaction networks. For each $1\leq j\leq d$ , let $N_j(t)$ denote the number of molecules of the chemical species $S_j$ at time t. Then the concentration of $S_j$ at time t is given by $X^V_j(t) = N_j(t)/V$ , where V is the system size. Let $X^V(t)=(X^V_1(t),\dots,X^V_d(t))$ denote the concentration process of all chemical species. At the mesoscopic level, the process $\{X^V(t):t\geq 0\}$ can be modeled by a continuous-time Markov chain on the d-dimensional nonnegative integer lattice
with transition rate matrix $Q^V = (q^V_{x,y})$ whose elements are defined as follows: for any $1\leq p\leq r$ and any $x\in E_V$ ,
where $|\nu|=\sum_{j=1}^d\nu_j$ is called the order of the complex $\nu = (\nu_1,\cdots,\nu_d)$ , and we write $x!=\prod_{j=1}^d x_j!$ for each vector $x = (x_1,\cdots,x_d)\in\mathbb{Z}^d_{\geq 0}$ . Let $\mathbb{P}^V_{x}(t)$ denote the probability of observing state $x\in E_V$ at time t. Then the evolution of the stochastic model is governed by the following chemical master equation:
We next recall the deterministic model of reaction networks. For each $1\leq j\leq d$ , let $x_j(t)$ denote the concentration of the chemical species $S_j$ at time t. At the macroscopic level, the concentration process $x(t)=(x_1(t),\dots,x_d(t))$ of all chemical species can be modeled by the following ordinary differential equation with mass action kinetics:
with $x_0$ denoting the initial concentration vector and
where we write $x^y=\prod_{j=1}^d x_j^{y_j}$ for any vectors $x,y\in\mathbb{R}^d_{\geq 0}$ . The relationship between the mesoscopic stochastic model and the macroscopic deterministic model is revealed by the following celebrated Kurtz theorem [Reference Kurtz31, Reference Kurtz32]: for any $\delta,T>0$ , whenever $x_0^V\in E_V$ and $x_0^V\rightarrow x_0\in\mathbb{R}^d_{\geq 0}$ ,
where $\mathbb{P}_{x_0^V}(\!\cdot\!) = \mathbb{P}(\!\cdot|X^V(0) = x_0^V)$ and $\|x\|$ denotes the Euclidean norm of $x\in \mathbb{R}^d$ . This implies that as the system size tends to infinity, the trajectories of the stochastic model will converge to those of the deterministic model on any compact time interval, whenever the initial value converges.
The limit theorem in (3) can be viewed as the law of large numbers for the stochastic model. The corresponding large deviation principle was proved recently by Agazzi et al. [Reference Agazzi, Dembo and Eckmann2, Theorem 1.6] and is stated as follows. The Hamiltonian of a stochastic reaction network is defined as
where $x\cdot y = \sum_{j=1}^dx_jy_j$ denotes the usual scalar product on $\mathbb{R}^d$ . The Lagrangian of a stochastic reaction network is then defined as the Legendre–Fenchel transform of the Hamiltonian with respect to the variable $\theta$ , namely
The Lagrangian is nonnegative because $L(x,y)\geq 0\cdot y-H(x,0) = 0$ . Moreover, it is not hard to prove that $L(x,y) = \infty$ for any $y\notin\mathrm{span}(V(\mathcal{R}))$ . This is because any $y\notin\mathrm{span}(V(\mathcal{R}))$ can be decomposed uniquely as $y = y_1+y_2$ , where $y_1\in\mathrm{span}(V(\mathcal{R}))$ and $0\neq y_2\in\mathrm{span}(V(\mathcal{R}))^\perp$ . Thus for any $K>0$ , we have $L(x,y)\geq Ky_2\cdot y-H(x,Ky_2) = K\|y_2\|^2$ , where we have used the fact that $H(x,\theta) = 0$ for any $\theta\in\mathrm{span}(V(\mathcal{R}))^\perp$ . Since K is arbitrarily chosen, we conclude that $L(x,y) = \infty$ .
To proceed, let $D_{[0,T]}(\mathbb{R}^d_{\geq 0})$ denote the space of càdlàg functions $\phi:[0,T]\rightarrow \mathbb{R}^d_{\geq 0}$ , equipped with the topology of uniform convergence. For any $x_0\in\mathbb{R}^d_{\geq 0}$ , let $I_{x_0,T}:D_{[0,T]}(\mathbb{R}^d_{\geq 0})\rightarrow[0,\infty]$ be the function defined as
Using the properties of the Lagrangian, it is easy to see that $I_{x_0,T}(\phi) = \infty$ if there exists $0\leq t\leq T$ such that $\phi(t)\notin x_0+\mathrm{span}(V(\mathcal{R}))$ . With this notation, Agazzi et al. proved the following result [Reference Agazzi, Dembo and Eckmann2, Theorem 1.6]: provided that the network is asiphonic and strongly endotactic (see [Reference Agazzi, Dembo and Eckmann2, Definitions 1.8 and 1.9] for detailed definitions), for any $x^V_0\in E_V$ and $x^V_0\rightarrow x_0$ , the law of the process $\{X^V(t):t\in [0,T]\}$ with $X^V(0) = x^V_0$ satisfies a large deviation principle with rate V and good rate function $I_{x_0,T}$ . The large deviation principle means that for any measurable set $\Gamma\subset D_{[0,T]}(\mathbb{R}^d_{\geq 0})$ , we have
where $\Gamma^o$ and $\bar\Gamma$ denote the interior and closure of $\Gamma$ , respectively. Combining (3) and (6), it is easy to see that $I_{x_0,T}(x)=0$ , where $x = x(t)$ is the solution of the deterministic model (1). The rate function can be used to define the following quasi-potential:
Intuitively, $W(x_0,y)$ represents the ‘cost’ for the stochastic reaction network to move from $x_0$ to y. It is easy to see that the quasi-potential is nonnegative and jointly continuous in $x_0$ and y [Reference Olivieri and Vares38]. Using the properties of the Lagrangian, it is easy to see that $W(x_0,y) = \infty$ if $y\notin x_0+\mathrm{span}(V(\mathcal{R}))$ .
Agazzi et al. [Reference Agazzi, Dembo and Eckmann2, Theorem 1.15] also deal with the Freidlin–Wentzell-type metastability theory for chemical reaction networks, where the quasi-potential plays a central role. For simplicity, we consider the case where the domain under consideration contains only one stable equilibrium point. Specifically, we assume that the following four conditions are satisfied:
-
(a) D is a bounded open domain in $\mathbb{R}^d_{\geq 0}$ with a piecewise $C^2$ boundary $\partial D$ .
-
(b) The point $c\in D$ is an asymptotically stable equilibrium point of the deterministic model (1).
-
(c) The set $\bar{D} = D\cup\partial D$ is attracted to c, which means that whenever $x_0\in\bar D$ , the solution of the deterministic model (1) starting from $x_0$ satisfies $x(t)\in D$ for each $t>0$ and $x(t)\rightarrow c$ as $t\rightarrow\infty$ .
-
(d) There exists a ball $B\subset\bar D$ such that for any $x\in B$ and $y\in\bar D$ , the set $\bar D$ contains the line segment between x and y.
It is easy to check that Assumptions A.3 and A.4 in [Reference Agazzi, Dembo and Eckmann2] are satisfied under these conditions. Then the Kurtz theorem implies that when V is sufficiently large, the trajectory of the stochastic model will stay in the domain D over any compact time interval with overwhelming probability. However, it is still possible for the system to escape from D. The mean exit time from D has the following asymptotic behavior:
where $\tau_V=\inf\{t\geq 0:X^V(t)\notin D\}$ denotes the exit time of $X^V$ from D. Moreover, if there is a unique $y_0\in\partial D$ such that
then for any $\delta>0$ , the exit position from D has the asymptotic behavior
and for any $\delta>0$ and $z_0\in\partial D$ , the exit distribution from D has the asymptotic behavior
Intuitively, when V is sufficiently large, the stochastic model will escape from D around a particular point $y_0\in\partial D$ at which the quasi-potential restricted to $\partial D$ attains its minimum.
3. Detailed balance for chemical reaction networks
In this section, we investigate the relationship among different types of detailed balance conditions for chemical reaction networks. Before stating our results, we first recall the definitions of deterministic and stochastic detailed balance for chemical reaction networks [Reference Joshi27].
Definition 2. We say that a reaction network satisfies deterministic detailed balance (or reaction network detailed balance [Reference Joshi27]) if there exists $c\in\mathbb{R}^d_{>0}$ such that
Here c is called a chemical equilibrium state of the reaction network.
Clearly, any chemical equilibrium state c is also an equilibrium point of the deterministic model (1), and thus it is also called a detailed balanced equilibrium point. It has been shown that for mass action kinetics, if one positive equilibrium point of the deterministic model is detailed balanced, then every positive equilibrium point is detailed balanced [Reference Feinberg15, Reference Joshi27].
Definition 3. We say that a reaction network satisfies stochastic detailed balance (or Markov chain detailed balance [Reference Joshi27]) if for any $V>0$ , there exists a probability measure $\pi^V = (\pi^V_x)$ on $E_V$ such that
Note that here we do not require $\pi^V$ to be a probability distribution.
A simple method of verifying stochastic detailed balance is to use the Kolmogorov criterion [Reference Kolmogoroff29], which states that a reaction network satisfies stochastic detailed balance if and only if for any $V>0$ , the transition rates satisfy the following Kolmogorov cycle condition:
for any finite number of states $x_1,\dots,x_n\in E_V$ . In other words, the Kolmogorov criterion states that a reaction network satisfies stochastic detailed balance if and only if for any $V>0$ , the product of the transition rates of the stochastic model along any cycle is equal to that along the reversed cycle.
Besides deterministic and stochastic detailed balance, we introduce another type of detailed balance which is defined as follows. This new type of detailed balance will play an important role in constructing the global potential of a chemical reaction network.
Definition 4.
-
(i) We say that a reaction network satisfies zero-order local detailed balance if for any integers $\xi_1,\dots,\xi_r$ satisfying $\sum_{p=1}^r \xi_p\omega_p = 0$ , we have
(9) \begin{equation}\sum_{p=1}^r \xi_p\log \frac{f^+_p(x)}{f^-_p(x)}=0,\;\;\;\text{for any}\,x\in \mathbb{R}_{>0}^d,\end{equation}where $f^+_p(x)$ and $f^-_p(x)$ are the functions defined in (2). -
(ii) We say that a reaction network satisfies first-order local detailed balance if for any $1\leq p,q\leq r$ with $p\neq q$ , we have
(10) \begin{equation}\omega_{q}\cdot\nabla\left(\log \frac{f^+_{p}(x)}{f^-_{p}(x)}\right)= \omega_{p}\cdot\nabla\left(\log \frac{f^+_{q}(x)}{f^-_{q}(x)}\right),\;\;\;\text{for any}\;x\in \mathbb{R}_{>0}^d.\end{equation} -
(iii) We say that a reaction network satisfies local detailed balance if it satisfies both zero-order and first-order local detailed balance.
Remark 1. The ideas behind the above definition are explained as follows. For any integers $\xi_1,\dots,\xi_r$ satisfying $\sum_{p=1}^r\xi_p\omega_p = 0$ , we can construct a cycle C in the integer lattice $\mathbb{Z}^d$ , which is given by
Here $\mathrm{sgn}(x)$ is the sign function, which takes the value of 1 if $x>0$ , takes the value of 0 if $x=0$ , and takes the value of $-1$ if $x<0$ . Obviously, for any $x^V\in E_V$ and $x^V\rightarrow x\in\mathbb{R}^d_{>0}$ , the cycle C can induce a cycle $x^V+C/V$ in $E_V$ around x, which becomes smaller as V increases. For convenience, let $\eta_1\rightarrow\eta_2\rightarrow\dots\rightarrow \eta_L\rightarrow\eta_1$ denote the induced cycle in $E_V$ , where $L = \sum_{p=1}^r|\xi_p|$ is the number of transitions in the cycle. If a reaction network satisfies stochastic detailed balance, then it follows from Kolmogorov’s cycle condition that
Note that the left-hand side of this equality is a function of $1/V$ . Since $f(1/V) = 0$ for all $V>0$ , we have
and
Roughly speaking, the condition (9) extracts the zero-order information of the equality (11) as $V\rightarrow\infty$ , and the condition (10) extracts the first-order information of the equality (11) as $V\rightarrow\infty$ . Since the induced cycle becomes smaller as V increases, (9) and (10) actually contain, respectively, the zero-order and the first-order local information of detailed balance around $x\in\mathbb{R}^d_{>0}$ .
It is a well-known result that deterministic detailed balance implies stochastic detailed balance for a chemical reaction network [Reference Joshi27, Theorem 5.9]. The following theorem reveals the relationship between stochastic and local detailed balance.
Theorem 1. If a reaction network satisfies stochastic detailed balance, then it also satisfies local detailed balance. In other words, stochastic detailed balance implies local detailed balance.
Proof. The proof of the theorem will be given in Section 5.
The next corollary follows immediately from Theorem 1 and [Reference Joshi27, Theorem 5.9].
Corollary 1. For a chemical reaction network, the following statements hold:
-
(a) Deterministic detailed balance implies stochastic detailed balance.
-
(b) Stochastic detailed balance implies local detailed balance.
-
(c) Local detailed balance implies zero-order local detailed balance.
The above corollary reveals the inclusion relationship among the four types of detailed balance—deterministic, stochastic, local, and zero-order local detailed balance—as illustrated in Figure 1. Deterministic detailed balance is the strongest and zero-order local detailed balance is the weakest. The following proposition reveals when the four types of detailed balance are equivalent.
Proposition 1. If a chemical network has no equivalent reactions, then the following statements are equivalent:
-
(a) The network satisfies deterministic detailed balance.
-
(b) The network satisfies stochastic detailed balance.
-
(c) The network satisfies local detailed balance.
-
(d) The network satisfies zero-order local detailed balance.
Proof. By Corollary 1, we only need to prove that (d) implies (a). If the network satisfies zero-order local detailed balance, for any integers $\xi_1,\cdots,\xi_r$ satisfying $\sum_{p=1}^r\xi_p\omega_p=0$ , we have
Since the network has no equivalent reactions, we have $r_p=1$ for any $1\leq p\leq r$ . This shows that
where $\log x = (\log x_1,\cdots,\log x_d)$ . Combining the above two equations shows that
Thus we conclude that for any integers $\xi_1,\cdots,\xi_r$ satisfying $\sum_{p=1}^r\xi_p(\nu'_{\!\!p1}-\nu_{p1}) = 0$ , we have
This is exactly the so-called Wegscheider cycle condition, which is widely known as the sufficient and necessary condition for deterministic detailed balance [Reference Gorban and Yablonsky17, Proposition 1]. Therefore, we have proved that (d) implies (a).
We have seen that if a chemical network has no equivalent reactions, then the four types of detailed balance are equivalent. For chemical networks having equivalent reactions, however, the four types of detailed balance are no longer equivalent, as can be seen from the following three counterexamples.
The first example [Reference Joshi27, Reference Vellela and Qian44] gives a reaction network that satisfies stochastic detailed balance but violates deterministic detailed balance.
Example 1. Consider the following well-known Schlögl model [Reference Vellela and Qian44]:
The stochastic model of this reaction network is a one-dimensional birth–death process and thus must satisfy stochastic detailed balance. Moreover, it is easy to check that deterministic detailed balance is satisfied if and only if $k^+_{11}/k^-_{11}=k^+_{12}/k^-_{12}$ [Reference Vellela and Qian44]. In other words, if $k^+_{11}/k^-_{11}\neq k^+_{12}/k^-_{12}$ , then deterministic detailed balance is violated.
The next example gives a reaction network that satisfies local detailed balance but violates stochastic detailed balance.
Example 2. Consider the following chemical reaction system:
By definition, the forward reactions are given by
and the backward reactions are given by
It is easy to see that the first two forward reactions have the same reaction vector $\omega_1=1$ , and the last three forward reactions also have the same reaction vector $\omega_2=2$ . The multiplicities of the two reaction vectors are given by $r_1=2$ and $r_2=3$ , respectively. We first prove that the system satisfies local detailed balance. Clearly, the two reaction vectors are linearly related by $\xi_1\omega_1+\xi_2\omega_2 = 0$ with $\xi_1 = 2$ and $\xi_2 = -1$ . It is easy to check that
Therefore, we have
which shows that zero-order local detailed balance is satisfied. Moreover, we have
which shows that first-order local detailed balance is also satisfied. We next prove that the system violates stochastic detailed balance. For each $V>0$ , consider the following cycle in $E_V$ :
The transition rates along this cycle and its reversed cycle are given by
Direct computation shows that
It is easy to check that the left-hand side of this equation is equal to 1 if and only if $k_1^+/k^-_1=k^+_2/k^-_2$ , which means that stochastic detailed balance is violated if $k_1^+/k^-_1\neq k^+_2/k^-_2$ .
The third example gives a reaction network that satisfies zero-order local detailed balance but violates local detailed balance.
Example 3. Consider the following chemical reaction system:
By definition, the forward reactions are given by
and the backward reactions are given by
The first two forward reactions have the same reaction vector $\omega_1=(1,0)$ , the next two forward reactions have the same reaction vector $\omega_2=(0,1)$ , and the last forward reaction has the reaction vector $\omega_3=(1,1)$ . The multiplicities of the three reaction vectors are given by $r_1 = r_2=2$ and $r_3=1$ , respectively. We first prove that the system satisfies zero-order local detailed balance. Clearly, the three reaction vectors are linearly related by $\xi_1\omega_1+\xi_2\omega_2+\xi_3\omega_3 = 0$ with $\xi_1 = \xi_2= 1$ and $\xi_3 = -1$ . It is easy to check that
Therefore, we have
which shows that zero-order local detailed balance is satisfied. On the other hand, it is easy to check that
Clearly, the left-hand sides of the above two equations are equal if and only if $k_1^+/k^-_1=k^+_2/k^-_2$ . In other words, if $k_1^+/k^-_1\neq k^+_2/k^-_2$ , then first-order local detailed balance is violated and thus the system does not satisfy local detailed balance.
In [Reference Joshi27, Theorem 4.2], Joshi gave the following sufficient and necessary condition for deterministic detailed balance.
Theorem 2. ([Reference Joshi27, Theorem 4.2]) A reaction network satisfies deterministic detailed balance if and only if the following two conditions are satisfied:
-
(a) For any $1\leq p\leq r$ , the rate constants of the reactions in $\mathcal{R}^+_p$ and $\mathcal{R}^-_p$ satisfy
\begin{equation*}\frac{k^+_{p1}}{k^-_{p1}} = \frac{k^+_{p2}}{k^-_{p2}} = \cdots = \frac{k^+_{pr_p}}{k^-_{pr_p}}.\end{equation*} -
(b) For any integers $\xi_1,\cdots,\xi_r$ satisfying $\sum_{p=1}^r\xi_p\omega_p=0$ , we have
(13) \begin{equation}\sum_{p=1}^r\xi_p\log\frac{k^+_{p1}}{k^-_{p1}} = 0.\end{equation}
While Joshi [Reference Joshi27, Theorem 5.14] also gave a sufficient condition for stochastic detailed balance, it is difficult to apply this condition in practice, because an infinite number of restrictions need to be verified. Here we give a simpler sufficient condition for stochastic detailed balance that is more applicable in practice. To state our sufficient condition, we need the following definition.
Definition 5. For each $1\leq p\leq r$ , we say that the reaction vector $\omega_p$ satisfies the orthogonality condition if
It is easy to see that if $r_p = 1$ , then the orthogonality condition is automatically satisfied for $\omega_p$ . The following theorem provides a new sufficient condition for stochastic detailed balance. This sufficient condition is imposed directly on rate constants and only a finite number of restrictions need to be verified.
Theorem 3. Suppose that the reaction vectors $\omega_1,\cdots,\omega_r$ are linearly independent. Suppose that for each $1\leq p\leq r$ , either one of the following two conditions is satisfied: The reaction vector $\omega_p$ satisfies the orthogonality condition.
-
(a) The reaction vector $\omega_p$ satisfies the orthogonality condition.
-
(b) The rate constants of the reactions in $\mathcal{R}^+_p$ and $\mathcal{R}^-_p$ satisfy
Then the reaction network satisfies stochastic detailed balance.
Proof. The proof of this theorem will be given in Section 6.
From Theorem 2, if the reaction vectors $\omega_1,\cdots,\omega_r$ are linearly independent, then (13) holds trivially and hence the reaction network satisfies deterministic detailed balance if and only if (14) is satisfied for all $1\leq p\leq r$ .
We next use Theorem 3 to construct more examples of chemical reaction networks that satisfy stochastic detailed balance but violate deterministic detailed balance.
Example 4. Consider the following chemical reaction system:
By definition, the forward reactions are given by
and the backward reactions are given by
The first two forward reactions have the same reaction vector $\omega_1=(1,0)$ , and the last two forward reactions have the same reaction vector $\omega_2=(1,-1)$ . The multiplicities of the two reaction vectors are given by $r_1 = r_2=2$ , and the two reaction vectors are linearly independent. Moreover, we have
It is easy to check that $(\nu_{22}^j-\nu^j_{21})\omega_1^j=0$ for $j=1,2$ . This shows that the orthogonality condition is satisfied for the reaction vector $\omega_2$ . By Theorem 3, the reaction network satisfies stochastic detailed balance if the condition (b) holds for $p = 1$ ; that is, $k^+_{11}/k^-_{11}=k^+_{12}/k^-_{12}$ . Thus, if $k^+_{11}/k^-_{11}=k^+_{12}/k^-_{12}$ but $k^+_{21}/k^-_{21}\neq k^+_{22}/k^-_{22}$ , then the system satisfies stochastic detailed balance but violates deterministic detailed balance.
4. Global potential for chemical reaction networks
In this section, we investigate the quasi-potential of chemical reaction networks, which plays a central role in the Freidlin–Wentzell-type metastability theory. The quasi-potential defined in (7) has two important features: (i) it is locally defined within each basin of attraction and in general cannot be globally defined over the whole space, and (ii) it is defined in the variational form, which is usually too complicated to be computed explicitly. The following theorem shows that, under the condition of local detailed balance, the quasi-potential not only can be defined globally over the whole space but also has an explicit and concise expression.
In the following, we always assume that the stochastic model $X^V$ starts from some $x^V_0\in E_V$ which satisfies $x^V_0\rightarrow x_0\in\mathbb{R}^d_{\geq0}$ as $V\rightarrow\infty$ . Recall that the critical points of a function $U\in C^1(\mathbb{R}^d_{>0})$ are those points in $\mathbb{R}^d_{>0}$ at which $\nabla U$ vanishes.
Theorem 4. Suppose that a reaction network satisfies local detailed balance. Let $\{\omega_{i_1},\cdots,\omega_{i_m}\}$ be an arbitrary basis of ${\rm span}(V(\mathcal{R}))$ and let $M = (\omega_{i_1}^T,\cdots,\omega^T_{i_m})$ be a $d\times m$ matrix, where m is the dimension of ${\rm span}(V(\mathcal{R}))$ . Let $F:\mathbb{R}^d_{>0}\rightarrow\mathbb{R}^d$ be a vector field defined as
Then the following five statements hold:
-
(a) The definition of the vector field F is independent of the choice of the basis of ${\rm span}(V(\mathcal{R}))$ . In addition, for any $1\leq p\leq r$ , we have
(16) \begin{equation}F(x)\cdot\omega_p =\log\frac{f^+_p(x)}{f^-_p(x)},\;\;\;x\in\mathbb{R}^d_{>0}.\end{equation} -
(b) The vector field F has a potential function $U\in C^\infty(\mathbb{R}^d_{>0})$ , i.e.,
(17) \begin{equation}F(x) = -\nabla U(x),\;\;\;x\in\mathbb{R}^d_{>0}\cap(x_0+\mathrm{span}(V(\mathcal{R}))).\end{equation} -
(c) The potential function U satisfies
\begin{equation*}\frac{{\rm d}}{{\rm d}t}U(x(t))\leq 0,\;\;\;t\geq 0,\end{equation*}where $x \,=\, x(t)$ is the solution of the deterministic model (1), and equality holds if and only if the deterministic model starts from any one of its equilibrium points. -
(d) The critical points of U within $\mathbb{R}^d_{>0}\cap(x_0+\mathrm{span}(V(\mathcal{R})))$ are also the equilibrium points of the deterministic model (1).
-
(e) Let $c\in\mathbb{R}^d_{>0}\cap(x_0+\mathrm{span}(V(\mathcal{R})))$ be an equilibrium point of the deterministic model (1). If $y\in\mathbb{R}^d_{>0}$ is attracted to c for the deterministic model (1), then
(18) \begin{equation}W(c,y) = U(y)-U(c),\end{equation}where W is the quasi-potential defined in (7).
Proof. The proof of this theorem will be given in Section 7.
It is easy to see that if two potential functions $U_1,U_2\in C^\infty(\mathbb{R}^d_{>0})$ both satisfy (17), i.e., if
then $U_1$ and $U_2$ must coincide with each other (up to a constant) on $\mathbb{R}^d_{>0}\cap(x_0+\mathrm{span}(V(\mathcal{R})))$ .
Definition 6. The potential function $U$ introduced in Theorem 4 is called the global potential of the reaction network.
Remark 2. In the classic Freidlin–Wentzell theory for randomly perturbed dynamical systems [Reference Freidlin and Wentzell16], it was shown that when detailed balance is satisfied, the system has a global potential that can be computed explicitly [Reference Freidlin and Wentzell16, Chapter 4, Theorem 3.1]. The above theorem is actually the counterpart of this result for stochastic reaction networks. It shows that under the condition of local detailed balance, we are able to construct a global potential of the reaction network, which is exactly the same as the quasi-potential (up to a constant) within each basin of attraction.
Remark 3. Combining Theorems 1 and 4, we can see that stochastic detailed balance ensures the existence of a global potential. From the deterministic perspective, the vector field F defined in (15) can be understood as the chemical force of the deterministic model. Recall that F has a potential function if and only if the line integral of F (the work exerted by the chemical force) along any smooth closed curve is zero. Similarly, from the stochastic perspective, the chemical force of the stochastic model between any two states $x,y\in E_V$ is defined as $\log q^V_{x,y}/q^V_{y,x}$ [Reference Zhang, Qian and Qian47, Theorem 2.5]. Then the Kolmogorov cycle condition guarantees that the work exerted by the chemical force along any cycle is zero; that is,
for any cycle $x_1\rightarrow x_2\rightarrow\cdots\rightarrow x_n\rightarrow x_1$ . This intuitively explains why stochastic detailed balance implies the existence of a global potential.
The following corollary shows that the global potential U satisfies the time-independent Hamilton–Jacobi equation.
Corollary 2. Suppose that a reaction network satisfies local detailed balance. Then we have
where $H(x,\theta)$ is the Hamiltonian defined in (4) and F(x) is the vector field defined in (15). In particular, we have
where U(x) is the global potential introduced in Theorem 4.
Proof. Since the network satisfies local detailed balance, it follows from Theorem 4(a) that
The rest of the proof follows immediately from Theorem 4(b).
If a reaction network satisfies deterministic detailed balance, then any detailed balanced equilibrium point $c\in\mathbb{R}^d_{>0}$ of the deterministic model (1) must also be complex balanced (see [Reference Anderson, Craciun and Kurtz7] for the detailed definition of this concept). Every complex balanced equilibrium point $c=(c_1,\dots,c_d)\in \mathbb{R}^d_{>0}$ can be used to construct a similar potential
which turns out to be a Lyapunov function of the deterministic model [Reference Anderson, Craciun, Gopalkrishnan and Wiuf6, Reference Horn and Jackson19]. The following theorem reveals the relationship between the global potential U introduced in Theorem 4 and the potential $\tilde{U}$ defined in (19).
Theorem 5. Suppose that a reaction network satisfies deterministic detailed balance with detailed balanced equilibrium point $c\in\mathbb{R}^d_{>0}$ . Let U be the global potential of the reaction network, and let $\tilde{U}$ be the Lyapunov function defined in (19). Then U coincides with $\tilde{U}$ (up to a constant) on $\mathbb{R}^d_{>0}\cap (x_0+{\rm span}(V(\mathcal{R})))$ .
Proof. Since $c\in\mathbb{R}^d_{>0}$ is a detailed balanced equilibrium point, it follows from (8) that
where $\log c=(\log c_1,\dots,\log c_d)$ . This shows that for any $x\in\mathbb{R}^d_{>0}$ ,
Moreover, it follows from Theorem 4(a), Theorem 4(b), and (12) that for any $x\in\mathbb{R}^d_{>0}\cap (x_0+{\rm span}(V(\mathcal{R})))$ ,
Combining the above two equations yields
Then the function $H=\tilde{U}-U$ must satisfy
For any $x,y\in\mathbb{R}^d_{>0}\cap (x_0+{\rm span}(V(\mathcal{R})))$ , let $\phi:[0,1]\rightarrow \mathbb{R}^d_{>0}$ be an arbitrary smooth curve satisfying
Then we have
where we have used the fact that $\dot{\phi}(\cdot)\in {\rm span}(V(\mathcal{R}))$ . This implies the desired result.
From the above theorem, U must coincide with $\tilde{U}$ on $\mathbb{R}^d_{>0}\cap (x_0+{\rm span}(V(\mathcal{R})))$ if the reaction network satisfies deterministic detailed balance. Moreover, Theorem 4(b) shows that U is the potential function of the vector field F. However, the following counterexample shows that $\tilde{U}$ is not necessarily the potential function of the vector field F, even the reaction network satisfies deterministic detailed balance.
Example 5. Consider the following chemical reaction system:
Clearly the system satisfies deterministic detailed balance and $V(\mathcal{R})) = \{(1,-1)\}$ . It is easy to check that the vector field F is given by
Suppose that $x_0 = (1,0)$ . Then there is a unique detailed balanced equilibrium point
The Lyapunov function $\tilde{U}$ associated with the equilibrium point c is given by
This shows that
It is easy to check that $-\nabla\tilde{U}\neq F$ on $\mathbb{R}^2_{>0}\cap (x_0+{\rm span}(V(\mathcal{R})))$ unless $(x_1,x_2) = c$ .
We next give an example showing the application of the results in this section.
Example 6. We revisit the chemical reaction system given in Example 4. Recall that the system satisfies stochastic detailed balance when $k^+_{11}/k^-_{11}=k^+_{12}/k^-_{12}$ . Here we assume that this condition is satisfied. Under this condition, it follows from Theorem 1 that the system also satisfies local detailed balance. Note that the reaction vectors in $V(\mathcal{R})$ are given by $\omega_1 = (1,0)$ and $\omega_2 = (1,-1)$ , which are linearly independent. Therefore, the matrix M is given by
and thus the vector field F is given by
where we have used the condition $k^+_{11}/k^-_{11}=k^+_{12}/k^-_{12}$ . It then follows from Theorem 4(b) that the system has a global potential which satisfies
Integrating the above equation gives the following explicit expression for the global potential:
where C is a constant which can be chosen so that the minimum of U is zero. By Theorem 4(d), any critical point $c = (c_1,c_2)\in\mathbb{R}^2_{>0}$ of the global potential U must satisfy $\nabla U(c)=0$ , that is,
Note that $c_2$ satisfies a cubic equation, which is capable of having three distinct positive real roots. In this case, the global potential U has two local minimum points $c_A$ and $c_C$ and one saddle point $c_B$ , as illustrated in Figure 2. It is easy to see that $c_A$ and $c_C$ are stable equilibrium points of the deterministic model (1) and $c_B$ is an unstable equilibrium point. Therefore, by applying Theorem 1, we have constructed a high-dimensional chemical reaction network that both satisfies stochastic detailed balance and displays multistability, i.e. multiple attractors for the deterministic model.
When V is large, a stochastic reaction network can transition between multiple attractors with low-probability events. In analogy to the classic Freidlin–Wentzell theory [Reference Freidlin and Wentzell16], if $x_0$ is in the basin of attraction of $c_A$ , then for any $\delta>0$ , it follows from [Reference Agazzi, Dembo and Eckmann2, Theorem 1.15] that the transition time between attractors has the following asymptotic behavior:
where $\sigma(B(c_C,\delta))$ denotes the hitting time of the ball centered at $c_C$ with radius $\delta$ . By Theorem 4(e), for any $y\in\mathbb{R}^2_{>0}$ staying in the basin of attraction of $c_A$ , we have
Taking $y\rightarrow c_B$ in the above equation and applying the continuity of the quasi-potential finally yields
Note that the right-hand side of this equation is independent of $c_C$ . This is because the trajectory of the stochastic model will be attracted by $c_C$ with very high speed once it has escaped from the basin of attraction of $c_A$ . Therefore, the hitting time of $B(c_C,\delta)$ is mainly determined by the exit time from the basin of attraction of $c_A$ .
Remark 4. The reaction networks in Examples 1 and 4 are called multistable systems, since they are capable of producing multiple positive equilibrium points of the deterministic model. So far, many results have been obtained to identify whether a deterministic reaction network admits multiple equilibrium points [Reference Conradi, Feliu, Mincheva and Wiuf10, Reference Conradi and Pantea11, Reference Joshi and Shiu28]. Here we mainly focus on the stochastic model, and our results can be applied to investigate the stochastic transitions between multiple attractors.
5. Proof of Theorem 1
Proof of Theorem 1. By the definition of the transition rates of the stochastic model, it is easy to see that for any $1\leq p\leq r$ , $x^V\in E_V$ , and $x^V\rightarrow x\in\mathbb{R}^d_{>0}$ , we have
This clearly shows that
To prove that the system satisfies local detailed balance, we first prove that zero-order local detailed balance is satisfied. For any sufficiently large V and any integers $\xi_1,\dots,\xi_r$ satisfying $\sum_{p=1}^r \xi_p\omega_p = 0$ , we construct the following cycle in $E_V$ :
where $\mathrm{sgn}(x)$ is the sign function, which takes the value of 1 if $x> 0$ , takes the value of 0 if $x=0$ , and takes the value of $-1$ if $x<0$ . Applying the Kolmogorov cycle condition to this cycle yields
Taking the limit as $V\rightarrow\infty$ in this equation and applying (21) yields
Taking logarithms on both sides yields
which shows that the system satisfies zero-order local detailed balance.
We next prove that first-order local detailed balance is satisfied. To this end, for any sufficiently large V and any $1\leq p,q\leq r$ with $p\neq q$ , we consider the following parallelogram cycle in $E_V$ :
Applying the Kolmogorov cycle condition to this cycle yields
Taking logarithms on both sides of this equation yields
By the mean value theorem, it is not hard to prove that
Combining the above two equations yields
which shows that first-order local detailed balance is also satisfied.
6. Proof of Theorem 3
To prove that a reaction network satisfies stochastic detailed balance, it suffices to prove that for any $V>0$ and any cycle $\eta_1\rightarrow\eta_2\rightarrow\cdots\rightarrow \eta_L\rightarrow\eta_1$ in $E_V$ , the Kolmogorov cycle condition
is satisfied. To this end, we need the following lemma.
Lemma 1. Under the conditions in Theorem 3, for any $x\in E_V$ and any integers $\xi_1,\cdots,\xi_r$ , whenever
and
for some $1\leq p\leq r$ , we have
where
Proof. We first discuss the relationship between the reactions that can occur at x and the reactions that can occur at $\tilde{x}$ . For each $1\leq p\leq r$ , let
be the family of reactions that belong to $\mathcal{R}^+_p$ and can occur at x. We claim that if $\omega_p$ satisfies the orthogonality condition, $\mathcal{R}^+_p(x)\neq\varnothing$ , and $\mathcal{R}^+_p(\tilde{x})\neq\varnothing$ , then $\mathcal{R}^+_p(x)=\mathcal{R}^+_p(\tilde{x})$ . To prove this, set
Since $\mathcal{R}^+_p(x)\neq\varnothing$ , there exists $1\leq l_1\leq r_p$ such that $Vx_j\geq\nu^j_{pl_1}$ for all $1\leq j\leq d$ . It then follows from the definition of $J_p$ that $Vx_j\geq\nu^j_{pl}$ for all $j\in J_p$ and $1\leq l\leq r_p$ . Similarly, since $\mathcal{R}^+_p(\tilde{x})\neq\varnothing$ , we conclude that $V\tilde{x}_j\geq \nu^j_{pl}$ for all $j\in J_p$ and $1\leq l\leq r_p$ . On the other hand, if $\omega_p$ satisfies the orthogonality condition, then $\omega^j_{q} = 0$ for all $j\notin J_p$ and $q\neq p$ . This shows that
for all $j\notin J_p$ . Therefore, for all $1\leq j\leq d$ and $1\leq l\leq r_p$ , we have proved that $Vx_j\geq \nu^j_{pl}$ holds if and only if $V\tilde{x}_j\geq \nu^j_{pl}$ holds. This clearly shows that $\mathcal{R}^+_p(x) = \mathcal{R}^+_p(\tilde{x})$ . We are now in a position to prove (22). For each $1\leq p\leq r$ , note that
where we have used the fact that $\omega_p-\nu'_{\!\!pl}=-\nu_{pl}$ for any $1\leq p\leq r$ and $1\leq l\leq r_p$ . Since
and
we have $\mathcal{R}^+_p(x)\neq\varnothing$ and $\mathcal{R}^+_p(\tilde{x})\neq\varnothing$ . This shows that $\mathcal{R}^+_p(x) = \mathcal{R}^+_p(\tilde{x})$ . Similarly, we have
To prove (22), we only need to prove
Since $\nu'_{\!\!pl}=\nu_{pl}+\omega_p$ for any $1\leq p\leq r$ and $1\leq l \leq r_p$ , we only need to prove
If $\omega_p$ does not satisfy the orthogonality condition, then (14) must hold. In this case, it is easy to see that
If $\omega_p$ satisfies the orthogonality condition, then we have
where the second and third equalities in (24) follow from the fact that $\nu^j_{pl_1}=\nu^j_{pl_2}$ for any $1\leq l_1,l_2\leq r_p$ and $j\in J_p$ and the fact that $\tilde{x}_j=x_j$ for any $j\notin J_p$ . Therefore, we have proved (23). This completes the proof.
We are now in a position to prove Theorem 3.
Proof of Theorem 3. Let $\eta_1\rightarrow\eta_2\rightarrow\cdots\rightarrow\eta_L\rightarrow\eta_1$ be an arbitrary cycle in $E_V$ . We first prove that there is a one-to-one correspondence between the transitions in the cycle. Since $\omega_1,\cdots,\omega_r$ are linearly independent, for each $1\leq p \leq r$ , the number of transitions resulting from the reaction vector $\omega_p$ in the cycle must be equal to the number of transitions resulting from the reaction vector $-\omega_p$ . (Otherwise, the reaction vector $\omega_p$ could be linearly expressed by other reaction vectors in $V(\mathcal{R})$ , which would contradict the fact that the elements in $V(\mathcal{R})$ are linearly independent.) We then pair the transitions resulting from the reaction vector $\omega_p$ with the transitions resulting from the reaction vector $-\omega_p$ in the following manner. First, we project the cycle onto the one-dimensional line spanned by $\omega_p$ , as illustrated in Figure 3. Note that the projection mentioned here means oblique projection rather than orthogonal projection. (Suppose that $\omega_1,\cdots,\omega_r$ are linearly independent vectors; if a vector $\omega$ can be linearly expressed by $\omega_1,\cdots,\omega_r$ as $\omega = a_1\omega_1+\cdots+a_r\omega_r$ , then the (oblique) projection of $\omega$ onto the direction of $\omega_p$ is simply defined as $a_p\omega_p$ .) The image of the projection onto the one-dimensional line is still a cycle, and only the transitions resulting from $\pm\omega_p$ exist in the projected cycle. Second, we decompose the transitions in the projected cycle into multiple floors, and we pair each transition with one of its reversed transitions located on the same floor, as depicted in Figure 3. This is always possible since the number of transitions resulting from $\omega_p$ on each floor must be equal to that resulting from $-\omega_p$ . Based on this method, each transition
can be paired with a transition
for some integers $\xi_1,\cdots,\xi_r$ . Obviously, this method establishes a one-to-one correspondence between the transitions in the cycle. Applying Lemma 1 to the paired transitions yields
This shows that
Thanks to the one-to-one correspondence between the transitions in the cycle, we finally obtain
Thus we have proved that the Kolmogorov cycle condition is satisfied for each cycle, which implies that the system satisfies stochastic detailed balance.
7. Proof of Theorem 4
To prove Theorem 4, we need some lemmas. The following lemma is exactly Theorem 4(a).
Lemma 2. Suppose that a reaction network satisfies zero-order local detailed balance. Then the vector field F defined in (15) is independent of the choice of the basis of ${\rm span}(V(\mathcal{R}))$ . Moreover, for any $1\leq p\leq r$ , we have
Proof of Lemma 2. Let $\{\omega_{i_1},\dots,\omega_{i_m}\}$ and $\{\omega_{j_1},\dots,\omega_{j_m}\}$ be two arbitrary bases of ${\rm span}(V(\mathcal{R}))$ , and let $M =(\omega^T_{i_1},\dots,\omega^T_{i_m})$ and $\tilde{M} = (\omega^T_{j_1},\dots,\omega^T_{j_m})$ be two matrices. Since the columns of M, as well as the columns of $\tilde{M}$ , are linearly independent, there exists an invertible matrix $A=(a_{pk})\in M_{m\times m}(\mathbb{\mathbb{R}})$ such that $\tilde{M}=MA$ , where $M_{m\times m}(\mathbb{\mathbb{R}})$ denotes the set of $m\times m$ matrices whose components are all real numbers. Since $\mathrm{rank}(M) = m$ , the matrix M must have an invertible $m\times m$ submatrix. Since the entries of M and $\tilde{M}$ are all rational numbers, using Cramer’s rule for computing the inverse matrix, it is easy to see that the entries of A are all rational numbers. Note that each column of $\tilde{M}$ can be linearly expressed by the columns of M as
Since the system satisfies zero-order local detailed balance and since $a_{pk}\in\mathbb{Q}$ for all $1\leq p,k\leq m$ , we obtain
Since $\mathrm{rank}(A) = \mathrm{rank}(A^TA)$ for an arbitrary real matrix A [Reference Zhang46, Section 3.5], it is easy to see that the matrices $M^TM$ and $\tilde{M}^T\tilde{M}$ are both invertible. Thus we obtain
This implies that the definition of the vector field F is independent of the choice of the basis of ${\rm span}(V(\mathcal{R}))$ . On the other hand, since $\{\omega_{i_1},\cdots,\omega_{i_m}\}$ is a basis of ${\rm span}(V(\mathcal{R}))$ , for each $1\leq p\leq r$ , the reaction vector $\omega_p$ can be linearly expressed by $\omega_{i_1},\cdots,\omega_{i_m}$ as
where $a = (a_1,\cdots,a_m)\in\mathbb{R}^m$ . This shows that
Since the system satisfies zero-order local detailed balance and since the entries of $(M^TM)^{-1}M^T\omega_p^T$ are all rational numbers, we immediately obtain
This completes the proof.
The following lemma is exactly Theorem 4(b).
Lemma 3. Suppose that a reaction network satisfies local detailed balance. Then the vector field F has a potential function $U\in C^\infty(\mathbb{R}^d_{>0})$ , i.e.,
Proof. For any $x\in x_0+{\rm span}(V(\mathcal{R}))$ , there exists $y = (y_1,\cdots,y_m)\in\mathbb{R}^m$ such that
Since $\{\omega_{i_1},\cdots,\omega_{i_m}\}$ is a basis of ${\rm span}(V(\mathcal{R}))$ , this equation establishes a one-to-one correspondence between $x\in\mathbb{R}^d_{>0}\cap(x_0+{\rm span}(V(\mathcal{R})))$ and $y\in E$ , where E is some convex open subset of $\mathbb{R}^m$ . We denote this correspondence by
The inverse of this affine transformation is then given by
For convenience, we introduce a function $h:E\rightarrow\mathbb{R}^m$ defined by
It is then easy to check that
Since the system satisfies first-order local detailed balance, we immediately obtain
To proceed, we construct the following smooth differential 1-form on E:
It then follows from (26) that
This shows that the differential form $\omega$ is closed. Since E is a convex open subset of $\mathbb{R}^m$ , it then follows from the Poincaré lemma [Reference Lee35, Theorem 17.14] that the kth de Rham cohomology of E vanishes for each $k\geq 1$ , which means that $\omega$ is exact. In other words, there exists a function $\tilde{U}\in C^\infty(E)$ such that
This clearly shows that $h = -\nabla\tilde{U}$ . Since $\tilde{U}$ is a smooth function on the open set $E\subset\mathbb{R}^m$ , we can extend it to a function $\tilde{U}\in C^\infty(\mathbb{R}^m)$ . To proceed, we define the potential function $U\in C^\infty(\mathbb{R}^d_{>0})$ as
For any $x\in\mathbb{R}^d_{>0}\cap(x_0+{\rm span}(V(\mathcal{R})))$ , straightforward computations show that
This completes the proof.
The following lemma is exactly Theorem 4(c)–(d).
Lemma 4. Suppose that a reaction network satisfies local detailed balance. Then
where $x = x(t)$ is the solution of the deterministic model (1), and equality holds if and only if the deterministic model starts from any one of its equilibrium points. Moreover, the critical points of U within $\mathbb{R}^d_{>0}\cap(x_0+\mathrm{span}(V(\mathcal{R})))$ are also the equilibrium points of the deterministic model.
Proof. Let $x = x(t)$ be the solution of the deterministic model (1). It then follows from Lemma 2 that
where the equality holds for some $t\geq 0$ if and only if $f_p^+(x(t))=f^-_p(x(t))$ for all $1\leq p\leq r$ , which implies that x(t) is an equilibrium point of the deterministic model (1). If the deterministic model does not start from any one of its equilibrium points, then x(t) is not an equilibrium point and thus the equality in (27) cannot be attained. Finally, if $c\in\mathbb{R}^d_{>0}\cap(x_0+\mathrm{span}(V(\mathcal{R})))$ is a critical point of U, then we have $F(c) = -\nabla U(c) = 0$ . It then follows from (16) that $f^+_p(c) = f^-_p(c)$ for each $1\leq p\leq r$ , which shows that c is an equilibrium point of the deterministic model.
The above lemma shows that the asymptotic stability of equilibrium points of the deterministic model (1) can be analyzed with the aid of the potential function U, according to the classic Lyapunov stability criterion [Reference Walter45, Chapter 30].
Lemma 5. Let $\{\omega_{i_1},\cdots,\omega_{i_m}\}$ be an arbitrary basis of ${\rm span}(V(\mathcal{R}))$ , and let $M = (\omega_{i_1}^T,\cdots,\omega^T_{i_m})$ be a $d\times m$ matrix. Then for any $x\in \mathbb{R}^d_{>0}$ and $y\in{\rm span}(V(\mathcal{R}))$ , we have
where $A = M(M^TM)^{-1}$ and L(x,y) is the Lagrangian defined in (5). Furthermore, there exists a unique $\theta = \theta_0\in\mathbb{R}^m$ at which the maximum is attained.
Proof. Let $\{v_1,\cdots,v_{d-m}\}$ be an arbitrary basis of ${\rm span}(V(\mathcal{R}))^\perp$ , and let
be two matrices. We first prove that the square matrix B is invertible. To this end, we consider the system of linear equations $Bz^T = 0$ . Since $Bz^T = 0$ , we have $M^Tz^T=0$ and $N^Tz^T=0$ . Since the columns of M and N constitute a basis of $\mathbb{R}^d$ , we conclude that $z = 0$ . Thus the system of linear equations $Bz^T = 0$ has only the zero solution, which implies that B is invertible.
For any $x\in\mathbb{R}^d_{>0}$ and $y\in{\rm span}(V(\mathcal{R}))$ , we have
where we have used the fact that B is invertible. Since $y\in {\rm span}(V(\mathcal{R}))$ , there exist $k_1,\cdots,k_m\in\mathbb{R}$ such that $y=k_1\omega_{i_1}+\dots+k_m\omega_{i_m}$ . Since the columns of M and the columns of N are orthogonal, we have
Similarly, we can prove that the last $d-m$ components of $\omega_pB^T$ are 0 for all $1\leq p \leq r$ . Thus the supremum in the last equality of (28) can be taken over the first m components of $\theta$ ; that is,
where
and $A = M(M^TM)^{-1}$ . Direct computations show that the Hessian matrix of the function $G(x,y,\theta)$ with respect to $\theta$ is given by
where
Since $A^T(\omega_{i_1}^T,\dots,\omega^T_{i_m}) = (M^TM)^{-1}M^TM = I$ and $g_p(x,\theta)>0$ for any $x\in\mathbb{R}^d_{>0}$ , it is easy to check that the Hessian matrix is negative definite. Therefore, $G(x,y,\theta)$ is a strictly concave function with respect to $\theta$ [Reference Niculescu and Persson37, Corollary 3.8.6].
We next prove that
Since $yA = (k_1,\cdots,k_m)$ , for any $\theta = (\theta_1,\cdots,\theta_m)$ , we have
Therefore, (30) follows directly from the fact that exponential functions grow much faster than linear functions.
Since $G(x,y,\theta)$ is a strictly concave function with respect to $\theta$ and since (30) holds, it follows that $G(x,y,\theta)$ must attain its maximum at a unique $\theta = \theta_0\in\mathbb{R}^m$ [Reference Hiriart-Urruty and LemarÉchal18, Chapter B, Theorem 4.1.1].
For any absolutely continuous trajectory $\phi:[0,T]\rightarrow \mathbb{R}^d_{>0}$ satisfying
we define
to be the line integral of the vector field F along the trajectory $\phi$ . If the system satisfies local detailed balance, then the vector field F has a potential function U. In this case, $S(\phi)$ can be represented by the potential function U as
which depends only on the endpoints of the trajectory $\phi$ and thus is ‘path-independent’. Moreover, we define the reversed trajectory of $\phi$ as
Lemma 6. Suppose that a reaction network satisfies local detailed balance. Then for any absolutely continuous trajectory $\phi:[0,T]\rightarrow \mathbb{R}^d_{>0}$ satisfying
we have
where $\phi^-$ is the reversed trajectory of $\phi$ .
Proof of Lemma 6. Since $\phi(\cdot)\in x_0+\mathrm{span}(V(\mathcal{R}))$ , we have $\dot\phi(\cdot)\in\mathrm{span}(V(\mathcal{R}))$ . It then follows from Lemma 5 that for each $0\leq t\leq T$ , there exists a unique $\theta = \theta_1(t)\in\mathbb{R}^m$ such that the following maximum is attained:
where $G(x,y,\theta)$ is the function defined in (29). Using Lemma 5 again shows that for each $0\leq t\leq T$ , there exists a unique $\theta = \theta_2(t)\in\mathbb{R}^m$ such that the following maximum is attained:
Since the maximum in (32) is attained at $\theta = \theta_1(t)$ , taking the derivatives of $G(\phi(t),\dot\phi(t),\theta)$ with respect to $\theta$ and evaluating at $\theta = \theta_1(t)$ yields
where $A = M(M^TM)^{-1}$ . By the proof of Lemma 5, $G(\phi(t),\dot\phi(t),\theta)$ is a strictly concave function with respect to $\theta$ . This shows that $\theta = \theta_1(t)$ is the unique solution of the following equation:
Similarly, since the maximum in (33) is attained at $\theta = \theta_2(t)$ , we obtain
To proceed, let
It then follows from (16) that
This clearly shows that
which implies that
Substituting this equation into (35), it is easy to check that $\theta = -\theta_2(t)-f(\phi(t))$ is also a solution of the equation (34). By the uniqueness of the solution of the equation (34), we immediately obtain
It follows from (32), (33), and (36) that
For the reversed trajectory $\phi^-$ , note that
Finally, we obtain
This completes the proof.
The following lemma is exactly Theorem 4(e).
Lemma 7. Suppose that a reaction network satisfies local detailed balance. Let $c\in\mathbb{R}^d_{>0}\cap(x_0+\mathrm{span}(V(\mathcal{R})))$ be an equilibrium point of the deterministic model (1). If $y\in\mathbb{R}^d_{>0}$ is attracted to c for the deterministic model (1), then
Proof. Let $\phi:[0,T]\rightarrow \mathbb{R}^d_{>0}$ be an arbitrary absolutely continuous trajectory satisfying
It thus follows from Lemma 6 that
where $\phi^-$ is the reversed trajectory of $\phi$ . Taking the infimum over $\phi$ on both sides of this equation yields
On the other hand, let $\phi_{y}(t)$ denote the trajectory of the deterministic model (1) starting from y. In addition, let
denote the reversed trajectory of $\phi_{y}$ over the interval [0,T]. Applying Lemma 6 again shows that
Moreover, let
be an absolutely continuous trajectory from c to $\phi_{y}(T)$ . Recall that for any $y\in{\rm span}(V(\mathcal{R}))$ , we have
where $A = M(M^TM)^{-1}$ . Since y is attracted to c, for any $\epsilon>0$ , we have $\|\phi_y(T)-c\|\leq \epsilon$ when T is sufficiently large. For convenience, set
For any $\theta = (\theta_1,\cdots,\theta_m)\in\mathbb{R}^m$ , whenever $\|x-c\|\leq \epsilon$ , we have
Thus for any $y = k_1\omega_{i_1}+\dots+k_m\omega_{i_m}\in\mathrm{span}(V(\mathcal{R}))$ , whenever $\|x-c\|\leq \epsilon$ , we have
where
It is easy to see that the function $f^\star$ is continuous on $\mathrm{span}(V(\mathcal{R}))$ . Since $\|\dot{\zeta_T}(t)\|\equiv 1$ and $f^\star$ is bounded on compact sets, there exists a constant $C_2>0$ such that for any $T\geq 0$ and $0\leq t\leq\|\phi_{y}(T)-c\|$ ,
Thus when T is sufficiently large, it follows from (38) that for any $0\leq t\leq\|\phi_{y}(T)-c\|$ ,
where we have used the fact that $\|\zeta_T(t)-c\|\leq\epsilon$ when T is sufficiently large. Finally, we obtain
Combining $\zeta_T$ and $\psi_T$ , we obtain an absolutely continuous trajectory from c to y. Therefore, we have
Since y is attracted to c, taking $T\rightarrow \infty$ in the above equation yields
where we have used the arbitrariness of $\epsilon$ . Finally, the desired result follows from (37) and (39).
Acknowledgements
We thank Prof. Vadim A. Malyshev and Sergey A. Pirogov for stimulating discussions via email, and we thank Prof. Hao Ge and Dr. Xiao Jin for providing some useful references. We are also grateful to the anonymous referees for their valuable comments and suggestions, which helped us greatly improve the quality of this paper. C. J. acknowledges support from the NSAF grant of the National Natural Science Foundation of China, under grant no. U1930402. D.-Q. J. is supported by the National Natural Science Foundation of China through grant no. 11871079.