1 Introduction
Let $(\beta _p)_{p \geqslant 2}$ be a sequence of non-negative numbers, which for simplicity we assume to contain only a finite number of nonzero elements, and, for every $r \in \mathbb {R}$ , let $\xi (r) := \sum _{p \geqslant 2} \beta _p r^p$ . For every integer $N \geqslant 1$ , let $P_N$ denote a probability measure on $\mathbb {R}^N$ , which we often (but not always) assume to be such that
We aim to study Gibbs measures built from the probability measure $P_N$ using as energy function the centered Gaussian field $(H_N(\sigma ))_{\sigma \in \mathbb {R}^N}$ with covariance
This Gaussian vector can be built explicitly using independent linear combinations of quantities of the form $\sum _{1 \leqslant i_1,\ldots ,i_p \leqslant N} J_{i_1,\ldots ,i_p} \sigma _{i_1} \cdots \, \sigma _{i_p}$ , where $(J_{i_1,\ldots ,i_p})$ are independent standard Gaussian random variables. The Gibbs measures thus obtained are often called mixed p-spin models, possibly with the qualifiers “spherical” or“with Ising spins” when $P_N$ is the uniform measure on the sphere in $\mathbb {R}^N$ or on $\{-1,1\}^N$ , respectively. The Sherrington–Kirkpatrick model corresponds to the case of Ising spins and $\xi (r) = \beta r^2$ . The Parisi formula is a self-contained description of the limit free energy
The identification of this limit was put on a rigorous mathematical footing in [Reference Guerra19, Reference Panchenko26, 31–33], after the fundamental insights reviewed in [Reference Mézard, Parisi and Virasoro20].
The main goal of the present paper is to propose a new way to think about this result. This new point of view reveals a natural connection with the solution of a Hamilton–Jacobi equation posed in the space ofprobability measures on the positive half-line. For every metric space E, we denote by $\mathcal P(E)$ the set of Borel probability measures on E, and by $\delta _x$ the Dirac measure at $x \in E$ . We also set, with $\xi ^*$ denoting the convex dual of $\xi $ defined precisely below in (1.4),
Theorem 1.1 (Hamilton–Jacobi representation of Parisi formula)
Assume (1.1), fix the normalization $\xi (1) = 1$ , and let $(t,\mu ) \mapsto f(t,\mu ) : \mathbb {R}_+ \times \mathcal P_*(\mathbb {R}_+) \to \mathbb {R}$ be the solution of the Hamilton–Jacobi equation
where the function $\psi $ is described below in (3.1) and Proposition 3.1. For every $t \geqslant 0$ ,
Interestingly, the evolution equation in (1.2) depends on the correlation function $\xi $ but not on the measures $P_N$ , while, as will be seen below, the initial condition $\psi $ depends on the measures $P_N$ but not on $\xi $ . We postpone a precise discussion of the meaning of the equation (1.2), and start by explaining the background and motivations for looking for such arepresentation.
Recently, a new rigorous approach to the identification of limit free energies of mean-field disordered systems was proposed in [Reference Mourrat21, Reference Mourrat22], inspired by [Reference Barra, Del Ferraro and Tantari3, Reference Barra, Di Biasio and Guerra4, Reference Guerra18]. The idea proposed there is to place the mainemphasis on the fact that after “enriching” the problem, we can identify the limit free energy as the solution of a Hamilton–Jacobi equation. At least for the problems considered there, one can showthat finite-volume free energies already satisfy the same Hamilton–Jacobi equation, except approximately. In particular, the approach allows for a convenient breakdown of the proof into two main steps: a first,more “probabilistic” part, which aims at showing that finite-volume free energies indeed satisfy an approximate Hamilton–Jacobi equation; and a second, more “analytic” part, which takesthis information as input and concludes that the limit must solve the equation exactly.
The problems studied in [Reference Mourrat21, Reference Mourrat22] relate to statistical inference. They possess a particular feature that enforces “replicasymmetry,” and this allows for a complete resolution of the problem by adding only a finite number of extra variables to the problem. As is well-known, this is not the case for mean-field spin glasses as thoseconsidered here. The relevant Hamilton–Jacobi equation, if any, must therefore be set in an infinite-dimensional space.
The identity of this Hamilton–Jacobi equation is revealed by Theorem 1.1. The aim of the present paper is to demonstrate the presence of this structure, and wewill therefore simply borrow formulas from the literature for the limit on the left side of (1.3), and check that the expressions found there agree with the right side of(1.3). Hence, I want to stress that Theorem 1.1 is a rephrasing of known results.
However, I believe that Theorem 1.1 can be useful in furthering our understanding by providing a new way for us to think about these results—see also [Reference Thurston34] for general considerations on the relevance of such endeavors. In the long run, I hope indeed that this new interpretation of the Parisi formula will suggest a new and possibly morerobust and transparent approach to the identification of the limit free energy of disordered mean-field systems. For this purpose, it will be important to rely on stability estimates for the Hamilton–Jacobiequation (1.2) (that is, estimates asserting that a function satisfying the equation approximately must be close to the true solution). This should leverage on powerfulapproaches to the well-posedness of Hamilton–Jacobi equations such as the notions of viscosity or weak solutions, as exemplified in the finite-dimensional setting in [Reference Mourrat21]and [Reference Mourrat22], respectively. Since the purpose of the present paper is only to demonstrate the presence of the Hamilton–Jacobi structure, I will refrain from exploring thisdirection here. Since the completion of this work, partial results on bipartite models have been obtained in [Reference Mourrat23] using the idea uncovered here. In [Reference Mourrat23, Section 6], it is also argued that more standard variational approaches do not seem to be applicable for such models.
This Hopf–Lax formulation features an optimal transport problem involving the cost function $(x,y) \mapsto \xi ^*(x-y)$ , where $\xi ^*$ is the convex dual of $\xi $ defined by
Notice that the function $\xi $ is convex on $\mathbb {R}_+$ , and the precise way to interpret $\xi ^*$ is as the dual of the convex and lower semicontinuous function on $\mathbb {R}$ which coincides with $\xi $ on $\mathbb {R}_+$ and is $+\infty $ otherwise. (The function f of interest to us satisfies a monotonicity property which can be interpreted as $\partial _{\mu } f \geqslant 0$ in a weak sense, and thus modifying $\xi $ on $\mathbb {R} \setminus \mathbb {R}_+$ is irrelevant to the interpretation of (1.2)—see also [Reference Mourrat22] for a more precise discussion of this point infinite dimension, as well as Lemma 2.4 below).
Optimal transport problems for measures on the real line are in some sense trivial, in that the couplings between pairs of measures can be realized jointly over all measures, and do not depend on the convex function $\xi ^*$ entering the definition of the cost function. Denoting, for every $\mu \in \mathcal P(\mathbb {R}_+)$ and $r \in [0,1]$ ,
and letting U be a uniform random variable over $[0,1]$ under $\mathbb {P}$ , we set
It is classical to verify that the law of $X_\mu $ under $\mathbb {P}$ is $\mu $ , and that for any two measures $\mu ,\nu \in \mathcal P_*(\mathbb {R}_+)$ , the law of the pair $(X_\mu , X_\nu )$ is an optimal transport plan for the cost function $(x,y) \mapsto \xi ^*(x-y)$ (see e.g., [Reference Villani35, Theorem 2.18 and Remark 2.19(ii)] or [Reference Ambrosio, Gigli and Savaré2, Theorem 6.0.2]). As discussed above, for thepurposes of this paper, we define the solution of (1.2) to be given by the Hopf–Lax formula
Although this will not be used here, one can give a brief and nonrigorous idea of the definition of the derivative $\partial _\mu $ formally appearing in (1.2) in the case when it applies to a sufficiently “smooth” function $g : \mathcal P_*(\mathbb {R}_+) \to \mathbb {R}$ : for each $\mu \in \mathcal P_*(\mathbb {R}_+)$ , we want $\partial _\mu g(\mu ,\cdot )$ to satisfy $\int\!\! \xi (\partial _\mu g(\mu ,\cdot )) \, \mathrm {d} \mu < \infty $ , and be such that, as $\nu \to \mu $ in $\mathcal P_*(\mathbb {R}_+)$ ,
where $\|Y\|_{L^*}$ denotes the $\xi ^*$ -Orlicz norm of a random variable Y, see [Reference Rao and Ren29],
From this informal definition, one can work out finite-dimensional approximations of the equation (1.2) by imposing, for instance, that only measures of theform $k^{-1} \sum _{\ell = 1}^k \delta _{x_\ell }$ are “permitted.” This brings us within the realm of finite-dimensional Hamilton–Jacobi equations and allows for instance to verify the correspondence between the equation (1.2) and the Hopf–Lax formula (1.7) at the level of these approximations.
We will in fact consider a richer family of finite-volume free energies than what appears on the left side of (1.3), parametrized by $(t,\mu ) \in \mathbb {R}_+ \times \mathcal P_*(\mathbb {R}_+)$ , and I expect that these free energies converge to $f(t,\mu )$ as N tends to infinity, where f is the solution of (1.2). In fact, I expect that a similar result holds for a much largerclass of measures $P_N$ than those covered by the assumption of (1.1). A precise conjecture to this effect is presented in Section 2. Theidentification of the initial condition $\psi $ appearing in (1.2) is then discussed in Section 3. The proof of Theorem 1.1 is given in Section 4. Finally, finite-dimensional approximations of (1.2) are briefly explored in Section5.
2 Conjecture for a general reference measure
The main goal of this section is to state a conjecture generalizing Theorem 1.1 to a wider class of measures $P_N$ than those appearing in (1.1). For simplicity, we retain the assumption that
If there exists some $R \in (0,\infty )$ such that for every N, the measure $P_N$ is supported in the ball $\{\sigma \in \mathbb {R}^N \ : \ |\sigma |^2 \leqslant R N\}$ , then one can without loss of generality reduce to the case in (2.1) by rescaling the function $\xi $ .
In order to gain some familiarity with Theorem 1.1 and its conjectured generalization, we start by illustrating the driving idea in simpler settings. Possibly thesimplest demonstration of the idea of identifying limit free energies of mean-field systems as solutions of Hamilton–Jacobi equations concerns the analysis of the Curie–Weiss model, see e.g., [Reference Mourrat21, Section 1] (earlier references include [Reference Brankov and Zagrebnov6, Reference Newman25]). We give here another simple illustration forspin glasses in the high-temperature regime, which is similar to discussions in [Reference Guerra18]. For every $t, h \geqslant 0$ , we consider the “enriched” free energy
where $z = (z_1,\ldots ,z_N)$ is a vector of independent standard Gaussians, independent of $H_N$ , and where $|\sigma |^2 = \sum _{i = 1}^N \sigma _i^2$ . Notice that under the assumptions of Theorem 1.1, we have $|\sigma |^2 = N$ and $\xi (N^{-1} |\sigma |^2) = 1$ . The terms $-Nt \xi \left ( N^{-1} |\sigma |^2 \right )$ and $-h|\sigma |^2$ inside the exponential in (2.2) are natural since they ensure that
(Observing that $H_N(\sigma )$ and $z\cdot \sigma $ are independent centered Gaussians of variance $N \xi \left ( N^{-1} |\sigma |^2 \right )$ and $|\sigma |^2$ respectively, this follows either by recognizing an exponential martingale, or by differentiating in t and h and using Gaussian integration by parts.) In statisticalphysics’ terminology, one may say that we have normalized the Hamiltonian so that the annealed free energy is always zero. The minus sign in front of the expression on the right side of (2.2) is also convenient since, by Jensen’s inequality, we thus have $\overline F_N^\circ \geqslant 0$ . One can check that
In the case when $\xi $ is convex over $\mathbb {R}$ , the right side of (2.3) is non-negative, and thus we already see that $\overline F_N^\circ $ is a supersolution of a simple Hamilton–Jacobi equation. Moreover, one can expect in many settings that the initial condition $\overline F_N^\circ (0,h)$ converges as N tends to infinity; for instance, when $P_N$ is the N-fold product measure $P_N = P_1^{\otimes N}$ , we have
where in this expression, the variable $z_1$ is a scalar standard Gaussian. Finally, if we expect the overlap $\sigma \cdot \sigma '$ to be concentrated around its expectation, which should be correct in a high-temperature region (that is, for t sufficiently small), then it should be that $\overline F_N^\circ $ converges to the solution of the equation $\partial _t f - \xi (\partial _h f) = 0$ .
However, as is well-known, the overlap $\sigma \cdot \sigma '$ is in fact not always concentrated around its mean value, and a more refined approach is necessary. In order to proceed, as in [Reference Guerra19, Reference Panchenko26, 31–33], we need to compare the system of interest with a much more refined “linear” system than $\sqrt {2h} z \cdot \sigma $ . We parametrize the more refined systems by a measure $\mu \in \mathcal P(\mathbb {R}_+)$ (and not $\mu \in \mathcal P([0,1])$ as experts may expect). It is much more convenient to describe this more refined system in the case when $\mu $ is a measure of finite support: we assume that for some integer $k \geqslant 0$ , there exist
such that
We represent the rooted tree with (countably) infinite degree and depth k by
where $\mathbb {N}^{0} = \{\emptyset \}$ , and $\emptyset $ represents the root of the tree. For every $\alpha \in \mathbb {N}^\ell $ , we write $|\alpha | := \ell $ to denote the depth of the vertex $\alpha $ in the tree $\mathcal A$ . For every leaf $\alpha = (n_1,\ldots ,n_k)\in \mathbb {N}^k$ and $\ell \in \{0,\ldots , k\}$ , we write
with the understanding that $\alpha _{| 0} = \emptyset $ . We also give ourselves a family $(z_{\alpha ,i})_{\alpha \in \mathcal A, 1 \leqslant i \leqslant N}$ of independent standard Gaussians, independent of $H_N$ , and we let $(v_\alpha )_{\alpha \in \mathbb {N}^k}$ denote a Poisson–Dirichlet cascade with weights given by the family $(\zeta _\ell )_{1 \leqslant \ell \leqslant k}$ . We refer to [Reference Panchenko26, (2.46)] for a precise definition, and briefly mention here the following three points. First, in the case $k = 0$ , we simply set $v_{\emptyset } = 1$ . Second, in the case $k = 1$ , the weights $(v_\alpha )_{\alpha \in \mathbb {N}}$ are obtained by normalizing a Poisson point process on $(0,\infty )$ with intensity measure $\zeta _1 x^{-1-\zeta _1} \, \mathrm {d} x$ so that $\sum _{\alpha } v_\alpha = 1$ . Third, for general $k \geqslant 1$ , the progeny of each nonleaf vertex at level $\ell \in \{0,\ldots , k-1\}$ is decorated with the values of an independent Poisson point process of intensity measure $\zeta _{\ell +1} x^{-1-\zeta _{\ell +1}} \, \mathrm {d} x$ , then the weight of a given leaf $\alpha \in \mathbb {N}^k$ is calculated by taking the product of the “decorations” attached to each parent vertex, including the leaf vertex itself (but excluding the root), and finally, these weights over leavesare normalized so that their total sum is $1$ . We take this Poisson–Dirichlet cascade $(v_\alpha )_{\alpha \in \mathbb {N}^k}$ to be independent of $H_N$ and of the random variables $(z_\alpha )_{\alpha \in \mathcal A}$ . For every $\sigma \in \mathbb {R}^N$ and $\alpha \in \mathbb {N}^k$ , we set
where we write $z_{\alpha _{|\ell }} \cdot \sigma = \sum _{i = 1}^N z_{\alpha _{|\ell },i} \, \sigma _i$ . The random variables $(H_N^{\prime }(\sigma ,\alpha ))_{\sigma \in {\mathbb {R}^N}, \alpha \in \mathbb {N}^k}$ form a Gaussian family which is independent of $(H_N(\sigma ))_{\sigma \in \mathbb {R}^N}$ and has covariance
where we write, for every $\alpha , \beta \in \mathbb {N}^k$ ,
We define the “enriched” free energy as
and $\overline F_N(t,\mu ) := \mathbb {E} \left [ F_N(t,\mu ) \right ]$ . As for (2.2), we have normalized this expression so that, by Jensen’s inequality, we have $\overline F_N \geqslant 0$ . We first notice that this quantity can be extended to all $\mu \in \mathcal P_1(\mathbb {R}_+)$ by continuity.
Proposition 2.1 (Continuity and extension of $\overline F_N(t,\mu )$ )
Assume (2.1). For each $t \geqslant 0$ and $\mu , \mu ^{\prime } \in \mathcal P(\mathbb {R}_+)$ with finite support, we have
In particular, the mapping $\mu \mapsto \overline F_N(t,\mu )$ can be extended by continuity to the set
The proof of proposition 2.1 makes use of the following two lemmas. The first one provides an explicit procedure for integrating the randomness coming from thePoisson–Dirichlet cascade. We refer to [Reference Panchenko26, Theorem 2.9] for a proof. (Notice that the indexation of the family $\zeta $ differs by one unit between here and [Reference Panchenko26].)
Lemma 2.2 (Integration of Poisson–Dirichlet cascades)
Assume (2.1), and fix $t \geqslant 0$ . For every $y_0, \ldots , y_k \in \mathbb {R}^N$ , define
and then recursively, for every $\ell \in \{1,\ldots , k\}$ ,
where, for every $\ell \in \{0,\ldots ,k\}$ , we write $\mathbb {E}_{y_{\ell }}$ to denote the integration of the variable $y_{\ell } \in \mathbb {R}^N$ along the standard Gaussian measure. We have
In order to state the second lemma, we introduce notation for the Gibbs measure associated with the free energy $F_N$ . That is, for every bounded measurable function $f : \mathbb {R}^N \times \mathbb {N}^k \to \mathbb {R}$ , we write
We usually simply write $\left \langle \cdot \right \rangle $ instead of $\left \langle \cdot \right \rangle _{t,\mu }$ unless there is a risk of confusion. Notice that the measure $\left \langle \cdot \right \rangle $ depends additionally on the realization of the Gaussian field $(H_N(\sigma ))$ and of the variables $(z_{\alpha })$ . By the definition of $F_N(t,\mu )$ , we have $\left \langle 1 \right \rangle = 1$ , and thus $\left \langle \cdot \right \rangle $ can be interpreted as a probability distribution on $\mathbb {R}^N \times \mathbb {N}^k$ . We also need to consider “replicated” pairs, denoted by $(\sigma ,\alpha ), (\sigma ^{\prime },\alpha ^{\prime }), (\sigma '', \alpha ''), \ldots $ , which are independent and are each distributed according to $\left \langle \cdot \right \rangle $ (conditionally on $(H_N(\sigma ))$ and $(z_\alpha )$ ). We keep writing $\left \langle \cdot \right \rangle $ to denote the tensorized measure, so that for instance, for every bounded measurable $f,g : \mathbb {R}^N \times \mathbb {N}^k \to \mathbb {R}$ , we have
The second lemma we need identifies the law of the overlap between $\alpha $ and $\alpha '$ under the Gibbs measure, after also averaging over the randomness coming from $(H_N(\sigma ))$ and $(z_\alpha )$ (averaging over $(z_\alpha )$ only would be sufficient).
Lemma 2.3 (overlaps for the Poisson–Dirichlet variables)
Whenever the measure $\mu $ is of the form in (2.4)-(2.5), we have, for every $t \geqslant 0$ and $\ell \in \{0,\ldots ,k\}$ ,
Proof The argument can be extracted from [Reference Talagrand33], or by observing that the derivation of [Reference Panchenko26, (2.82)] applies as well to the measures consideredhere. A slightly adapted version of the latter argument is as follows. We fix $\ell \in \{0,\ldots , k\}$ , and let $(g_\beta )_{\beta \in \mathbb {N}^\ell }$ be a family of independent standard Gaussians, independent of any other random variable considered so far. For every $\alpha , \beta \in \mathbb {N}^k$ , we have
Recall the construction of the Poisson–Dirichlet cascade outlined in the paragraph above (2.6), see also [Reference Panchenko26,(2.46)], and denote by $w_\alpha $ the weights attributed to the leaves by taking the product of the “decorations” of the parent vertices, before normalization, as in [Reference Panchenko26, (2.45)], sothat
By [Reference Panchenko26, (2.26)], for every $s \in \mathbb {R}$ , we have that
have the same law up to reorderings that preserve the tree structure: that is, we identify two families $(a_\alpha )_{\alpha \in \mathbb {N}^k}$ and $(b_\alpha )_{\alpha \in \mathbb {N}^k}$ whenever there exists a bijection $\pi : \mathbb {N}^k \to \mathbb {N}^k$ satisfying, for every $\alpha , \beta \in \mathbb {N}^k$ ,
We denote
and write $\left \langle \cdot \right \rangle _{t,\mu ,\ell ,s}$ to denote the measure defined as in (2.11) but with $v_\alpha $ replaced by $v_{\alpha ,\ell ,s}$ . By the invariance described above, Gaussian integration by parts, and (2.12), we have, for every $s \in \mathbb {R}$ ,
and, using the invariance once more, we can replace $\left \langle \cdot \right \rangle _{t,\mu ,\ell ,s}$ by $\left \langle \cdot \right \rangle _{t,\mu }$ in the last expression. We thus conclude that
which yields the desired result. ▪
Proof We decompose the proof into two steps.
Step 1. In this step, we give a consistent extension of the definition of $\overline F_N(t,\mu )$ to the case when the parameters in (2.4) may contain repetitions. More precisely, we give ourselves possibly repeating parameters
and let $\mu $ be the measure defined by (2.5). We show that the naive extension of the definition of $\overline F_N(t,\mu )$ obtained by simply ignoring the fact that there may be repetitions in the parameters in (2.13) yields the same result as the actual definition that wasgiven using nonrepeating parameters. The first thing we need to do is extend the definition of the Poisson–Dirichlet cascade $(v_\alpha )_{\alpha \in \mathbb {N}^k}$ to the case when some values of $(\zeta _\ell )_{\ell \in \{1,\ldots ,k\}}$ may be equal to $0$ . Recall that for $\zeta _\ell \in (0,1)$ , the definition briefly described in the paragraph above (2.6) involves a Poisson point process of intensity measure $\zeta _{\ell } x^{-1-\zeta _\ell } \, \mathrm {d} x$ . In the case $\zeta _\ell = 0$ , we interpret this Poisson point process as consisting of a single instance of the value $1$ and then a countably infinite repetition of the value $0$ . This allows us to define the quantity on the right side of (2.7) for arbitrary values of the parameters in (2.13). The average of this quantity can be calculated using Lemma 2.2: the only point that needs to be added is that in the case $\zeta _\ell = 0$ , we interpret (2.10) as
From this algorithmic procedure, one can check that the result does not depend on whether or not there were repetitions in the parameters in (2.13). Indeed,on the one hand, when $\zeta _\ell = \zeta _{\ell +1}$ , we have
where $\mathbb {E}_{y_\ell ,y_{\ell +1}}$ denotes the averaging of the variables $y_\ell , y_{\ell +1}$ when sampled independently according to the standard Gaussian measure on $\mathbb {R}^N$ ; and under this measure, the sum
has the same law as
On the other hand, if $q_\ell = q_{\ell +1}$ , then the term indexed by $\ell +1$ in the sum on the right side of (2.9) vanishes, and
It is thus clear in both cases that removing repetitions does not change the value of the resulting quantity.
Step 2. Consider now two measures $\mu ,\mu ' \in \mathcal P(\mathbb {R}_+)$ of finite support. There exist $k \in \mathbb {N}$ , $(\zeta _\ell )_{0 \leqslant \ell \leqslant k}$ , $(q_\ell )_{0 \leqslant \ell \leqslant k}$ and $(q^{\prime }_\ell )_{0 \leqslant \ell \leqslant k}$ satisfying (2.5), (2.13),
Using this representation, we can rewrite the $L^1$ -Wasserstein distance between the measures $\mu $ and $\mu '$ as
Abusing notation, we denote
and proceed to compute $\partial _{q_\ell } \overline F_N(t,\zeta ,q)$ , for each $\ell \in \{0,\ldots ,k\}$ . For every $\sigma , \tau \in \mathbb {R}^N$ , $\alpha \in \mathbb {N}^k$ and $\beta \in \mathcal A$ , we have
For every $\ell \in \{0,\ldots ,k-1\}$ , we have
By (2.16) and Gaussian integration by parts, see e.g. [Reference Panchenko26, Lemma 1.1], we obtain
The same reasoning also shows that
so that the last identity in (2.17) is also valid for $\ell = k$ . In particular, for every $\ell \in \{0,\ldots ,k\}$ , we have by (2.1) that
and thus, by integration,
A comparison with (2.14) then yields the desired result. ▪
We can also use Lemma 2.2 to give a more precise meaning to the vaguely stated monotonicity claim of $\partial _{\mu } f \geqslant 0$ expressed in the paragraph below (1.4), already at the level of the functions $\overline F_N$ .
Lemma 2.4 Let $\zeta ,q$ be parameters as in (2.4), and let $\overline F_N(t,\zeta ,q)$ be as in (2.15). For every $\ell \leqslant \ell ' \in \{0,\ldots , k\}$ , we have
as well as
Remark 2.5 As the proof will make clear, we can make sense of the quantity
even when $\zeta _\ell = \zeta _{\ell +1}$ , by continuity. In view of (2.17) and Lemma 2.3, the monotonicity expressed in (2.19) can be rephrased as the statement that, for every $\ell \leqslant \ell '$ ,
where we understand that the conditioning is with respect to the measure $\mathbb {E} \left \langle \cdot \right \rangle $ .
Proof The main step of the proof is similar to that of [Reference Talagrand33, Proposition 14.3.2]; see also [Reference Barra, Del Ferraro and Tantari3, Reference Barra, Di Biasio and Guerra4,Reference Guerra18, Reference Panchenko and Talagrand27]. We will rewrite the left side of (2.18) as an averaged overlap, takingLemma 2.2 as a starting point, the subtle point being in the identification of the correct measure with respect to which the average is taken. We start by introducing somenotation. We let $X_k,X_{k-1},\ldots ,X_0$ be as in Lemma 2.2, and define $X_{-1} := \mathbb {E}_{y_0} \left [ X_0(y_0) \right ]$ . For every $\ell \leqslant m \in \{0,\ldots ,k\}$ , we write
We also write $\mathbb {E}_{y_{\geqslant \ell }}$ to denote the integration of the variables $y_{\ell }, \ldots , y_k$ along the standard Gaussian measure, and we write $\mathbb {E}_y$ as shorthand for $\mathbb {E}_{y_{\geqslant 0}}$ . Within the current proof (and only here), we abuse notation and use $\left \langle \cdot \right \rangle $ with a meaning slightly different from that in (2.11), namely,
Defining $F_N(t,\zeta ,q)$ as in (2.15) (substituting $\overline F_N$ by $F_N$ there), we will show that for every $\ell \in \{0,\ldots ,k\}$ ,
We decompose the proof of (2.20) into two steps, and then conclude in a last step.
Step 1. We show that, for every $\ell ,m \in \{0,\ldots ,k\}$ ,
We prove the result by decreasing induction on $\ell $ . Setting $D_{k+1,k} = 1$ , the result is obvious for $\ell = k+1$ . Let $\ell \in \{1,\ldots ,k\}$ , and assume that the statement (2.21) holds with $\ell $ replaced by $\ell + 1$ . Using (2.10), we obtain (2.21) itself. This proves (2.21) for every $\ell \in \{1,\ldots , k\}$ . The statement for $\ell = 0$ is then immediate (recall that $\zeta _0 = 0$ ). Similarly, for every $\ell ,m \in \{0,\ldots ,k\}$ with $m> \ell $ and $i \in \{1,\ldots , N\}$ , we have
where we write $y_m = (y_{mi})_{1 \leqslant i \leqslant N} \in \mathbb {R}^N$ . For $m \leqslant \ell $ , we clearly have $\partial _{y_{mi}} X_{\ell -1} = 0$ .
Step 2. Notice that, for every $m \in \{0,\ldots ,k-1\}$ ,
We are ultimately interested in understanding $\partial _{q_m} X_{-1}$ , which, in view of (2.21), prompts us to study, for every $i \in \{1,\ldots ,N\}$ ,
where we performed a Gaussian integration by parts to get the equality. (Recall that $D_{1k} = D_{0k}$ since $\zeta _0 = 0$ .) We have
and
We next derive from (2.22) that, for every $\ell ,m \in \{0,\ldots ,k\}$ with $m> \ell $ and $i \in \{1,\ldots , N\}$ ,
It thus follows that
and
with the understanding that $\mathbb {E}_{y_{\geqslant k+1}}$ is the identity map, $D_{k+1,k} = 1$ , and recalling that $\zeta _{k+1} = 1$ . Combining this with (2.24) and (2.25), we thus get that
Using this identity in conjunction with (2.21) and (2.23), we arrive at
This identity is also valid when $m = k$ , as can be checked by following the same argument. We can then write $D_{1k} = D_{1m} D_{m+1,k}$ , and use that $D_{1m}$ does not depend on $y_{m+1},\ldots ,y_k$ , to conclude that
Step 3. We now show that (2.20) implies the lemma. First, it is clear from (2.20)that $\partial _{q_\ell } \overline F_N \geqslant 0$ . Turning to (2.19), we observe that, for each $\ell \in \{0,\ldots , k-1\}$ , we have by Jensen’s inequality that
and therefore,
It thus follows that the sequence
is increasing (in the sense of wide inequalities). By (2.20), this implies (2.19). ▪
We can now state the conjecture generalizing Theorem 1.1.
Conjecture 2.6 Assume (2.1) and that there exists a function $\psi : \mathcal P_*(\mathbb {R}_+) \to \mathbb {R}$ such that for every $\mu \in \mathcal P_*(\mathbb {R}_+)$ , $\overline F_N(0,\mu )$ converges to $\psi (\mu )$ as N tends to infinity. For every $t \geqslant 0$ and $\mu \in \mathcal P_*(\mathbb {R}_+)$ , we have
where $f : \mathbb {R}_+ \times \mathcal P(\mathbb {R}_+) \to \mathbb {R}$ solves the Hamilton–Jacobi equation in (1.2).
Recall that for the purposes of the present paper, we take the Hopf–Lax formula (1.7) as the definition of the solution to (1.2). In the case when $P_N$ is a product measure, this conjecture has now been proved in more recent work, see [Reference Mourrat and Panchenko24].
3 Convergence of initial condition
We now give two typical situations in which the convergence of $\overline F_N(0,\cdot )$ to some limit is valid. Whenever the limit exists, we write, for every $\mu \in \mathcal P_*(\mathbb {R}_+)$ ,
In agreement with Conjecture 2.6, the function $\psi $ is the initial condition we need to use for the Hamilton–Jacobi equation (1.2).
Proposition 3.1 (Convergence of initial condition)
-
(1) If the measure $P_N$ is of the product form $P_N = P_1^{\otimes N}$ , with $P_1$ of bounded support, then $\overline F_N(0,\cdot ) = \overline F_1(0,\cdot )$ .
-
(2) For every $\mu \in \mathcal P(\mathbb {R}_+)$ of compact support and $q \geqslant 0$ such that $\mu ([0,q]) = 1$ , let
(3.2) $$ \begin{align} \psi^\circ (\mu) &:= \inf \left\{\int_0^q \frac 1 {b-2\int_s^{q} \mu([0,r]) \, \mathrm{d} r} \, \mathrm{d} s \right. \nonumber\\ &\quad\qquad\left. + \frac 1 2 \left(b - 1 - \log b\right) - q \ : \ b> 2\int_0^q \mu([0,r]) \, \mathrm{d} r \right\}. \end{align} $$The right side of (3.2) does not depend on the choice of q satisfying $\mu ([0,q]) = 1$ , and the mapping $\mu \mapsto \psi ^\circ (\mu )$ can be extended by continuity to $\mathcal P_1(\mathbb {R}_+)$ . Moreover, if the measure $P_N$ is the uniform measure on the sphere $\{\sigma \in \mathbb {R}^N \ : \ |\sigma ^2| = N\}$ , then for every $\mu \in \mathcal P_1(\mathbb {R}_+)$ , we have(3.3) $$ \begin{align} \lim_{N \to \infty} \overline F_N(0,\mu) = \psi^\circ(\mu). \end{align} $$
Proof For part (1), we appeal to Lemma 2.2 and observe that, when $t = 0$ , the definition of $X_k$ given there becomes
Notice that the summands indexed by i are independent random variables under $\mathbb {E}_{y_k}$ , and this structure is preserved as we go down the levels, up to the definition of $X_0$ , where we end up with a sum of N terms that are deterministic and all equal to a constant which does not depend on N. This proves the claim (see also [Reference Panchenko26, (2.60)]).
For part (2), we first verify that the right side of (3.2) does not depend upon the choice of $q \geqslant 0$ satisfying $\mu ([0,q]) = 1$ . Indeed, for every q satisfying $\mu ([0,q]) = 1$ , $q' \geqslant q$ and $b> 2 \int_0^{q'} \mu ([0,r]) \, \mathrm {d} r$ , we have
We thus obtain that
Taking the infimum over $b> 2 \int_0^{q'} \mu ([0,r]) \, \mathrm {d} r = 2(q'-q) + 2 \int_0^q \mu ([0,r]) \, \mathrm {d} r$ concludes the verification of the fact that the right side of (3.2) does not depend on the choice of q satisfying $\mu ([0,q]) = 1$ .
In order to verify the convergence in (3.3), we start by considering the case of a measure of finite support. In this case, we can follow the arguments leading to [Reference Talagrand30, Proposition 3.1] and obtain (3.3). The full result then follows by the continuity property of $\overline F_N$ , see Proposition 2.1. ▪
It so happens that at least in the case when $P_N$ is a product measure, the initial condition $\psi = \lim _{N \to \infty } \overline F_N(0,\cdot )$ can itself be described in terms of a Hamilton–Jacobi equation of second order [Reference Parisi28]. We recall this fact in the proposition below for completeness, and soas to clarify the small modifications necessary to match the slightly different presentation explored in the present paper. As far as I understand, the fact that the initial condition admits such a representation seems tobe unrelated to the (first-order) Hamilton–Jacobi structure explored in the rest of the paper. Somewhat surprisingly, this representation makes it less clear that $\psi \geqslant 0$ .
Proposition 3.2 (Initial condition as second-order HJ equation)
Assume that $P_N$ is the N-fold tensor product $P_N = P_1^{\otimes N}$ , with $P_1$ a measure of bounded support, and denote $\psi = \lim _{N \to \infty } \overline F_N(0,\cdot ) = \overline F_1(0,\cdot )$ . For every $\mu \in \mathcal P(\mathbb {R}_+)$ with compact support and $q \geqslant 0$ such that $\mu ([0,q]) = 1$ , letting $u_\mu : [0,q]\times \mathbb {R} \to \mathbb {R}$ be the solution of the backward-in-time equation
we have $\psi (\mu ) = - u_\mu (0,0)$ .
Proof We first verify that $u_\mu $ does not depend on the choice of $q \geqslant 0$ satisfying $\mu ([0,q]) = 1$ . More precisely, denoting by $u_{\mu ,q}$ the solution obtained for a given choice of such q, and letting $q' \geqslant q$ , we have that the solutions $u_{\mu ,q}$ and $u_{\mu ,q'}$ coincide on $[0,q] \times \mathbb {R}$ . Indeed, this is a consequence of the fact that, writing
we have
(Verifying this boils down to the observation that the second moment of $\sigma $ under the appropriate Gibbs measure is the sum of the variance and the square of the first moment.) Let $\mu $ be a measure of the form (2.4) to (2.5), and let $(B_t)$ be a standard Brownian motion. We define, for every $x \in \mathbb {R}$ ,
and then recursively, for every $\ell \in \{0,\ldots ,k\}$ and $s \in [q_{\ell -1},q_{\ell })$ ,
Recall that when $\ell = 0$ , we have $\zeta _0 = 0$ and we interpret the right side above as
By induction, we have that for every $\ell \in \{0,\ldots , k\}$ ,
where here $\mathbb {E}_{y_\ell }$ denotes the integration of the variable $y_\ell $ according to the standard scalar Gaussian measure, and for $\ell = 0$ , the right side above is interpreted as
By Lemma 2.2 with $t = 0$ and $N = 1$ , we deduce that $\overline F_1(0,\mu ) = - v_\mu (0,0)$ , and we have already seen in part (1) of Proposition 3.1 that $\overline F_N(0,\cdot ) = \overline F_1(0,\cdot )$ . Moreover, denoting, for every $\ell \in \{0,\ldots ,k\}$ and $s \in [q_{\ell -1},q_\ell )$ ,
we have $\partial _s w_\mu +\partial _x^2w_\mu = 0$ on $[q_{\ell -1}, q_\ell ) \times \mathbb {R}$ , with continuity at the junction times $s \in \{q_0,\ldots ,q_{k}\}$ , and a change of variables then gives that $v_\mu $ solves (3.4). This shows that Proposition 3.2 holds whenever $\mu $ is a measure of finite support. The general case can then be obtained by continuity in $\mu $ (the continuity of $\overline F_1(0,\cdot )$ is a consequence of Proposition 2.1; for that of $\mu \mapsto u_\mu (0,0)$ , one can start by verifying that $\|\partial _x u_\mu \|_{L^\infty }$ is bounded by an upper bound on the support of $P_1$ using the maximum principle). ▪
4 Proof of Theorem 1.1
In this section, we give the proof of Theorem 1.1. Recall that we interpret the solution of (1.2) as being given by theHopf–Lax formula in (1.7). The formula (1.7) simplifies slightly in the case when $\mu = \delta _0$ , and thus the statement of Theorem 1.1 can be reformulated as follows.
Proposition 4.1 (Hopf–Lax representation of Parisi formula)
Assume (1.1), and fix the normalization $\xi (1) = 1$ . For every $t> 0$ , we have
Proof We first focus on the case when $P_N$ is the uniform probability measure on $\{-1,1\}^N$ . We decompose the argument for this case into four steps.
Step 1. In this step, we recast the standard expression for the Parisi formula, borrowed from [Reference Panchenko26], in the following form:
On the right side, the notation $(t\xi ')(\nu )$ denotes the image of the measure $\nu $ under the mapping $r \mapsto t\xi '(r)$ . Let $\nu \in \mathcal P([0,1])$ be a measure with finite support containing the extremal points $0$ and $1$ . For some $k \in \mathbb {N}$ and parameters
we can represent this measure as
The reason for the perhaps slightly surprising choice of setting $q_{-1} = q_0 = 0$ is that we have chosen here to include a term associated with the root of $\mathcal A$ , at level $\ell = 0$ , in the definition (2.6), while a different choice was taken in [Reference Panchenko26]. (The motivation for this inconsequentialdifference is that it then covers more naturally the situation in (2.2) as a particular case. Relatedly, by default, the measures of finite support considered in [Reference Panchenko26] have an atom at zero.) In order to extract the free energy associated with the Hamiltonian $\sigma \mapsto \sqrt {2t} H_N(\sigma )$ from [Reference Panchenko26, Theorem 3.1], we need to replace $\xi $ by $2t\xi $ in [Reference Panchenko26, (3.3)]. With this modification in place, and recalling Lemma 2.2, we see that the quantity denoted $\mathbb {E} X_0$ in [Reference Panchenko26, (3.11)] can be rewritten as
On the other hand, by (3.1) and Proposition 3.1, we have
We thus deduce that the quantity denoted $\mathbb {E} X_0$ in [Reference Panchenko26, (3.11)] is
The finite-volume free energy is normalized slightly differently here and in [Reference Panchenko26]: there is a multiplicative factor of $2^{-N}$ hidden in the fact that $P_N$ is normalized to be a probability measure, and an additional minus sign, on the left side of (4.1). Combining these observations and appealing to [Reference Panchenko26, Theorem 3.1] and to Proposition 2.1 yields (4.1).
Step 2. We fix $\nu \in \mathcal P([0,1])$ , $t> 0$ , and define $\mu := (t\xi ')(\nu )$ to be the image of $\nu $ under the mapping $r \mapsto t\xi '(r)$ . In this step, we show that
By the definition of $\mu $ and a change of variables, we have
Recall that
Since $\xi '(0) = 0$ , for each $s> 0$ , the supremum above is achieved at some $r> 0$ , and calculating the derivative in r shows that it is in fact achieved at $r = s$ , since $\xi '$ is injective. That is, we have $\xi ^*(\xi '(s)) = s\xi '(s) - \xi (s)$ , and thus (4.2) holds.
Step 3. In this step, we show that
We start by rewriting the last term in the supremum on the right side of (4.1), by appealing to the following integration by parts formula: for every $f \in L^1([0,1])$ ,
This formula itself is a consequence of Fubini’s theorem. We notice that
Recalling that we have fixed the normalization $\xi (1) = 1$ yields that
Combining this with (4.1), (4.2), and the fact that $\xi ' : [0,1] \to [0,\xi '(1)]$ is bijective, we obtain (4.3).
Step 4. In order to conclude the proof (in the case of Ising spins), there remains to show that the supremum on the right side of (4.3) does notincrease if we remove the restriction that the support of the measure $\mu $ be in $[0,t \xi '(1)]$ . Let $\mu \in \mathcal P_*(\mathbb {R}_+)$ , and let $\widetilde \mu $ denote the image of $\mu $ under the mapping $r \mapsto r \wedge (t\xi '(1))$ , where we write $a \wedge b := \min (a,b)$ . We show that
By Proposition 2.1 and Fubini’s theorem, we have
On the other hand, by the definition of $\widetilde \mu $ , we have
and thus
Recall that $\xi ^*(\xi '(1)) = \xi '(1) - \xi (1)$ . By the definition of the convex dual, we also have that $\xi ^*(s) \geqslant s - \xi (1)$ . Hence, the integral on the right side of (4.7) is bounded from below by
Combining this with (4.6) yields (4.5) and thus completes the proof in the case of Ising spins.
Step 5. We show Proposition 4.1 in the case when $P_N$ is the uniform probability measure on the sphere $\{\sigma \in \mathbb {R}^N \ : \ |\sigma |^2 = N\}$ . Using [Reference Talagrand30, Corollary 4.1] and arguing as in Step 1, one can check that the formula (4.1) is also valid inthis case. The rest of the argument carries over without modification. ▪
5 Finite-dimensional approximations
In this last section, we lightly touch upon the question of giving an intrinsic meaning to the Hamilton–Jacobi equation (1.2). This allows us to give somesubstance to the connection between this equation and the Hopf–Lax formula in (1.7).
There already exists a rich literature on Hamilton–Jacobi equations in infinite-dimensional Banach spaces, as well as on the Wasserstein space of probability measures or more general metric spaces; see inparticular [10–12] for the former and [Reference Ambrosio and Feng1, Reference Cardaliaguet7–Reference Cardaliaguet and Souquière9, 14–17] for the latter. I will refrain from engaging with these works here, and only discuss finite-dimensionalapproximations of the solution to (1.2).
A simple way to obtain a finite-dimensional approximation of (1.2) is to fix an integer $k \geqslant 1$ and restrict the space of allowed probability measures to those belonging to
A natural discretization of the formula (1.7) is then obtained by setting, for every $t \geqslant 0$ and $\mu \in \mathcal P^{(k)}(\mathbb {R}_+)$ ,
Abusing notation slightly, we also write, for every $x \in \mathbb {R}_+^{k}$ ,
We note the following elementary observation.
Lemma 5.1 For every $t \geqslant 0$ and $x \in \mathbb {R}_+^{k}$ , we have
Proof We introduce the notation
Notice first that the quantities $f^{(k)}(t,x)$ and $\psi ^{(k)}(x)$ are invariant under permutation of the coordinates of x. Hence, it suffices to prove the relation (5.1) under the additional assumptionthat $x \in \mathbb {R}^{k\uparrow }_+$ . It is clear that equality holds if on the right side, we take the supremum over $y \in \mathbb {R}^{k\uparrow }_+$ only. We now verify that other orderings of a given vector y yield a larger value for the sum on the right side of (5.1). Indeed, fix $x \in \mathbb {R}^{k\uparrow }_+$ , $y \in \mathbb {R}^k$ , and assume that there exist $i < j \in \{1,\ldots ,k\}$ such that $y_i \geqslant y_j$ . By the convexity of $\xi ^*$ and the fact that $x_i \leqslant x_j$ ,, the function $u \mapsto \xi ^* \left ( \frac {u - x_i}{t} \right ) - \xi ^* \left ( \frac {u - x_j}{t} \right )$ is increasing. In particular,
and therefore
That is, whenever $i < j$ and $y_i \geqslant y_j$ , replacing y by the vector with the coordinates $y_i$ and $y_j$ interchanged can only reduce (or keep constant) the value of the quantity
By induction, this implies that replacing the vector y by the increasingly ordered sequence of coordinates of y can only reduce (or keep constant) the quantity above. ▪
The convex dual of the mapping
is
It follows from Proposition 2.1 that $\psi ^{(k)}$ is Lipschitz continuous, and from this, one can show that $f^{(k)}$ is Lipschitz continuous in x and in t. In particular, the function $f^{(k)}$ is differentiable almost everywhere in $[0,\infty )\times \mathbb {R}_+^k$ . Following classical arguments, see e.g., [Reference Benton5] or [Reference Evans13, Theorem 3.3.5], we thus deduce from (5.1) that at every $(t,x) \in (0,\infty )\times (0,\infty )^k$ at which $f^{(k)}$ is differentiable, we have
This identification also uses that $\partial _{x_\ell } f^{(k)}(t,x) \geqslant 0$ , see (5.2). The latter property can be obtained as a consequence of the fact that $\partial _{x_\ell } \psi ^{(k)} \geqslant 0$ , which itself follows from Lemma 2.4.
We can now verify that the equation in (5.3) is formally consistent with a finite-dimensional interpretation of the Hamilton–Jacobi equation (1.2). In view of (1.8), and assuming “smoothness” of the function f, we must have, for every $\mu = k^{-1} \sum _{\ell = 1}^k \delta _{x_\ell } \in \mathcal P^{(k)}(\mathbb {R}_+)$ and $\ell \in \{1,\ldots , k\}$ that
and thus
We have thus obtained a formal relation between the Hamilton–Jacobi equation in (1.2) and that in (5.3), which itself can be rigorously connected with the Hopf–Lax formula (5.1)—see [Reference Mourrat21] and [Reference Mourrat22] on how to handle the boundary condition on $\partial (\mathbb {R}_+^k)$ in the contexts of viscosity solutions and weak solutions, respectively.
Acknowledgment
I would like to thank Dmitry Panchenko for useful comments, in particular for pointing out an error in an earlier argument for the validity of (2.18).