Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-02-06T16:50:31.721Z Has data issue: false hasContentIssue false

The Parisi formula is a Hamilton–Jacobi equation in Wasserstein space

Published online by Cambridge University Press:  28 January 2021

Jean-Christophe Mourrat*
Affiliation:
DMA, Ecole normale supérieure, CNRS, PSL University, Paris, France; Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
Rights & Permissions [Opens in a new window]

Abstract

The Parisi formula is a self-contained description of the infinite-volume limit of the free energy of mean-field spin glass models. We showthat this quantity can be recast as the solution of a Hamilton–Jacobi equation in the Wasserstein space of probability measures on the positive half-line.

Type
Article
Copyright
© Canadian Mathematical Society 2021

1 Introduction

Let $(\beta _p)_{p \geqslant 2}$ be a sequence of non-negative numbers, which for simplicity we assume to contain only a finite number of nonzero elements, and, for every $r \in \mathbb {R}$ , let $\xi (r) := \sum _{p \geqslant 2} \beta _p r^p$ . For every integer $N \geqslant 1$ , let $P_N$ denote a probability measure on $\mathbb {R}^N$ , which we often (but not always) assume to be such that

(1.1) $$ \begin{align} \left\{\!\!\!\begin{array}{@{}l@{}l} & \text{either}\ P_N\ \text{is the uniform measure on}\ \{-1,1\}^N\ \text{for every}\ N \geqslant 1, \\ & \text{or}\ P_N\ \text{is the uniform measure on}\ \{\sigma \in \mathbb{R}^N \ : \ |\sigma|^2 = N\}\ \text{for every}\ N \geqslant 1. \end{array} \right. \end{align} $$

We aim to study Gibbs measures built from the probability measure $P_N$ using as energy function the centered Gaussian field $(H_N(\sigma ))_{\sigma \in \mathbb {R}^N}$ with covariance

$$ \begin{align*} \mathbb{E} \left[ H_N(\sigma) \, H_N(\tau) \right] = N \xi \left( \frac{\sigma \cdot \tau}{N} \right) \qquad (\sigma,\tau \in \mathbb{R}^N). \end{align*} $$

This Gaussian vector can be built explicitly using independent linear combinations of quantities of the form $\sum _{1 \leqslant i_1,\ldots ,i_p \leqslant N} J_{i_1,\ldots ,i_p} \sigma _{i_1} \cdots \, \sigma _{i_p}$ , where $(J_{i_1,\ldots ,i_p})$ are independent standard Gaussian random variables. The Gibbs measures thus obtained are often called mixed p-spin models, possibly with the qualifiers “spherical” or“with Ising spins” when $P_N$ is the uniform measure on the sphere in $\mathbb {R}^N$ or on $\{-1,1\}^N$ , respectively. The Sherrington–Kirkpatrick model corresponds to the case of Ising spins and $\xi (r) = \beta r^2$ . The Parisi formula is a self-contained description of the limit free energy

$$ \begin{align*} \lim_{N \to \infty} \frac 1 N \mathbb{E} \log \int\!\! \exp \left( H_N(\sigma) \right) \, \mathrm{d} P_N(\sigma). \end{align*} $$

The identification of this limit was put on a rigorous mathematical footing in [Reference Guerra19Reference Panchenko26, 31–33], after the fundamental insights reviewed in [Reference Mézard, Parisi and Virasoro20].

The main goal of the present paper is to propose a new way to think about this result. This new point of view reveals a natural connection with the solution of a Hamilton–Jacobi equation posed in the space ofprobability measures on the positive half-line. For every metric space E, we denote by $\mathcal P(E)$ the set of Borel probability measures on E, and by $\delta _x$ the Dirac measure at $x \in E$ . We also set, with $\xi ^*$ denoting the convex dual of $\xi $ defined precisely below in (1.4),

$$ \begin{align*} \mathcal P_*(\mathbb{R}_+) := \left\{ \mu \in \mathcal P(\mathbb{R}_+) \ : \ \int_{\mathbb{R}_+} \xi^*(s) \, \mathrm{d} \mu(s) < \infty \right\} . \end{align*} $$

Theorem 1.1 (Hamilton–Jacobi representation of Parisi formula)

Assume (1.1), fix the normalization $\xi (1) = 1$ , and let $(t,\mu ) \mapsto f(t,\mu ) : \mathbb {R}_+ \times \mathcal P_*(\mathbb {R}_+) \to \mathbb {R}$ be the solution of the Hamilton–Jacobi equation

(1.2) $$ \begin{align} \left\{ \begin{aligned} & \partial_t f - \int\!\! \xi(\partial_\mu f) \, \mathrm{d} \mu = 0 & \quad \text{on } \mathbb{R}_+ \times \mathcal P_*(\mathbb{R}_+), \\ & f(0,\cdot) = \psi&\quad \text{on } \mathcal P_*(\mathbb{R}_+), \end{aligned} \right. \end{align} $$

where the function $\psi $ is described below in (3.1) and Proposition 3.1. For every $t \geqslant 0$ ,

(1.3) $$ \begin{align} \lim_{N \to \infty} -\frac 1 N\mathbb{E} \log \int\!\! \exp \left( \sqrt{2t} H_N(\sigma) - N t\right) \, \mathrm{d} P_N(\sigma) = f(t,\delta_0). \end{align} $$

Interestingly, the evolution equation in (1.2) depends on the correlation function $\xi $ but not on the measures $P_N$ , while, as will be seen below, the initial condition $\psi $ depends on the measures $P_N$ but not on $\xi $ . We postpone a precise discussion of the meaning of the equation (1.2), and start by explaining the background and motivations for looking for such arepresentation.

Recently, a new rigorous approach to the identification of limit free energies of mean-field disordered systems was proposed in [Reference Mourrat21Reference Mourrat22], inspired by [Reference Barra, Del Ferraro and Tantari3, Reference Barra, Di Biasio and Guerra4, Reference Guerra18]. The idea proposed there is to place the mainemphasis on the fact that after “enriching” the problem, we can identify the limit free energy as the solution of a Hamilton–Jacobi equation. At least for the problems considered there, one can showthat finite-volume free energies already satisfy the same Hamilton–Jacobi equation, except approximately. In particular, the approach allows for a convenient breakdown of the proof into two main steps: a first,more “probabilistic” part, which aims at showing that finite-volume free energies indeed satisfy an approximate Hamilton–Jacobi equation; and a second, more “analytic” part, which takesthis information as input and concludes that the limit must solve the equation exactly.

The problems studied in [Reference Mourrat21Reference Mourrat22] relate to statistical inference. They possess a particular feature that enforces “replicasymmetry,” and this allows for a complete resolution of the problem by adding only a finite number of extra variables to the problem. As is well-known, this is not the case for mean-field spin glasses as thoseconsidered here. The relevant Hamilton–Jacobi equation, if any, must therefore be set in an infinite-dimensional space.

The identity of this Hamilton–Jacobi equation is revealed by Theorem 1.1. The aim of the present paper is to demonstrate the presence of this structure, and wewill therefore simply borrow formulas from the literature for the limit on the left side of (1.3), and check that the expressions found there agree with the right side of(1.3). Hence, I want to stress that Theorem 1.1 is a rephrasing of known results.

However, I believe that Theorem 1.1 can be useful in furthering our understanding by providing a new way for us to think about these results—see also [Reference Thurston34] for general considerations on the relevance of such endeavors. In the long run, I hope indeed that this new interpretation of the Parisi formula will suggest a new and possibly morerobust and transparent approach to the identification of the limit free energy of disordered mean-field systems. For this purpose, it will be important to rely on stability estimates for the Hamilton–Jacobiequation (1.2) (that is, estimates asserting that a function satisfying the equation approximately must be close to the true solution). This should leverage on powerfulapproaches to the well-posedness of Hamilton–Jacobi equations such as the notions of viscosity or weak solutions, as exemplified in the finite-dimensional setting in [Reference Mourrat21]and [Reference Mourrat22], respectively. Since the purpose of the present paper is only to demonstrate the presence of the Hamilton–Jacobi structure, I will refrain from exploring thisdirection here. Since the completion of this work, partial results on bipartite models have been obtained in [Reference Mourrat23] using the idea uncovered here. In [Reference Mourrat23, Section 6], it is also argued that more standard variational approaches do not seem to be applicable for such models.

This Hopf–Lax formulation features an optimal transport problem involving the cost function $(x,y) \mapsto \xi ^*(x-y)$ , where $\xi ^*$ is the convex dual of $\xi $ defined by

(1.4) $$ \begin{align} \xi^*(s) := \mathop{\mathrm{sup}}\limits_{r \geqslant 0} (rs - \xi(r)). \end{align} $$

Notice that the function $\xi $ is convex on $\mathbb {R}_+$ , and the precise way to interpret $\xi ^*$ is as the dual of the convex and lower semicontinuous function on $\mathbb {R}$ which coincides with $\xi $ on $\mathbb {R}_+$ and is $+\infty $ otherwise. (The function f of interest to us satisfies a monotonicity property which can be interpreted as $\partial _{\mu } f \geqslant 0$ in a weak sense, and thus modifying $\xi $ on $\mathbb {R} \setminus \mathbb {R}_+$ is irrelevant to the interpretation of (1.2)—see also [Reference Mourrat22] for a more precise discussion of this point infinite dimension, as well as Lemma 2.4 below).

Optimal transport problems for measures on the real line are in some sense trivial, in that the couplings between pairs of measures can be realized jointly over all measures, and do not depend on the convex function $\xi ^*$ entering the definition of the cost function. Denoting, for every $\mu \in \mathcal P(\mathbb {R}_+)$ and $r \in [0,1]$ ,

(1.5) $$ \begin{align} F^{-1}_\mu(r) := \inf \left\{ s \geqslant 0 \ : \ \mu \left( [0,s] \right) \geqslant r \right\} , \end{align} $$

and letting U be a uniform random variable over $[0,1]$ under $\mathbb {P}$ , we set

(1.6) $$ \begin{align} X_\mu := F^{-1}_\mu(U). \end{align} $$

It is classical to verify that the law of $X_\mu $ under $\mathbb {P}$ is $\mu $ , and that for any two measures $\mu ,\nu \in \mathcal P_*(\mathbb {R}_+)$ , the law of the pair $(X_\mu , X_\nu )$ is an optimal transport plan for the cost function $(x,y) \mapsto \xi ^*(x-y)$ (see e.g., [Reference Villani35, Theorem 2.18 and Remark 2.19(ii)] or [Reference Ambrosio, Gigli and Savaré2, Theorem 6.0.2]). As discussed above, for thepurposes of this paper, we define the solution of (1.2) to be given by the Hopf–Lax formula

(1.7) $$ \begin{align} f(t,\mu) := \mathop{\mathrm{sup}}\limits_{\nu \in \mathcal P_*(\mathbb{R}_+)} \left( \psi(\nu) - t\, \mathbb{E} \left[ \xi^* \left( \frac{ X_{\nu} - X_\mu }{t} \right) \right] \right).\end{align} $$

Although this will not be used here, one can give a brief and nonrigorous idea of the definition of the derivative $\partial _\mu $ formally appearing in (1.2) in the case when it applies to a sufficiently “smooth” function $g : \mathcal P_*(\mathbb {R}_+) \to \mathbb {R}$ : for each $\mu \in \mathcal P_*(\mathbb {R}_+)$ , we want $\partial _\mu g(\mu ,\cdot )$ to satisfy $\int\!\! \xi (\partial _\mu g(\mu ,\cdot )) \, \mathrm {d} \mu < \infty $ , and be such that, as $\nu \to \mu $ in $\mathcal P_*(\mathbb {R}_+)$ ,

(1.8) $$ \begin{align} g(\nu) = g(\mu) + \mathbb{E} \left[ \partial_\mu g(\mu,X_\mu)(X_\nu - X_\mu) \right] + o \left( \left\| X_\nu - X_\mu \right\|_{L^*} \right) , \end{align} $$

where $\|Y\|_{L^*}$ denotes the $\xi ^*$ -Orlicz norm of a random variable Y, see [Reference Rao and Ren29],

$$ \begin{align*} \|Y\|_{L^*} := \inf \left\{ t> 0 \ : \ \mathbb{E} \left[ \xi^*(t^{-1} Y) \right] \leqslant \xi^*(1) \right\} . \end{align*} $$

From this informal definition, one can work out finite-dimensional approximations of the equation (1.2) by imposing, for instance, that only measures of theform $k^{-1} \sum _{\ell = 1}^k \delta _{x_\ell }$ are “permitted.” This brings us within the realm of finite-dimensional Hamilton–Jacobi equations and allows for instance to verify the correspondence between the equation (1.2) and the Hopf–Lax formula (1.7) at the level of these approximations.

We will in fact consider a richer family of finite-volume free energies than what appears on the left side of (1.3), parametrized by $(t,\mu ) \in \mathbb {R}_+ \times \mathcal P_*(\mathbb {R}_+)$ , and I expect that these free energies converge to $f(t,\mu )$ as N tends to infinity, where f is the solution of (1.2). In fact, I expect that a similar result holds for a much largerclass of measures $P_N$ than those covered by the assumption of (1.1). A precise conjecture to this effect is presented in Section 2. Theidentification of the initial condition $\psi $ appearing in (1.2) is then discussed in Section 3. The proof of Theorem 1.1 is given in Section 4. Finally, finite-dimensional approximations of (1.2) are briefly explored in Section5.

2 Conjecture for a general reference measure

The main goal of this section is to state a conjecture generalizing Theorem 1.1 to a wider class of measures $P_N$ than those appearing in (1.1). For simplicity, we retain the assumption that

(2.1) $$ \begin{align} \text{the measure}\ P_N\ \text{is supported in the ball}\ \{\sigma \in \mathbb{R}^N \ : \ |\sigma|^2 \leqslant N\}. \end{align} $$

If there exists some $R \in (0,\infty )$ such that for every N, the measure $P_N$ is supported in the ball $\{\sigma \in \mathbb {R}^N \ : \ |\sigma |^2 \leqslant R N\}$ , then one can without loss of generality reduce to the case in (2.1) by rescaling the function $\xi $ .

In order to gain some familiarity with Theorem 1.1 and its conjectured generalization, we start by illustrating the driving idea in simpler settings. Possibly thesimplest demonstration of the idea of identifying limit free energies of mean-field systems as solutions of Hamilton–Jacobi equations concerns the analysis of the Curie–Weiss model, see e.g., [Reference Mourrat21, Section 1] (earlier references include [Reference Brankov and Zagrebnov6, Reference Newman25]). We give here another simple illustration forspin glasses in the high-temperature regime, which is similar to discussions in [Reference Guerra18]. For every $t, h \geqslant 0$ , we consider the “enriched” free energy

(2.2) $$ \begin{align} \overline F_N^\circ(t,h) := -\frac 1 N\mathbb{E} \log \int\!\! \exp \big(& \sqrt{2t} H_N(\sigma) - Nt \xi \left( N^{-1} |\sigma|^2 \right) \nonumber\\ & + \sqrt{2h} z \cdot \sigma -h|\sigma|^2\big) \, \mathrm{d} P_N(\sigma), \end{align} $$

where $z = (z_1,\ldots ,z_N)$ is a vector of independent standard Gaussians, independent of $H_N$ , and where $|\sigma |^2 = \sum _{i = 1}^N \sigma _i^2$ . Notice that under the assumptions of Theorem 1.1, we have $|\sigma |^2 = N$ and $\xi (N^{-1} |\sigma |^2) = 1$ . The terms $-Nt \xi \left ( N^{-1} |\sigma |^2 \right )$ and $-h|\sigma |^2$ inside the exponential in (2.2) are natural since they ensure that

$$ \begin{align*} \mathbb{E} \left[ \exp \left( \sqrt{2t} H_N(\sigma) - Nt \xi \left( N^{-1} |\sigma|^2 \right) + \sqrt{2h} z \cdot \sigma - h|\sigma|^2\right) \right] = 1. \end{align*} $$

(Observing that $H_N(\sigma )$ and $z\cdot \sigma $ are independent centered Gaussians of variance $N \xi \left ( N^{-1} |\sigma |^2 \right )$ and $|\sigma |^2$ respectively, this follows either by recognizing an exponential martingale, or by differentiating in t and h and using Gaussian integration by parts.) In statisticalphysics’ terminology, one may say that we have normalized the Hamiltonian so that the annealed free energy is always zero. The minus sign in front of the expression on the right side of (2.2) is also convenient since, by Jensen’s inequality, we thus have $\overline F_N^\circ \geqslant 0$ . One can check that

(2.3) $$ \begin{align} \partial_t \overline F_N^\circ - \xi \left( \partial_h \overline F_N^\circ \right) = \mathbb{E} \left\langle \xi \left( \frac{\sigma \cdot \sigma'}{N} \right) \right\rangle - \xi \left(\mathbb{E} \left\langle \frac{\sigma \cdot \sigma'}{N} \right\rangle \right) . \end{align} $$

In the case when $\xi $ is convex over $\mathbb {R}$ , the right side of (2.3) is non-negative, and thus we already see that $\overline F_N^\circ $ is a supersolution of a simple Hamilton–Jacobi equation. Moreover, one can expect in many settings that the initial condition $\overline F_N^\circ (0,h)$ converges as N tends to infinity; for instance, when $P_N$ is the N-fold product measure $P_N = P_1^{\otimes N}$ , we have

$$ \begin{align*} \overline F_N^\circ(0,h) = \overline F_1^\circ(0,h) = -\mathbb{E} \log \int_{\mathbb{R}} \exp \left( \sqrt{2h} \, z_1 \sigma - h \sigma^2 \right) \, \mathrm{d} P_1(\sigma),\end{align*} $$

where in this expression, the variable $z_1$ is a scalar standard Gaussian. Finally, if we expect the overlap $\sigma \cdot \sigma '$ to be concentrated around its expectation, which should be correct in a high-temperature region (that is, for t sufficiently small), then it should be that $\overline F_N^\circ $ converges to the solution of the equation $\partial _t f - \xi (\partial _h f) = 0$ .

However, as is well-known, the overlap $\sigma \cdot \sigma '$ is in fact not always concentrated around its mean value, and a more refined approach is necessary. In order to proceed, as in [Reference Guerra19Reference Panchenko26, 31–33], we need to compare the system of interest with a much more refined “linear” system than $\sqrt {2h} z \cdot \sigma $ . We parametrize the more refined systems by a measure $\mu \in \mathcal P(\mathbb {R}_+)$ (and not $\mu \in \mathcal P([0,1])$ as experts may expect). It is much more convenient to describe this more refined system in the case when $\mu $ is a measure of finite support: we assume that for some integer $k \geqslant 0$ , there exist

(2.4) $$ \begin{align} 0 = \zeta_{0} < \zeta_1 < \cdots < \zeta_{k} < \zeta_{k+1} = 1, \qquad 0 = q_{-1} \leqslant q_0 < q_1 < \cdots < q_k < q_{k+1} = \infty \end{align} $$

such that

(2.5) $$ \begin{align} \mu = \sum_{\ell = 0}^{k} (\zeta_{\ell+1} - \zeta_{\ell}) \delta_{q_\ell}. \end{align} $$

We represent the rooted tree with (countably) infinite degree and depth k by

$$ \begin{align*} \mathcal A := \mathbb{N}^{0} \cup \mathbb{N} \cup \mathbb{N}^2 \cup \cdots \cup \mathbb{N}^k, \end{align*} $$

where $\mathbb {N}^{0} = \{\emptyset \}$ , and $\emptyset $ represents the root of the tree. For every $\alpha \in \mathbb {N}^\ell $ , we write $|\alpha | := \ell $ to denote the depth of the vertex $\alpha $ in the tree $\mathcal A$ . For every leaf $\alpha = (n_1,\ldots ,n_k)\in \mathbb {N}^k$ and $\ell \in \{0,\ldots , k\}$ , we write

$$ \begin{align*} \alpha_{| \ell} := (n_1,\ldots, n_\ell), \end{align*} $$

with the understanding that $\alpha _{| 0} = \emptyset $ . We also give ourselves a family $(z_{\alpha ,i})_{\alpha \in \mathcal A, 1 \leqslant i \leqslant N}$ of independent standard Gaussians, independent of $H_N$ , and we let $(v_\alpha )_{\alpha \in \mathbb {N}^k}$ denote a Poisson–Dirichlet cascade with weights given by the family $(\zeta _\ell )_{1 \leqslant \ell \leqslant k}$ . We refer to [Reference Panchenko26, (2.46)] for a precise definition, and briefly mention here the following three points. First, in the case $k = 0$ , we simply set $v_{\emptyset } = 1$ . Second, in the case $k = 1$ , the weights $(v_\alpha )_{\alpha \in \mathbb {N}}$ are obtained by normalizing a Poisson point process on $(0,\infty )$ with intensity measure $\zeta _1 x^{-1-\zeta _1} \, \mathrm {d} x$ so that $\sum _{\alpha } v_\alpha = 1$ . Third, for general $k \geqslant 1$ , the progeny of each nonleaf vertex at level $\ell \in \{0,\ldots , k-1\}$ is decorated with the values of an independent Poisson point process of intensity measure $\zeta _{\ell +1} x^{-1-\zeta _{\ell +1}} \, \mathrm {d} x$ , then the weight of a given leaf $\alpha \in \mathbb {N}^k$ is calculated by taking the product of the “decorations” attached to each parent vertex, including the leaf vertex itself (but excluding the root), and finally, these weights over leavesare normalized so that their total sum is $1$ . We take this Poisson–Dirichlet cascade $(v_\alpha )_{\alpha \in \mathbb {N}^k}$ to be independent of $H_N$ and of the random variables $(z_\alpha )_{\alpha \in \mathcal A}$ . For every $\sigma \in \mathbb {R}^N$ and $\alpha \in \mathbb {N}^k$ , we set

(2.6) $$ \begin{align} H_N^{\prime}(\sigma,\alpha) := \sum_{\ell = 0}^k \left(2q_{\ell} - 2q_{\ell-1}\right)^{\frac 1 2} z_{\alpha_{|\ell}} \cdot \sigma, \end{align} $$

where we write $z_{\alpha _{|\ell }} \cdot \sigma = \sum _{i = 1}^N z_{\alpha _{|\ell },i} \, \sigma _i$ . The random variables $(H_N^{\prime }(\sigma ,\alpha ))_{\sigma \in {\mathbb {R}^N}, \alpha \in \mathbb {N}^k}$ form a Gaussian family which is independent of $(H_N(\sigma ))_{\sigma \in \mathbb {R}^N}$ and has covariance

$$ \begin{align*} \mathbb{E} \left[ H_N^{\prime}(\sigma,\alpha) \, H_N^{\prime}(\tau,\beta) \right] = 2q_{\alpha \wedge \beta} \ \sigma \cdot \tau \qquad (\sigma,\tau \in \mathbb{R}^N, \ \alpha, \beta\in \mathbb{N}^k),\end{align*} $$

where we write, for every $\alpha , \beta \in \mathbb {N}^k$ ,

$$ \begin{align*} \alpha \wedge \beta := \mathop{\mathrm{sup}}\limits \{\ell \leqslant k \ : \ \alpha_{|\ell} = \beta_{|\ell} \}. \end{align*} $$

We define the “enriched” free energy as

(2.7) $$ \begin{align} &F_N(t,\mu) := -\frac 1 N \log \int\!\! \sum_{\alpha \in \mathbb{N}^k} \exp \Big( \sqrt{2t} H_N(\sigma) - N t\xi \left( N^{-1} |\sigma|^2 \right) \nonumber\\&\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad+ H_N^{\prime}(\sigma,\alpha) - q_k|\sigma|^2 \Big) \,v_\alpha \, \mathrm{d} P_N(\sigma), \end{align} $$

and $\overline F_N(t,\mu ) := \mathbb {E} \left [ F_N(t,\mu ) \right ]$ . As for (2.2), we have normalized this expression so that, by Jensen’s inequality, we have $\overline F_N \geqslant 0$ . We first notice that this quantity can be extended to all $\mu \in \mathcal P_1(\mathbb {R}_+)$ by continuity.

Proposition 2.1 (Continuity and extension of $\overline F_N(t,\mu )$ )

Assume (2.1). For each $t \geqslant 0$ and $\mu , \mu ^{\prime } \in \mathcal P(\mathbb {R}_+)$ with finite support, we have

(2.8) $$ \begin{align} \left| \overline F_N(t,\mu^{\prime}) - \overline F_N(t,\mu) \right| \leqslant \mathbb{E} \left[ |X_{\mu^{\prime}} - X_{\mu}| \right] . \end{align} $$

In particular, the mapping $\mu \mapsto \overline F_N(t,\mu )$ can be extended by continuity to the set

$$ \begin{align*} \mathcal P_1(\mathbb{R}_+) := \left\{ \mu \in \mathcal P(\mathbb{R}_+) \ : \ \int\!\! s \, \mathrm{d} \mu(s) < \infty \right\} . \end{align*} $$

The proof of proposition 2.1 makes use of the following two lemmas. The first one provides an explicit procedure for integrating the randomness coming from thePoisson–Dirichlet cascade. We refer to [Reference Panchenko26, Theorem 2.9] for a proof. (Notice that the indexation of the family $\zeta $ differs by one unit between here and [Reference Panchenko26].)

Lemma 2.2 (Integration of Poisson–Dirichlet cascades)

Assume (2.1), and fix $t \geqslant 0$ . For every $y_0, \ldots , y_k \in \mathbb {R}^N$ , define

(2.9) $$ \begin{align} & X_k(y_0,\ldots,y_k) := \log \int\!\! \exp \big( \sqrt{2t} H_N(\sigma) - N t\xi \left( N^{-1} |\sigma|^2 \right) \nonumber\\&\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad + \sum_{\ell = 0}^k \left(2q_{\ell} - 2q_{\ell-1}\right)^{\frac 1 2} y_\ell \cdot \sigma - q_k |\sigma|^2 \big)\ \mathrm{d} P_N(\sigma), \end{align} $$

and then recursively, for every $\ell \in \{1,\ldots , k\}$ ,

(2.10) $$ \begin{align} X_{\ell-1}(y_0,\ldots,y_{\ell-1}) := \zeta_{\ell}^{-1} \log \mathbb{E}_{y_{\ell}} \exp \left( \zeta_{\ell} X_{\ell}(y_0,\ldots,y_{\ell}) \right) , \end{align} $$

where, for every $\ell \in \{0,\ldots ,k\}$ , we write $\mathbb {E}_{y_{\ell }}$ to denote the integration of the variable $y_{\ell } \in \mathbb {R}^N$ along the standard Gaussian measure. We have

$$ \begin{align*} -N \, \mathbb{E} \left[ F_N(t,\mu) \, \big \vert \, (H_N(\sigma))_{\sigma \in \mathbb{R}^N} \right] = \mathbb{E}_{y_0}\left[X_0(y_0)\right]. \end{align*} $$

In order to state the second lemma, we introduce notation for the Gibbs measure associated with the free energy $F_N$ . That is, for every bounded measurable function $f : \mathbb {R}^N \times \mathbb {N}^k \to \mathbb {R}$ , we write

(2.11) $$ \begin{align} &\left\langle f(\sigma,\alpha) \right\rangle_{t,\mu} := \exp \left( N F_N(t,\mu) \right)\nonumber\\ &\quad\times \int\!\! \sum_{\alpha \in \mathbb{N}^k} f(\sigma,\alpha) \exp \left(\sqrt{2t} H_N(\sigma) - N t\xi \left( N^{-1} |\sigma|^2 \right) + H_N^{\prime}(\sigma,\alpha) - q_k|\sigma|^2 \right) \nonumber\\ &\quad\times v_\alpha \, \mathrm{d} P_N(\sigma). \end{align} $$

We usually simply write $\left \langle \cdot \right \rangle $ instead of $\left \langle \cdot \right \rangle _{t,\mu }$ unless there is a risk of confusion. Notice that the measure $\left \langle \cdot \right \rangle $ depends additionally on the realization of the Gaussian field $(H_N(\sigma ))$ and of the variables $(z_{\alpha })$ . By the definition of $F_N(t,\mu )$ , we have $\left \langle 1 \right \rangle = 1$ , and thus $\left \langle \cdot \right \rangle $ can be interpreted as a probability distribution on $\mathbb {R}^N \times \mathbb {N}^k$ . We also need to consider “replicated” pairs, denoted by $(\sigma ,\alpha ), (\sigma ^{\prime },\alpha ^{\prime }), (\sigma '', \alpha ''), \ldots $ , which are independent and are each distributed according to $\left \langle \cdot \right \rangle $ (conditionally on $(H_N(\sigma ))$ and $(z_\alpha )$ ). We keep writing $\left \langle \cdot \right \rangle $ to denote the tensorized measure, so that for instance, for every bounded measurable $f,g : \mathbb {R}^N \times \mathbb {N}^k \to \mathbb {R}$ , we have

$$ \begin{align*} \left\langle f(\sigma,\alpha) \, g(\sigma',\alpha') \right\rangle = \left\langle f(\sigma,\alpha) \right\rangle \, \left\langle g(\sigma',\alpha') \right\rangle . \end{align*} $$

The second lemma we need identifies the law of the overlap between $\alpha $ and $\alpha '$ under the Gibbs measure, after also averaging over the randomness coming from $(H_N(\sigma ))$ and $(z_\alpha )$ (averaging over $(z_\alpha )$ only would be sufficient).

Lemma 2.3 (overlaps for the Poisson–Dirichlet variables)

Whenever the measure $\mu $ is of the form in (2.4)-(2.5), we have, for every $t \geqslant 0$ and $\ell \in \{0,\ldots ,k\}$ ,

$$ \begin{align*} \mathbb{E} \left\langle \mathbf{1}_{\{\alpha \wedge \alpha'=\ell\}} \right\rangle_{t,\mu} = \zeta_{\ell+1} - \zeta_{\ell}. \end{align*} $$

Proof The argument can be extracted from [Reference Talagrand33], or by observing that the derivation of [Reference Panchenko26, (2.82)] applies as well to the measures consideredhere. A slightly adapted version of the latter argument is as follows. We fix $\ell \in \{0,\ldots , k\}$ , and let $(g_\beta )_{\beta \in \mathbb {N}^\ell }$ be a family of independent standard Gaussians, independent of any other random variable considered so far. For every $\alpha , \beta \in \mathbb {N}^k$ , we have

(2.12) $$ \begin{align} \mathbb{E} \left[ g_{\alpha_{|\ell}} \, g_{\beta_{|\ell}} \right] = \mathbf{1}_{\{ \alpha_{|\ell} = \beta_{|\ell} \}}. \end{align} $$

Recall the construction of the Poisson–Dirichlet cascade outlined in the paragraph above (2.6), see also [Reference Panchenko26,(2.46)], and denote by $w_\alpha $ the weights attributed to the leaves by taking the product of the “decorations” of the parent vertices, before normalization, as in [Reference Panchenko26, (2.45)], sothat

$$ \begin{align*} v_\alpha = \frac{w_\alpha}{\sum_{\beta \in \mathbb{N}^k} w_{\beta}}. \end{align*} $$

By [Reference Panchenko26, (2.26)], for every $s \in \mathbb {R}$ , we have that

$$ \begin{align*} (w_\alpha, g_{\alpha_{|\ell}})_{\alpha \in \mathbb{N}^k} \quad \text{ and } \quad \left(w_\alpha \exp \left( s \left(g_{\alpha_{|\ell}} - \tfrac {s\zeta_\ell}{2} \right) \right),g_{\alpha_{|\ell}} - s\zeta_\ell\right)_{\alpha \in \mathbb{N}^k} \end{align*} $$

have the same law up to reorderings that preserve the tree structure: that is, we identify two families $(a_\alpha )_{\alpha \in \mathbb {N}^k}$ and $(b_\alpha )_{\alpha \in \mathbb {N}^k}$ whenever there exists a bijection $\pi : \mathbb {N}^k \to \mathbb {N}^k$ satisfying, for every $\alpha , \beta \in \mathbb {N}^k$ ,

$$ \begin{align*} a_{\alpha} = b_{\pi(\alpha)} \quad \text{and} \quad \pi(\alpha) \wedge \pi(\beta) = \alpha \wedge \beta. \end{align*} $$

We denote

$$ \begin{align*} v_{\alpha,\ell,s} := \frac{w_\alpha\exp \left( s g_{\alpha_{|\ell}} \right)}{\sum_{\beta \in \mathbb{N}^k} w_{\beta}\exp \left( s g_{\beta_{|\ell}} \right)}, \end{align*} $$

and write $\left \langle \cdot \right \rangle _{t,\mu ,\ell ,s}$ to denote the measure defined as in (2.11) but with $v_\alpha $ replaced by $v_{\alpha ,\ell ,s}$ . By the invariance described above, Gaussian integration by parts, and (2.12), we have, for every $s \in \mathbb {R}$ ,

$$ \begin{align*} 0 = \mathbb{E} \left\langle g_{\alpha_{|\ell}} \right\rangle_{t,\mu,\ell,0} = \mathbb{E} \left\langle g_{\alpha_{|\ell}} - s\zeta_\ell \right\rangle_{t,\mu,\ell,s} = s \, \mathbb{E}\left\langle 1-\mathbf{1}_{\{\alpha_{|\ell} = \alpha^{\prime}_{|\ell}\}} - \zeta_\ell \right\rangle_{t,\mu,\ell,s}, \end{align*} $$

and, using the invariance once more, we can replace $\left \langle \cdot \right \rangle _{t,\mu ,\ell ,s}$ by $\left \langle \cdot \right \rangle _{t,\mu }$ in the last expression. We thus conclude that

$$ \begin{align*} \mathbb{E} \left\langle \mathbf{1}_{\{\alpha_{|\ell} = \alpha^{\prime}_{|\ell}\}} \right\rangle_{t,\mu} = 1-\zeta_\ell, \end{align*} $$

which yields the desired result.    ▪

Proof We decompose the proof into two steps.

Step 1. In this step, we give a consistent extension of the definition of $\overline F_N(t,\mu )$ to the case when the parameters in (2.4) may contain repetitions. More precisely, we give ourselves possibly repeating parameters

(2.13) $$ \begin{align} 0 = \zeta_{0} \leqslant \zeta_1 \leqslant \cdots \leqslant \zeta_{k} < \zeta_{k+1} = 1, \qquad 0 = q_{-1} \leqslant q_0 \leqslant \cdots \leqslant q_k < q_{k+1} = \infty, \end{align}$$

and let $\mu $ be the measure defined by (2.5). We show that the naive extension of the definition of $\overline F_N(t,\mu )$ obtained by simply ignoring the fact that there may be repetitions in the parameters in (2.13) yields the same result as the actual definition that wasgiven using nonrepeating parameters. The first thing we need to do is extend the definition of the Poisson–Dirichlet cascade $(v_\alpha )_{\alpha \in \mathbb {N}^k}$ to the case when some values of $(\zeta _\ell )_{\ell \in \{1,\ldots ,k\}}$ may be equal to $0$ . Recall that for $\zeta _\ell \in (0,1)$ , the definition briefly described in the paragraph above (2.6) involves a Poisson point process of intensity measure $\zeta _{\ell } x^{-1-\zeta _\ell } \, \mathrm {d} x$ . In the case $\zeta _\ell = 0$ , we interpret this Poisson point process as consisting of a single instance of the value $1$ and then a countably infinite repetition of the value $0$ . This allows us to define the quantity on the right side of (2.7) for arbitrary values of the parameters in (2.13). The average of this quantity can be calculated using Lemma 2.2: the only point that needs to be added is that in the case $\zeta _\ell = 0$ , we interpret (2.10) as

$$ \begin{align*} X_{\ell-1}(y_0,\ldots,y_{\ell-1}) := \mathbb{E}_{y_{\ell}} \left[ X_{\ell}(y_0,\ldots,y_{\ell}) \right] . \end{align*} $$

From this algorithmic procedure, one can check that the result does not depend on whether or not there were repetitions in the parameters in (2.13). Indeed,on the one hand, when $\zeta _\ell = \zeta _{\ell +1}$ , we have

$$ \begin{align*} X_{\ell-1} (y_0,\ldots,y_{\ell-1}) = \zeta_{\ell}^{-1} \log \mathbb{E}_{y_\ell,y_{\ell+1}} \exp \left( \zeta_\ell X_{\ell+1}(y_0,\ldots,y_{\ell+1}) \right) , \end{align*} $$

where $\mathbb {E}_{y_\ell ,y_{\ell +1}}$ denotes the averaging of the variables $y_\ell , y_{\ell +1}$ when sampled independently according to the standard Gaussian measure on $\mathbb {R}^N$ ; and under this measure, the sum

$$ \begin{align*} \left[(2q_\ell - 2q_{\ell-1})^{\frac 1 2} y_\ell + (2q_{\ell+1} - 2q_{\ell})^{\frac 1 2} y_{\ell+1}\right] \cdot \sigma \end{align*} $$

has the same law as

$$ \begin{align*} (2q_{\ell+1} - 2q_{\ell-1})^{\frac 1 2} y_\ell \cdot \sigma. \end{align*} $$

On the other hand, if $q_\ell = q_{\ell +1}$ , then the term indexed by $\ell +1$ in the sum on the right side of (2.9) vanishes, and

$$ \begin{align*} X_{\ell} (y_0,\ldots,y_{\ell}) = X_{\ell+1}(y_0,\ldots,y_{\ell+1}) . \end{align*} $$

It is thus clear in both cases that removing repetitions does not change the value of the resulting quantity.

Step 2. Consider now two measures $\mu ,\mu ' \in \mathcal P(\mathbb {R}_+)$ of finite support. There exist $k \in \mathbb {N}$ , $(\zeta _\ell )_{0 \leqslant \ell \leqslant k}$ , $(q_\ell )_{0 \leqslant \ell \leqslant k}$ and $(q^{\prime }_\ell )_{0 \leqslant \ell \leqslant k}$ satisfying (2.5), (2.13),

$$ \begin{align*} 0 = q_{-1}^{\prime} \leqslant q_0^{\prime} \leqslant q_1^{\prime} \leqslant \cdots \leqslant q_k^{\prime} < q^{\prime}_{k+1} = \infty, \quad \text{and} \quad \mu^{\prime} = \sum_{\ell= 0}^k (\zeta_{\ell+1} - \zeta_{\ell}) \delta_{q^{\prime}_\ell}. \end{align*} $$

Using this representation, we can rewrite the $L^1$ -Wasserstein distance between the measures $\mu $ and $\mu '$ as

(2.14) $$ \begin{align} \mathbb{E} \left[ | X_{\mu'} - X_{\mu}|\right] = \sum_{\ell = 0}^k \left( \zeta_{\ell+1} - \zeta_{\ell} \right) |q^{\prime}_\ell - q_\ell|. \end{align} $$

Abusing notation, we denote

(2.15) $$ \begin{align} \overline F_N(t,\zeta,q) := \overline F_N \left( t, \sum_{\ell = 0}^k (\zeta_{\ell+1} - \zeta_{\ell}) \delta_{q_\ell}\right), \end{align} $$

and proceed to compute $\partial _{q_\ell } \overline F_N(t,\zeta ,q)$ , for each $\ell \in \{0,\ldots ,k\}$ . For every $\sigma , \tau \in \mathbb {R}^N$ , $\alpha \in \mathbb {N}^k$ and $\beta \in \mathcal A$ , we have

(2.16) $$ \begin{align} \mathbb{E} \left[ H_N^{\prime}(\sigma,\alpha) z_{\beta} \cdot \tau\right] = \left| \begin{array}{ll} (2q_{\ell} - 2 q_{\ell-1})^{\frac 1 2} \, \sigma \cdot \tau \ & \text{if } \beta =\alpha_{| \ell} \text{ with } \ell \in \{0,1,\ldots,k\}, \\ 0 & \text{otherwise}. \end{array} \right. \end{align} $$

For every $\ell \in \{0,\ldots ,k-1\}$ , we have

$$ \begin{align*} \partial_{q_\ell} \overline F_N(t,\zeta,q) = -\frac 1 N \mathbb{E} \left\langle (2q_\ell - 2q_{\ell - 1})^{-\frac 1 2} z_{\alpha_{|\ell}}\cdot \sigma - (2q_{\ell+1} -2q_{\ell})^{-\frac 1 2} z_{\alpha_{|(\ell+1)}}\cdot \sigma \right\rangle. \end{align*} $$

By (2.16) and Gaussian integration by parts, see e.g. [Reference Panchenko26, Lemma 1.1], we obtain

(2.17) $$ \begin{align} \partial_{q_\ell} \overline F_N(t,\zeta,q) & = \frac 1 N \mathbb{E} \left\langle \left(\mathbf{1}_{\{\alpha_{|\ell} \, = \, \alpha^{\prime}_{|\ell}\}} - \mathbf{1}_{\{\alpha_{|(\ell+1)}= \alpha^{\prime}_{|(\ell+1)}\}} \right)\sigma \cdot \sigma' \right\rangle \\ \notag & = \frac 1 N\mathbb{E} \left\langle \mathbf{1}_{\{\alpha \wedge \alpha'=\ell\}} \, \sigma \cdot \sigma' \right\rangle. \end{align}$$

The same reasoning also shows that

$$ \begin{align*} \partial_{q_k} \overline F_N(t,\zeta,q) = \frac 1 N \mathbb{E} \left\langle \mathbf{1}_{\{\alpha = \alpha'\}} \sigma \cdot \sigma' \right\rangle, \end{align*} $$

so that the last identity in (2.17) is also valid for $\ell = k$ . In particular, for every $\ell \in \{0,\ldots ,k\}$ , we have by (2.1) that

$$ \begin{align*} \left|\partial_{q_\ell} \overline F_N(t,\zeta,q) \right|\leqslant \mathbb{E} \left\langle \mathbf{1}_{\{\alpha \wedge \alpha'=\ell\}} \right\rangle = (\zeta_{\ell+1} - \zeta_{\ell}),\end{align*} $$

and thus, by integration,

$$ \begin{align*} \left|\overline F_N(t,\zeta, q') - \overline F_N(t,\zeta,q)\right| \leqslant \sum_{\ell = 0}^k (\zeta_{\ell+1} - \zeta_\ell) \left| q^{\prime}_{\ell} - q_\ell \right|. \end{align*}$$

A comparison with (2.14) then yields the desired result.    ▪

We can also use Lemma 2.2 to give a more precise meaning to the vaguely stated monotonicity claim of $\partial _{\mu } f \geqslant 0$ expressed in the paragraph below (1.4), already at the level of the functions $\overline F_N$ .

Lemma 2.4 Let $\zeta ,q$ be parameters as in (2.4), and let $\overline F_N(t,\zeta ,q)$ be as in (2.15). For every $\ell \leqslant \ell ' \in \{0,\ldots , k\}$ , we have

(2.18) $$ \begin{align} \partial_{q_\ell} \overline F_N \geqslant 0, \end{align} $$

as well as

(2.19) $$ \begin{align} (\zeta_{\ell'+1} - \zeta_{\ell'}) \partial_{q_\ell} \overline F_N\leqslant (\zeta_{\ell+1} - \zeta_{\ell}) \partial_{q_{\ell'}} \overline F_N. \end{align} $$

Remark 2.5 As the proof will make clear, we can make sense of the quantity

$$ \begin{align*} (\zeta_{\ell+1} - \zeta_{\ell})^{-1} \partial_{q_\ell} \overline F_N \end{align*} $$

even when $\zeta _\ell = \zeta _{\ell +1}$ , by continuity. In view of (2.17) and Lemma 2.3, the monotonicity expressed in (2.19) can be rephrased as the statement that, for every $\ell \leqslant \ell '$ ,

$$ \begin{align*} \mathbb{E} \left\langle \sigma \cdot \sigma' \, \big \vert \, \alpha \wedge \alpha' = \ell \right\rangle \leqslant \mathbb{E} \left\langle \sigma \cdot \sigma' \, \big \vert \, \alpha\wedge \alpha' = \ell' \right\rangle, \end{align*} $$

where we understand that the conditioning is with respect to the measure $\mathbb {E} \left \langle \cdot \right \rangle $ .

Proof The main step of the proof is similar to that of [Reference Talagrand33, Proposition 14.3.2]; see also [Reference Barra, Del Ferraro and Tantari3, Reference Barra, Di Biasio and Guerra4,Reference Guerra18Reference Panchenko and Talagrand27]. We will rewrite the left side of (2.18) as an averaged overlap, takingLemma 2.2 as a starting point, the subtle point being in the identification of the correct measure with respect to which the average is taken. We start by introducing somenotation. We let $X_k,X_{k-1},\ldots ,X_0$ be as in Lemma 2.2, and define $X_{-1} := \mathbb {E}_{y_0} \left [ X_0(y_0) \right ]$ . For every $\ell \leqslant m \in \{0,\ldots ,k\}$ , we write

$$ \begin{align*} D_{\ell,m} = D_{\ell m} := \frac{\exp \left( \zeta_{\ell} X_\ell + \cdots + \zeta_{m} X_m \right)}{\mathbb{E}_{y_\ell} \left[ \exp \left( \zeta_\ell X_\ell \right) \right] \, \cdots \\mathbb{E}_{y_m} \left[ \exp \left( \zeta_m X_m \right) \right] }. \end{align*} $$

We also write $\mathbb {E}_{y_{\geqslant \ell }}$ to denote the integration of the variables $y_{\ell }, \ldots , y_k$ along the standard Gaussian measure, and we write $\mathbb {E}_y$ as shorthand for $\mathbb {E}_{y_{\geqslant 0}}$ . Within the current proof (and only here), we abuse notation and use $\left \langle \cdot \right \rangle $ with a meaning slightly different from that in (2.11), namely,

$$ \begin{align*} &\left\langle f(\sigma) \right\rangle := \exp \left( -X_k \right) \\ &\quad\times \int\!\! f(\sigma) \exp \left( \sqrt{2t} H_N(\sigma) - N t\xi \left( N^{-1} |\sigma|^2 \right) +\sum_{\ell = 0}^k \left(2q_{\ell} - 2q_{\ell-1}\right)^{\frac 1 2} y_\ell \cdot \sigma - q_k |\sigma|^2 \right)\\&\quad\times \mathrm{d} P_N(\sigma). \end{align*} $$

Defining $F_N(t,\zeta ,q)$ as in (2.15) (substituting $\overline F_N$ by $F_N$ there), we will show that for every $\ell \in \{0,\ldots ,k\}$ ,

(2.20) $$ \begin{align} &\partial_{q_\ell} \mathbb{E} \left[ F_N(t,\zeta,q) \, \big \vert \, (H_N(\sigma))_{\sigma \in \{\pm 1\}^N} \right] \nonumber\\ &\qquad\qquad\qquad\qquad\qquad= \frac {\zeta_{\ell+1} -\zeta_\ell} N \sum_{i = 1}^N \mathbb{E}_{y} \left[ \left(\mathbb{E}_{y_{\geqslant \ell+1}} \left[\left\langle \sigma_i \right\rangle D_{\ell+1,k} \right]\right)^2 D_{1\ell} \right]. \end{align} $$

We decompose the proof of (2.20) into two steps, and then conclude in a last step.

Step 1. We show that, for every $\ell ,m \in \{0,\ldots ,k\}$ ,

(2.21) $$ \begin{align} \partial_{q_{m}} X_{\ell-1} = \mathbb{E}_{y_{\geqslant \ell}} \left[ \left(\partial_{q_m} X_k\right) D_{\ell k} \right]. \end{align} $$

We prove the result by decreasing induction on $\ell $ . Setting $D_{k+1,k} = 1$ , the result is obvious for $\ell = k+1$ . Let $\ell \in \{1,\ldots ,k\}$ , and assume that the statement (2.21) holds with $\ell $ replaced by $\ell + 1$ . Using (2.10), we obtain (2.21) itself. This proves (2.21) for every $\ell \in \{1,\ldots , k\}$ . The statement for $\ell = 0$ is then immediate (recall that $\zeta _0 = 0$ ). Similarly, for every $\ell ,m \in \{0,\ldots ,k\}$ with $m> \ell $ and $i \in \{1,\ldots , N\}$ , we have

(2.22) $$ \begin{align} \partial_{y_{mi}} X_{\ell-1} = \mathbb{E}_{y_{\geqslant \ell}} \left[ \left(\partial_{y_{mi}} X_k \right) D_{\ell k} \right], \end{align} $$

where we write $y_m = (y_{mi})_{1 \leqslant i \leqslant N} \in \mathbb {R}^N$ . For $m \leqslant \ell $ , we clearly have $\partial _{y_{mi}} X_{\ell -1} = 0$ .

Step 2. Notice that, for every $m \in \{0,\ldots ,k-1\}$ ,

(2.23) $$ \begin{align} \partial_{q_m} X_k = \left\langle (2q_m - 2q_{m-1})^{-\frac 1 2} y_m \cdot \sigma - (2q_{m+1}-2q_m)^{-\frac 1 2} y_{m+1} \cdot \sigma \right\rangle. \end{align} $$

We are ultimately interested in understanding $\partial _{q_m} X_{-1}$ , which, in view of (2.21), prompts us to study, for every $i \in \{1,\ldots ,N\}$ ,

(2.24) $$ \begin{align} \mathbb{E}_y \left[ y_{mi} \left\langle \sigma_i \right\rangle D_{1k} \right] = \mathbb{E}_y \left[ \partial_{y_{mi}} \left(\left\langle \sigma_i \right\rangle D_{1k} \right)\right],\end{align} $$

where we performed a Gaussian integration by parts to get the equality. (Recall that $D_{1k} = D_{0k}$ since $\zeta _0 = 0$ .) We have

(2.25) $$ \begin{align} \partial_{y_{mi}} \left\langle \sigma_i \right\rangle = (2q_{m} - 2q_{m-1})^{\frac 12} \left(\left\langle \sigma_i^2 \right\rangle - \left\langle \sigma_i \right\rangle^2 \right)\end{align} $$

and

$$ \begin{align*} \partial_{y_{mi}} D_{1k} = \left(\sum_{\ell = m}^k \zeta_\ell \partial_{y_{mi}} X_{\ell} - \sum_{\ell = m+1}^k \zeta_{\ell} \frac{\mathbb{E}_{y_{\ell}} \left[\partial_{y_{mi}} X_{\ell}\exp \left( \zeta_\ell X_\ell \right) \right]}{\mathbb{E}_{y_\ell} \left[ \exp \left( \zeta_\ell X_\ell \right) \right] }\right) D_{1k}. \end{align*} $$

We next derive from (2.22) that, for every $\ell ,m \in \{0,\ldots ,k\}$ with $m> \ell $ and $i \in \{1,\ldots , N\}$ ,

$$ \begin{align*} \partial_{y_{mi}} X_{\ell-1} = (2q_m - 2q_{m-1})^{\frac 12} \, \mathbb{E}_{y_{\geqslant \ell}} \left[ \left\langle \sigma_i \right\rangle D_{\ell k} \right] . \end{align*} $$

It thus follows that

$$ \begin{align*} \frac{\mathbb{E}_{y_{\ell}} \left[\partial_{y_{mi}} X_{\ell} \exp \left( \zeta_\ell X_\ell \right) \right]}{\mathbb{E}_{y_\ell} \left[ \exp \left( \zeta_\ell X_\ell \right) \right] } =(2q_m - 2q_{m-1})^{\frac 1 2} \, \mathbb{E}_{y_{\geqslant \ell}} \left[ \left\langle \sigma_i \right\rangle D_{\ell k} \right] \end{align*} $$

and

$$ \begin{align*} \partial_{y_{mi}} D_{1k} & = (2q_m - 2q_{m-1})^{\frac 1 2} \left(\sum_{\ell = m}^k \zeta_\ell \mathbb{E}_{y \geqslant \ell+1} \left[ \left\langle \sigma_i \right\rangle D_{\ell+1,k}\right] - \sum_{\ell = m+1}^k \zeta_{\ell} \mathbb{E}_{y_{\geqslant \ell}} \left[ \left\langle \sigma_i \right\rangle D_{\ell k} \right] \right) D_{1k} \\ & = (2q_m - 2q_{m-1})^{\frac 1 2} \left(\left\langle \sigma_i\right\rangle - \sum_{\ell = m}^k (\zeta_{\ell+1} - \zeta_\ell) \mathbb{E}_{y_{\geqslant \ell+1}} \left[ \left\langle \sigma_i \right\rangle D_{\ell+1,k} \right] \right) D_{1k}, \end{align*} $$

with the understanding that $\mathbb {E}_{y_{\geqslant k+1}}$ is the identity map, $D_{k+1,k} = 1$ , and recalling that $\zeta _{k+1} = 1$ . Combining this with (2.24) and (2.25), we thus get that

$$ \begin{align*} &(2q_m - 2q_{m-1})^{-\frac 1 2} \mathbb{E}_y \left[ y_{mi} \left\langle \sigma_i \right\rangle D_{1k} \right] \\ &\qquad\qquad\qquad= \mathbb{E}_y \left[ \left(\left\langle \sigma_i^2\right\rangle - \left\langle \sigma_i \right\rangle \sum_{\ell = m}^k (\zeta_{\ell+1} - \zeta_\ell) \mathbb{E}_{y_{\geqslant \ell+1}} \left[ \left\langle \sigma_i \right\rangle D_{\ell+1,k} \right] \right) D_{1k}\right] .\end{align*} $$

Using this identity in conjunction with (2.21) and (2.23), we arrive at

$$ \begin{align*} -\partial_{q_{m}} X_{-1} = (\zeta_{m+1} - \zeta_m) \sum_{i = 1}^N\mathbb{E}_y \left[ \mathbb{E}_{y_{\geqslant m+1}} \left[ \left\langle \sigma_i \right\rangle D_{m+1,k} \right]\left\langle \sigma_i \right\rangle D_{1k} \right] . \end{align*} $$

This identity is also valid when $m = k$ , as can be checked by following the same argument. We can then write $D_{1k} = D_{1m} D_{m+1,k}$ , and use that $D_{1m}$ does not depend on $y_{m+1},\ldots ,y_k$ , to conclude that

$$ \begin{align*} -\partial_{q_{m}} X_{-1} = (\zeta_{m+1} - \zeta_m) \sum_{i = 1}^N \mathbb{E}_{y} \left[ \left(\mathbb{E}_{y_{\geqslant m+1}} \left[\left\langle \sigma_i \right\rangle D_{m+1,k}\right]\right)^2 D_{1m} \right]. \end{align*} $$

By Lemma 2.2, this is (2.20).

Step 3. We now show that (2.20) implies the lemma. First, it is clear from (2.20)that $\partial _{q_\ell } \overline F_N \geqslant 0$ . Turning to (2.19), we observe that, for each $\ell \in \{0,\ldots , k-1\}$ , we have by Jensen’s inequality that

$$ \begin{align*} \left(\mathbb{E}_{y_{\geqslant \ell}} \left[\left\langle \sigma_i \right\rangle D_{\ell k}\right]\right)^2 & = \left( \mathbb{E}_{y_\ell} \left[\mathbb{E}_{y_{\geqslant \ell + 1}}\left[\left\langle \sigma_i \right\rangle D_{\ell+1,k}\right]D_{\ell\ell}\right]\right)^2 \\ & \leqslant \mathbb{E}_{y_\ell} \left[ \left(\mathbb{E}_{y_{\geqslant \ell + 1}} \left[\left\langle \sigma_i \right\rangle D_{\ell+1,k}\right] \right)^2D_{\ell\ell}\right] \end{align*} $$

and therefore,

$$ \begin{align*} \mathbb{E}_{y} \left[ \left(\mathbb{E}_{y_{\geqslant \ell}} \left[\left\langle \sigma_i \right\rangle D_{\ell k} \right]\right)^2 D_{1,\ell-1} \right] & \leqslant \mathbb{E}_{y} \left[\mathbb{E}_{y_\ell} \left[\left(\mathbb{E}_{y_{\geqslant \ell + 1}} \left[\left\langle \sigma_i \right\rangle D_{\ell+1,k}\right]\right)^2D_{\ell\ell}\right] D_{1,\ell-1} \right] \\ & = \mathbb{E}_{y} \left[\left(\mathbb{E}_{y_{\geqslant \ell+1}} \left[\left\langle \sigma_i \right\rangle D_{\ell+1,k} \right]\right)^2 D_{1\ell} \right]. \end{align*} $$

It thus follows that the sequence

$$ \begin{align*} \left(\sum_{i = 1}^N \mathbb{E}_{y} \left[ \left(\mathbb{E}_{y_{\geqslant \ell+1}} \left[\left\langle \sigma_i \right\rangle D_{\ell+1,k} \right]\right)^2 D_{1\ell} \right]\right)_{0\leqslant \ell \leqslant k} \end{align*} $$

is increasing (in the sense of wide inequalities). By (2.20), this implies (2.19).   ▪

We can now state the conjecture generalizing Theorem 1.1.

Conjecture 2.6 Assume (2.1) and that there exists a function $\psi : \mathcal P_*(\mathbb {R}_+) \to \mathbb {R}$ such that for every $\mu \in \mathcal P_*(\mathbb {R}_+)$ , $\overline F_N(0,\mu )$ converges to $\psi (\mu )$ as N tends to infinity. For every $t \geqslant 0$ and $\mu \in \mathcal P_*(\mathbb {R}_+)$ , we have

$$ \begin{align*} \lim_{N\to +\infty} \overline F_N(t,\mu) = f(t,\mu), \end{align*} $$

where $f : \mathbb {R}_+ \times \mathcal P(\mathbb {R}_+) \to \mathbb {R}$ solves the Hamilton–Jacobi equation in (1.2).

Recall that for the purposes of the present paper, we take the Hopf–Lax formula (1.7) as the definition of the solution to (1.2). In the case when $P_N$ is a product measure, this conjecture has now been proved in more recent work, see [Reference Mourrat and Panchenko24].

3 Convergence of initial condition

We now give two typical situations in which the convergence of $\overline F_N(0,\cdot )$ to some limit is valid. Whenever the limit exists, we write, for every $\mu \in \mathcal P_*(\mathbb {R}_+)$ ,

(3.1) $$ \begin{align} \psi(\mu) := \lim_{N\to \infty} \overline F_N(0,\mu). \end{align} $$

In agreement with Conjecture 2.6, the function $\psi $ is the initial condition we need to use for the Hamilton–Jacobi equation (1.2).

Proposition 3.1 (Convergence of initial condition)

  1. (1) If the measure $P_N$ is of the product form $P_N = P_1^{\otimes N}$ , with $P_1$ of bounded support, then $\overline F_N(0,\cdot ) = \overline F_1(0,\cdot )$ .

  2. (2) For every $\mu \in \mathcal P(\mathbb {R}_+)$ of compact support and $q \geqslant 0$ such that $\mu ([0,q]) = 1$ , let

    (3.2) $$ \begin{align} \psi^\circ (\mu) &:= \inf \left\{\int_0^q \frac 1 {b-2\int_s^{q} \mu([0,r]) \, \mathrm{d} r} \, \mathrm{d} s \right. \nonumber\\ &\quad\qquad\left. + \frac 1 2 \left(b - 1 - \log b\right) - q \ : \ b> 2\int_0^q \mu([0,r]) \, \mathrm{d} r \right\}. \end{align} $$
    The right side of (3.2) does not depend on the choice of q satisfying $\mu ([0,q]) = 1$ , and the mapping $\mu \mapsto \psi ^\circ (\mu )$ can be extended by continuity to $\mathcal P_1(\mathbb {R}_+)$ . Moreover, if the measure $P_N$ is the uniform measure on the sphere $\{\sigma \in \mathbb {R}^N \ : \ |\sigma ^2| = N\}$ , then for every $\mu \in \mathcal P_1(\mathbb {R}_+)$ , we have
    (3.3) $$ \begin{align} \lim_{N \to \infty} \overline F_N(0,\mu) = \psi^\circ(\mu). \end{align} $$

Proof For part (1), we appeal to Lemma 2.2 and observe that, when $t = 0$ , the definition of $X_k$ given there becomes

$$ \begin{align*} X_k(y_0,\ldots, y_k) = \sum_{i = 1}^N \log \int_{\mathbb{R}} \exp \left( \sum_{\ell = 0}^k \left( 2 q_{\ell} - 2 q_{\ell-1} \right)^{\frac 1 2} y_{\ell,i} \sigma_i - q_k\sigma_i^2\right) \, \mathrm{d} P_1(\sigma_i) . \end{align*} $$

Notice that the summands indexed by i are independent random variables under $\mathbb {E}_{y_k}$ , and this structure is preserved as we go down the levels, up to the definition of $X_0$ , where we end up with a sum of N terms that are deterministic and all equal to a constant which does not depend on N. This proves the claim (see also [Reference Panchenko26, (2.60)]).

For part (2), we first verify that the right side of (3.2) does not depend upon the choice of $q \geqslant 0$ satisfying $\mu ([0,q]) = 1$ . Indeed, for every q satisfying $\mu ([0,q]) = 1$ , $q' \geqslant q$ and $b> 2 \int_0^{q'} \mu ([0,r]) \, \mathrm {d} r$ , we have

$$ \begin{align*} &\int_0^{q'} \frac 1 {b-2\int_s^{q'} \mu([0,r]) \, \mathrm{d} r} \, \mathrm{d} s\\ &= \int_0^q \frac 1 {b-2(q'-q)-2\int_s^{q} \mu([0,r]) \, \mathrm{d} r} \, \mathrm{d} s + \frac 1 2\left(\log b - \log \left[ b-2(q'-q) \right]\right). \end{align*} $$

We thus obtain that

$$ \begin{align*} & \int_0^{q'} \frac 1 {b-2\int_s^{q'} \mu([0,r]) \, \mathrm{d} r} \, \mathrm{d} s + \frac 1 2 \left( b-1-\log b \right) - q' \\ & \qquad = \int_0^{q} \frac 1 {b-2(q'-q) - 2\int_s^{q}\mu([0,r]) \, \mathrm{d} r} \, \mathrm{d} s \\ & \qquad \qquad \qquad + \frac 1 2 \left( b-2(q'-q) - 1-\log \left[ b-2(q'-q) \right] \right) - q. \end{align*} $$

Taking the infimum over $b> 2 \int_0^{q'} \mu ([0,r]) \, \mathrm {d} r = 2(q'-q) + 2 \int_0^q \mu ([0,r]) \, \mathrm {d} r$ concludes the verification of the fact that the right side of (3.2) does not depend on the choice of q satisfying $\mu ([0,q]) = 1$ .

In order to verify the convergence in (3.3), we start by considering the case of a measure of finite support. In this case, we can follow the arguments leading to [Reference Talagrand30, Proposition 3.1] and obtain (3.3). The full result then follows by the continuity property of $\overline F_N$ , see Proposition 2.1.    ▪

It so happens that at least in the case when $P_N$ is a product measure, the initial condition $\psi = \lim _{N \to \infty } \overline F_N(0,\cdot )$ can itself be described in terms of a Hamilton–Jacobi equation of second order [Reference Parisi28]. We recall this fact in the proposition below for completeness, and soas to clarify the small modifications necessary to match the slightly different presentation explored in the present paper. As far as I understand, the fact that the initial condition admits such a representation seems tobe unrelated to the (first-order) Hamilton–Jacobi structure explored in the rest of the paper. Somewhat surprisingly, this representation makes it less clear that $\psi \geqslant 0$ .

Proposition 3.2 (Initial condition as second-order HJ equation)

Assume that $P_N$ is the N-fold tensor product $P_N = P_1^{\otimes N}$ , with $P_1$ a measure of bounded support, and denote $\psi = \lim _{N \to \infty } \overline F_N(0,\cdot ) = \overline F_1(0,\cdot )$ . For every $\mu \in \mathcal P(\mathbb {R}_+)$ with compact support and $q \geqslant 0$ such that $\mu ([0,q]) = 1$ , letting $u_\mu : [0,q]\times \mathbb {R} \to \mathbb {R}$ be the solution of the backward-in-time equation

(3.4) $$ \begin{align} \left\{ \begin{aligned} & \partial_s u_\mu + \partial_x^2 u_\mu + \mu([0,s]) \, (\partial_x u_\mu)^2 = 0 & \text{on } [0,q] \times \mathbb{R},\\ & u_\mu(q,x) = \log \int_{\mathbb{R}}\exp \left( x \sigma - q |\sigma|^2 \right) \, \mathrm{d} P_1(\sigma) & \text{for } x \in \mathbb{R}, \end{aligned} \right. \end{align} $$

we have $\psi (\mu ) = - u_\mu (0,0)$ .

Proof We first verify that $u_\mu $ does not depend on the choice of $q \geqslant 0$ satisfying $\mu ([0,q]) = 1$ . More precisely, denoting by $u_{\mu ,q}$ the solution obtained for a given choice of such q, and letting $q' \geqslant q$ , we have that the solutions $u_{\mu ,q}$ and $u_{\mu ,q'}$ coincide on $[0,q] \times \mathbb {R}$ . Indeed, this is a consequence of the fact that, writing

$$ \begin{align*} \phi(s,x) := \log \int_{\mathbb{R}} \exp \left( x \sigma - s |\sigma|^2 \right) \, \mathrm{d} P_1(\sigma), \end{align*} $$

we have

$$ \begin{align*} \partial_s \phi + \partial_x^2 \phi + (\partial_x \phi)^2 = 0. \end{align*} $$

(Verifying this boils down to the observation that the second moment of $\sigma $ under the appropriate Gibbs measure is the sum of the variance and the square of the first moment.) Let $\mu $ be a measure of the form (2.4) to (2.5), and let $(B_t)$ be a standard Brownian motion. We define, for every $x \in \mathbb {R}$ ,

$$ \begin{align*} v_\mu(q_k,x) := \phi(q_k,x), \end{align*} $$

and then recursively, for every $\ell \in \{0,\ldots ,k\}$ and $s \in [q_{\ell -1},q_{\ell })$ ,

$$ \begin{align*} v_\mu(s,x) := \zeta_{\ell}^{-1} \log \mathbb{E} \exp \left[ \zeta_{\ell} v_\mu\left(q_{\ell}, B_{2q_{\ell}} - B_{2s} + x \right) \right] . \end{align*} $$

Recall that when $\ell = 0$ , we have $\zeta _0 = 0$ and we interpret the right side above as

$$ \begin{align*} \mathbb{E} \left[ v_\mu(q_0,B_{2q_1} - B_{2s} + x) \right]. \end{align*} $$

By induction, we have that for every $\ell \in \{0,\ldots , k\}$ ,

$$ \begin{align*} v_\mu(q_{\ell-1},x) &= \zeta_\ell^{-1} \log \mathbb{E}_{y_{\ell}} \left[ \mathbb{E}_{y_{\ell+1}}^{\frac{\zeta_{\ell}}{\zeta_{\ell+1}}} \left[\mathbb{E}_{y_{\ell+2}}^{\frac{\zeta_{\ell+1}}{\zeta_{\ell+2}}} \left[ \cdots \mathbb{E}_{y_{k}}^{\frac{\zeta_{k-1}}{\zeta_k}} \left[ \exp \left( \zeta_k \phi\left(q_k,\vphantom{\mathbb{E}_{y_{\ell+1}}^{\frac{\zeta_{\ell}}{\zeta_{\ell+1}}}}\right.\right.\right.\right.\right.\right.\\&\left.\left.\left.\left.\left.\left.\vphantom{\mathbb{E}_{y_{\ell+1}}^{\frac{\zeta_{\ell}}{\zeta_{\ell+1}}}}x+ \sum_{\ell' = \ell}^k \left( 2q_{\ell} - 2q_{\ell-1} \right)^{\frac 1 2} y_\ell \right) \right) \right] \cdots \right] \right] \right] , \end{align*} $$

where here $\mathbb {E}_{y_\ell }$ denotes the integration of the variable $y_\ell $ according to the standard scalar Gaussian measure, and for $\ell = 0$ , the right side above is interpreted as

$$ \begin{align*} \mathbb{E}_{y_0} \left[ \zeta_{1}^{-1}\log \left( \mathbb{E}_{y_1}\left[\mathbb{E}_{y_2}^{\frac{\zeta_1}{\zeta_2}} \left[\cdots\right]\right] \right) \right]. \end{align*} $$

By Lemma 2.2 with $t = 0$ and $N = 1$ , we deduce that $\overline F_1(0,\mu ) = - v_\mu (0,0)$ , and we have already seen in part (1) of Proposition 3.1 that $\overline F_N(0,\cdot ) = \overline F_1(0,\cdot )$ . Moreover, denoting, for every $\ell \in \{0,\ldots ,k\}$ and $s \in [q_{\ell -1},q_\ell )$ ,

$$ \begin{align*} w_\mu(s,x) := \mathbb{E} \exp \left[ \zeta_\ell v_\mu\left(q_{\ell}, B_{2q_{\ell}} - B_{2s} + x \right) \right], \end{align*} $$

we have $\partial _s w_\mu +\partial _x^2w_\mu = 0$ on $[q_{\ell -1}, q_\ell ) \times \mathbb {R}$ , with continuity at the junction times $s \in \{q_0,\ldots ,q_{k}\}$ , and a change of variables then gives that $v_\mu $ solves (3.4). This shows that Proposition 3.2 holds whenever $\mu $ is a measure of finite support. The general case can then be obtained by continuity in $\mu $ (the continuity of $\overline F_1(0,\cdot )$ is a consequence of Proposition 2.1; for that of $\mu \mapsto u_\mu (0,0)$ , one can start by verifying that $\|\partial _x u_\mu \|_{L^\infty }$ is bounded by an upper bound on the support of $P_1$ using the maximum principle).    ▪

4 Proof of Theorem 1.1

In this section, we give the proof of Theorem 1.1. Recall that we interpret the solution of (1.2) as being given by theHopf–Lax formula in (1.7). The formula (1.7) simplifies slightly in the case when $\mu = \delta _0$ , and thus the statement of Theorem 1.1 can be reformulated as follows.

Proposition 4.1 (Hopf–Lax representation of Parisi formula)

Assume (1.1), and fix the normalization $\xi (1) = 1$ . For every $t> 0$ , we have

$$ \begin{align*} \lim_{N \to \infty} -\frac 1 N\mathbb{E} \log\int\!\! &\exp \left( \sqrt{2t} H_N(\sigma) - N t\right) \, \mathrm{d} P_N(\sigma) \\ &= \mathop{\mathrm{sup}}\limits_{\mu \in \mathcal{P}_{*}(\mathbb{R}_+)} \left( \psi(\mu) - t \int_{\mathbb{R}_+} \xi^*(t^{-1} s) \, \mathrm{d} \mu(s) \right) . \end{align*} $$

Proof We first focus on the case when $P_N$ is the uniform probability measure on $\{-1,1\}^N$ . We decompose the argument for this case into four steps.

Step 1. In this step, we recast the standard expression for the Parisi formula, borrowed from [Reference Panchenko26], in the following form:

(4.1) $$ \begin{align} &\lim_{N \to \infty} -\frac 1 N \mathbb{E} \log \int\!\! \exp \left( \sqrt{2t} H_N(\sigma) -Nt\right) \, \mathrm{d} P_N(\sigma) \nonumber\\ &\qquad\qquad=t +\mathop{\mathrm{sup}}\limits_{\nu \in \mathcal P(\left[0,1\right])}\left( \psi \left( (t\xi')(\nu) \right) -t\xi'(1) + t \int_0^1 s \xi''(s) \nu([0,s]) \, \mathrm{d} s \right) . \end{align} $$

On the right side, the notation $(t\xi ')(\nu )$ denotes the image of the measure $\nu $ under the mapping $r \mapsto t\xi '(r)$ . Let $\nu \in \mathcal P([0,1])$ be a measure with finite support containing the extremal points $0$ and $1$ . For some $k \in \mathbb {N}$ and parameters

$$ \begin{align*} 0 = \zeta_{0} < \zeta_1 < \cdots < \zeta_{k} < \zeta_{k+1} = 1, \qquad 0 = q_{-1} = q_0 < q_1 < \cdots < q_k = 1, \end{align*} $$

we can represent this measure as

$$ \begin{align*} \nu = \sum_{\ell = 0}^{k} (\zeta_{\ell+1} - \zeta_{\ell}) \delta_{q_\ell}. \end{align*} $$

The reason for the perhaps slightly surprising choice of setting $q_{-1} = q_0 = 0$ is that we have chosen here to include a term associated with the root of $\mathcal A$ , at level $\ell = 0$ , in the definition (2.6), while a different choice was taken in [Reference Panchenko26]. (The motivation for this inconsequentialdifference is that it then covers more naturally the situation in (2.2) as a particular case. Relatedly, by default, the measures of finite support considered in [Reference Panchenko26] have an atom at zero.) In order to extract the free energy associated with the Hamiltonian $\sigma \mapsto \sqrt {2t} H_N(\sigma )$ from [Reference Panchenko26, Theorem 3.1], we need to replace $\xi $ by $2t\xi $ in [Reference Panchenko26, (3.3)]. With this modification in place, and recalling Lemma 2.2, we see that the quantity denoted $\mathbb {E} X_0$ in [Reference Panchenko26, (3.11)] can be rewritten as

$$ \begin{align*} \mathbb{E} \log \left[ \sum_{\sigma \in \{\pm 1\}} \sum_{\alpha \in \mathcal A} v_\alpha \exp \left( \sum_{\ell = 1}^k \left(2t\xi'(q_{\ell}) - 2t\xi'(q_{\ell-1})\right)^{\frac 1 2}z_{\alpha_{|\ell}} \cdot \sigma \right)\right]. \end{align*} $$

On the other hand, by (3.1) and Proposition 3.1, we have

$$ \begin{align*} &\psi\left((t\xi')(\nu)\right) \\ &= -\mathbb{E} \log \left[ \frac 1 2 \sum_{\sigma \in \{\pm 1\}} \sum_{\alpha \in \mathcal A} v_\alpha \exp \left( \sum_{\ell = 1}^k\left(2t\xi'(q_\ell) - 2t\xi'(q_{\ell-1})\right)^{\frac 1 2} z_{\alpha_{|\ell}} \cdot \sigma -t\xi'(1)\right)\right]. \end{align*} $$

We thus deduce that the quantity denoted $\mathbb {E} X_0$ in [Reference Panchenko26, (3.11)] is

$$ \begin{align*} \log 2 - \psi\left((t\xi')(\nu)\right) + t \xi'(1). \end{align*} $$

The finite-volume free energy is normalized slightly differently here and in [Reference Panchenko26]: there is a multiplicative factor of $2^{-N}$ hidden in the fact that $P_N$ is normalized to be a probability measure, and an additional minus sign, on the left side of (4.1). Combining these observations and appealing to [Reference Panchenko26, Theorem 3.1] and to Proposition 2.1 yields (4.1).

Step 2. We fix $\nu \in \mathcal P([0,1])$ , $t> 0$ , and define $\mu := (t\xi ')(\nu )$ to be the image of $\nu $ under the mapping $r \mapsto t\xi '(r)$ . In this step, we show that

(4.2) $$ \begin{align} \int_{ \left[0,1\right] } \left( s\xi'(s) - \xi(s) \right) \, \mathrm{d} \nu(s) = \int_{\mathbb{R}_+} \xi^*(t^{-1} s) \, \mathrm{d} \mu(s). \end{align} $$

By the definition of $\mu $ and a change of variables, we have

$$ \begin{align*} \int_{\mathbb{R}_+} \xi^*(t^{-1} s) \, \mathrm{d} \mu(s) = \int_{\left[0,1\right]} \xi^*(\xi'(s)) \, \mathrm{d} \nu(s). \end{align*} $$

Recall that

$$ \begin{align*} \xi^*(\xi'(s)) = \mathop{\mathrm{sup}}\limits_{r \geqslant 0} \left( r\xi'(s) - \xi(r)\right). \end{align*} $$

Since $\xi '(0) = 0$ , for each $s> 0$ , the supremum above is achieved at some $r> 0$ , and calculating the derivative in r shows that it is in fact achieved at $r = s$ , since $\xi '$ is injective. That is, we have $\xi ^*(\xi '(s)) = s\xi '(s) - \xi (s)$ , and thus (4.2) holds.

Step 3. In this step, we show that

(4.3) $$ \begin{align} &\lim_{N \to \infty} -\frac 1 N\mathbb{E} \log\int\!\! \exp \left( \sqrt{2t} H_N(\sigma) - N t\right) \, \mathrm{d} P_N(\sigma)\nonumber \\ &\quad= \mathop{\mathrm{sup}}\limits \left \{\psi(\mu) - t \int_{\mathbb{R}_+} \xi^*(t^{-1} s) \, \mathrm{d} \mu(s) \ : \ \mu \in \mathcal P(\mathbb{R}_+), \ \operatorname{\mathrm{supp}} \mu \subseteq [0,t\xi'(1)] \right\} . \end{align} $$

We start by rewriting the last term in the supremum on the right side of (4.1), by appealing to the following integration by parts formula: for every $f \in L^1([0,1])$ ,

(4.4) $$ \begin{align} \int_0^1 f(r) \, \nu([0,r]) \, \mathrm{d} r = \int_{ \left[ 0,1 \right] } \int_r^1 f(u) \, \mathrm{d} u \, \mathrm{d} \nu(r). \end{align} $$

This formula itself is a consequence of Fubini’s theorem. We notice that

$$ \begin{align*} \int_r^1 s \xi''(s) \, \mathrm{d} s = \xi'(1) - \xi(1)- \left(r\xi'(r) - \xi(r)\right). \end{align*} $$

Recalling that we have fixed the normalization $\xi (1) = 1$ yields that

$$ \begin{align*} \int_0^1 s \xi''(s) \nu([0,s]) \, \mathrm{d} s = \xi'(1) - 1 - \int_{ \left[ 0,1 \right] } \left( r \xi'(r) - \xi(r) \right) \, \mathrm{d} \nu(r). \end{align*} $$

Combining this with (4.1), (4.2), and the fact that $\xi ' : [0,1] \to [0,\xi '(1)]$ is bijective, we obtain (4.3).

Step 4. In order to conclude the proof (in the case of Ising spins), there remains to show that the supremum on the right side of (4.3) does notincrease if we remove the restriction that the support of the measure $\mu $ be in $[0,t \xi '(1)]$ . Let $\mu \in \mathcal P_*(\mathbb {R}_+)$ , and let $\widetilde \mu $ denote the image of $\mu $ under the mapping $r \mapsto r \wedge (t\xi '(1))$ , where we write $a \wedge b := \min (a,b)$ . We show that

(4.5) $$ \begin{align} \psi(\mu) - t \int_{\mathbb{R}_+} \xi^*(t^{-1} s) \, \mathrm{d} \mu(s) \leqslant \psi(\widetilde \mu) - t \int_{\mathbb{R}_+} \xi^*(t^{-1} s) \, \mathrm{d} \widetilde \mu(s) .\end{align} $$

By Proposition 2.1 and Fubini’s theorem, we have

(4.6) $$ \begin{align} \notag \psi(\mu) - \psi(\widetilde \mu) \leqslant \mathbb{E} \left[ \left| X_\mu - X_{\widetilde \mu} \right| \right] & = \int_0^{+\infty} \left|\mu([0,r]) - \widetilde \mu([0,r])\right| \, \mathrm{d} r \\ \notag & = \int_{t\xi'(1)}^{+\infty} \mu((r,+\infty)) \, \mathrm{d} r \\ \notag & = \int_{t\xi'(1)}^{+\infty} \int_{\mathbb{R}_+}\mathbf{1}_{s \geqslant r} \, \mathrm{d} \mu(s) \, \mathrm{d} r\\ & = \int_{\mathbb{R}_+} \left(s - t\xi'(1) \right)\mathbf{1}_{s \geqslant t\xi'(1)}\, \mathrm{d} \mu(s). \end{align} $$

On the other hand, by the definition of $\widetilde \mu $ , we have

$$ \begin{align*} \int_{\mathbb{R}_+} \xi^*(t^{-1} s) \, \mathrm{d} \widetilde \mu(s) = \int_{\mathbb{R}_+} \xi^*\left((t^{-1} s) \wedge \xi'(1) \right) \, \mathrm{d} \mu(s), \end{align*} $$

and thus

(4.7) $$ \begin{align} &\int_{\mathbb{R}_+} \xi^*(t^{-1} s) \, \mathrm{d} \mu(s) - \int_{\mathbb{R}_+} \xi^*(t^{-1} s) \, \mathrm{d} \widetilde \mu(s) \nonumber\\ &\qquad\qquad\qquad\qquad\qquad\qquad=\int_{\mathbb{R}_+} \left(\xi^*(t^{-1}s) - \xi^*\left( \xi'(1)\right)\right)\mathbf{1}_{s \geqslant t\xi'(1)} \, \mathrm{d} \mu(s). \end{align} $$

Recall that $\xi ^*(\xi '(1)) = \xi '(1) - \xi (1)$ . By the definition of the convex dual, we also have that $\xi ^*(s) \geqslant s - \xi (1)$ . Hence, the integral on the right side of (4.7) is bounded from below by

$$ \begin{align*} \int_{\mathbb{R}_+} \left( t^{-1} s -\xi'(1) \right) \mathbf{1}_{s \geqslant t \xi'(1)} \, \mathrm{d} \mu(s). \end{align*} $$

Combining this with (4.6) yields (4.5) and thus completes the proof in the case of Ising spins.

Step 5. We show Proposition 4.1 in the case when $P_N$ is the uniform probability measure on the sphere $\{\sigma \in \mathbb {R}^N \ : \ |\sigma |^2 = N\}$ . Using [Reference Talagrand30, Corollary 4.1] and arguing as in Step 1, one can check that the formula (4.1) is also valid inthis case. The rest of the argument carries over without modification.   ▪

5 Finite-dimensional approximations

In this last section, we lightly touch upon the question of giving an intrinsic meaning to the Hamilton–Jacobi equation (1.2). This allows us to give somesubstance to the connection between this equation and the Hopf–Lax formula in (1.7).

There already exists a rich literature on Hamilton–Jacobi equations in infinite-dimensional Banach spaces, as well as on the Wasserstein space of probability measures or more general metric spaces; see inparticular [10–12] for the former and [Reference Ambrosio and Feng1, Reference Cardaliaguet7Reference Cardaliaguet and Souquière914–17] for the latter. I will refrain from engaging with these works here, and only discuss finite-dimensionalapproximations of the solution to (1.2).

A simple way to obtain a finite-dimensional approximation of (1.2) is to fix an integer $k \geqslant 1$ and restrict the space of allowed probability measures to those belonging to

$$ \begin{align*} \mathcal P^{(k)}(\mathbb{R}_+) := \left\{ \frac 1 {k} \sum_{\ell = 1}^k \delta_{x_\ell} \ : \ x_1, \ldots, x_k \geqslant 0 \right\} . \end{align*} $$

A natural discretization of the formula (1.7) is then obtained by setting, for every $t \geqslant 0$ and $\mu \in \mathcal P^{(k)}(\mathbb {R}_+)$ ,

$$ \begin{align*} f^{(k)}(t,\mu) := \mathop{\mathrm{sup}}\limits_{\nu \in \mathcal P^{(k)} (\mathbb{R}_+)} \left( \psi(\nu) - t \mathbb{E} \left[ \xi^* \left( \frac{X_\nu - X_\mu}{t} \right) \right]\right) . \end{align*} $$

Abusing notation slightly, we also write, for every $x \in \mathbb {R}_+^{k}$ ,

$$ \begin{align*} f^{(k)}(t,x) := f \left( t, \frac 1 {k} \sum_{\ell = 1}^k \delta_{x_\ell} \right) \quad \text{ and } \quad \psi^{(k)}(x) := \psi \left( \frac 1 {k} \sum_{\ell = 1}^k \delta_{x_\ell}\right). \end{align*} $$

We note the following elementary observation.

Lemma 5.1 For every $t \geqslant 0$ and $x \in \mathbb {R}_+^{k}$ , we have

(5.1) $$ \begin{align} f^{(k)}(t,x) = \mathop{\mathrm{sup}}\limits_{y \in \mathbb{R}^{k}_{+}} \left( \psi^{(k)}(y) - \frac t {k} \sum_{\ell = 1}^k \xi^* \left( \frac{y_\ell - x_\ell}{t} \right) \right) .\end{align} $$

Proof We introduce the notation

$$ \begin{align*} \mathbb{R}^{k\uparrow}_{+} := \left\{ x = (x_1,\ldots,x_k) \in \mathbb{R}_+^k \ : \ x_1 \leqslant \cdots \leqslant x_k \right\} . \end{align*} $$

Notice first that the quantities $f^{(k)}(t,x)$ and $\psi ^{(k)}(x)$ are invariant under permutation of the coordinates of x. Hence, it suffices to prove the relation (5.1) under the additional assumptionthat $x \in \mathbb {R}^{k\uparrow }_+$ . It is clear that equality holds if on the right side, we take the supremum over $y \in \mathbb {R}^{k\uparrow }_+$ only. We now verify that other orderings of a given vector y yield a larger value for the sum on the right side of (5.1). Indeed, fix $x \in \mathbb {R}^{k\uparrow }_+$ , $y \in \mathbb {R}^k$ , and assume that there exist $i < j \in \{1,\ldots ,k\}$ such that $y_i \geqslant y_j$ . By the convexity of $\xi ^*$ and the fact that $x_i \leqslant x_j$ ,, the function $u \mapsto \xi ^* \left ( \frac {u - x_i}{t} \right ) - \xi ^* \left ( \frac {u - x_j}{t} \right )$ is increasing. In particular,

$$ \begin{align*} \xi^* \left( \frac{y_i - x_i}{t} \right) - \xi^* \left( \frac{y_i - x_j}{t} \right) \geqslant \xi^* \left( \frac{y_j - x_i}{t} \right) - \xi^* \left( \frac{y_j - x_j}{t} \right),\end{align*} $$

and therefore

$$ \begin{align*} \xi^* \left( \frac{y_j - x_i}{t} \right) + \xi^* \left( \frac{y_i - x_j}{t} \right) \leqslant \xi^* \left( \frac{y_i - x_i}{t} \right) + \xi^* \left( \frac{y_j - x_j}{t} \right) .\end{align*} $$

That is, whenever $i < j$ and $y_i \geqslant y_j$ , replacing y by the vector with the coordinates $y_i$ and $y_j$ interchanged can only reduce (or keep constant) the value of the quantity

$$ \begin{align*} \sum_{\ell = 1}^k \xi^* \left( \frac{y_\ell - x_\ell}{t} \right) . \end{align*} $$

By induction, this implies that replacing the vector y by the increasingly ordered sequence of coordinates of y can only reduce (or keep constant) the quantity above.    ▪

The convex dual of the mapping

$$ \begin{align*} \left\{ \begin{array}{lcl} \mathbb{R}^k & \to & \mathbb{R} \\ x & \mapsto & \frac 1 k \sum_{\ell = 1}^k \xi^*(x_\ell) \end{array} \right. \end{align*} $$

is

(5.2) $$ \begin{align} \left\{ \begin{array}{lcl} \mathbb{R}^k & \to & \mathbb{R} \\ p & \mapsto & \left|\begin{array}{@{\,}ll} \frac 1 k \sum_{\ell = 1}^k \xi(k \, p_\ell) & \text{if } p \in \mathbb{R}_+^k\\ +\infty & \text{otherwise}. \end{array} \right. \end{array} \right. \end{align} $$

It follows from Proposition 2.1 that $\psi ^{(k)}$ is Lipschitz continuous, and from this, one can show that $f^{(k)}$ is Lipschitz continuous in x and in t. In particular, the function $f^{(k)}$ is differentiable almost everywhere in $[0,\infty )\times \mathbb {R}_+^k$ . Following classical arguments, see e.g., [Reference Benton5] or [Reference Evans13, Theorem 3.3.5], we thus deduce from (5.1) that at every $(t,x) \in (0,\infty )\times (0,\infty )^k$ at which $f^{(k)}$ is differentiable, we have

(5.3) $$ \begin{align} \partial_t f^{(k)}(t,x) - \frac 1 k \sum_{\ell = 1}^k \xi \left( k \partial_{x_\ell} f^{(k)}(t,x) \right) = 0. \end{align} $$

This identification also uses that $\partial _{x_\ell } f^{(k)}(t,x) \geqslant 0$ , see (5.2). The latter property can be obtained as a consequence of the fact that $\partial _{x_\ell } \psi ^{(k)} \geqslant 0$ , which itself follows from Lemma 2.4.

We can now verify that the equation in (5.3) is formally consistent with a finite-dimensional interpretation of the Hamilton–Jacobi equation (1.2). In view of (1.8), and assuming “smoothness” of the function f, we must have, for every $\mu = k^{-1} \sum _{\ell = 1}^k \delta _{x_\ell } \in \mathcal P^{(k)}(\mathbb {R}_+)$ and $\ell \in \{1,\ldots , k\}$ that

$$ \begin{align*} \partial_{\mu} f(t,\mu,x_\ell) = k\, \partial_{x_\ell} f^{(k)}(t,x), \end{align*} $$

and thus

$$ \begin{align*} \int\!\! \xi(\partial_{\mu} f) \, \mathrm{d} \mu = \frac 1 k \sum_{\ell = 1}^k \xi \left(k \, \partial_{x_\ell} f^{(k)} \right) . \end{align*} $$

We have thus obtained a formal relation between the Hamilton–Jacobi equation in (1.2) and that in (5.3), which itself can be rigorously connected with the Hopf–Lax formula (5.1)—see [Reference Mourrat21] and [Reference Mourrat22] on how to handle the boundary condition on $\partial (\mathbb {R}_+^k)$ in the contexts of viscosity solutions and weak solutions, respectively.

Acknowledgment

I would like to thank Dmitry Panchenko for useful comments, in particular for pointing out an error in an earlier argument for the validity of (2.18).

Footnotes

This work was partially supported by the ANR grants LSD (ANR-15-CE40-0020-03) and Malin (ANR-16-CE93-0003).

References

Ambrosio, L. and Feng, J., On a class of first order Hamilton-Jacobiequations in metric spaces. J. Differ. Equ. 256(2014), no. 7, 21942245.CrossRefGoogle Scholar
Ambrosio, L., Gigli, N., and Savaré, G., Gradient flows in metric spaces and in the space of probability measures. 2nd ed., Lectures in Mathematics ETH Zürich,Birkhäuser Verlag,Basel, Switzerland, 2008.Google Scholar
Barra, A., Del Ferraro, G., and Tantari, D., Mean field spin glasses treated with PDE techniques . Eur. Phys. J. B. 86(2013), no.7, 110.CrossRefGoogle Scholar
Barra, A., Di Biasio, A., and Guerra, F., Replica symmetry breaking in mean-field spin glasses through the Hamilton-Jacobi technique . J. Stat. Mech. TheoryExp. 22(2010), no. 9, P09006.Google Scholar
Benton, S. H. Jr., The Hamilton-Jacobi equation . Academic Press,New York, NY, London, UK, 1977.Google Scholar
Brankov, J. G. and Zagrebnov, V. A., On the description of thephase transition in the Husimi-Temperley model . J. Phys. A 16(1983), no. 10,22172224.CrossRefGoogle Scholar
Cardaliaguet, P., Notes on mean field games. Technical report, 2010.Google Scholar
Cardaliaguet, P. and Quincampoix, M., Deterministic differentialgames under probability knowledge of initial condition . Int. Game Theory Rev. 10(2008), no. 1,116.CrossRefGoogle Scholar
Cardaliaguet, P. and Souquière, A., A differential game witha blind player . SIAM J. Control. Optim. 50(2012), no. 4, 20902116.CrossRefGoogle Scholar
Crandall, M. G. and Lions, P.-L., Hamilton-Jacobi equations ininfinite dimensions. I. Uniqueness of viscosity solutions . J. Funct. Anal. 62(1985), no. 3,379396.CrossRefGoogle Scholar
Crandall, M. G. and Lions, P.-L., Hamilton-Jacobi equations ininfinite dimensions. II. Existence of viscosity solutions . J. Funct. Anal. 65(1986), no. 3,368405.CrossRefGoogle Scholar
Crandall, M. G. and Lions, P.-L., Hamilton-Jacobi equations ininfinite dimensions. III. J. Funct. Anal. 68(1986), no. 2, 214247.CrossRefGoogle Scholar
Evans, L. C., Partial differential equations . 2nd ed., Graduate Studies in Mathematics, 19,American Mathematical Society,Providence, RI, 2010.Google Scholar
Feng, J. and Katsoulakis, M., A comparison principle forHamilton-Jacobi equations related to controlled gradient flows in infinite dimensions . Arch. Ration. Mech. Anal. 192(2009), no. 2,275310.CrossRefGoogle Scholar
Feng, J. and Kurtz, T. G., Large deviations for stochasticprocesses, Mathematical Surveys and Monographs, 131, American Mathematical Society,Providence, RI, 2006.Google Scholar
Gangbo, W., Nguyen, T., and Tudorascu, A., Hamilton-Jacobi equations in the Wasserstein space . Methods Appl. Anal. 15(2008),no. 2, 155183.CrossRefGoogle Scholar
Gangbo, W. and Świch, A., Optimal transport and large numberof particles . Discrete Contin. Dyn. Syst. 34(2014), no. 4, 13971441.CrossRefGoogle Scholar
Guerra, F., Sum rules for the free energy in the mean field spin glass model . Fields Inst.Commun. 30(2001), 161.Google Scholar
Guerra, F., Broken replica symmetry bounds in the mean field spin glass model . Commun. Math.Phys. 233(2003), no. 1, 112.CrossRefGoogle Scholar
Mézard, M., Parisi, G., and Virasoro, M., Spin glass theory and beyond: an introduction to the replica method and its applications . Vol. 9. WorldScientific Publishing Company, 1987.Google Scholar
Mourrat, J.-C., Hamilton-Jacobi equations for mean-field disordered systems . Ann. Henri Lebesgue(2018), to appear.Google Scholar
Mourrat, J.-C., Hamilton-Jacobi equations for finite-rank matrix inference . Ann. Appl. Probab. 30(2020), no. 5, 22342260.CrossRefGoogle Scholar
Mourrat, J.-C., Nonconvex interactions in mean-field spin glasses . Probab. Math. Phys. 2020, to appear.Google Scholar
Mourrat, J.-C. and Panchenko, D., Extending the Parisi formulaalong a Hamilton-Jacobi equation . Electron. J. Probab. 25(2020), no. 23, 17.CrossRefGoogle Scholar
Newman, C., Percolation theory: A selective survey of rigorous results . In: Advances in multiphase flowand related problems, (G. Papanicolaou) SIAM,Philadelphia, PA, 1986.Google Scholar
Panchenko, D., The Sherrington-Kirkpatrick model, Springer Monographs in Mathematics, Springer,New York, NY, 2013.CrossRefGoogle Scholar
Panchenko, D. and Talagrand, M., Guerra’s interpolation usingDerrida-Ruelle cascades. Unpublished manuscript. Preprint, 2007. arXiv:0708.3641.CrossRefGoogle Scholar
Parisi, G., A sequence of approximated solutions to the SK model for spin glasses . J. Phys. A 13(1980), no. 4, L115L121.CrossRefGoogle Scholar
Rao, M. M. and Ren, Z. D., Theory of Orlicz spaces, Monographsand Textbooks in Pure and Applied Mathematics, 146, Marcel Dekker, Inc.,New York, NY, 1991.Google Scholar
Talagrand, M., Free energy of the spherical mean field model . Probab. Theory Related Fields 134(2006), no. 3, 339382.CrossRefGoogle Scholar
Talagrand, M., The Parisi formula . Ann. Math. 163(2006), no.1, 221263.CrossRefGoogle Scholar
Talagrand, M., Mean field models for spin glasses. Vol. I. Ergebnisse der Mathematik und ihrer Grenzgebiete, 54,Springer-Verlag, Berlin, Germany, 2011.Google Scholar
Talagrand, M., Mean field models for spin glasses. Vol. II . Ergebnisse der Mathematik und ihrer Grenzgebiete, 55,Springer, Heidelberg, Germany, 2011.Google Scholar
Thurston, W. P., On proof and progress in mathematics . Bull. Amer. Math. Soc. 30(1994), no. 2, 161177.CrossRefGoogle Scholar
Villani, C., Topics in optimal transportation, Graduate Studies in Mathematics, 58, American MathematicalSociety, Providence, RI, 2003.Google Scholar