Wall-to-wall optimal transport in two dimensions

Andre N. Souza; Ian Tobasco; Charles R. Doering

doi:10.1017/jfm.2020.42

Wall-to-wall optimal transport in two dimensions

Published online by Cambridge University Press: 28 February 2020

Andre N. Souza

Ian Tobasco and

Charles R. Doering

Show author details

Andre N. Souza*: Affiliation:
Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Ian Tobasco: Affiliation:
Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, 851 S. Morgan Street, Chicago, IL 60607-7045, USA
Charles R. Doering: Affiliation:
Department of Mathematics, University of Michigan, Ann Arbor, MI 48109-1043, USA Center for the Study of Complex Systems, University of Michigan, Ann Arbor, MI 48109-1042, USA Department of Physics, University of Michigan, Ann Arbor, MI 48109-1140, USA
*: †Email address for correspondence: andrenogueirasouza@gmail.com

Article contents

Abstract
Introduction
Theory
Gradient flow
Two-dimensional computations
Conditional upper bound
Summary and conclusions
References

Rights & Permissions

Abstract

Gradient ascent methods are developed to compute incompressible flows that maximize heat transport between two isothermal no-slip parallel walls. Parameterizing the magnitude of the velocity fields by a Péclet number $Pe$ proportional to their root-mean-square rate of strain, the schemes are applied to compute two-dimensional flows optimizing convective enhancement of diffusive heat transfer, i.e. the Nusselt number $Nu$ up to $Pe\approx 10^{5}$. The resulting transport exhibits a change of scaling from $Nu-1\sim Pe^{2}$ for $Pe<10$ in the linear regime to $Nu\sim Pe^{0.54}$ for $Pe>10^{3}$. Optimal fields are observed to be approximately separable, i.e. products of functions of the wall-parallel and wall-normal coordinates. Analysis employing a separable ansatz yields a conditional upper bound ${\lesssim}Pe^{6/11}=Pe^{0.\overline{54}}$ as $Pe\rightarrow \infty$ similar to the computationally achieved scaling. Implications for heat transfer in buoyancy-driven Rayleigh–Bénard convection are discussed.

JFM classification

Mathematical Foundations: Variational methods Flow Control: Mixing enhancement

Type: JFM Papers
Information: Journal of Fluid Mechanics , Volume 889 , 25 April 2020 , A34

DOI: https://doi.org/10.1017/jfm.2020.42 [Opens in a new window]
Copyright: © The Author(s), 2020. Published by Cambridge University Press

1 Introduction

Heat transfer via fluid advection is a critical component of atmospheric, oceanographic, geophysical and astrophysical dynamics, as well as being the basis of cooling systems in engineering applications. Numerous studies on how to design systems that achieve enhanced heat transfer by either manipulation of domain geometry or through the discovery of suitable flow structures have recently appeared in the literature (Alben Reference Alben2017; Toppaladoddi, Succi & Wettlaufer Reference Toppaladoddi, Succi and Wettlaufer2017; Marcotte et al. Reference Marcotte, Doering, Thiffeault and Young2018; Motoki, Kawahara & Shimizu Reference Motoki, Kawahara and Shimizu2018a,Reference Motoki, Kawahara and Shimizub). A particularly fruitful approach to discovering flow structures was first introduced by Hassanzadeh, Chini & Doering (Reference Hassanzadeh, Chini and Doering2014), where it was formulated via an optimal control approach.

The original motivation for Hassanzadeh et al. (Reference Hassanzadeh, Chini and Doering2014) was to develop a new tool for obtaining upper bounds on thermal transport by buoyancy-driven flows, e.g. for Rayleigh–Bénard convection. The analysis and derivation of upper bounds on transport properties plays a prominent role in expanding our knowledge of fundamental fluid dynamics and serves a complimentary role to other methods of inquiry, i.e. direct numerical simulations of the underlying equations of motion, invoking closure models to determine statistical properties, or postulating phenomenological models of turbulent transport. The first proof of upper bounds on heat transfer by Rayleigh–Bénard convection was achieved in Howard (Reference Howard1963). The complementary ‘background method’ was subsequently introduced in Doering & Constantin (Reference Doering and Constantin1996). Both approaches leverage certain bulk integral constraints derived from the equations of motion and yield bounds which apply to a strictly larger class of flows. It remains unknown whether the resulting bounds are realizable by buoyancy-driven flows.

Unlike those previous approaches, the wall-to-wall transport problem introduced in Hassanzadeh et al. (Reference Hassanzadeh, Chini and Doering2014) fully enforces the advection–diffusion equation for the temperature field pointwise in space and time. Admissible incompressible advecting flow fields do not (necessarily) satisfy an equation of motion, but are instead subject to a bulk integral intensity constraint, i.e. fixed finite magnitude of energy or enstrophy, and suitable boundary conditions. This allows consideration of the following question: amongst all possible incompressible flow fields subject to a fixed intensity constraint and relevant boundary conditions, which ones maximize thermal transport?

As usual, we model heat transport with the advection–diffusion equation

(1.1)

$$\begin{eqnarray}\unicode[STIX]{x2202}_{t}T+\unicode[STIX]{x1D735}\boldsymbol{\cdot }(\boldsymbol{u}T-\unicode[STIX]{x1D705}\unicode[STIX]{x1D735}T)=0,\end{eqnarray}$$

where the coefficient $\unicode[STIX]{x1D705}$ is the thermal diffusivity. The two-dimensional spatial domain is $\unicode[STIX]{x1D6FA}=[0,L_{x})\times [0,L_{z}]$ and the temperature field $T$ is periodic in the horizontal $x$ direction, ‘hot’ on the bottom boundary where $T(z=0)=T_{0}$ and ‘cool’ on the top boundary where $T(z=L_{z})=T_{1}$ with $T_{0}>T_{1}$. The advecting flow field $\boldsymbol{u}=u_{1}\hat{x}+u_{3}\hat{z}$ is divergence free ($\unicode[STIX]{x1D735}\boldsymbol{\cdot }\boldsymbol{u}=0$) with no penetration through the boundaries, i.e. $u_{3}(z=0)=0=u_{3}(z=L_{z})$. In this paper the velocity is restricted to satisfy no-slip boundary conditions on the top and bottom boundaries, $u_{1}(z=0)=0=u_{1}(z=L_{z})$. Both components are periodic in the horizontal $x$ direction. Other boundary conditions can be handled using similar methods. Initial ($t=0$) data for the temperature field are provided to formally pose the evolution problem for $t>0$. Using units $L_{z}$ and $L_{z}^{2}/\unicode[STIX]{x1D705}$ for space and time and changing the temperature $T\rightarrow (T-T_{0})/(T_{1}-T_{0})$ transforms the system to

(1.2)

$$\begin{eqnarray}\unicode[STIX]{x2202}_{t}T+\unicode[STIX]{x1D735}\boldsymbol{\cdot }(\boldsymbol{u}T-\unicode[STIX]{x1D735}T)=0,\end{eqnarray}$$

with $x\in [0,\unicode[STIX]{x1D6E4})$ where $\unicode[STIX]{x1D6E4}=L_{x}/L_{z}$, $z\in [0,1]$, and with $T(z=0)=1$ and $T(z=1)=0$. We consider (1.2) henceforth.

Given a flow field $\boldsymbol{u}$ the non-dimensional measure of thermal transport is the space- and long-time average of the convective heat flux in the vertical direction, i.e. the Nusselt number

(1.3)

$$\begin{eqnarray}Nu\{\boldsymbol{u}\}=\langle u_{3}T-\unicode[STIX]{x2202}_{z}T\rangle ,\end{eqnarray}$$

where $\langle \cdot \rangle$ denotes the space–time average

(1.4)

$$\begin{eqnarray}\langle f\rangle \equiv \limsup _{\unicode[STIX]{x1D70F}\rightarrow \infty }\frac{1}{\unicode[STIX]{x1D70F}\unicode[STIX]{x1D6E4}}\int _{0}^{\unicode[STIX]{x1D70F}}\int _{0}^{1}\int _{0}^{\unicode[STIX]{x1D6E4}}f(x,z,t)\,\text{d}x\,\text{d}z\,\text{d}t.\end{eqnarray}$$

The boundary conditions imply unit mean conductive heat flux, i.e. $\langle -\unicode[STIX]{x2202}_{z}T\rangle =1$.

In this work, the goal is to determine the largest possible value of $Nu$ as a function of $\boldsymbol{u}$, which we parameterize by a Péclet number defined as the root-mean-square vorticity (the square root of the mean enstrophy density) $Pe\equiv \langle |\unicode[STIX]{x1D735}\times \boldsymbol{u}|^{2}\rangle ^{1/2}$. Incompressibility and the boundary conditions imply that $Pe$ is simply related to the norm of $\unicode[STIX]{x1D735}\boldsymbol{u}$ and the mean-square rate of strain,

(1.5)

$$\begin{eqnarray}Pe^{2}=\langle |\unicode[STIX]{x1D735}\times \boldsymbol{u}|^{2}\rangle =\langle |\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\rangle =2\langle |(\unicode[STIX]{x1D735}\boldsymbol{u})_{sym}|^{2}\rangle \equiv |\unicode[STIX]{x1D735}u_{1}|^{2}+|\unicode[STIX]{x1D735}u_{3}|^{2}.\end{eqnarray}$$

The optimal wall-to-wall transport problem is then to maximize $Nu$ as a function of $Pe$. Explicitly, we take on the task to

(1.6)

$$\begin{eqnarray}\displaystyle & \displaystyle \text{Maximize }\langle u_{3}T\rangle \text{subject to } & \displaystyle\end{eqnarray}$$

(1.7)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x2202}_{t}T+\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}T=\unicode[STIX]{x0394}T, & \displaystyle \nonumber\\ \displaystyle & \displaystyle \unicode[STIX]{x1D735}\boldsymbol{\cdot }\boldsymbol{u}=0,~\langle |\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\rangle =Pe^{2}, & \displaystyle\end{eqnarray}$$

(1.8)

$$\begin{eqnarray}\displaystyle & \displaystyle \text{and the boundary conditions}. & \displaystyle\end{eqnarray}$$

Note in the wall-to-wall problem the velocity field $\boldsymbol{u}$ is not required to satisfy conservation of momentum. Nevertheless, in Hassanzadeh et al. (Reference Hassanzadeh, Chini and Doering2014) its solution was shown to inform bounds on buoyancy-driven driven transport; indeed, the original motivation for introducing the wall-to-wall problem was to find new upper bounds on the Nusselt number in Rayleigh–Bénard convection modelled by the Boussinesq equations

(1.9)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x2202}_{t}\boldsymbol{u}+\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\boldsymbol{u}+\unicode[STIX]{x1D735}p=Pr\unicode[STIX]{x0394}\boldsymbol{u}+Pr\,Ra\,T\,\hat{z}, & \displaystyle\end{eqnarray}$$

(1.10)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D735}\boldsymbol{\cdot }\boldsymbol{u}=0, & \displaystyle\end{eqnarray}$$

and

(1.11)

$$\begin{eqnarray}\unicode[STIX]{x2202}_{t}T+\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}T=\unicode[STIX]{x0394}T.\end{eqnarray}$$

Here, the Prandtl number $Pr$ and the Rayleigh number $Ra$ are given by

(1.12a,b)

$$\begin{eqnarray}Pr=\unicode[STIX]{x1D708}/\unicode[STIX]{x1D705}\quad \text{and}\quad Ra=\frac{\unicode[STIX]{x1D6FC}g(T_{1}-T_{0})(L_{z})^{3}}{\unicode[STIX]{x1D705}\unicode[STIX]{x1D708}},\end{eqnarray}$$

where $\unicode[STIX]{x1D708}$ is the kinematic viscosity, $g$ is the acceleration due to gravity and $\unicode[STIX]{x1D6FC}$ is the thermal expansion coefficient. The challenge for Rayleigh–Bénard is to derive upper bounds on the convective heat transport of the form $Nu-1\leqslant f(Pr,Ra)$ that hold for all solutions of the Boussinesq system. The connection between the Péclet number $Pe$ and the Rayleigh number $Ra$ is obtained by dotting $\boldsymbol{u}$ into (1.9) and averaging over space and time (utilizing integration by parts with the given boundary conditions and (1.10)) to obtain the identity $Pe^{2}=Ra(Nu-1)$.

The flow field of the wall-to-wall problem is not constrained to satisfy the Navier–Stokes system, but free to be any incompressible flow field with finite $Pe$. As a result, any upper bound for the wall-to-wall problem implies an upper bound for convective heat transport among solutions of the Boussinesq equations. Indeed, let $(Nu)_{natural}$ denote the largest possible heat transport of the Boussinesq system and $(Nu)_{optimal}$ denote the largest possible heat transport of the wall-to-wall problem for the same Péclet number, then $(Nu)_{natural}\leqslant (Nu)_{optimal}$. Now, suppose that we find an upper bound of the form $(Nu-1)_{optimal}\leqslant cPe^{\unicode[STIX]{x1D6FD}}$ where $0<\unicode[STIX]{x1D6FD}<2$ and $0<c$ for the wall-to-wall problem, then

(1.13)

$$\begin{eqnarray}(Nu-1)_{natural}\leqslant (Nu-1)_{optimal}\leqslant cPe^{\unicode[STIX]{x1D6FD}}\end{eqnarray}$$

implies

(1.14)

$$\begin{eqnarray}\frac{Pe^{2}}{Ra}\leqslant (Nu-1)_{optimal}\leqslant cPe^{\unicode[STIX]{x1D6FD}},\end{eqnarray}$$

by utilizing the $Pe^{2}=Ra(Nu-1)_{natural}$ from (1.9). This, in turn, means $Pe\leqslant c^{1/(2-\unicode[STIX]{x1D6FD})}Ra^{1/(2-\unicode[STIX]{x1D6FD})}$ and hence, $(Nu-1)_{natural}\leqslant c^{2/(2-\unicode[STIX]{x1D6FD})}Ra^{\unicode[STIX]{x1D6FD}/(2-\unicode[STIX]{x1D6FD})}$. In summary, an upper bound of the form $(Nu-1)_{optimal}\leqslant cPe^{\unicode[STIX]{x1D6FD}}$ implies an upper bound of the form $(Nu-1)_{natural}\leqslant c^{2/(2-\unicode[STIX]{x1D6FD})}Ra^{\unicode[STIX]{x1D6FD}/(2-\unicode[STIX]{x1D6FD})}$. The stress-free results of Hassanzadeh et al. (Reference Hassanzadeh, Chini and Doering2014) were that $Nu\lesssim Pe^{10/17}$ or, in terms of Rayleigh–Bénard convection, $Nu\lesssim Ra^{5/12}$.

The optimal wall-to-wall transport problem is evidently non-convex – there may be many local maxima and global extrema – and only by evaluating the global maximum are we assured of an upper bound for Rayleigh–Bénard convection. Nevertheless, to make progress, we seek local maxima numerically in this paper and discuss the resulting flows. These flows are of interest in their own right as mechanisms to significantly enhance heat transport, e.g. as targets for control.

The rest of this paper is organized as follows. We first introduce Lagrange multipliers to implement the constraints of the wall-to-wall optimal transport problem and examine the structure of the resulting functional in § 2. From insights gained from manipulations of the functional, we develop time-stepping algorithms in § 3 to solve the Euler–Lagrange equations. Solutions to the saddle point conditions and resulting transport scalings are presented for time-independent two-dimensional flow fields with no-slip boundaries in § 4. Upon investigation of the numerical solutions we see that the fields are to a very high degree separable, i.e. the computed streamfunctions satisfy $\unicode[STIX]{x1D713}(x,z)\approx \unicode[STIX]{x1D719}(x)\unicode[STIX]{x1D6F9}(z)$ and similarly for the other fields. This numerical observation motivates an analytic examination of upper bounds on the wall-to-wall problem with an additional separable ansatz – which is apparently almost satisfied by solutions of the wall-to-wall Euler–Lagrange equations – in § 5 leading to conditional upper bounds on heat transport of the form $Nu\lesssim Pe^{6/11}$ or, in terms of Rayleigh number, $Nu\lesssim Ra^{3/8}$.

Along the way, we discuss the relationship of the wall-to-wall optimal transport problem to the background method of Doering & Constantin (Reference Doering and Constantin1996) in § 2.4, and to the Howard–Busse–Malkus problem (Malkus Reference Malkus1954; Howard Reference Howard1963; Busse Reference Busse1969) in § 2.6. The perspective developed in those sections inspires the design of a time-stepping algorithm for computing optimal flows, similar to that in Wen et al. (Reference Wen, Chini, Kerswell and Doering2015) for computing optimal background fields. Our methods of temporal and spatial discretization are described, respectively, in § 3.2 and appendix A. In particular, we find in § 3.1 that evolving the equations

(1.15)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D709}=\unicode[STIX]{x0394}\unicode[STIX]{x1D709}-\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}+u_{3}, & \displaystyle\end{eqnarray}$$

(1.16)

$$\begin{eqnarray}\displaystyle & \displaystyle 0=-\unicode[STIX]{x0394}\unicode[STIX]{x1D702}-\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}, & \displaystyle\end{eqnarray}$$

(1.17)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\boldsymbol{u}=\unicode[STIX]{x1D707}\unicode[STIX]{x0394}\boldsymbol{u}-\unicode[STIX]{x1D709}\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}+\unicode[STIX]{x1D709}\hat{z}+{\textstyle \frac{1}{2}}\unicode[STIX]{x1D735}p, & \displaystyle\end{eqnarray}$$

(1.18)

$$\begin{eqnarray}\displaystyle & \displaystyle 0=\unicode[STIX]{x1D735}\boldsymbol{\cdot }\boldsymbol{u} & \displaystyle\end{eqnarray}$$

forward in pseudo-time $\unicode[STIX]{x1D70F}$ (for a fixed constant $\unicode[STIX]{x1D707}$) subject to homogeneous boundary conditions for $\unicode[STIX]{x1D702}$ and $\unicode[STIX]{x1D709}$ and no-slip boundary conditions for $\boldsymbol{u}$ yields local maxima of the wall-to-wall problem. One of the many benefits of the algorithms described here is that optimal flow fields may be computed for other geometries, e.g. cylinders, given suitable Poisson and Stokes equations solvers.

2 Theory

In this section we utilize Lagrange multipliers to rewrite the wall-to-wall optimization problem as one of finding saddle points of a certain unconstrained functional ${\mathcal{F}}$. We then describe various manipulations that can be performed on ${\mathcal{F}}$ involving its maximization or minimization or both in §§ 2.1 and 2.3. This leads us to a direct comparison between the background method and the wall-to-wall problem in § 2.4. In particular, we conjecture that there exists a duality gap between the two problems in § 2.5 and provide a simple polynomial example to illustrate why one should expect the problems to be distinct. Additionally, we note the connection to the Howard–Busse–Malkus problem in § 2.6. All of these considerations aid us in the development of numerical algorithms for producing candidate optimizers in § 3, and in the proof of our conditional upper bounds in § 5.

We begin by introducing a new temperature variable $\unicode[STIX]{x1D703}=T-(1-z)$ and rewrite the advection–diffusion equation as

(2.1)

$$\begin{eqnarray}\unicode[STIX]{x2202}_{t}\unicode[STIX]{x1D703}+\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D703}=\unicode[STIX]{x0394}\unicode[STIX]{x1D703}+u_{3}.\end{eqnarray}$$

Next, we introduce Lagrange multipliers $\unicode[STIX]{x1D707}$ (a positive real number), $p(x,z,t)$ and $\unicode[STIX]{x1D711}(x,z,t)$ and the unconstrained functional

(2.2)

$$\begin{eqnarray}{\mathcal{F}}=\left\langle u_{3}\unicode[STIX]{x1D703}+\unicode[STIX]{x1D711}\left(-\unicode[STIX]{x2202}_{t}\unicode[STIX]{x1D703}-\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D703}+\unicode[STIX]{x0394}\unicode[STIX]{x1D703}+u_{3}\right)+\unicode[STIX]{x1D707}\left(Pe^{2}-|\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\right)+\unicode[STIX]{x1D735}p\boldsymbol{\cdot }\boldsymbol{u}\right\rangle ,\end{eqnarray}$$

the saddle points of which are candidates for the maximization problem (1.6). The variables $\unicode[STIX]{x1D711}$ and $p$ come equipped with their natural boundary conditions. Namely, we impose periodic and homogeneous boundary conditions in the $x$ and $z$ directions respectively for $\unicode[STIX]{x1D711}$, and the usual implicit boundary conditions for $p$.

Given initial data, one could search for time-dependent optimal flow fields of the wall-to-wall problem, but we restrict ourselves to time-independent flow fields for a number of reasons. First, steady fields are far easier to compute numerically and time dependence greatly expands the scope and range of our current endeavour. Second, in the context of simpler models such as the Lorenz equations (with a ‘heat transport’ functional analogous to the Nusselt number), time dependence was found to never increase transport (Souza & Doering Reference Souza and Doering2015a,Reference Souza and Doeringb). Third, preliminary attempts at computing time-dependent flow fields for the wall-to-wall problem have yielded essentially time-independent results, suggesting that time dependence may not play a role in significantly enhancing heat transport. More precisely, taking the initial condition of the temperature field to be the conductive state $1-z$, we found that the result of the time-dependent optimization was to move the conductive state into a (locally) optimal steady state, and to hold it there. Finally, for time-independent flows we are guaranteed that maximizers exist (and that the functional ${\mathcal{F}}$ is differentiable), while as of now there, is no such assurance for time-dependent flows.

With these considerations in mind, from this point on we focus on the time-independent functional

(2.3)

$$\begin{eqnarray}{\mathcal{F}}=\left\langle u_{3}\unicode[STIX]{x1D703}+\unicode[STIX]{x1D711}\left(-\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D703}+\unicode[STIX]{x0394}\unicode[STIX]{x1D703}+u_{3}\right)+\unicode[STIX]{x1D707}\left(Pe^{2}-|\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\right)+\unicode[STIX]{x1D735}p\boldsymbol{\cdot }\boldsymbol{u}\right\rangle ,\end{eqnarray}$$

where now the brackets $\langle \cdot \rangle$ are understood to give the spatial average only. We seek the saddle points of (2.3) and for this it will be useful to consider alternative coordinate systems or ‘constraint manifolds’ that pass through these. Many of the manipulations introduced below extend naturally to time-dependent and/or stress-free flow fields.

The first manipulation we perform is to change variables by $\unicode[STIX]{x1D703}=\unicode[STIX]{x1D709}+\unicode[STIX]{x1D702}$ and $\unicode[STIX]{x1D711}=\unicode[STIX]{x1D709}-\unicode[STIX]{x1D702}$, following the ‘symmetrization method’ described in Tobasco & Doering (Reference Tobasco and Doering2017) and Doering & Tobasco (Reference Doering and Tobasco2019). Various integrations by parts yield the functional

(2.4)

$$\begin{eqnarray}{\mathcal{S}}=\left\langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}|^{2}+2\unicode[STIX]{x1D709}\boldsymbol{u}\boldsymbol{\cdot }\left(\hat{z}-\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}\right)-|\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}|^{2}+\unicode[STIX]{x1D707}\left(Pe^{2}-|\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\right)+\unicode[STIX]{x1D735}p\boldsymbol{\cdot }\boldsymbol{u}\right\rangle .\end{eqnarray}$$

The advantage of the new $(\unicode[STIX]{x1D709},\unicode[STIX]{x1D702})$ coordinates lies in exposing the underlying geometry of the wall-to-wall problem. Indeed, the functional ${\mathcal{S}}$ is convex with respect to $\unicode[STIX]{x1D702}$ and concave with respect to $\boldsymbol{u}$ or $\unicode[STIX]{x1D709}$ when the other is held fixed, whereas the functional ${\mathcal{F}}$ fails to have such properties. It is a bit like choosing to study the polynomial expression $x^{2}-y^{2}$ instead of $(x+y)(x-y)$.

As a warm up to the manipulations that follow, let us consider this simple example a bit more and remark on how one might search for the saddle points of $s(x,y)=x^{2}-y^{2}$. Gradient ascent/descent procedures are problematic on their own, but can be successfully combined with constraints picking out certain curves. For example, the curve $y=0$ passes through the saddle point $(0,0)$ and the resulting function $s(x,0)=x^{2}$ can be minimized by gradient descent. Thinking procedurally, this particular constraint curve is found by taking the derivative of $s$ with respect to $y$ and setting the result equal to zero, i.e. $(\unicode[STIX]{x2202}s/\unicode[STIX]{x2202}y)=2y=0$.

Returning to the functional ${\mathcal{S}}$, we proceed in §§ 2.1 and 2.2 to derive various constraint manifolds that pass through its saddle points. We do so by setting the variations of ${\mathcal{S}}$ with respect to $\unicode[STIX]{x1D702}$, $\unicode[STIX]{x1D709}$, or $\boldsymbol{u}$ equal to zero and relating these to optimizations of ${\mathcal{S}}$. Then, in §§ 2.3 and 2.4 we show how the wall-to-wall optimal transport problem and the background method arise from two such optimization procedures, thereby producing insights into the relationship between the two.

2.1 Variations with respect to $\unicode[STIX]{x1D702}$

We start by taking the variation of ${\mathcal{S}}$ with respect to $\unicode[STIX]{x1D702}$ and setting it equal to zero. This results in the Euler–Lagrange equations

(2.5)

$$\begin{eqnarray}\unicode[STIX]{x0394}\unicode[STIX]{x1D702}=\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}\end{eqnarray}$$

for $\unicode[STIX]{x1D702}$. Substituting this back into ${\mathcal{S}}$ results in the constrained functional

(2.6)

$$\begin{eqnarray}{\mathcal{S}}_{\unicode[STIX]{x1D702}}\equiv \left\langle 2u_{3}\unicode[STIX]{x1D709}-|\unicode[STIX]{x1D735}\unicode[STIX]{x1D6E5}^{-1}\left(\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}\right)|^{2}-|\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}|^{2}+\unicode[STIX]{x1D707}\left(Pe^{2}-|\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\right)+\unicode[STIX]{x1D735}p\boldsymbol{\cdot }\boldsymbol{u}\right\rangle .\end{eqnarray}$$

Stated differently, constraining the variable $\unicode[STIX]{x1D702}$ to satisfy (2.5) preserves the saddles of (2.4) and yields

(2.7)

$$\begin{eqnarray}{\mathcal{S}}_{\unicode[STIX]{x1D702}}=\min _{\unicode[STIX]{x1D702}}{\mathcal{S}}.\end{eqnarray}$$

2.2 Variations with respect to $\unicode[STIX]{x1D709}$ and $\boldsymbol{u}$

Next, we take the variation of ${\mathcal{S}}$ with respect to the variable $\unicode[STIX]{x1D709}$ and set it equal to zero. This yields

(2.8)

$$\begin{eqnarray}\unicode[STIX]{x0394}\unicode[STIX]{x1D709}=\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}-u_{3}\end{eqnarray}$$

and, after substituting it back into ${\mathcal{S}}$, we find

(2.9)

$$\begin{eqnarray}{\mathcal{S}}_{\unicode[STIX]{x1D709}}=\left\langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}|^{2}+|\unicode[STIX]{x1D735}\unicode[STIX]{x1D6E5}^{-1}\left(\boldsymbol{u}\boldsymbol{\cdot }\left(\hat{z}-\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}\right)\right)|^{2}+\unicode[STIX]{x1D707}\left(Pe^{2}-|\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\right)+\unicode[STIX]{x1D735}p\boldsymbol{\cdot }\boldsymbol{u}\right\rangle .\end{eqnarray}$$

As before, the same functional is obtained by maximizing ${\mathcal{S}}$ in $\unicode[STIX]{x1D709}$, i.e.

$$\begin{eqnarray}{\mathcal{S}}_{\unicode[STIX]{x1D709}}=\max _{\unicode[STIX]{x1D709}}{\mathcal{S}}.\end{eqnarray}$$

Finally, we take variations over all incompressible flow fields $\boldsymbol{u}$. This produces the constrained functional

(2.10)

$$\begin{eqnarray}{\mathcal{S}}_{\boldsymbol{u}}=\left\langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}|^{2}-|\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}|^{2}+\unicode[STIX]{x1D707}Pe^{2}+\frac{1}{\unicode[STIX]{x1D707}}|\unicode[STIX]{x1D735}S^{-1}\left(\unicode[STIX]{x1D709}\hat{z}-\unicode[STIX]{x1D709}\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}\right)|^{2}\right\rangle ,\end{eqnarray}$$

where $S^{-1}\left(\unicode[STIX]{x1D709}\hat{z}-\unicode[STIX]{x1D709}\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}\right)$ denotes the unique flow field $\boldsymbol{u}$ solving

(2.11)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D707}\unicode[STIX]{x0394}\boldsymbol{u}=\unicode[STIX]{x1D709}\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}-\unicode[STIX]{x1D709}\hat{z}-{\textstyle \frac{1}{2}}\unicode[STIX]{x1D735}p, & \displaystyle\end{eqnarray}$$

(2.12)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D735}\boldsymbol{\cdot }\boldsymbol{u}=0. & \displaystyle\end{eqnarray}$$

Note that

$$\begin{eqnarray}\displaystyle {\mathcal{S}}_{\boldsymbol{u}}=\max _{\boldsymbol{u}}{\mathcal{S}}. & & \displaystyle \nonumber\end{eqnarray}$$

These observations serve as a starting point in § 5 for establishing upper bounds on transport and motivate our choice of numerical methods in § 3. But first, let us see how the wall-to-wall problem comes out of these manipulations.

2.3 Finding the wall-to-wall problem

Consider the structure of the $S_{\unicode[STIX]{x1D702}}=\min _{\unicode[STIX]{x1D702}}{\mathcal{S}}$ and $S_{\unicode[STIX]{x1D709}}=\max _{\unicode[STIX]{x1D709}}{\mathcal{S}}$ functionals for a fixed incompressible velocity field $\boldsymbol{u}$ with enstrophy $\langle |\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\rangle =Pe^{2}$. The functional $S_{\unicode[STIX]{x1D702}}$ is concave in $\unicode[STIX]{x1D709}$; likewise $S_{\unicode[STIX]{x1D709}}$ is convex in $\unicode[STIX]{x1D702}$. Thus, finding the maximum of the former with respect to $\unicode[STIX]{x1D709}$, or the minimum of the latter with respect to $\unicode[STIX]{x1D702}$, is equivalent to enforcing their Euler–Lagrange equations.

The Euler–Lagrange equation for $S_{\unicode[STIX]{x1D702}}$ in $\unicode[STIX]{x1D709}$ is

(2.13)

$$\begin{eqnarray}\unicode[STIX]{x0394}\unicode[STIX]{x1D709}=\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D6E5}^{-1}\left(\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}\right)-u_{3}.\end{eqnarray}$$

Similarly, for $S_{\unicode[STIX]{x1D709}}$ in $\unicode[STIX]{x1D702}$ we have

(2.14)

$$\begin{eqnarray}\unicode[STIX]{x0394}\unicode[STIX]{x1D702}=\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D6E5}^{-1}\left(\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}-u_{3}\right).\end{eqnarray}$$

These can be written more concisely as the system

(2.15)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x0394}\unicode[STIX]{x1D702}=\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}, & \displaystyle\end{eqnarray}$$

(2.16)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x0394}\unicode[STIX]{x1D709}=\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}-u_{3}, & \displaystyle\end{eqnarray}$$

obtained by setting

(2.17)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D702}=\unicode[STIX]{x1D6E5}^{-1}\left(\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}\right), & \displaystyle\end{eqnarray}$$

(2.18)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D709}=\unicode[STIX]{x1D6E5}^{-1}\left(\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}-u_{3}\right) & \displaystyle\end{eqnarray}$$

in (2.13) and (2.14). Hence,

(2.19)

$$\begin{eqnarray}Nu\{\boldsymbol{u}\}-1=\max _{\unicode[STIX]{x1D709}}\min _{\unicode[STIX]{x1D702}}{\mathcal{S}}=\min _{\unicode[STIX]{x1D702}}\max _{\unicode[STIX]{x1D709}}{\mathcal{S}}\end{eqnarray}$$

for each fixed velocity field $\boldsymbol{u}$.

The formula (2.19), which first appeared in Tobasco & Doering (Reference Tobasco and Doering2017) and was further analysed in Doering & Tobasco (Reference Doering and Tobasco2019), is an exact variational characterization of the Nusselt number. It allows the optimal wall-to-wall transport problem to be stated succinctly as

(2.20)

$$\begin{eqnarray}\max _{\boldsymbol{u}}Nu\{\boldsymbol{u}\}-1=\max _{\boldsymbol{u}}\max _{\unicode[STIX]{x1D709}}\min _{\unicode[STIX]{x1D702}}{\mathcal{S}}=\max _{\boldsymbol{u}}\min _{\unicode[STIX]{x1D702}}\max _{\unicode[STIX]{x1D709}}{\mathcal{S}},\end{eqnarray}$$

where $\boldsymbol{u}$ satisfies the given boundary conditions and intensity constraints. In particular, we note that gradient ascent may be applied with impunity to the constrained functional ${\mathcal{S}}_{\unicode[STIX]{x1D702}}$ to compute local maxima. This functional can also be used to prove lower bounds on the Nusselt number without having to solve the advection–diffusion equation. Indeed, plugging in any $\unicode[STIX]{x1D709}$ admissible in the previous manipulations into ${\mathcal{S}}_{\unicode[STIX]{x1D702}}$ yields the lower bound

(2.21)

$$\begin{eqnarray}{\mathcal{S}}_{\unicode[STIX]{x1D702}}\{\boldsymbol{u},\unicode[STIX]{x1D709}\}\leqslant Nu\{\boldsymbol{u}\}-1.\end{eqnarray}$$

In fact, reinterpreting for the moment angle brackets as space and time averages, we note that the bound (2.21) holds for time-independent flows as well, an observation that was exploited in Tobasco & Doering (Reference Tobasco and Doering2017) and Doering & Tobasco (Reference Doering and Tobasco2019) with ‘branching’ trial functions to prove the scaling $\max \,Nu\sim Pe^{2/3}$ up to logarithmic corrections as $Pe\rightarrow \infty$. We return to discuss this asymptotic result in the context of our numerical results much further below. Next, we consider the relationship between the wall-to-wall problem and the background method.

2.4 Finding the background method

Consider now the background method which guarantees the absolute upper bound

(2.22)

$$\begin{eqnarray}Nu-1\leqslant \min _{\unicode[STIX]{x1D702}(\boldsymbol{x}),\unicode[STIX]{x1D707}}\left\{\begin{array}{@{}ll@{}}\langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}|^{2}\rangle +\unicode[STIX]{x1D707}Pe^{2}\quad & \text{if }\langle {\mathcal{Q}}[\boldsymbol{u},\unicode[STIX]{x1D709};\unicode[STIX]{x1D702}]\rangle \leqslant 0~~\forall \,\boldsymbol{u},\unicode[STIX]{x1D709},\\ \infty \quad & \text{otherwise},\end{array}\right.\end{eqnarray}$$

where

(2.23)

$$\begin{eqnarray}{\mathcal{Q}}[\boldsymbol{u},\unicode[STIX]{x1D709};\unicode[STIX]{x1D702}]=2\unicode[STIX]{x1D709}\boldsymbol{u}\boldsymbol{\cdot }\left(\hat{z}-\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}\right)-|\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}|^{2}-\unicode[STIX]{x1D707}|\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}.\end{eqnarray}$$

The reader may not immediately recognize this as the familiar background method as it has been applied to Rayleigh–Bénard convection (Doering & Constantin Reference Doering and Constantin1996). Nevertheless, equation (2.22) does follow from applying the usual argument to the wall-to-wall problem. (The resulting bounds carry over to the time-dependent case.)

Let us recall the argument now. Starting with the advection–diffusion equation

(2.24)

$$\begin{eqnarray}\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}T=\unicode[STIX]{x0394}T\end{eqnarray}$$

and decomposing $T$ as $T=\unicode[STIX]{x1D709}+1-z+\unicode[STIX]{x1D702}$ yields

(2.25)

$$\begin{eqnarray}\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}+\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}=\unicode[STIX]{x0394}\unicode[STIX]{x1D709}+\unicode[STIX]{x0394}\unicode[STIX]{x1D702}+u_{3}.\end{eqnarray}$$

Multiplying through by $\unicode[STIX]{x1D709}$ and integrating by parts yields the balance relation

(2.26)

$$\begin{eqnarray}\langle \unicode[STIX]{x1D709}\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}+|\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}|^{2}-u_{3}\unicode[STIX]{x1D709}+\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}\rangle =0.\end{eqnarray}$$

Now, utilizing

(2.27)

$$\begin{eqnarray}\langle wT\rangle =\langle |\unicode[STIX]{x1D735}T|^{2}\rangle -1=\langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}|^{2}+|\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}|^{2}+2\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}\rangle ,\end{eqnarray}$$

we subtract twice (2.26) from (2.27) to conclude that

(2.28)

$$\begin{eqnarray}Nu-1=\left\langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}|^{2}+2\unicode[STIX]{x1D709}\boldsymbol{u}\boldsymbol{\cdot }\left(\hat{z}-\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}\right)-|\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}|^{2}\right\rangle .\end{eqnarray}$$

Introducing a Lagrange multiplier $\unicode[STIX]{x1D707}/2$ for the enstrophy constraint $\langle |\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\rangle =Pe^{2}$ yields

(2.29)

$$\begin{eqnarray}\displaystyle Nu-1 & = & \displaystyle \left\langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}|^{2}+2\unicode[STIX]{x1D709}\boldsymbol{u}\boldsymbol{\cdot }\left(\hat{z}-\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}\right)-|\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}|^{2}\right\rangle +\unicode[STIX]{x1D707}(Pe^{2}-\langle |\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\rangle ),\end{eqnarray}$$

(2.30)

$$\begin{eqnarray}\displaystyle & = & \displaystyle \langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}|^{2}\rangle +\unicode[STIX]{x1D707}Pe^{2}+\langle {\mathcal{Q}}[\boldsymbol{u},\unicode[STIX]{x1D709};\unicode[STIX]{x1D702}]\rangle\end{eqnarray}$$

and upon performing the operations $\min _{\unicode[STIX]{x1D702}}\max _{\boldsymbol{u},\unicode[STIX]{x1D709}}$ we deduce (2.22).

2.5 A possible duality gap

We can now discuss the relationship between the background method and the wall-to-wall optimal transport problem. Combining the definition of ${\mathcal{S}}$ from (2.4) and the identity (2.29) we see that the background method bound (2.22) can be alternatively written as

(2.31)

$$\begin{eqnarray}\max _{\boldsymbol{u}}Nu-1\leqslant \min _{\unicode[STIX]{x1D702}}\max _{\boldsymbol{u},\unicode[STIX]{x1D709}}{\mathcal{S}}.\end{eqnarray}$$

On the left-hand side appears the wall-to-wall optimal transport problem, while on the right-hand side appears the background method. Note this inequality is consistent with the results of § 2.3 since in any case

(2.32)

$$\begin{eqnarray}\max _{\boldsymbol{u},\unicode[STIX]{x1D709}}\min _{\unicode[STIX]{x1D702}}{\mathcal{S}}\leqslant \min _{\unicode[STIX]{x1D702}}\max _{\boldsymbol{u},\unicode[STIX]{x1D709}}{\mathcal{S}}\end{eqnarray}$$

regardless of the definition of ${\mathcal{S}}$. Now, if ${\mathcal{S}}$ were convex in $\unicode[STIX]{x1D702}$ and jointly concave in $(\boldsymbol{u},\unicode[STIX]{x1D709})$ one would be led on general grounds via convex duality to conjecture that equality should hold between the left-hand and right-hand sides above, in which case the wall-to-wall problem and the background method would turn out to be equivalent. We instead propose that the opposite situation is true and that

(2.33)

$$\begin{eqnarray}\max _{\boldsymbol{u}}Nu-1\neq \min _{\unicode[STIX]{x1D702}}\max _{\boldsymbol{u},\unicode[STIX]{x1D709}}{\mathcal{S}}.\end{eqnarray}$$

A concurrent study, Ding & Kerswell (Reference Ding and Kerswell2019), has provided further evidence that such an inequality holds. This inequality still allows the possibility for these quantities to achieve the same asymptotic scaling as $Pe\rightarrow \infty$, a situation suggested for three-dimensional wall-to-wall optimal transport by the recent numerical scaling $\max \,Nu\sim Pe^{2/3}$ reported for a finite range of $Pe$ in Motoki et al. (Reference Motoki, Kawahara and Shimizu2018a). It would also be consistent with the dimension-independent logarithmic lower bound $\max \,Nu\geqslant C^{\prime }Pe^{2/3}/(\log Pe)^{4/3}$ proved for all large enough $Pe$ in Tobasco & Doering (Reference Tobasco and Doering2017) and Doering & Tobasco (Reference Doering and Tobasco2019).

Let us illustrate why the inequality in (2.33) may be true by considering how the previous manipulations operate on the polynomial

(2.34)

$$\begin{eqnarray}\displaystyle & \displaystyle p(\unicode[STIX]{x1D70F}_{1},\unicode[STIX]{x1D70F}_{2},\unicode[STIX]{x1D70F}_{3},u,v)=(\unicode[STIX]{x1D70F}_{1})^{2}+(\unicode[STIX]{x1D70F}_{2})^{2}+(\unicode[STIX]{x1D70F}_{3})^{2}+q(u,v,\unicode[STIX]{x1D70F}_{1},\unicode[STIX]{x1D70F}_{2},\unicode[STIX]{x1D70F}_{3}), & \displaystyle\end{eqnarray}$$

(2.35)

$$\begin{eqnarray}\displaystyle & \displaystyle q=\left[\begin{array}{@{}cc@{}}u & v\end{array}\right]\left[\begin{array}{@{}cc@{}}2(1-\unicode[STIX]{x1D70F}_{1}+\unicode[STIX]{x1D70F}_{2}) & \unicode[STIX]{x1D70F}_{3}\\ \unicode[STIX]{x1D70F}_{3} & 2(1-\unicode[STIX]{x1D70F}_{1}-\unicode[STIX]{x1D70F}_{2})\end{array}\right]\left[\begin{array}{@{}c@{}}u\\ v\end{array}\right], & \displaystyle\end{eqnarray}$$

which we see as analogous to (2.4). The variable $\unicode[STIX]{x1D70F}_{1}$ is analogous to the zeroth Fourier mode of $\unicode[STIX]{x1D702}$, while $\unicode[STIX]{x1D70F}_{2}$ and $\unicode[STIX]{x1D70F}_{3}$ are to the non-zero Fourier modes. (This polynomial was not derived as a modal truncation of ${\mathcal{S}}$. That would produce a more complicated example.) The variables $u$ and $v$ are analogous to $\unicode[STIX]{x1D709}$ and $\boldsymbol{u}$. The fact is that

(2.36)

$$\begin{eqnarray}\max _{u,v}\min _{\unicode[STIX]{x1D70F}_{1},\unicode[STIX]{x1D70F}_{2},\unicode[STIX]{x1D70F}_{3}}p=4/5<\min _{\unicode[STIX]{x1D70F}_{1},\unicode[STIX]{x1D70F}_{2},\unicode[STIX]{x1D70F}_{3}}\max _{u,v}p=1,\end{eqnarray}$$

and that the optimizers for the ‘background method’ $\min$$\max$ problem are not saddle points of $p$. In particular, the critical point equations

(2.37a-c)

$$\begin{eqnarray}\unicode[STIX]{x1D70F}_{1}=u^{2}+v^{2},\quad \unicode[STIX]{x1D70F}_{2}=v^{2}-u^{2},\quad \text{and}\quad \unicode[STIX]{x1D70F}_{3}=-uv,\end{eqnarray}$$

(2.38)

$$\begin{eqnarray}\left[\begin{array}{@{}c@{}}0\\ 0\end{array}\right]=\left[\begin{array}{@{}cc@{}}2(1-\unicode[STIX]{x1D70F}_{1}+\unicode[STIX]{x1D70F}_{2}) & \unicode[STIX]{x1D70F}_{3}\\ \unicode[STIX]{x1D70F}_{3} & 2(1-\unicode[STIX]{x1D70F}_{1}-\unicode[STIX]{x1D70F}_{2})\end{array}\right]\left[\begin{array}{@{}c@{}}u\\ v\end{array}\right],\end{eqnarray}$$

fail to be satisfied by solutions of the $\min$$\max$ problem.

To see why this is the case, consider the background method procedure wherein the maximum occurs first. If any of the eigenvalues of (2.35) are positive then the maximum over $u$ and $v$ yields infinity; thus, we must calculate the eigenvalues of (2.35) to see when this occurs. For fixed $(\unicode[STIX]{x1D70F}_{1},\unicode[STIX]{x1D70F}_{2},\unicode[STIX]{x1D70F}_{3})$, the eigenvalues of the matrix in (2.35) are $\unicode[STIX]{x1D706}=2(1-\unicode[STIX]{x1D70F}_{1})\pm \sqrt{4(\unicode[STIX]{x1D70F}_{2})^{2}+(\unicode[STIX]{x1D70F}_{3})^{2}}$. The only way these eigenvalues are non-positive is if $2(1-\unicode[STIX]{x1D70F}_{1})+\sqrt{4(\unicode[STIX]{x1D70F}_{2})^{2}+(\unicode[STIX]{x1D70F}_{3})^{2}}\leqslant 0$, or equivalently $2\unicode[STIX]{x1D70F}_{1}\geqslant 2+\sqrt{4(\unicode[STIX]{x1D70F}_{2})^{2}+(\unicode[STIX]{x1D70F}_{3})^{2}}$. Hence,

(2.39)

$$\begin{eqnarray}\max _{u,v}p(\unicode[STIX]{x1D70F}_{1},\unicode[STIX]{x1D70F}_{2},\unicode[STIX]{x1D70F}_{3},u,v)=\left\{\begin{array}{@{}ll@{}}(\unicode[STIX]{x1D70F}_{1})^{2}+(\unicode[STIX]{x1D70F}_{2})^{2}+(\unicode[STIX]{x1D70F}_{3})^{2}\quad & \text{ if }2\unicode[STIX]{x1D70F}_{1}\geqslant 2+\sqrt{4(\unicode[STIX]{x1D70F}_{2})^{2}+(\unicode[STIX]{x1D70F}_{3})^{2}},\\ \infty \quad & \text{otherwise}.\end{array}\right.\end{eqnarray}$$

It follows immediately that $\min \max p=1$, and that the minimizer satisfies $\unicode[STIX]{x1D70F}_{1}=1$ and $\unicode[STIX]{x1D70F}_{2}=\unicode[STIX]{x1D70F}_{3}=0$; however, such $\unicode[STIX]{x1D70F}_{1}$, $\unicode[STIX]{x1D70F}_{2}$ and $\unicode[STIX]{x1D70F}_{3}$ cannot be a saddle point of $p$: if $\unicode[STIX]{x1D70F}_{2}=\unicode[STIX]{x1D70F}_{3}=0$ then from (2.37) we see that $u^{2}=v^{2}$ and $uv=0$ so that $u=v=0$, but then the $\unicode[STIX]{x1D70F}_{1}$ equation cannot be satisfied since $\unicode[STIX]{x1D70F}_{1}=1\neq u^{2}+v^{2}$.

Proceeding in the reverse order we find that

(2.40)

$$\begin{eqnarray}\min _{\unicode[STIX]{x1D70F}_{1},\unicode[STIX]{x1D70F}_{2},\unicode[STIX]{x1D70F}_{3}}p=-(u^{2}+v^{2})^{2}-(u^{2}-v^{2})^{2}-(uv)^{2}+2(u^{2}+v^{2}).\end{eqnarray}$$

The minimizing $\unicode[STIX]{x1D70F}$ satisfy

(2.41a-c)

$$\begin{eqnarray}\unicode[STIX]{x1D70F}_{1}=u^{2}+v^{2},\quad \unicode[STIX]{x1D70F}_{2}=v^{2}-u^{2},\quad \text{and}\quad \unicode[STIX]{x1D70F}_{3}=-uv.\end{eqnarray}$$

The maximum over $u,v$ is given by $u=\pm \sqrt{2/5}$ and $v=\pm \sqrt{2/5}$. Hence, $\max \min p=4/5$.

While in this example the background method procedure (maximum followed by minimum) yields results that are incompatible with the saddle points of (2.34), the wall-to-wall procedure (minimum followed by maximum) does produce saddle points. Returning to the actual wall-to-wall optimal transport problem, we note that the optimal flow fields reported in § 4 exhibit non-trivial non-zero Fourier modes for the variable $\unicode[STIX]{x1D702}$, whereas in the background method optimizers must satisfy $\unicode[STIX]{x1D702}=\unicode[STIX]{x1D702}(z)$. Indeed, if $\unicode[STIX]{x1D702}$ satisfies the spectral stability constraint $\langle {\mathcal{Q}}\rangle \leqslant 0$ then so does its periodic average $\overline{\unicode[STIX]{x1D702}}$ in $x$, while by Jensen’s inequality $\langle |(\text{d}/\text{d}z)\overline{\unicode[STIX]{x1D702}}|^{2}\rangle \leqslant \langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}|^{2}\rangle$ with equality if and only if $\unicode[STIX]{x1D702}=\overline{\unicode[STIX]{x1D702}}$. These observations strongly suggest that the conjectured gap (2.33) between the wall-to-wall problem and the background method should hold and, in particular, that the spectral stability constraint should fail to be satisfied by the true saddle points of ${\mathcal{S}}$.

2.6 Comparison with the Howard–Busse–Malkus problem

We would be remiss if we did not additionally state the connection of the previous discussions on the wall-to-wall and background method approach with the classic Howard–Busse–Malkus approach put forth in Howard (Reference Howard1963). To see the connection between these, start with ${\mathcal{S}}$ and restrict attention to incompressible flows $\unicode[STIX]{x1D735}\boldsymbol{\cdot }\boldsymbol{u}=0$ with enstrophy $\langle |\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\rangle =Pe^{2}$ and functions $\unicode[STIX]{x1D702}(z)$. At this point, it is useful to introduce notation for the horizontal average of a function,

(2.42)

$$\begin{eqnarray}\overline{f}\equiv \frac{1}{\unicode[STIX]{x1D6E4}}\int _{0}^{\unicode[STIX]{x1D6E4}}f(x,z)\,\text{d}x.\end{eqnarray}$$

Computing the minimum of ${\mathcal{S}}$ with respect to $\unicode[STIX]{x1D702}(z)$ yields the optimality condition

(2.43)

$$\begin{eqnarray}\frac{\text{d}^{2}}{\text{d}z^{2}}\unicode[STIX]{x1D702}=\overline{\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}}=\frac{\text{d}}{\text{d}z}\overline{u_{3}\unicode[STIX]{x1D709}}\end{eqnarray}$$

whose solution is

(2.44)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D702}(z)=\int _{0}^{z}\overline{u_{3}\unicode[STIX]{x1D709}}(z^{\prime })\,\text{d}z^{\prime }-z\int _{0}^{1}\overline{u_{3}\unicode[STIX]{x1D709}}(z^{\prime })\,\text{d}z^{\prime }=\int _{0}^{z}\overline{u_{3}\unicode[STIX]{x1D709}}(z^{\prime })\,\text{d}z^{\prime }-z\langle u_{3}\unicode[STIX]{x1D709}\rangle , & \displaystyle\end{eqnarray}$$

(2.45)

$$\begin{eqnarray}\displaystyle & \displaystyle \frac{\text{d}}{\text{d}z}\unicode[STIX]{x1D702}=\overline{u_{3}\unicode[STIX]{x1D709}}(z)-\langle u_{3}\unicode[STIX]{x1D709}\rangle . & \displaystyle\end{eqnarray}$$

Employing this relation in ${\mathcal{S}}$ yields

(2.46)

$$\begin{eqnarray}\min _{\unicode[STIX]{x1D702}(z)}{\mathcal{S}}=\left\langle 2u_{3}\unicode[STIX]{x1D709}-\left(\overline{u_{3}\unicode[STIX]{x1D709}}-\langle u_{3}\unicode[STIX]{x1D709}\rangle \right)^{2}-|\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}|^{2}\right\rangle .\end{eqnarray}$$

Making the change of variables $\unicode[STIX]{x1D709}=\unicode[STIX]{x1D6FC}\unicode[STIX]{x1D70E}$ for a soon to be determined scalar $\unicode[STIX]{x1D6FC}$ transforms this to

(2.47)

$$\begin{eqnarray}\min _{\unicode[STIX]{x1D702}(z)}{\mathcal{S}}=\unicode[STIX]{x1D6FC}\left\langle 2u_{3}\unicode[STIX]{x1D70E}\right\rangle -\unicode[STIX]{x1D6FC}^{2}\left\langle \left(\overline{u_{3}\unicode[STIX]{x1D70E}}+\langle u_{3}\unicode[STIX]{x1D70E}\rangle \right)^{2}+|\unicode[STIX]{x1D735}\unicode[STIX]{x1D70E}|^{2}\right\rangle ,\end{eqnarray}$$

a quadratic function of $\unicode[STIX]{x1D6FC}$. Maximizing with respect to $\unicode[STIX]{x1D6FC}$ determines the optimal choice

(2.48)

$$\begin{eqnarray}\unicode[STIX]{x1D6FC}^{\ast }=\frac{\langle u_{3}\unicode[STIX]{x1D70E}\rangle }{\left\langle \left(\overline{u_{3}\unicode[STIX]{x1D70E}}-\langle u_{3}\unicode[STIX]{x1D70E}\rangle \right)^{2}+|\unicode[STIX]{x1D735}\unicode[STIX]{x1D70E}|^{2}\right\rangle },\end{eqnarray}$$

utilizing $\unicode[STIX]{x1D6FC}^{\ast }$ in the above and taking $\boldsymbol{u}=Pe(\boldsymbol{v}/\sqrt{\langle |\unicode[STIX]{x1D735}\boldsymbol{v}|^{2}\rangle })$ results in the multiplicative form of the functional

(2.49)

$$\begin{eqnarray}{\mathcal{M}}[\unicode[STIX]{x1D70E},\boldsymbol{v},Pe]=\frac{\left(\langle v_{3}\unicode[STIX]{x1D70E}\rangle \right)^{2}}{Pe^{-2}\langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D70E}|^{2}\rangle \langle |\unicode[STIX]{x1D735}\boldsymbol{v}|^{2}\rangle +\langle \left(\overline{v_{3}\unicode[STIX]{x1D70E}}-\langle v_{3}\unicode[STIX]{x1D70E}\rangle \right)^{2}\rangle },\end{eqnarray}$$

exactly as in Howard (Reference Howard1963) under the appropriate retranscriptions. The functional in (2.49) is homogeneous with respect to $\unicode[STIX]{x1D70E}$ and $\boldsymbol{v}$, i.e. ${\mathcal{M}}[\unicode[STIX]{x1D706}\unicode[STIX]{x1D70E},\unicode[STIX]{x1D706}\boldsymbol{v},Pe]={\mathcal{M}}[\unicode[STIX]{x1D70E},\boldsymbol{v},Pe]$. It has in the past served as a starting point for the analysis of maximal heat transport, in particular leading in Howard (Reference Howard1963) to bounds on transport under the author’s assumptions of homogeneity and statistical similarity. The connection of this functional to the background method has been pointed out before (Kerswell Reference Kerswell1998).

However, we point out now that the resulting bounds on transport can be improved beyond those obtained using (2.49) due to the variational representation (2.19) of the wall-to-wall problem. After a similar series of manipulations utilizing all possible $\unicode[STIX]{x1D702}(x,z)$ instead of functions of $z$ alone, we deduce the improved formula

(2.50)

$$\begin{eqnarray}Nu-1=\max _{\unicode[STIX]{x1D70E}}\tilde{{\mathcal{M}}}[\unicode[STIX]{x1D70E},\boldsymbol{v};Pe],\end{eqnarray}$$

where

(2.51)

$$\begin{eqnarray}\tilde{{\mathcal{M}}}[\unicode[STIX]{x1D70E},\boldsymbol{v};Pe]=\frac{\left(\langle v_{3}\unicode[STIX]{x1D70E}\rangle \right)^{2}}{Pe^{-2}\langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D70E}|^{2}\rangle \langle |\unicode[STIX]{x1D735}\boldsymbol{v}|^{2}\rangle +\langle \left|\unicode[STIX]{x1D735}\unicode[STIX]{x1D6E5}^{-1}\left(\boldsymbol{v}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D70E}\right)\right|^{2}\rangle }.\end{eqnarray}$$

Stated succinctly, the Howard–Busse–Malkus problem maximizes (2.49) and the wall-to-wall problem maximizes (2.51). The crucial difference between (2.49) and (2.51) lies in the fact that

(2.52)

$$\begin{eqnarray}\langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D6E5}^{-1}(\boldsymbol{v}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D70E})|^{2}\rangle =\langle (\overline{v_{3}\unicode[STIX]{x1D70E}}-\langle v_{3}\unicode[STIX]{x1D70E}\rangle )^{2}\rangle +\text{non-negative terms}\end{eqnarray}$$

since $\langle \left(\overline{v_{3}\unicode[STIX]{x1D70E}}-\langle v_{3}\unicode[STIX]{x1D70E}\rangle \right)^{2}\rangle$ is just the zeroth Fourier mode contribution to $\langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D6E5}^{-1}(\boldsymbol{v}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D70E})|^{2}\rangle$. Hence,

(2.53)

$$\begin{eqnarray}\tilde{{\mathcal{M}}}[\unicode[STIX]{x1D70E},\boldsymbol{v};Pe]\leqslant {\mathcal{M}}[\unicode[STIX]{x1D70E},\boldsymbol{v};Pe]\end{eqnarray}$$

for all $(\unicode[STIX]{x1D70E},\boldsymbol{v})$ and the inequality is strict in most cases. Bounds on transport obtained using the functional $\tilde{{\mathcal{M}}}$ are, therefore, at least as tight as those that have been obtained using Howard’s functional ${\mathcal{M}}$. Heuristically, this occurs because the wall-to-wall problem is more constrained (and hence provides a tighter bound) since it fully enforces the steady advection–diffusion equation whereas the Howard–Busse–Malkus problem is less constrained since it only enforces the horizontally averaged version. We take (2.52) as additional evidence of the conjectured duality gap (2.33) between the wall-to-wall problem and the background method/Howard–Busse–Malkus approach.

3 Gradient flow

The theoretical developments of the previous sections give insight into numerical methods for computing the saddle points of the functionals ${\mathcal{F}}$ in (2.3) and ${\mathcal{S}}$ in (2.4). In this section, we exploit their structure to derive time-stepping methods for solving the Euler–Lagrange equations. An advantage of the approach adopted here is that it is only necessary to have a Poisson or Stokes solver to compute candidate optimizers to the wall-to-wall problem.

The Euler–Lagrange equations for the wall-to-wall problem are of the form

(3.1)

$$\begin{eqnarray}\mathbf{0}=\boldsymbol{f}(\boldsymbol{x}).\end{eqnarray}$$

To solve this numerically, we introduce a time derivative on the left-hand side,

(3.2)

$$\begin{eqnarray}\dot{\boldsymbol{x}}=\boldsymbol{f}(\boldsymbol{x}),\end{eqnarray}$$

transforming (3.1) into a dynamical system where every local maximum is an attracting fixed point. When applied to the functional ${\mathcal{S}}_{\unicode[STIX]{x1D702}}=\min _{\unicode[STIX]{x1D702}}{\mathcal{S}}$ from (2.6) this yields its local maximizers, thereby producing saddle points for ${\mathcal{S}}$.

Composing (3.1) with a locally invertible function such that $P(\mathbf{0})=\mathbf{0}$ and then introducing a time derivative defines an alternate system $\dot{\boldsymbol{x}}=P(\boldsymbol{f}(\boldsymbol{x}))$. The danger and boon of choosing such a preconditioner is that the stability of a fixed point may change: we could be computing local maxima, minima or saddles of our original function $\boldsymbol{f}$ in (3.1). For example, Newton’s method may be viewed as choosing the negative inverse Jacobian $J^{-1}(\boldsymbol{x})$ of $\boldsymbol{f}$, $P(\boldsymbol{f}(\boldsymbol{x}))=-J^{-1}(\boldsymbol{x})\boldsymbol{f}(\boldsymbol{x})$, along with a choice of optimal step size ($\unicode[STIX]{x0394}t=1$) upon temporal discretization. With regards to the functional ${\mathcal{F}}$, we will take the preconditioner approach implemented in a way that is similar to the algorithm in Wen et al. (Reference Wen, Chini, Kerswell and Doering2015). With regards to the Euler–Lagrange equations of (2.2) and (2.4) we ultimately solve

(3.3)

$$\begin{eqnarray}\displaystyle & \displaystyle 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D711}}=\unicode[STIX]{x0394}\unicode[STIX]{x1D703}-\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D703}+u_{3}, & \displaystyle\end{eqnarray}$$

(3.4)

$$\begin{eqnarray}\displaystyle & \displaystyle 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D703}}=\unicode[STIX]{x0394}\unicode[STIX]{x1D711}+\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D711}+u_{3}, & \displaystyle\end{eqnarray}$$

(3.5)

$$\begin{eqnarray}\displaystyle & \displaystyle 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\boldsymbol{u}}=2\unicode[STIX]{x1D707}\unicode[STIX]{x0394}\boldsymbol{u}-\unicode[STIX]{x1D711}\unicode[STIX]{x1D735}\unicode[STIX]{x1D703}+(\unicode[STIX]{x1D703}+\unicode[STIX]{x1D711})\hat{z}-\unicode[STIX]{x1D735}p, & \displaystyle\end{eqnarray}$$

(3.6)

$$\begin{eqnarray}\displaystyle & \displaystyle 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}p}=\unicode[STIX]{x1D735}\boldsymbol{\cdot }\boldsymbol{u}, & \displaystyle\end{eqnarray}$$

(3.7)

$$\begin{eqnarray}\displaystyle & \displaystyle 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D707}}=\langle Pe^{2}-|\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\rangle & \displaystyle\end{eqnarray}$$

or equivalently

(3.8)

$$\begin{eqnarray}\displaystyle & \displaystyle 0=\frac{1}{2}\frac{\unicode[STIX]{x1D6FF}{\mathcal{S}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D709}}=\unicode[STIX]{x0394}\unicode[STIX]{x1D709}-\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}+u_{3}, & \displaystyle\end{eqnarray}$$

(3.9)

$$\begin{eqnarray}\displaystyle & \displaystyle 0=\frac{1}{2}\frac{\unicode[STIX]{x1D6FF}{\mathcal{S}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D702}}=-\unicode[STIX]{x0394}\unicode[STIX]{x1D702}+\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}, & \displaystyle\end{eqnarray}$$

(3.10)

$$\begin{eqnarray}\displaystyle & \displaystyle 0=\frac{1}{2}\frac{\unicode[STIX]{x1D6FF}{\mathcal{S}}}{\unicode[STIX]{x1D6FF}\boldsymbol{u}}=\unicode[STIX]{x1D707}\unicode[STIX]{x0394}\boldsymbol{u}-\unicode[STIX]{x1D709}\unicode[STIX]{x1D735}\unicode[STIX]{x1D702}+\unicode[STIX]{x1D709}\hat{z}+\frac{1}{2}\unicode[STIX]{x1D735}p, & \displaystyle\end{eqnarray}$$

(3.11)

$$\begin{eqnarray}\displaystyle & \displaystyle 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{S}}}{\unicode[STIX]{x1D6FF}p}=\unicode[STIX]{x1D735}\boldsymbol{\cdot }\boldsymbol{u}, & \displaystyle\end{eqnarray}$$

(3.12)

$$\begin{eqnarray}\displaystyle & \displaystyle 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{S}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D707}}=\langle Pe^{2}-|\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\rangle . & \displaystyle\end{eqnarray}$$

3.1 Gradient ascent methods

Here, we outline various methods for solving the Euler–Lagrange equations of the wall-to-wall problem described above. One method is to evolve

(3.13a-e)

$$\begin{eqnarray}0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D711}},\quad 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D703}},\quad \unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\boldsymbol{u}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\boldsymbol{u}},\quad 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}p},\quad \text{and}\quad \unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D707}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D707}}\end{eqnarray}$$

forward in time. This strictly enforces the advection–diffusion equation and its adjoint at every time step, utilizing $\unicode[STIX]{x1D6FF}{\mathcal{F}}/\unicode[STIX]{x1D6FF}\boldsymbol{u}$ to compute corrections to the flow field. This approach was taken in Motoki et al. (Reference Motoki, Kawahara and Shimizu2018b), and the rate limiting step is the solution of the advection–diffusion equation and its adjoint. In the present work, we take a different approach and compute numerical solutions to (3.3) and (3.8) utilizing two different algorithms.

The first of our algorithms involves a time-stepping procedure of the form

(3.14a-e)

$$\begin{eqnarray}\unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D703}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D711}},\quad \unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D711}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D703}},\quad \unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\boldsymbol{u}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\boldsymbol{u}},\quad 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}p},\quad \text{and}\quad \unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D707}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D707}}.\end{eqnarray}$$

This procedure may be understood as follows. For fixed $\boldsymbol{u}$, the equations for $\unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D703}$ and $\unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D711}$ evolve towards the steady state solutions $\unicode[STIX]{x1D6FF}{\mathcal{F}}/\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D711}=0$ and $\unicode[STIX]{x1D6FF}{\mathcal{F}}/\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D703}=0$, hence, these evolutions are a relaxation of fully solving the Euler–Lagrange equations for $\unicode[STIX]{x1D703}$ and $\unicode[STIX]{x1D711}$. The equation for $\unicode[STIX]{x1D707}$ guarantees that we flow towards a flow field with the desired enstrophy $Pe$. The last condition (for a fixed $\unicode[STIX]{x1D703}$ and $\unicode[STIX]{x1D711}$) evolves to a solution of the optimality condition. We enforce incompressibility at every time step and evolve all fields at once.

There is an alternative description of this algorithm in terms of ${\mathcal{S}}$. Focusing on the $\unicode[STIX]{x1D703}$ and $\unicode[STIX]{x1D711}$ equations, we see that evolving

(3.15a,b)

is equivalent to evolving

(3.16a,b)

$$\begin{eqnarray}\unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D709}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{S}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D709}}\quad \text{and}\quad -\unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D702}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{S}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D702}}\end{eqnarray}$$

after taking sums and differences (making use of $\unicode[STIX]{x1D703}=\unicode[STIX]{x1D709}+\unicode[STIX]{x1D702}$ and $\unicode[STIX]{x1D711}=\unicode[STIX]{x1D709}-\unicode[STIX]{x1D702}$) and rescaling time. Thus, our time-stepping procedure for $\unicode[STIX]{x1D703}$ and $\unicode[STIX]{x1D711}$ is equivalent to simultaneously applying gradient ascent for the concave variable $\unicode[STIX]{x1D709}$ and gradient descent for the convex variable $\unicode[STIX]{x1D702}$ in the ${\mathcal{S}}$ functional. This is similar to the philosophy of Wen et al. (Reference Wen, Chini, Kerswell and Doering2015). In fact, the only modification required to compute two-dimensional no-slip background fields would be to project the $\unicode[STIX]{x1D702}$ variable to the zeroth Fourier mode at each time step, by taking the $x$ average of the right-hand side of the $\unicode[STIX]{x1D702}$ equation.

The second time-stepping method involves computing the saddles of ${\mathcal{S}}$ via

(3.17a-d)

$$\begin{eqnarray}\unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D709}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D709}},\quad 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D702}},\quad \unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\boldsymbol{u}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\boldsymbol{u}},\quad \text{and}\quad 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}p},\end{eqnarray}$$

while fixing $\unicode[STIX]{x1D707}$. The resulting enstrophy depends implicitly on $\unicode[STIX]{x1D707}$. Enforcing $\unicode[STIX]{x1D6FF}{\mathcal{F}}/\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D702}=0$ at each time step yields gradient ascent for the local optima of (2.6), an unconstrained variational problem. The reason why this algorithm is efficient is due to the many existing algorithms for quick inversion of the Laplacian and the Helmholtz operator.

We found numerically that both (3.14) and (3.17) yield the same result. We also implemented various other ascent procedures with different preconditioners. For example, we evolved equations of the form

(3.18a-e)

$$\begin{eqnarray}-\unicode[STIX]{x0394}\unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D703}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D711}},\quad -\unicode[STIX]{x0394}\unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D711}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D703}},\quad \unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\boldsymbol{u}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\boldsymbol{u}},\quad 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}p},\quad \text{and}\quad \unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D707}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D707}}\end{eqnarray}$$

forward in time, as well as

(3.19a-e)

$$\begin{eqnarray}-\unicode[STIX]{x0394}\unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D703}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D711}},\quad -\unicode[STIX]{x0394}\unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D711}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D703}},\quad -\unicode[STIX]{x0394}\unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\boldsymbol{u}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\boldsymbol{u}},\quad 0=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}p},\quad \text{and}\quad \unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D707}=\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D707}},\end{eqnarray}$$

evolving different components on different time scales. Amongst all of our results, the ones presented in § 4 maximize $Nu$ for a given $Pe$.

3.2 Temporal discretization

Each of the gradient ascent procedures described in § 3 are of the form

(3.20)

$$\begin{eqnarray}{\dot{x}}={\mathcal{L}}x+{\mathcal{N}}(x)+f,\end{eqnarray}$$

where $x$ is the state vector, ${\mathcal{L}}$ is an ‘easily’ invertible linear operator, ${\mathcal{N}}$ is a nonlinear operator and $f$ is a forcing function. We follow Viswanath & Tobasco (Reference Viswanath and Tobasco2013) and consider linear multi-step schemes as follows:

(3.21)

$$\begin{eqnarray}\frac{1}{\unicode[STIX]{x0394}\unicode[STIX]{x1D70F}}\left(\unicode[STIX]{x1D6FE}x^{n+1}+\mathop{\sum }_{j=0}^{s-1}a_{j}x^{n-j}\right)=\mathop{\sum }_{j=0}^{s-1}b_{j}{\mathcal{N}}(x^{n-j})+{\mathcal{L}}x^{n+1}+f^{n+1},\end{eqnarray}$$

where $s$ is the order of the time-stepping scheme, $a_{i}$ and $b_{i}$ are parameters and $\unicode[STIX]{x0394}t$ is the time-step size. The parameters values for orders $s=1,2$ and $3$ are

(3.22a-d)

$$\begin{eqnarray}s=1,\quad \unicode[STIX]{x1D6FE}=1,\quad a_{0}=-1,\quad b_{0}=1,\end{eqnarray}$$

(3.23a-f)

$$\begin{eqnarray}s=2,\quad \unicode[STIX]{x1D6FE}=3/2,\quad a_{0}=-2,\quad a_{1}=1/2,\quad b_{0}=2,\quad b_{1}=-1,\end{eqnarray}$$

(3.24a-h)

$$\begin{eqnarray}s=3,\quad \unicode[STIX]{x1D6FE}=11/6,\quad a_{0}=-3,\quad a_{1}=3/2,\quad a_{2}=-1/3,\quad b_{0}=3,\quad b_{1}=-3,\quad b_{2}=1.\end{eqnarray}$$

For example, with $s=1$ and the advection–diffusion equation,

(3.25)

$$\begin{eqnarray}\unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D703}=-\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D703}+\unicode[STIX]{x0394}\unicode[STIX]{x1D703}+u_{3}\end{eqnarray}$$

we use

(3.26)

$$\begin{eqnarray}\left(\unicode[STIX]{x1D6E5}-\frac{1}{\unicode[STIX]{x0394}\unicode[STIX]{x1D70F}}\boldsymbol{I}\right)\unicode[STIX]{x1D703}^{n+1}=\boldsymbol{u}^{n}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D703}^{n}-w^{n}-\frac{1}{\unicode[STIX]{x0394}\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D703}^{n}.\end{eqnarray}$$

Thus for each time step we must solve a modified Poisson’s equation of the form

(3.27)

$$\begin{eqnarray}\left(\unicode[STIX]{x1D6E5}-c\boldsymbol{I}\right)\unicode[STIX]{x1D703}=f,\end{eqnarray}$$

where $c\geqslant 0$ and we have made the transcriptions

(3.28a-c)

$$\begin{eqnarray}\unicode[STIX]{x1D703}^{n+1}\mapsto \unicode[STIX]{x1D703},\quad \frac{1}{\unicode[STIX]{x0394}\unicode[STIX]{x1D70F}}\mapsto c,\quad \text{and}\quad \boldsymbol{u}^{n}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D703}^{n}-w^{n}-\frac{1}{\unicode[STIX]{x0394}t}\unicode[STIX]{x1D703}^{n}\mapsto f.\end{eqnarray}$$

Analogous discretizations are used for $\unicode[STIX]{x1D711},\unicode[STIX]{x1D702}$ and $\unicode[STIX]{x1D709}$. For updating the optimality condition with $s=1$ one option is to use

(3.29)

$$\begin{eqnarray}\displaystyle & \displaystyle \left(\unicode[STIX]{x1D707}\unicode[STIX]{x1D6E5}-\frac{1}{\unicode[STIX]{x0394}\unicode[STIX]{x1D70F}}\boldsymbol{I}\right)\boldsymbol{u}^{n+1}=-\left(\unicode[STIX]{x1D711}^{n}+\unicode[STIX]{x1D703}^{n}\right)\hat{e}_{3}+\unicode[STIX]{x1D711}^{n}\unicode[STIX]{x1D735}\unicode[STIX]{x1D703}^{n}-\frac{1}{\unicode[STIX]{x0394}\unicode[STIX]{x1D70F}}\boldsymbol{u}^{n}+\unicode[STIX]{x1D735}p^{n+1}, & \displaystyle\end{eqnarray}$$

(3.30)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D735}\boldsymbol{\cdot }\boldsymbol{u}^{n+1}=0. & \displaystyle\end{eqnarray}$$

Each time step involves solving a modified Stokes equation

(3.31)

$$\begin{eqnarray}\displaystyle & \displaystyle \left(\unicode[STIX]{x1D6E5}-c\boldsymbol{I}\right)\boldsymbol{u}=\boldsymbol{f}+\unicode[STIX]{x1D735}p, & \displaystyle\end{eqnarray}$$

(3.32)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D735}\boldsymbol{\cdot }\boldsymbol{u}=0, & \displaystyle\end{eqnarray}$$

where $c\geqslant 0$ and we have made the transcriptions

(3.33a-d)

$$\begin{eqnarray}\boldsymbol{u}^{n+1}\mapsto \boldsymbol{u},\quad p^{n+1}\mapsto p,\quad \frac{1}{\unicode[STIX]{x0394}\unicode[STIX]{x1D70F}}\mapsto c,\quad \text{and}\quad \left(\unicode[STIX]{x1D711}^{n}+\unicode[STIX]{x1D703}^{n}\right)\hat{e}_{3}+\unicode[STIX]{x1D711}^{n}\unicode[STIX]{x1D735}\unicode[STIX]{x1D703}^{n}-\frac{1}{\unicode[STIX]{x0394}\unicode[STIX]{x1D70F}}\boldsymbol{u}^{n}\mapsto \boldsymbol{f}.\end{eqnarray}$$

We solve these boundary value problems by utilizing a pseudo-spectral method, where the wall-bounded direction is represented by Chebyshev polynomials and the periodic directions are represented by Fourier series; however, our use of Chebyshev polynomials utilizes spectral integration in the same way as Viswanath (Reference Viswanath2015). The Stokes equation (3.31) was solved utilizing the Kleiser–Schumann algorithm (Kleiser & Schumann Reference Kleiser, Schumann and Hirschel1980). See appendix A for details regarding implementation. A theoretical discussion of the approach is in Viswanath (Reference Viswanath2015). Let us highlight some of the benefits of spectral integration here:

(i) Memory efficiency: the total memory occupied is linear in the number of points required to describe the state variable $\boldsymbol{u}$.
(ii) Speed: solving the linear equations involves only tridiagonal matrices which are put into lower–upper form at the beginning of the gradient ascent procedure.
(iii) Discretization accuracy: everything is represented using Fourier and Chebyshev modes, allowing for spectral accuracy.
(iv) Machine accuracy: utilizing spectral integration in factored form (see appendix A) allows us to avoid taking wall-bounded derivatives and only take derivatives in periodic direction, without passing to the nodal domain.

At this point, all the pieces are in place to compute flow fields maximizing heat transfer. By utilizing gradient ascent or other time-marching schemes of § 3.1, it is possible to adapt existing Rayleigh–Bénard codes to find steady maximizing flow fields and compare their thermal transfer to that of natural flows.

4 Two-dimensional computations

For every decade of Péclet number we computed approximately 20 logarithmically spaced solutions to the Euler–Lagrange equations (3.3) and (3.8). We then performed numerical differentiation of $\log Nu$ and $\log \unicode[STIX]{x1D6E4}$ with respect to $\log Pe$ to examine (local) scaling relations. The largest enstrophy satisfied $Pe\approx 10^{5}$, with an $x$ Fourier, $z$ Chebyshev grid size of $512\times 1025$. The time-stepping code slowed substantially at larger $Pe$.

In addition to numerical continuation from small Péclet number ($Pe\approx 0$), we started at different points in function space in an attempt to find more optimal solutions. We utilized solutions to other related variational problems as ‘educated guesses’ for new flows at which we started our gradient ascent procedure. We computed solutions to the Euler–Lagrange equations for fixed aspect ratio $\unicode[STIX]{x1D6E4}$ as well as optimizing over all $\unicode[STIX]{x1D6E4}$. The solutions presented here achieved the largest transport.

Figure 1 shows visualizations of the temperature field and the streamfunction with no-slip boundary conditions. In the low Péclet number regime, the solutions are convection cells while at higher $Pe$ ‘recirculation zones’ develop on the bottom and top boundaries. Additionally, the optimal aspect ratio of a cell size shrinks with ever increasing $Pe$. For a fixed aspect ratio, optimal solutions contain a multiplicity of convection cells. One of the benefits of our numerical approach is the automatic computation of the optimal domain size with little additional overhead.

Figure 1. Optimal no-slip solutions for different enstrophy budgets in a single cell. The black contour lines are the streamlines and the colours represent the temperature field. From left to right, top to bottom the Péclet numbers are $4.0\times 10^{-1},4.0\times 10^{0},4.0\times 10^{1},4.0\times 10^{2},4.0\times 10^{3},4.0\times 10^{4}$. The domain size in the horizontal direction $x$ shrinks as the enstrophy budget increases. (a) $Pe=4\times 10^{-1}$, (b) $Pe=4\times 10^{0}$, (c) $Pe=4\times 10^{1}$, (d) $Pe=4\times 10^{2}$, (e) $Pe=4\times 10^{3}$, (f) $Pe=4\times 10^{4}$.

Figure 2. Computed optimal Nusselt number ($Nu$) and aspect ratio ($\unicode[STIX]{x1D6E4}$) as a function of the enstrophy budget ($Pe$), for no-slip boundary conditions. (a) Log–log plot of $Pe$ versus $Nu$-1. (b) Log–log plot of $Pe$ versus $\unicode[STIX]{x1D6E4}$. Panels (c) and (d) show slopes of (a) and (b), respectively. The last slope for the (c) plot is $0.544$ and the last slope for the (d) plot is $-0.371$. The largest computed $Pe=2.5\times 10^{5}$ corresponding to $\unicode[STIX]{x1D707}=1.4\times 10^{-9}$.

In figure 2 we report the $Nu$-$Pe$ and $\unicode[STIX]{x1D6E4}$-$Pe$ relations and local scaling relations for the best known optimizers (for no-slip boundary conditions). After leaving the ‘linear’ regime where $Nu\sim Pe^{2}$ we enter a ‘fully nonlinear’ regime where $Nu\sim Pe^{0.54}$. In view of the results of Tobasco & Doering (Reference Tobasco and Doering2017) and Doering & Tobasco (Reference Doering and Tobasco2019) this scaling cannot persist for the globally maximal transport at sufficiently large $Pe$. Nevertheless, it does appear to be optimal for the computed range of Péclet number in two spatial dimensions. (The $Nu\sim Pe^{0.54}$ optimal transport scaling in this regime was first reported in Souza (Reference Souza2016) and thereafter confirmed by independent computation by Motoki et al. (Reference Motoki, Kawahara and Shimizu2018b) and Ding & Kerswell (Reference Ding and Kerswell2019).)

In the linear regime there is little change in the optimal aspect ratio, i.e. the optimal $\unicode[STIX]{x1D6E4}\approx \text{constant}$ while in the fully nonlinear regime a non-trivial $\unicode[STIX]{x1D6E4}\sim Pe^{-0.37}$ scaling emerges. Interestingly, the local exponent exhibits a somewhat oscillatory relaxation rather than a perhaps more expected monotonic convergence.

4.1 Singular value decomposition

Given the structure of the solutions depicted in figure 1 one may wonder if the corresponding fields are to first approximation separable. With regards to the streamfunction $\unicode[STIX]{x1D713}$ defined by $(-\unicode[STIX]{x2202}_{z}\unicode[STIX]{x1D713},\unicode[STIX]{x2202}_{x}\unicode[STIX]{x1D713})=(u_{1},u_{3})$, this would mean

(4.1)

$$\begin{eqnarray}\unicode[STIX]{x1D713}(x,z)\approx \unicode[STIX]{x1D6F9}(z)\unicode[STIX]{x1D719}(x).\end{eqnarray}$$

We investigate this possibility by minimizing the functional

(4.2)

$$\begin{eqnarray}A[\unicode[STIX]{x1D6F9},\unicode[STIX]{x1D719}]=\frac{1}{\unicode[STIX]{x1D6E4}}\int _{0}^{\unicode[STIX]{x1D6E4}}\int _{0}^{1}(\unicode[STIX]{x1D713}(x,z)-\unicode[STIX]{x1D6F9}(z)\unicode[STIX]{x1D719}(x))^{2}\,\text{d}x\,\text{d}z\end{eqnarray}$$

subject to appropriate constraints on $\unicode[STIX]{x1D6F9}(z)$ and $\unicode[STIX]{x1D719}(x)$, and similarly for other fields such as $\unicode[STIX]{x1D702}$ or $\unicode[STIX]{x1D709}$. (For example, $\unicode[STIX]{x1D719}(x)$ should be periodic and $\unicode[STIX]{x1D6F9}(z)$ should satisfy no-slip boundary conditions.) If the minimal value is sufficiently small, then we may say that the flow fields are nearly separable.

Upon discretization,

(4.3)

$$\begin{eqnarray}\displaystyle A[\unicode[STIX]{x1D6F9},\unicode[STIX]{x1D719}]\approx \mathop{\sum }_{i,j}(\unicode[STIX]{x1D713}_{ij}-\unicode[STIX]{x1D6F9}_{j}\unicode[STIX]{x1D719}_{i})^{2}w_{ij},\end{eqnarray}$$

(4.4a-c)

$$\begin{eqnarray}\displaystyle \unicode[STIX]{x1D713}_{ij}=\unicode[STIX]{x1D713}(x_{i},z_{j}),\quad \unicode[STIX]{x1D719}_{i}=\unicode[STIX]{x1D719}(x_{i}),\quad \text{and}\quad \unicode[STIX]{x1D6F9}_{j}=\unicode[STIX]{x1D6F9}(z_{j}),\end{eqnarray}$$

where $x_{i}=\unicode[STIX]{x1D6E4}i/n$ for $i=0,\ldots ,n-1$ and $z_{j}={\textstyle \frac{1}{2}}(1+\cos (\unicode[STIX]{x03C0}j/m))$ for $j=0,\ldots ,m$ are the collocation points of the Fourier and Chebyshev discretizations and $w_{ij}$ are weights for approximating the integral by a discrete sum. Spectral accuracy for the approximation (4.3) was obtained by setting

(4.5a-c)

$$\begin{eqnarray}w_{ij}=\unicode[STIX]{x0394}x_{i}\unicode[STIX]{x0394}z_{j}\quad \text{with}\quad \unicode[STIX]{x0394}x_{i}=\frac{1}{n}\quad \text{and}\quad \unicode[STIX]{x0394}z_{j}=\frac{1}{2}\sin (a_{j})\frac{1}{m}\mathop{\sum }_{\ell =1}^{m-1}\sin (\ell a_{j})\frac{(1-\cos (\ell \unicode[STIX]{x03C0}))}{\ell },\end{eqnarray}$$

where $a_{j}=\unicode[STIX]{x03C0}j/m$ for $j=0,\ldots ,m$, see Boyd (Reference Boyd2001). Since we are dealing with the average value of the integral there is no factor of $\unicode[STIX]{x1D6E4}$ in $\unicode[STIX]{x0394}x_{i}$.

In the discrete problem corresponding to (4.3) we minimized the weighted Frobenius norm of the matrix $\unicode[STIX]{x1D713}_{ij}$ with respect to an outer product decomposition, hence, for $w_{ij}=1$, the solution to the discrete problem is to take $\unicode[STIX]{x1D6F9}$ and $\unicode[STIX]{x1D719}$ to be equal to the largest singular vectors of the matrix $\unicode[STIX]{x1D713}_{ij}$ multiplied by the largest singular value. For $w_{ij}\neq 1$, we rescaled the problem taking $\unicode[STIX]{x1D713}_{ij}^{\prime }=\unicode[STIX]{x1D713}_{ij}\sqrt{w_{ij}}$ and found the largest singular vectors of $\unicode[STIX]{x1D713}_{ij}^{\prime }$. The difference between uniform and non-uniform $w_{ij}$ may be interpreted as minimizing with respect to different weighted energy norms and, in our numerical experiments, we found little qualitative difference between utilizing one over the other. All singular value decompositions reported here were computed using $w_{ij}=1$.

Figure 3 shows the first three modes in the singular value decomposition of the streamfunction $\unicode[STIX]{x1D713}$. Figure 5 shows the same for the $\unicode[STIX]{x1D709}$ field and the vertical velocity field $u_{3}$. The modes with smaller singular values may be viewed as a preliminary manifestation of a branching-like pattern, perhaps similar to the ones constructed in Tobasco & Doering (Reference Tobasco and Doering2017) and Doering & Tobasco (Reference Doering and Tobasco2019). The structures corresponding to the largest singular value are the direct analogue for the wall-to-wall problem of the single wavenumber solutions for the Howard–Busse–Malkus problem produced in Howard (Reference Howard1963).

As it turns out, these dominant structures carry ${\approx}99\%$ of the overall heat transport in the computed range of $Pe$. More precisely, writing

(4.6)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D713}(x,z)\approx \unicode[STIX]{x1D6F9}(z)\unicode[STIX]{x1D719}^{(1)}(x), & \displaystyle\end{eqnarray}$$

(4.7)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D709}(x,z)\approx \unicode[STIX]{x1D6EF}(z)\unicode[STIX]{x2202}_{x}\unicode[STIX]{x1D719}^{(2)}(x), & \displaystyle\end{eqnarray}$$

(4.8)

$$\begin{eqnarray}\displaystyle & \displaystyle N_{1}=\langle (\unicode[STIX]{x2202}_{x}\unicode[STIX]{x1D713})\unicode[STIX]{x1D709}\rangle , & \displaystyle\end{eqnarray}$$

(4.9)

$$\begin{eqnarray}\displaystyle & \displaystyle N_{2}=\langle \unicode[STIX]{x1D6F9}(\unicode[STIX]{x2202}_{x}\unicode[STIX]{x1D719}^{(1)})\unicode[STIX]{x1D6EF}\unicode[STIX]{x2202}_{x}\unicode[STIX]{x1D719}^{(2)}\rangle , & \displaystyle\end{eqnarray}$$

our solutions achieved $(N_{1}-N_{2})/N_{1}\leqslant 0.01$ uniformly over all computed Péclet numbers, see figure 4. (This error was even less for smaller Péclet numbers.) The fact that (4.8) is equal to $Nu-1$ for solutions to the wall-to-wall problem follows from the Euler–Lagrange equation (3.8).

Figure 3. The streamfunction $\unicode[STIX]{x1D713}$ and its first three modes at $Pe\approx 2.4\times 10^{4}$. The modes are ordered left to right by the magnitude of their singular values starting with the largest.

Figure 4. The quantity $|N_{1}-N_{2}|/N_{2}\times 100$ as a function of $Pe$. The value remains below $1\%$ throughout the range of numerically computed values of $Pe$.

Figure 5. The field $\unicode[STIX]{x1D709}=\unicode[STIX]{x1D703}+\unicode[STIX]{x1D711}$ (top row), the vertical velocity $u_{3}$ (bottom row) and their first three modes at $Pe\approx 2.4\times 10^{4}$. The outer products are ordered with respect to their singular values.

Figure 6. The first singular vectors in the singular value decomposition of the $\unicode[STIX]{x1D709}$ field (a,b) and the vertical velocity $u_{3}$ (c,d). The respective products of these singular vectors yields the first approximations in figure 5. The Péclet number ${\approx}2.4\times 10^{4}$.

Figure 6 depicts the first singular vectors for $\unicode[STIX]{x1D709}$ and $u_{3}$. While in the Howard–Busse–Malkus problem the corresponding $x$ dependence is perfectly sinusoidal, for the wall-to-wall problem the $x$ dependences of the first singular vectors resemble Jacobi elliptic functions. The similarity between $\unicode[STIX]{x2202}_{x}\unicode[STIX]{x1D719}^{(1)}$ and $\unicode[STIX]{x2202}_{x}\unicode[STIX]{x1D719}^{(2)}$ in figure 6 motivates the ‘separable ansatz’

(4.10)

$$\begin{eqnarray}\displaystyle & \displaystyle u_{1}(x,z)=-\unicode[STIX]{x1D6F9}^{\prime }(z)\unicode[STIX]{x1D719}(x), & \displaystyle\end{eqnarray}$$

(4.11)

$$\begin{eqnarray}\displaystyle & \displaystyle u_{3}(x,z)=\unicode[STIX]{x1D6F9}(z)\unicode[STIX]{x1D719}^{\prime }(x), & \displaystyle\end{eqnarray}$$

(4.12)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D709}(x,z)=\unicode[STIX]{x1D6EF}(z)\unicode[STIX]{x1D719}^{\prime }(x), & \displaystyle\end{eqnarray}$$

which we consider further in § 5.1 below. There, we derive conditional upper bounds extending the single wavenumber analysis from Howard (Reference Howard1963) to the wall-to-wall problem. We will see that the functional from § 2.6 can be bounded from above by $Pe^{6/11}=Pe^{0.\overline{54}}$ amongst all separable ansatzes, in accord with the numerically computed $Nu\sim Pe^{0.54}$ scaling.

5 Conditional upper bound

As discussed in the previous section, the flow and temperature fields found to maximize heat transport are nearly separable and achieve the scaling $Nu\sim Pe^{0.54}$ within the range of computed $Pe$. Furthermore, most of the heat transport scaling was attained via a separable approximation. We now make use of the theoretical developments from § 2 to prove the upper bound $Nu-1\leqslant CPe^{0.\overline{54}}$ under the assumption of a perfectly separable ansatz. We do this by rederiving the Howard functional from Howard (Reference Howard1963) (as in § 2.6) and assuming a separable ansatz at the outset. It should be emphasized that the arguments given below do not yield a fully ansatz-free explanation of the observed numerical scaling from the previous section since it must still be proved that all solutions in the range $Pe\lesssim 10^{5}$ are indeed separable or, what is more likely the case, that their non-separable part contributes negligibly to transport.

5.1 Upper bounds within a separable ansatz

To obtain upper bounds on the Nusselt number within a separable ansatz, we start by recalling that the Howard–Busse–Malkus problem bounds the wall-to-wall problem from above. Indeed, as was shown in §§ 2.3 and 2.6,

(5.1)

$$\begin{eqnarray}\max _{\boldsymbol{u}}Nu-1=\max _{\boldsymbol{u},\unicode[STIX]{x1D709}}\min _{\unicode[STIX]{x1D702}}{\mathcal{S}}=\max _{\boldsymbol{v},\unicode[STIX]{x1D70E}}\tilde{{\mathcal{M}}}\leqslant \max _{\boldsymbol{v},\unicode[STIX]{x1D70E}}{\mathcal{M}}\end{eqnarray}$$

since the minimum is only made smaller upon enlarging the class of admissible functions. Thus, we are led to consider the functional

(5.2)

from § 2.6, while also restricting ourselves to the separable ansatz

(5.3)

$$\begin{eqnarray}\displaystyle & \displaystyle u_{1}(x,z)=-\unicode[STIX]{x1D6F9}^{\prime }(z)\unicode[STIX]{x1D719}(x), & \displaystyle\end{eqnarray}$$

(5.4)

$$\begin{eqnarray}\displaystyle & \displaystyle u_{3}(x,z)=\unicode[STIX]{x1D6F9}(z)\unicode[STIX]{x1D719}^{\prime }(x), & \displaystyle\end{eqnarray}$$

(5.5)

$$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D709}(x,z)=\unicode[STIX]{x1D6EF}(z)\unicode[STIX]{x1D719}^{\prime }(x), & \displaystyle\end{eqnarray}$$

where $^{\prime }$ denotes an ordinary derivative. Note, while the argument that follows is similar to those of Howard (Reference Howard1963) and Doering & Constantin (Reference Doering and Constantin1996), it is more general in scope since here we allow $\unicode[STIX]{x1D719}(x)$ to be any periodic function rather than some well-chosen Fourier mode. Our choice to study this more general case is motivated by the previous discussion of numerical results, which showed how, in the wall-to-wall problem, non-sinusoidal $\unicode[STIX]{x1D719}(x)$ arise. Our goal now is to bound the functional ${\mathcal{M}}$ from above under the assumptions of separability (5.3)–(5.5). Again, we note that this does not provide the unconditional upper bound on ${\mathcal{M}}$ required to deduce bounds on $\text{max }Nu$ according to (5.1). What we lack is a proof that (5.3)–(5.5) are indeed valid assumptions over the given range of $Pe$. Nevertheless, we proceed.

Note that the velocity field is incompressible, and we normalize $\unicode[STIX]{x1D719}$ by requiring

(5.6)

$$\begin{eqnarray}\overline{(\unicode[STIX]{x1D719}^{\prime })^{2}}=1.\end{eqnarray}$$

Given the homogeneity of the functional we may impose that

(5.7)

$$\begin{eqnarray}\langle u_{3}\unicode[STIX]{x1D709}\rangle =\int _{0}^{1}\unicode[STIX]{x1D6F9}(z)\unicode[STIX]{x1D6EF}(z)\,\text{d}z=1.\end{eqnarray}$$

Given these, we seek to bound the denominator of (5.2) from below so as to produce the desired upper bound.

Note the identities

(5.8)

$$\begin{eqnarray}\displaystyle & \displaystyle \langle u_{3}\unicode[STIX]{x1D709}\rangle =\int _{0}^{1}\unicode[STIX]{x1D6F9}(z)\unicode[STIX]{x1D6EF}(z)\,\text{d}z, & \displaystyle\end{eqnarray}$$

(5.9)

$$\begin{eqnarray}\displaystyle & \displaystyle \langle |\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\rangle =\int _{0}^{1}\left(\overline{\unicode[STIX]{x1D719}^{2}}(\unicode[STIX]{x1D6F9}^{\prime \prime })^{2}+2(\unicode[STIX]{x1D6F9}^{\prime })^{2}+\overline{(\unicode[STIX]{x1D719}^{\prime \prime })^{2}}(\unicode[STIX]{x1D6F9})^{2}\right)\,\text{d}z, & \displaystyle\end{eqnarray}$$

(5.10)

$$\begin{eqnarray}\displaystyle & \displaystyle \langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}|^{2}\rangle =\int _{0}^{1}(\unicode[STIX]{x1D6EF}^{\prime })^{2}+\overline{(\unicode[STIX]{x1D719}^{\prime \prime })^{2}}\unicode[STIX]{x1D6EF}^{2}\,\text{d}z, & \displaystyle\end{eqnarray}$$

(5.11)

$$\begin{eqnarray}\displaystyle & \displaystyle \langle (\overline{u_{3}\unicode[STIX]{x1D709}}-\langle u_{3}\unicode[STIX]{x1D709}\rangle )^{2}\rangle =\int _{0}^{1}\left(\unicode[STIX]{x1D6F9}\unicode[STIX]{x1D6EF}-1\right)^{2}\,\text{d}z, & \displaystyle\end{eqnarray}$$

where for convenience we have abbreviated $\int _{0}^{1}f(z)\,\text{d}z$ as $\int f$. Using these, we obtain the lower bound

(5.12)

$$\begin{eqnarray}\displaystyle \langle |\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\rangle \langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}|^{2}\rangle & {\geqslant} & \displaystyle \overline{\unicode[STIX]{x1D719}^{2}}\int (\unicode[STIX]{x1D6F9}^{\prime \prime })^{2}\int (\unicode[STIX]{x1D6EF}^{\prime })^{2}+\overline{\left(\unicode[STIX]{x1D719}^{\prime \prime }\right)^{2}}\int \unicode[STIX]{x1D6F9}^{2}\int (\unicode[STIX]{x1D6EF}^{\prime })^{2}+\overline{\unicode[STIX]{x1D719}^{2}}\int (\unicode[STIX]{x1D6F9}^{\prime \prime })^{2}\overline{\left(\unicode[STIX]{x1D719}^{\prime \prime }\right)^{2}}\int \unicode[STIX]{x1D6EF}^{2}\nonumber\\ \displaystyle & & \displaystyle +\,\left(\overline{(\unicode[STIX]{x1D719}^{\prime \prime })^{2}}\right)^{2}\int \unicode[STIX]{x1D6F9}^{2}\int \unicode[STIX]{x1D6EF}^{2}\end{eqnarray}$$

(5.13)

$$\begin{eqnarray}\displaystyle & {\geqslant} & \displaystyle \overline{\unicode[STIX]{x1D719}^{2}}\int (\unicode[STIX]{x1D6F9}^{\prime \prime })^{2}\int (\unicode[STIX]{x1D6EF}^{\prime })^{2}+2\sqrt{\overline{\left(\unicode[STIX]{x1D719}^{\prime \prime }\right)^{2}}\int \unicode[STIX]{x1D6F9}^{2}\int (\unicode[STIX]{x1D6EF}^{\prime })^{2}\overline{\unicode[STIX]{x1D719}^{2}}\int (\unicode[STIX]{x1D6F9}^{\prime \prime })^{2}\overline{\left(\unicode[STIX]{x1D719}^{\prime \prime }\right)^{2}}\int \unicode[STIX]{x1D6EF}^{2}}\nonumber\\ \displaystyle & & \displaystyle +\,\left(\overline{(\unicode[STIX]{x1D719}^{\prime \prime })^{2}}\right)^{2}\int \unicode[STIX]{x1D6F9}^{2}\int \unicode[STIX]{x1D6EF}^{2},\end{eqnarray}$$

by an elementary Young’s inequality. The interpolation inequality

(5.14)

$$\begin{eqnarray}\left[\overline{(\unicode[STIX]{x1D719}^{\prime \prime })^{2}}\right]\left[\overline{\unicode[STIX]{x1D719}^{2}}\right]\geqslant \left(\overline{\unicode[STIX]{x1D719}\unicode[STIX]{x1D719}^{\prime \prime }}\right)^{2}=\left(\overline{(\unicode[STIX]{x1D719}^{\prime })^{2}}\right)^{2}=1\end{eqnarray}$$

follows by combining Cauchy–Schwarz, integration by parts, and the normalization (5.6). Similarly,

(5.15)

$$\begin{eqnarray}\int \unicode[STIX]{x1D6EF}^{2}\int \unicode[STIX]{x1D6F9}^{2}\geqslant 1,\end{eqnarray}$$

by Cauchy–Schwarz and the net flux constraint (5.7).

Proceeding, we deduce from (5.14) and (5.15) that

(5.16)

$$\begin{eqnarray}\displaystyle \langle |\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\rangle \langle |\unicode[STIX]{x1D735}\unicode[STIX]{x1D709}|^{2}\rangle & {\geqslant} & \displaystyle \frac{1}{\overline{(\unicode[STIX]{x1D719}^{\prime \prime })^{2}}}\int (\unicode[STIX]{x1D6F9}^{\prime \prime })^{2}\!\int (\unicode[STIX]{x1D6EF}^{\prime })^{2}+2\sqrt{\overline{\left(\unicode[STIX]{x1D719}^{\prime \prime }\right)^{2}}\!\int (\unicode[STIX]{x1D6F9}^{\prime \prime })^{2}\!\int (\unicode[STIX]{x1D6EF}^{\prime })^{2}}+(\overline{(\unicode[STIX]{x1D719}^{\prime \prime })^{2}})^{2}\nonumber\\ \displaystyle & & \displaystyle\end{eqnarray}$$

(5.17)

$$\begin{eqnarray}\displaystyle & {\geqslant} & \displaystyle \inf _{a>0}\left\{\frac{1}{a^{2}}\int (\unicode[STIX]{x1D6F9}^{\prime \prime })^{2}\int (\unicode[STIX]{x1D6EF}^{\prime })^{2}+a\sqrt{4\int (\unicode[STIX]{x1D6F9}^{\prime \prime })^{2}\int (\unicode[STIX]{x1D6EF}^{\prime })^{2}}+a^{4}\right\}\end{eqnarray}$$

(5.18)

$$\begin{eqnarray}\displaystyle & = & \displaystyle \frac{9}{2^{4/3}}\left(\int (\unicode[STIX]{x1D6F9}^{\prime \prime })^{2}\int (\unicode[STIX]{x1D6EF}^{\prime })^{2}\right)^{2/3}.\end{eqnarray}$$

Utilizing Howard’s lemma (the estimate referred to as such in Doering & Constantin (Reference Doering and Constantin1996), equation (6.26)), we can bound the remaining term by

(5.19)

$$\begin{eqnarray}\int \left(\unicode[STIX]{x1D6F9}\unicode[STIX]{x1D6EF}-1\right)^{2}\geqslant \frac{16}{15}\left(\frac{3}{2}\right)^{1/2}\left(\int (\unicode[STIX]{x1D6F9}^{\prime \prime })^{2}\int (\unicode[STIX]{x1D6EF}^{\prime })^{2}\right)^{-1/4}.\end{eqnarray}$$

Taking $\unicode[STIX]{x1D6FF}=\left(\int (\unicode[STIX]{x1D6F9}^{\prime \prime })^{2}\int (\unicode[STIX]{x1D6EF}^{\prime })^{2}\right)^{-1/4}$, we see that the denominator of the functional ${\mathcal{M}}$ given in (5.2) – in the separable ansatz – is bounded below by

(5.20)

$$\begin{eqnarray}\frac{16}{15}\left(\frac{3}{2}\right)^{1/2}\unicode[STIX]{x1D6FF}+\frac{9Pe^{-2}}{2^{4/3}}\frac{1}{\unicode[STIX]{x1D6FF}^{8/3}}\geqslant \left(\frac{16}{15}\left(\frac{3}{2}\right)^{1/2}\right)^{8/11}\left(\frac{9Pe^{-2}}{2^{4/3}}\right)^{3/11}\left(\left(\frac{3}{8}\right)^{-3/11}+\left(\frac{3}{8}\right)^{8/11}\right)\end{eqnarray}$$

after minimizing over $\unicode[STIX]{x1D6FF}$.

Thus, ${\mathcal{M}}$ is bounded from above by a constant multiple of $Pe^{6/11}=Pe^{0.\overline{54}}$ in the separable ansatz (5.3)–(5.5), in accord with the numerically computed transport scaling of $Pe^{0.54}$. While this is suggestive of the bound $Nu-1\leqslant 0.324Pe^{6/11}$ (roughly a factor of two larger than the numerical prefactor ${\approx}0.15$), it remains to be shown that the separable ansatz holds up to terms negligible in their transport for the given range of $Pe$.

6 Summary and conclusions

We developed and applied time-stepping methods for solving the wall-to-wall problem for steady flows. Along the way we developed theoretical tools illuminating connections between the wall-to-wall problem and both the background method of Doering & Constantin (Reference Doering and Constantin1996) and Howard’s classic formulation of heat transport bounds for Rayleigh–Bénard convection from Howard (Reference Howard1963). We computed optimal steady two-dimensional flow fields with no-slip boundary conditions whose heat transport scaling is $Nu\sim Pe^{0.54}$ for Péclet numbers between $10^{3}$ and $10^{5}$. Upon examining these two-dimensional flows, we computed their singular value decomposition, revealing concentration onto one dominant mode. This motivated us to consider the consequences of the additional assumption that the flow fields are separable. Within such an ansatz, we proved a conditional upper bound scaling as $Pe^{6/11}$ corresponding to the $Ra^{3/8}$ ‘single wavenumber’ upper bound scaling for Rayleigh–Bénard with no-slip boundary conditions.

In light of Tobasco & Doering (Reference Tobasco and Doering2017) and Doering & Tobasco (Reference Doering and Tobasco2019), however, these scalings for the optimal wall-to-wall problem cannot persist for global optimizers as $Pe\rightarrow \infty$. There, it was proved that wall-to-wall optimal transport in two and three dimensions, obeying no-slip or stress-free boundary conditions, satisfies $\max \,Nu\geqslant C^{\prime }Pe^{2/3}/(\log Pe)^{4/3}$ within logarithms of the a priori upper bound $Nu\leqslant C\,Pe^{2/3}$. Curiously, this lower bound coincides to a degree with the numerical results in the range of Péclet numbers explored: logarithmic differentiation yields

(6.1)

$$\begin{eqnarray}\frac{\text{d}\log (Nu-1)}{\text{d}\log Pe}\approx \frac{2}{3}-\frac{4}{3}\frac{1}{\log Pe}\approx 0.55\text{ at }Pe=10^{5},\end{eqnarray}$$

not far from the numerically computed $Nu\sim Pe^{0.54}$ result. Although at first we found no direct numerical evidence for the branching patterns constructed in Tobasco & Doering (Reference Tobasco and Doering2017) and Doering & Tobasco (Reference Doering and Tobasco2019), a singular value decomposition revealed branching in its inception. We wonder exactly how large $Pe$ must be for the higher modes to play a significant role in optimizing heat transport in a two-dimensional fluid layer.

The three-dimensional computations reported in Motoki et al. (Reference Motoki, Kawahara and Shimizu2018a) show a stark difference in flow structures and transport scaling as compared to the two-dimensional flows for both stress-free boundaries from Hassanzadeh et al. (Reference Hassanzadeh, Chini and Doering2014) and no-slip boundaries from § 4 of the present paper (see also Motoki et al. (Reference Motoki, Kawahara and Shimizu2018b)). Three-dimensional versions of branching appear to achieve the optimal scaling $Nu\sim Pe^{2/3}$ already for $Pe\in [10^{3},10^{4}]$ with a prefactor within $10\%$ of the background upper bound computations from Plasting & Kerswell (Reference Plasting and Kerswell2003). We wonder if this $10\%$ difference is a manifestation of our proposed duality gap between the wall-to-wall problem and the background method bounds. In any case, the evidence is that optimal flows take advantage of the presence of a third spatial dimension to maximize thermal transport.

We conclude with a list of five fundamental questions remaining for the optimal wall-to-wall transport problem:

(i) Are steady flows optimal?
(ii) Is it possible to prove the a priori upper bound $Nu\leqslant CPe^{2/3}/\log (Pe)^{4/3}$ for two-dimensional flows as $Pe\rightarrow \infty$, and does the upper bound $Nu\leqslant CPe^{6/11}$ hold for moderate $Pe$ instead?
(iii) Do there exist three-dimensional flow fields achieving $Nu\sim Pe^{2/3}$ as $Pe\rightarrow \infty$?
(iv) What do optimally transporting flows look like in other geometries, such as cylinders or domains with holes?
(v) Do structural properties of optimal flows resemble those of buoyancy-driven steady and/or statistically stationary turbulent Rayleigh–Bénard convection?

Acknowledgements

We thank R. Kerswell and D. Viswanath for many illuminating conversations and the anonymous reviewers for many helpful suggestions. This work was supported in part by NSF Awards DMS-1344199 (A.N.S.), DGE-0813964 and DMS-1812831 (I.T.), DMS-1515161 and DMS-1813003 (C.R.D.), a Van Loo Postdoctoral Fellowship (I.T.) and a Guggenheim Foundation Fellowship (C.R.D.).

Declaration of interests

The authors report no conflict of interest.

Appendix A. Spatial discretization

In what follows we provide details of implementing spectral integration with regards to solving the Poisson and Stokes equations with respect to our boundary conditions and domain.

A.1 Spectral methods

Taking the Fourier transform of (3.27) and (3.31) in the horizontal directions leads to the following set of ordinary differential equations to solve

(A 1)

$$\begin{eqnarray}(D^{2}-\unicode[STIX]{x1D6FD}_{n\ell }^{2})\unicode[STIX]{x1D703}_{n\ell }=f_{n\ell },\end{eqnarray}$$

and

(A 2)

$$\begin{eqnarray}\displaystyle & \displaystyle \left(D^{2}-\unicode[STIX]{x1D6FD}_{n\ell }^{2}\right)\boldsymbol{u}_{n\ell }=\boldsymbol{f}_{n\ell }+Dp_{n\ell }\hat{z}+\imath k_{n}p_{n\ell }\hat{x}+\imath \unicode[STIX]{x1D705}_{\ell }p_{n\ell }{\hat{y}} & \displaystyle\end{eqnarray}$$

(A 3)

$$\begin{eqnarray}\displaystyle & \displaystyle \imath k_{n}u_{n\ell }+\imath \unicode[STIX]{x1D705}_{\ell }v_{n\ell }+Dw_{n\ell }=0, & \displaystyle\end{eqnarray}$$

for $n,\ell \in \mathbb{Z}$, where $D$ the derivative in the vertical direction, and

(A 4a-e)

$$\begin{eqnarray}k_{n\ell }^{2}=(k_{n})^{2}+(\unicode[STIX]{x1D705}_{\ell })^{2},\quad \unicode[STIX]{x1D6FD}_{n\ell }^{2}=k_{n\ell }^{2}+c,\quad k_{n}=\frac{n\unicode[STIX]{x03C0}}{\unicode[STIX]{x1D6E4}_{1}},\quad \unicode[STIX]{x1D705}_{\ell }=\frac{\ell \unicode[STIX]{x03C0}}{\unicode[STIX]{x1D6E4}_{2}},\quad \text{and}\quad \imath =\sqrt{-1}.\end{eqnarray}$$

The square root denotes the principle branch. Although we could discretize the spatial coordinates using Chebyshev matrices, we instead use spectral integration. This method of solving boundary value problems of the form

(A 5)

$$\begin{eqnarray}(D-k)y=f\end{eqnarray}$$

subject to boundary conditions has numerous advantages over the differentiation matrix approach as in Trefethen (Reference Trefethen2001). With spectral integration, the operators that must be inverted have bounded condition numbers and are banded matrices as opposed to dense matrices with unbounded condition numbers.

Instead of solving for functions on $z\in [0,1]$, it is more convenient to take $z\in [-1,1]$ and then convert the results back to the original domain. Solutions in the $z\in [0,1]$ domain – denoted by the subscript $1$ as in $\unicode[STIX]{x1D703}_{1},\boldsymbol{u}_{1}$ – are related to solutions in the $z\in [-1,1]$ domain – denoted by the subscript $2$ as in $\unicode[STIX]{x1D703}_{2},\boldsymbol{u}_{2}$ – via the following relations

(A 6a-e)

$$\begin{eqnarray}\unicode[STIX]{x1D703}_{1}={\textstyle \frac{1}{2}}\unicode[STIX]{x1D703}_{2},\quad \boldsymbol{u}_{1}=2\boldsymbol{u}_{2},\quad Pe_{1}=4Pe_{2},\quad (Nu-1)_{1}=(Nu-1)_{2},\quad \unicode[STIX]{x1D707}_{1}=\unicode[STIX]{x1D707}_{2}/16.\end{eqnarray}$$

When performing calculations we use the $[-1,1]$ domain but report results in terms of the original $z\in [0,1]$ domain.

Computing averages (such as the Nusselt number) can be achieved with spectral accuracy. As mentioned in Boyd (Reference Boyd2001), the trapezoidal rule is spectrally accurate for periodic functions and for bounded domains there are quadrature weight formulas both in terms of the Chebyshev nodal and modal values. Furthermore, all products are computed by taking the inverse transforms, multiplying in real space and converting back to spectral space, as is common for pseudo-spectral methods.

A.2 Spectral integration

We have seen that gradient ascent for the wall-to-wall problem reduces to solving differential equations of the form

(A 7)

$$\begin{eqnarray}(D-k)y=f,\end{eqnarray}$$

where $k\in \boldsymbol{R}$ and $z\in [-1,1]$. The differential equation has the solution

(A 8)

$$\begin{eqnarray}y(z)=C\text{e}^{kz}+\text{e}^{-kz}\int _{-1}^{z}\text{e}^{kx}f(x)\,\text{d}x,\end{eqnarray}$$

where $C$ enforces boundary or integral constraints. This reduces the problem to quadrature and indeed the method that we adopt implicitly constructs the solution in this manner, as noted in Viswanath (Reference Viswanath2015).

To solve (A 7) we use a modern form of spectral integration developed by Viswanath (Reference Viswanath2015). The general principle is remarkably simple. First compute the homogeneous solution

(A 9)

$$\begin{eqnarray}(D-k)y^{h}=0,\end{eqnarray}$$

and then the particular solution

(A 10)

$$\begin{eqnarray}(D-k)y^{p}=f,\end{eqnarray}$$

so that the general solution is then a linear combination of the particular and homogeneous solution

(A 11)

$$\begin{eqnarray}y=Cy^{h}+y^{p},\end{eqnarray}$$

where $C$ is a constant that enforces boundary conditions or integral constraints. This basic decomposition of the general solution of an ordinary differential equation serves as the primary building block in the construction of solutions to (3.27) and (3.31).

We now discuss how to construct $y^{h}$ and $y^{p}$ in the domain $z\in [-1,1]$. First write $y$ as Chebyshev expansion of the form

(A 12)

$$\begin{eqnarray}y(z)=\frac{y_{0}}{2}P_{0}+\mathop{\sum }_{n=1}^{\infty }y_{n}P_{n}(z),\end{eqnarray}$$

where $P_{n}(x)$$n=0,1,2,\ldots$ are the Chebyshev polynomials defined by

(A 13)

$$\begin{eqnarray}P_{n}(z)=\cos (n\cos ^{-1}(z)).\end{eqnarray}$$

The factor of $1/2$ in front of the $y_{0}$ term is standard and convenient for formulas later on.

Let ${\mathcal{T}}_{n}(y)$ denote the $n$th Chebyshev coefficient of $y$, e.g.

(A 14)

$$\begin{eqnarray}\displaystyle & \displaystyle {\mathcal{T}}_{n}(y)=y_{n} & \displaystyle\end{eqnarray}$$

(A 15)

$$\begin{eqnarray}\displaystyle & \displaystyle {\mathcal{T}}_{n}\left(\int y\right)=\left\{\begin{array}{@{}ll@{}}0\quad & \text{ for }n=0,\\ {\displaystyle \frac{y_{n-1}-y_{n+1}}{2n}}\quad & \text{ for }n>0,\end{array}\right. & \displaystyle\end{eqnarray}$$

where $\int y$ here denotes a particular anti-derivative of $y$. The coefficients may be computed by the linear operator

(A 16)

$$\begin{eqnarray}y_{n}=\frac{2}{\unicode[STIX]{x03C0}}\int _{-1}^{1}y(x)\frac{P_{n}(x)}{\sqrt{1-x^{2}}}\,\text{d}x.\end{eqnarray}$$

Instead of working with (A 7) directly we work with the equation in integral form

(A 17)

$$\begin{eqnarray}y-k\int y=\int f+C,\end{eqnarray}$$

where $C$ is a constant of integration. An important observation is that if $f=Dg$ for some function $g$, then we do not need to differentiate $g$ to write it in the form (A 17) but rather we can directly use

(A 18)

$$\begin{eqnarray}y-k\int y=g+C.\end{eqnarray}$$

In other words, we can avoid numerically differentiating $g$.

Use the relation (A 15) to construct (A 17) as a system of equations for the Chebyshev coefficients, i.e.

(A 19)

$$\begin{eqnarray}\displaystyle & \displaystyle {\mathcal{T}}_{n}(y)-k{\mathcal{T}}_{n}\left(\int y\right)={\mathcal{T}}_{n}\left(\int f\right)+{\mathcal{T}}_{n}\left(C\right), & \displaystyle\end{eqnarray}$$

(A 20)

$$\begin{eqnarray}\displaystyle & \displaystyle \Rightarrow & \displaystyle \nonumber\\ \displaystyle & \displaystyle y_{0}=2C\quad \text{for }n=0, & \displaystyle\end{eqnarray}$$

(A 21)

$$\begin{eqnarray}\displaystyle & \displaystyle y_{n}-k\frac{y_{n-1}-y_{n+1}}{2n}=\frac{f_{n-1}-f_{n+1}}{2n}\quad \text{for }n>0. & \displaystyle\end{eqnarray}$$

The $f_{n}$ are the Chebyshev coefficients of the forcing function $f$. Note that any choice of $C$ yields a particular solution to the problem, but does not enforce the proper boundary conditions. This is an infinite-dimensional tridiagonal system of equations for the Chebyshev coefficients of $y$. In order to be amenable to computation any such system of equations must be truncated. Thus we assume that the solution $y$ and forcing function $f$ are well represented by a finite truncation. The use of this finite representation has additional benefits. The Chebyshev spectral coefficients of $y$ are related to the nodal values of $y$ evaluated at $\cos (\unicode[STIX]{x03C0}j/(N-1))$ for $j=0,1,\ldots ,N-1$ via the fast-cosine transform. This allows us to quickly convert spectral coefficients into real space, and herein lies one of the many advantages of using a Chebyshev series.

We are now in a position to show how to construct numerical homogeneous and particular solutions $y^{h}$ and $y^{p}$. To construct $y^{h}$ we solve the following problem:

(A 22)

$$\begin{eqnarray}(D-k)v=\frac{k}{2},\end{eqnarray}$$

subject to ${\mathcal{T}}_{0}(v)=0$. The general solution to (A 22) is

(A 23)

$$\begin{eqnarray}\displaystyle & \displaystyle v=Cy^{h}+v^{p}, & \displaystyle\end{eqnarray}$$

(A 24)

$$\begin{eqnarray}\displaystyle & \displaystyle v^{p}=-1/2, & \displaystyle\end{eqnarray}$$

(A 25)

$$\begin{eqnarray}\displaystyle & \displaystyle y^{h}(z)=\text{e}^{kz}, & \displaystyle\end{eqnarray}$$

where $v^{p}=-1/2$ is the particular solution. The condition ${\mathcal{T}}_{0}(v)=0$ guarantees that $C\neq 0$ since

(A 26)

$$\begin{eqnarray}C{\mathcal{T}}_{0}(y^{h})={\mathcal{T}}_{0}(v-v^{p})={\mathcal{T}}_{0}(v)-{\mathcal{T}}_{0}(v^{p})=0-(-1)=1\end{eqnarray}$$

and ${\mathcal{T}}_{0}(v^{h})\neq 0$. Thus the homogeneous solution is $y^{h}=v+1/2$. Denoting the Chebyshev series of $v$ by $v_{n}$ for $n=0,\ldots ,N-1$, we find $v$ by solving the system of equations

(A 27)

$$\begin{eqnarray}\displaystyle & \displaystyle v_{0}=v_{N-1}=0, & \displaystyle\end{eqnarray}$$

(A 28)

$$\begin{eqnarray}\displaystyle & \displaystyle v_{n}-k\frac{v_{n-1}-v_{n+1}}{2n}=\frac{f_{n-1}-f_{n+1}}{2n}\quad \text{for }0<n<N-1. & \displaystyle\end{eqnarray}$$

We set the $N-1$ Chebyshev coefficient of $v$ and $f$ to zero for convenience. For the construction of the particular solution $y^{p}$ we solve (A 10) subject to ${\mathcal{T}}_{0}(y^{p})=0$. Thus we solve the system of equations

(A 29)

$$\begin{eqnarray}\displaystyle & \displaystyle v_{0}=v_{N-1}=0, & \displaystyle\end{eqnarray}$$

(A 30)

$$\begin{eqnarray}\displaystyle & \displaystyle v_{n}-k\frac{v_{n-1}-v_{n+1}}{2n}=\frac{f_{n-1}-f_{n+1}}{2n}\quad \text{for }0<n<N-1. & \displaystyle\end{eqnarray}$$

Now that we have constructed the homogeneous and particular solutions we can enforce boundary conditions. Given $y(a)=b$ for $a\in [-1,1]$ the value of the constant $C$ is determined,

(A 31)

$$\begin{eqnarray}y(a)=Cy^{h}(a)+y^{p}(a)\Rightarrow C=\frac{b-y^{p}(a)}{y^{h}(a)}.\end{eqnarray}$$

Typically, we enforce the boundary conditions at the endpoint $z=\pm 1$ for which we have readily available formulas to compute the values of $y^{h}$ and $y^{p}$ in terms of their Chebyshev coefficients, see Boyd (Reference Boyd2001).

We now show how to construct solutions to the second-order equation

(A 32)

$$\begin{eqnarray}(D^{2}-k^{2})y=f.\end{eqnarray}$$

For this problem we have two homogeneous solutions and one particular solution, so that the general solution to this differential equation is of the form

(A 33)

$$\begin{eqnarray}y=C_{1}y^{h_{1}}+C_{2}y^{h_{2}}+y^{p}.\end{eqnarray}$$

We solve this as a system of two different equations

(A 34)

$$\begin{eqnarray}\displaystyle & \displaystyle (D-k)v=f, & \displaystyle\end{eqnarray}$$

(A 35)

$$\begin{eqnarray}\displaystyle & \displaystyle (D+k)y=v. & \displaystyle\end{eqnarray}$$

First, we find the particular and homogeneous solution to

(A 36)

$$\begin{eqnarray}(D-k)v=f,\end{eqnarray}$$

as described previously so that we have

(A 37)

$$\begin{eqnarray}v=Cv^{h_{1}}+v^{p}.\end{eqnarray}$$

Then, we find $y^{h_{2}}$ and $y^{p}$ by solving

(A 38)

$$\begin{eqnarray}\displaystyle & \displaystyle (D+k)y^{h_{2}}=y^{h_{1}}, & \displaystyle\end{eqnarray}$$

(A 39)

$$\begin{eqnarray}\displaystyle & \displaystyle (D+k)y^{p}=v^{p}, & \displaystyle\end{eqnarray}$$

subject to ${\mathcal{T}}_{0}(y^{h_{2}})=0$ and ${\mathcal{T}}_{0}(y^{p})=0$ via spectral integration for the particular solution. Note that $y^{h_{2}}$ constructed in this manner is linearly independent of $y^{h_{1}}$. To enforce boundary conditions we must now invert a matrix for the coefficients $C_{1}$ and $C_{2}$. For example for boundary condition enforced at the endpoints $z=\pm 1$ we have

(A 40)

$$\begin{eqnarray}\left[\begin{array}{@{}cc@{}}y^{h1}(-1) & y^{h2}(-1)\\ y^{h1}(1) & y^{h2}(1)\end{array}\right]\left[\begin{array}{@{}c@{}}C_{1}\\ C_{2}\end{array}\right]=\left[\begin{array}{@{}c@{}}y(-1)-y^{p}(-1)\\ y(1)-y^{p}(1)\end{array}\right]\end{eqnarray}$$

for the coefficients $a$ and $b$.

Once we have our solution $y$, we find derivatives without recourse to differentiation matrices or Fourier transform methods. Indeed, in the first-order case, i.e. for $y$ that satisfy

(A 41)

$$\begin{eqnarray}(D-k)y=f,\end{eqnarray}$$

we find the derivative by simply rearranging the equation

(A 42)

$$\begin{eqnarray}Dy=f+ky.\end{eqnarray}$$

Hence, differentiation is obtained by merely summing the forcing function and the solution multiplied by $k$. A similar approach can be used for constructing the second derivatives.

We now have all the pieces to solve the modified Poisson’s equation (3.27). Solving the modified Stokes equation (3.31) is more complicated and is the subject of the next section.

A.3 Kleiser–Schumann algorithm

The Kleiser–Schumann algorithm is a method for solving the modified Stokes problem (3.31), see Kleiser & Schumann (Reference Kleiser, Schumann and Hirschel1980), Viswanath & Tobasco (Reference Viswanath and Tobasco2013). Since the modified Stokes problem is linear, we solve it wavenumber by wavenumber. From now on we drop the subscript $n\ell$ with the understanding that the modified Stokes problem must be solved for all values of $n$ and $\ell$ individually. For now we consider the $k\neq 0$ case and discuss how to handle this mode separately at the end.

The single wavenumber problem is to solve

(A 43)

$$\begin{eqnarray}\displaystyle & \displaystyle (D^{2}-\unicode[STIX]{x1D6FD}^{2})\boldsymbol{u}=\boldsymbol{f}-\hat{\unicode[STIX]{x1D735}}p, & \displaystyle\end{eqnarray}$$

(A 44)

$$\begin{eqnarray}\displaystyle & \displaystyle \hat{\unicode[STIX]{x1D735}}\boldsymbol{\cdot }\boldsymbol{u}=0. & \displaystyle\end{eqnarray}$$

Taking the divergence of the first equation yields the following equation for $p$:

(A 45)

$$\begin{eqnarray}(D^{2}-k^{2})p=\hat{\unicode[STIX]{x1D735}}\boldsymbol{\cdot }\boldsymbol{f}.\end{eqnarray}$$

As we have seen before, we may write the general solution to this problem as the sum of two homogeneous terms and a particular solution

(A 46)

$$\begin{eqnarray}p=C_{1}p^{h_{1}}+C_{2}p^{h_{2}}+p^{p}.\end{eqnarray}$$

Hence, we may write the equation for $\boldsymbol{u}$ as

(A 47)

$$\begin{eqnarray}(D^{2}-\unicode[STIX]{x1D6FD}^{2})\boldsymbol{u}=\boldsymbol{f}-\hat{\unicode[STIX]{x1D735}}p^{p}-C_{1}\hat{\unicode[STIX]{x1D735}}p^{h_{1}}-C_{2}\hat{\unicode[STIX]{x1D735}}p^{h_{2}}.\end{eqnarray}$$

We split the problem of finding solutions to $\boldsymbol{u}$ into three parts. Let $\boldsymbol{u}^{i}$ for $i=1,2,3$ be solutions to

(A 48)

$$\begin{eqnarray}\displaystyle & \displaystyle (D^{2}-\unicode[STIX]{x1D6FD}^{2})\boldsymbol{u}^{1}=\hat{\unicode[STIX]{x1D735}}p^{h_{1}}, & \displaystyle\end{eqnarray}$$

(A 49)

$$\begin{eqnarray}\displaystyle & \displaystyle (D^{2}-\unicode[STIX]{x1D6FD}^{2})\boldsymbol{u}^{2}=\hat{\unicode[STIX]{x1D735}}p^{h_{2}}, & \displaystyle\end{eqnarray}$$

(A 50)

$$\begin{eqnarray}\displaystyle & \displaystyle (D^{2}-\unicode[STIX]{x1D6FD}^{2})\boldsymbol{u}^{3}=\boldsymbol{f}-\hat{\unicode[STIX]{x1D735}}p^{p}, & \displaystyle\end{eqnarray}$$

where each $\boldsymbol{u}^{i}$ for $i=1,2,3$ satisfies the boundary conditions for $\boldsymbol{u}$. With this we write $\boldsymbol{u}$ as

(A 51)

$$\begin{eqnarray}\boldsymbol{u}=\boldsymbol{u}^{3}-C_{1}\boldsymbol{u}^{1}-C_{2}\boldsymbol{u}^{2}.\end{eqnarray}$$

To find $C_{1}$ and $C_{2}$ we use auxiliary conditions derived by enforcing incompressibility on the boundary.

For no-slip boundary conditions we have $\unicode[STIX]{x2202}_{z}w(z=\pm 1)=0$ and for stress-free boundary conditions $\unicode[STIX]{x2202}_{zz}w(z=\pm 1)=0$. That is to say, equation (A 51) applies to each component hence to the vertical velocity $w$

(A 52)

$$\begin{eqnarray}w=w^{3}-C_{1}w^{1}-C_{2}w^{2}.\end{eqnarray}$$

For example, with no-slip boundary conditions we apply the vertical derivative $D$ to both sides and solve the following system of equations for $C_{1}$ and $C_{2}$:

(A 53)

$$\begin{eqnarray}\left[\begin{array}{@{}cc@{}}Dw^{1}(z=1) & Dw^{2}(z=1)\\ Dw^{1}(z=-1) & Dw^{2}(z=-1)\end{array}\right]\left[\begin{array}{@{}c@{}}C_{1}\\ C_{2}\end{array}\right]=\left[\begin{array}{@{}c@{}}Dw^{3}(z=1)\\ Dw^{3}(z=-1)\end{array}\right].\end{eqnarray}$$

For fixed time-step sizes, the matrices are precomputed and factorized only once. Furthermore $\boldsymbol{u}^{1}$ and $\boldsymbol{u}^{2}$ is precomputed. Hence at each time step we only need to solve for $\boldsymbol{u}^{3}$ and the coefficients $C_{1}$ and $C_{2}$.

For this problem, the $k=0$ case must be handled separately as well. This is due to the fact that the homogeneous solution for the pressure is of the form $p^{h_{1}}=1/2$ and $p^{h_{2}}=z/2$. Letting $\boldsymbol{u}^{1}=(u^{1},v^{1},w^{1})$, the $k=0$ case implies that $\boldsymbol{u}^{1}=0$ for no-slip boundary conditions and stress-free boundary conditions. For $\unicode[STIX]{x1D6FD}=0$, one has $(u^{1},v^{1},w^{1})=(D_{1},D_{2},0)$ in the stress-free case where $D_{1}$ and $D_{2}$ are arbitrary constants. By specifying that $(u^{1},v^{1})$ are mean zero we may set these arbitrary constants to zero. The constant $C_{1}$ becomes a free parameter that does not affect the physical flow field $\boldsymbol{u}$. We may choose this value such that the average of $p$ is zero, but this is by no means necessary. Now, we must find $C_{2}$. From incompressibility and from the boundary conditions we see that $Dw=0\Rightarrow w=0$ which implies $C_{2}w^{2}=w^{3}$. (Note that for $\unicode[STIX]{x1D6FD}=0$ we have $w^{2}=(z^{2}-1)/4$ and for $\unicode[STIX]{x1D6FD}\neq 0$ we have $w^{2}=-(1/2\unicode[STIX]{x1D6FD}^{2})+(\sinh (\unicode[STIX]{x1D6FD})\cosh (\unicode[STIX]{x1D6FD}z)/\unicode[STIX]{x1D6FD}^{2}\sinh (2\unicode[STIX]{x1D6FD}))$.) Taking the derivative we find that $Dw^{3}(z\pm 1)/Dw^{2}(z\pm 1)=C_{2}$. This may appear overconstrained, but since we are guaranteed that that the functions $w^{2}$ and $w^{3}$ are proportional to one another, we could take either conditions to evaluate the constant $C_{2}$.

Appendix B. Optimality condition for domain size

Prior work (Hassanzadeh et al. Reference Hassanzadeh, Chini and Doering2014) indicates that there seems to be an optimal domain size in the periodic direction $x\in [0,\unicode[STIX]{x1D6E4}]$. The derivative of the functional ${\mathcal{F}}$ with respect to domain size $\unicode[STIX]{x1D6E4}$, is easiest to compute by rewriting ${\mathcal{F}}=\langle {\mathcal{L}}\rangle$ with

(B 1)

$$\begin{eqnarray}{\mathcal{L}}=u_{3}\unicode[STIX]{x1D703}+\unicode[STIX]{x1D711}\boldsymbol{u}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D703}-\unicode[STIX]{x1D735}\unicode[STIX]{x1D711}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D703}+\unicode[STIX]{x1D711}u_{3}+\unicode[STIX]{x1D707}\left(Pe^{2}-|\unicode[STIX]{x1D735}\boldsymbol{u}|^{2}\right)+\unicode[STIX]{x1D735}p\boldsymbol{\cdot }\boldsymbol{u}.\end{eqnarray}$$

Differentiating this functional with respect to $\unicode[STIX]{x1D6E4}$ evaluated on a solution to the Euler–Lagrange equations yields

(B 2)

$$\begin{eqnarray}\displaystyle & \displaystyle \frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D6E4}}=-\frac{1}{\unicode[STIX]{x1D6E4}}\langle {\mathcal{F}}\rangle +\frac{1}{\unicode[STIX]{x1D6E4}}\int _{0}^{1}{\mathcal{H}}^{x}\,\text{d}z, & \displaystyle\end{eqnarray}$$

(B 3)

$$\begin{eqnarray}\displaystyle & \displaystyle -{\mathcal{H}}^{x}=-{\mathcal{L}}-\unicode[STIX]{x2202}_{x}\unicode[STIX]{x1D703}\unicode[STIX]{x1D711}u_{1}+\unicode[STIX]{x1D707}|\unicode[STIX]{x2202}_{x}\boldsymbol{u}|^{2}-p\unicode[STIX]{x2202}_{x}u+2\unicode[STIX]{x2202}_{x}\unicode[STIX]{x1D703}\unicode[STIX]{x2202}_{x}\unicode[STIX]{x1D719}. & \displaystyle\end{eqnarray}$$

The latter part is the spatial Hamiltonian density associated with ${\mathcal{L}}$. It may seem that $(\unicode[STIX]{x1D6FF}{\mathcal{F}}/\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D6E4})$ is a function of $x$, but the Lagrangian ${\mathcal{L}}$ has no explicit spatial dependence, hence, the quantity $\int _{0}^{1}{\mathcal{H}}^{x}\,\text{d}z$ is independent of the horizontal variable $x$. In other words $\int _{0}^{1}{\mathcal{H}}^{x}\,\text{d}z=\langle {\mathcal{H}}^{x}\rangle$. This allows us to simplify the formula and compute

(B 4)

$$\begin{eqnarray}\frac{\unicode[STIX]{x1D6FF}{\mathcal{F}}}{\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D6E4}}=\frac{1}{\unicode[STIX]{x1D6E4}}\langle \unicode[STIX]{x2202}_{x}\unicode[STIX]{x1D703}\unicode[STIX]{x1D711}u_{1}+\unicode[STIX]{x1D707}|\unicode[STIX]{x2202}_{x}\boldsymbol{u}|^{2}-p\unicode[STIX]{x2202}_{x}u+2\unicode[STIX]{x2202}_{x}\unicode[STIX]{x1D703}\unicode[STIX]{x2202}_{x}\unicode[STIX]{x1D719}\rangle .\end{eqnarray}$$

(If this calculation is unfamiliar to the reader we refer to Souza (Reference Souza2016) but the essence of the calculation comes from the first exercise of Feynman & Hibbs (Reference Feynman and Hibbs1965).)

The optimal domain size condition for ${\mathcal{S}}$ is similar but not exactly the same due to a redefinition of the pressure term.

We include an aspect ratio flow of the form $\unicode[STIX]{x2202}_{\unicode[STIX]{x1D70F}}\unicode[STIX]{x1D6E4}=(\unicode[STIX]{x1D6FF}{\mathcal{F}}/\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D6E4})$ in addition to the previous time-stepping procedures. There are subtleties associated with implementing this flow to make the procedure successful. Empirically, it is found to be useful to withhold evolution of the aspect ratio until one was ‘close’ to a solution of the Euler–Lagrange equations. Furthermore, taking a time step in tandem with the other evolutions seemed prohibitively slow, thus, the evolution is only taken every $n$th time step with respect to the other evolutions, where $n=5$ works well for our purposes. The overhead of changing the domain slows down computations by a negligible amount once these modifications are implemented.

References

Alben, S. 2017 Improved convection cooling in steady channel flows. Phys. Rev. Fluids 2, 104501.CrossRef Google Scholar

Boyd, J. P. 2001 Chebyshev and Fourier Spectral Methods, 2nd edn. Dover.Google Scholar

Busse, F. H. 1969 On Howard’s bound for heat transport by turbulent convection. J. Fluid Mech. 37, 457–477.CrossRef Google Scholar

Ding, Z. & Kerswell, R. R.2019 Exhausting the background approach for bounding the heat transport in Rayleigh–Bénard convection. J. Fluid Mech. (submitted).Google Scholar

Doering, C. R. & Constantin, P. 1996 Variational bounds on energy dissipation in incompressible flows. III. Convection. Phys. Rev. E 53, 5957–5981.Google Scholar PubMed

Doering, C. R. & Tobasco, I. 2019 On the optimal design of wall-to-wall heat transport. Commun. Pure Appl. Maths 72, 2385–2448.CrossRef Google Scholar

Feynman, R. P. & Hibbs, A. R. 1965 Quantum Mechanics and Path Integrals, 1st edn. McGraw-Hill.Google Scholar

Hassanzadeh, P., Chini, G. P. & Doering, C. R. 2014 Wall to wall optimal transport. J. Fluid Mech. 751, 627–662.CrossRef Google Scholar

Howard, L. N. 1963 Heat transport by turbulent convection. J. Fluid Mech. 17, 405–432.CrossRef Google Scholar

Kerswell, R. R. 1998 Unification of variational principles for turbulent shear flows: the background method of Doering-Constantin and the mean-fluctuation formulation of Howard-Busse. Physica D 121, 175–192.Google Scholar

Kleiser, L. & Schumann, U. 1980 Treatment of incompressibility and boundary conditions in 3-D numerical spectral simulations of plane channel flows. In Proceedings of the 3rd Conference on Numerical Methods in Fluid Mechanics (ed. Hirschel, E. H.), pp. 165–173. Springer.CrossRef Google Scholar

Malkus, W. V. R. 1954 Discrete transitions in turbulent convection. Proc. R. Soc. Lond. 225, 185–195.Google Scholar

Marcotte, F., Doering, C. R., Thiffeault, J.-L. & Young, W. R. 2018 Optimal heat transfer and optimal exit times. SIAM J. Appl. Maths 78 (1), 591–608.CrossRef Google Scholar

Motoki, S., Kawahara, G. & Shimizu, M. 2018a Maximal heat transfer between two parallel plates. J. Fluid Mech. 851, R4.CrossRef Google Scholar

Motoki, S., Kawahara, G. & Shimizu, M. 2018b Optimal heat transfer enhancement in plane Couette flow. J. Fluid Mech. 835, 1157–1198.CrossRef Google Scholar

Plasting, S. C. & Kerswell, R. R. 2003 Improved upper bound on the energy dissipation rate in plane Couette flow: the full solution to Busse’s problem and the Constantin–Doering–Hopf problem with one-dimensional background field. J. Fluid Mech. 477, 363–379.CrossRef Google Scholar

Souza, A. N.2016 An optimal control approach to bounding transport properties of thermal convection. Doctoral thesis, University of Michigan.Google Scholar

Souza, A. N. & Doering, C. R. 2015a Maximal transport in the Lorenz equations. Phys. Lett. A 379, 518–523.CrossRef Google Scholar

Souza, A. N. & Doering, C. R. 2015b Transport bounds for a truncated model of Rayleigh–Bénard convection. Physica D 308, 26–33.CrossRef Google Scholar

Tobasco, I. & Doering, C. R. 2017 Optimal wall-to-wall transport by incompressible flows. Phys. Rev. Lett. 118, 264502.CrossRef Google Scholar PubMed

Toppaladoddi, S., Succi, S. & Wettlaufer, J. S. 2017 Roughness as a route to the ultimate regime of thermal convection. Phys. Rev. Lett. 118, 074503.CrossRef Google Scholar PubMed

Trefethen, L. N. 2001 Spectral Methods in MATLAB, 1st edn. SIAM.Google Scholar

Viswanath, D. 2015 Spectral integration of linear boundary value problems. J. Comput. Appl. Maths 290, 159–173.CrossRef Google Scholar

Viswanath, D. & Tobasco, I. 2013 Navier–Stokes solver using Green’s functions I: channel flow and plane Couette flow. J. Comput. Phys. 251, 414–431.CrossRef Google Scholar

Wen, B., Chini, G. P., Kerswell, R. R. & Doering, C. R. 2015 Time-stepping approach for solving upper-bound problems: Application to two-dimensional Rayleigh-Bénard convection. Phys. Rev. E 92 (4), 043012.CrossRef Google Scholar PubMed