A generalised-Lagrangian-mean model of the interactions between near-inertial waves and mean flow

J.-H. Xie; J. Vanneste

doi:10.1017/jfm.2015.251

A generalised-Lagrangian-mean model of the interactions between near-inertial waves and mean flow

Published online by Cambridge University Press: 04 June 2015

J.-H. Xie and

J. Vanneste

Show author details

J.-H. Xie*: Affiliation:
School of Mathematics and Maxwell Institute for Mathematical Sciences, University of Edinburgh, Edinburgh EH9 3FD, UK
J. Vanneste: Affiliation:
School of Mathematics and Maxwell Institute for Mathematical Sciences, University of Edinburgh, Edinburgh EH9 3FD, UK
*: †Email address for correspondence: J.H.Xie@ed.ac.uk

Article contents

Abstract
Introduction
Coupled model
Derivation of the coupled model
Conservation laws and Hamiltonian structure
Implications
Two-dimensional models
Discussion
References

Rights & Permissions

Abstract

Wind forcing of the ocean generates a spectrum of inertia–gravity waves that is sharply peaked near the local inertial (or Coriolis) frequency. The corresponding near-inertial waves (NIWs) are highly energetic and play a significant role in the slow, large-scale dynamics of the ocean. To analyse this role, we develop a new model of the non-dissipative interactions between NIWs and balanced motion. The model is derived using the generalised-Lagrangian-mean (GLM) framework (specifically, the ‘glm’ variant of Soward & Roberts, J. Fluid Mech., vol. 661, 2010, pp. 45–72), taking advantage of the time-scale separation between the two types of motion to average over the short NIW period. We combine Salmon’s (J. Fluid Mech., vol. 719, 2013, pp. 165–182) variational formulation of GLM with Whitham averaging to obtain a system of equations governing the joint evolution of NIWs and mean flow. Assuming that the mean flow is geostrophically balanced reduces this system to a simple model coupling Young & Ben Jelloul’s (J. Mar. Res., vol. 55, 1997, pp. 735–766) equation for NIWs with a modified quasi-geostrophic (QG) equation. In this coupled model, the mean flow affects the NIWs through advection and refraction; conversely, the NIWs affect the mean flow by modifying the potential-vorticity (PV) inversion – the relation between advected PV and advecting mean velocity – through a quadratic wave term, consistent with the GLM results of Bühler & McIntyre (J. Fluid Mech., vol. 354, 1998, pp. 301–343). The coupled model is Hamiltonian and its conservation laws, for wave action and energy in particular, prove illuminating: on their basis, we identify a new interaction mechanism whereby NIWs forced at large scales extract energy from the balanced flow as their horizontal scale is reduced by differential advection and refraction so that their potential energy increases. A rough estimate suggests that this mechanism could provide a significant sink of energy for mesoscale motion and play a part in the global energetics of the ocean. Idealised two-dimensional models are derived and simulated numerically to gain insight into NIW–mean-flow interaction processes. A simulation of a one-dimensional barotropic jet demonstrates how NIWs forced by wind slow down the jet as they propagate into the ocean interior. A simulation assuming plane travelling NIWs in the vertical shows how a vortex dipole is deflected by NIWs, illustrating the irreversible nature of the interactions. In both simulations energy is transferred from the mean flow to the NIWs.

JFM classification

Geophysical and Geological Flows: Quasi-geostrophic flows Mathematical Foundations: Variational methods Geophysical and Geological Flows: Waves in rotating fluids

Type: Papers
Information: Journal of Fluid Mechanics , Volume 774 , 10 July 2015 , pp. 143 - 169

DOI: https://doi.org/10.1017/jfm.2015.251 [Opens in a new window]
Copyright: © 2015 Cambridge University Press

1. Introduction

Near-inertial waves (NIWs), that is, inertia–gravity waves with frequencies close to the local Coriolis frequency $f_{0}$ , play an important role in the dynamics of the ocean (e.g. Fu Reference Fu1981). They account for almost 50 % of the wave energy (e.g. Ferrari & Wunsch Reference Ferrari and Wunsch2009) and thus make a strong contribution to processes associated with inertia–gravity waves such as diapycnal mixing, vertical motion and primary production. Several features explain their dominance (Garrett Reference Garrett2001): their minimum frequency in the inertia–gravity-wave spectrum, the low frequency of the atmospheric winds that generate them, the presence of turning latitudes, nonlinear interactions (Medvedev & Zeitlin Reference Medvedev and Zeitlin1997), and the transfer of tidal energy through parametric subharmonic instability (Young, Tsang & Balmforth Reference Young, Tsang and Balmforth2008).

In view of their large energy, it is natural to expect that NIWs affect the large-scale circulation of the ocean. One possibility is that they do so through enhanced diapycnal mixing in the regions of the ocean where they dissipate (e.g. Wunsch & Ferrari Reference Wunsch and Ferrari2004). Another, more remarkable perhaps, is that they alter the slow, balanced oceanic circulation directly through wave–mean-flow interaction processes. Gertz & Straub (Reference Gertz and Straub2009) put forward the idea that NIWs provide an energy sink for this circulation. Their numerical simulations suggest that this process may be significant and, along with other mechanisms including bottom and surface friction (e.g. Duhaut & Straub Reference Duhaut and Straub2006; Nikurashin, Vallis & Adcroft Reference Nikurashin, Vallis and Adcroft2013) and loss of balance (e.g. Danioux et al. Reference Danioux, Vanneste, Klein and Sasaki2012; Vanneste Reference Vanneste2013), help resolve the long-standing puzzle posed by the dissipation of the (inverse energy-cascading) balanced oceanic flow.

The aim of the present paper is to develop a theoretical tool that enables a detailed analysis of the interactions between NIWs and balanced flow. So far, theoretical modelling has focused on the impact of the balanced flow on NIWs. Under the assumption that NIW scales are much smaller than mean-flow scales, a WKB approach can be applied (Mooers Reference Mooers1975a ,Reference Mooers b ); it shows in particular that the vorticity of the balanced flow shifts the frequency of NIWs away from $f_{0}$ (Kunze Reference Kunze1985). Young & Ben Jelloul (Reference Young and Ben Jelloul1997) (referred to as YBJ hereafter) derived an asymptotic model based on the frequency separation between NIWs and balanced motion which, in contrast, makes no assumption of separation between the NIW and flow spatial scales. Their model is therefore well suited to examining the realistic scenario of NIWs forced by atmospheric winds at horizontal scales larger than those of the ocean flow. The YBJ model describes the slow modulation of the NIW fields about their oscillations at the fast frequency $f_{0}$ . It neatly isolates the main mechanisms whereby the balanced flow and stratification influence NIWs: advection, dispersion and refraction.

In this paper, we extend the YBJ model to account for the feedback of the NIWs on the balanced flow. Specifically, we derive a new model that couples the YBJ model with a modified quasi-geostrophic (QG) model. The modification – a change in the relation between the advected potential vorticity (PV) and advecting velocity involving quadratic wave terms – captures this feedback. As detailed below, we work in the framework of non-dissipative generalised-Lagrangian-mean theory (GLM: see e.g. Bühler Reference Bühler2009). Bühler & McIntyre (Reference Bühler and McIntyre1998, for short waves) and Holmes-Cerfon, Bühler & Ferrari (Reference Holmes-Cerfon, Bühler and Ferrari2011, for waves of arbitrary spatial scales) showed that the change in the PV–velocity relationship is a generic conclusion of this theory which interprets some of the quadratic wave terms as PV contributions associated with the wave pseudomomentum (see also Bühler & McIntyre Reference Bühler and McIntyre2005, Bühler Reference Bühler2009 and Salmon Reference Salmon2013).

We pay close attention to the conservation laws satisfied by the coupled model. These turn out to be particularly important: based on the conservation of NIW action (in fact, the NIW kinetic energy divided by $f_{0}$ ) and total energy alone, we identify a novel mechanism providing a sink of energy for the balanced flow. In this mechanism, the reduction in the horizontal scale of NIWs that results from advection and refraction is accompanied by an increase in the NIW potential energy and, consequently, a decrease in the energy of the balanced flow.

A key to the derivation of wave–mean-flow models of the kind we develop is to separate the motion between mean and wave contributions, relying on the time-scale separation to define the mean as an average over the inertial period $2{\rm\pi}/f_{0}$ . The GLM theory of Andrews & McIntyre (Reference Andrews and McIntyre1978) offers a general framework for this separation and for the systematic derivation of equations governing the coupled wave–mean dynamics (see Bühler Reference Bühler2009 for an account). The theory has achieved notable successes but suffers from a deficiency in that the (Lagrangian) mean velocity it defines is divergent even for an incompressible fluid. Soward & Roberts (Reference Soward and Roberts2010) proposed a variant of GLM, termed ‘glm’, which yields a divergence-free mean velocity. Because it is convenient, we adopt this approach in the main body of the paper but show in an appendix that the same leading-order model can also be obtained from standard GLM. We also adopt a variational approach that ensures that conservation laws and their link to symmetries are preserved when the primitive equations are reduced asymptotically (see e.g. Grimshaw Reference Grimshaw1984, Salmon Reference Salmon1988 and Holm, Schmah & Stoica Reference Holm, Schmah and Stoica2009). Specifically, we derive the Lagrangian-mean and perturbation equations by introducing a wave–mean decomposition of the flow map into the primitive-equation Lagrangian, following closely the method proposed by Salmon (Reference Salmon2013) (see Gjaja & Holm Reference Gjaja and Holm1996 for a related approach). Because the wave component consists of rapidly oscillating NIWs, the resulting Lagrangian can be averaged in time in the manner of Whitham (Reference Whitham1974). Variations with respect to the mean-flow map (or rather its inverse) and to the NIW amplitude then lead to a coupled primitive-equation–YBJ system; applying a QG approximation reduces this system to a simple, energy-conserving YBJ–QG coupled model. (See Vanneste Reference Vanneste2014 for a related variational derivation of the original YBJ equation.)

The paper is organised as follows. The coupled YBJ–QG model is introduced without a derivation in § 2. Some key properties of the model and the key scaling assumptions underlying its derivation are also discussed there. The derivation itself is carried out in § 3, which also records the complete primitive-equation–YBJ model. The Hamiltonian structure of the YBJ–QG model and associated conservation laws are presented in § 4. Sections 3 and 4 are technical; the reader mainly interested in applications can skip them and move directly to § 5, which considers the possible implications of the wave–mean-flow interactions represented in the model for ocean energetics. In § 6 we examine two simplified models deduced from the full YBJ–QG model assuming certain symmetries. These models are two-dimensional and hence easily amenable to numerical simulations. We take advantage of this and present the results of two sets of simulations demonstrating (i) the slow-down of a one-dimensional barotropic jet by NIWs, and (ii) the deflection of a vortex dipole under the influence of vertically travelling NIWs. The paper concludes with a brief discussion in § 7. Appendices A–C provide details of some of the computations and alternative derivations.

2. Coupled model

2.1. Model

We start with the hydrostatic–Boussinesq equations written in the form

(2.1a )

$$\begin{eqnarray}\displaystyle & \partial _{t}u+\boldsymbol{u}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}u+w\partial _{z}u-(f_{0}+{\it\beta}y)v=-\partial _{x}p, & \displaystyle\end{eqnarray}$$

(2.1b )

$$\begin{eqnarray}\displaystyle & \partial _{t}v+\boldsymbol{u}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}v+w\partial _{z}v+(f_{0}+{\it\beta}y)u=-\partial _{y}p, & \displaystyle\end{eqnarray}$$

(2.1c )

$$\begin{eqnarray}\displaystyle & {\it\theta}=\partial _{z}p, & \displaystyle\end{eqnarray}$$

(2.1d )

$$\begin{eqnarray}\displaystyle & \boldsymbol{{\rm\nabla}}\boldsymbol{\cdot }\boldsymbol{u}+\partial _{z}w=0, & \displaystyle\end{eqnarray}$$

(2.1e )

$$\begin{eqnarray}\displaystyle & \partial _{t}{\it\theta}+\boldsymbol{u}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}{\it\theta}+w\partial _{z}{\it\theta}=0, & \displaystyle\end{eqnarray}$$

where

$\boldsymbol{u}=(u,v)$ is the horizontal velocity,

$w$ is the vertical velocity,

$p$ is the pressure, and

${\it\theta}$ is the buoyancy, defined as

$-g$ times the density variations relative to a constant density

${\it\rho}_{0}$ (e.g. Vallis 2006). We have used the

${\it\beta}$ -plane approximation to write the Coriolis parameter as

$f_{0}+{\it\beta}y$ , with constant

$f_{0}$ and

${\it\beta}$ . Throughout the paper,

$\boldsymbol{{\rm\nabla}}=(\partial _{x},\partial _{y})$ denotes the horizontal gradient.

Inertial oscillations are characterised by a linear balance between inertia and the Coriolis force in (2.1a ) and (2.1b ) and thus satisfy

(2.2a,b )

$$\begin{eqnarray}\partial _{t}u-f_{0}v=0\quad \text{and}\quad \partial _{t}v+f_{0}u=0.\end{eqnarray}$$

The solution can written in complex form as

(2.3)

$$\begin{eqnarray}u+\text{i}v=M_{z}\,\text{e}^{-\text{i}f_{0}t}\end{eqnarray}$$

for some complex amplitude $M(x,y,z)$ . Here we follow YBJ in writing this amplitude as a $z$ -derivative so that the vertical velocity, deduced from the incompressibility condition (2.1d ), takes the simple form

(2.4)

$$\begin{eqnarray}w=-M_{s}\text{e}^{-\text{i}f_{0}t}+\text{c.c.},\end{eqnarray}$$

where $s=x+\text{i}y$ , $\partial _{s}=(\partial _{x}-\text{i}\partial _{y})/2$ , and $\text{c.c.}$ denotes the complex conjugate of the preceding term. The position $\boldsymbol{x}=(x,y,z)$ of fluid particles in the inertial field (2.3) and (2.4) can be obtained by integration. If this position is written as

(2.5)

$$\begin{eqnarray}\boldsymbol{x}=\boldsymbol{X}+{\bf\xi},\end{eqnarray}$$

the displacement ${\bf\xi}=({\it\xi},{\it\eta},{\it\zeta})$ satisfies

(2.6a,b )

$$\begin{eqnarray}{\it\xi}+\text{i}{\it\eta}={\it\chi}_{z}\,\text{e}^{-\text{i}f_{0}t}\quad \text{and}\quad {\it\zeta}=-{\it\chi}_{s}\text{e}^{-\text{i}f_{0}t}+\text{c.c.}\end{eqnarray}$$

where ${\it\chi}=\text{i}M/f_{0}$ in the linear approximation. The mean position $\boldsymbol{X}$ can be regarded as an integration constant identifying the fluid particle, and the displacement ${\bf\xi}$ and amplitude ${\it\chi}$ can be thought of as functions of $\boldsymbol{X}$ .

For NIWs propagating in a flow, the description leading to (2.6) is overly simplified. However, it can be extended to capture the two-way interactions between the NIWs and the flow: this is achieved by regarding $\boldsymbol{X}$ as a suitably defined, time-dependent Lagrangian-mean position (in fact, a mean map $\boldsymbol{X}(\boldsymbol{a},t)$ mapping the particle labelled by $\boldsymbol{a}$ to its mean position at time $t$ ), and by taking the amplitude ${\it\chi}(\boldsymbol{X},t)$ to be a function of both time and mean position in typical GLM fashion (e.g. Bühler Reference Bühler2009). The main achievement of this paper is the derivation of equations governing the joint evolution of the NIW amplitude ${\it\chi}$ and of the mean map $\boldsymbol{X}(\boldsymbol{a},t)$ or, rather, of the corresponding Lagrangian-mean velocity.

We leave the details of this derivation for the next section and present here the final equations. These are particularly simple when the Lagrangian-mean flow is assumed to be QG and hence derived from a streamfunction ${\it\psi}$ according to $(\bar{\boldsymbol{u}}^{L},\bar{w}^{L})=(\boldsymbol{{\rm\nabla}}^{\bot }{\it\psi},0)$ , with $\boldsymbol{{\rm\nabla}}^{\bot }=(-\partial _{y},\partial _{x})$ . In this approximation, and using $\boldsymbol{x}$ rather than $\boldsymbol{X}$ to denote the independent spatial variables (the mean positions), the coupled model takes the form

(2.7a )

$$\begin{eqnarray}\displaystyle & \displaystyle {\it\chi}_{zzt}+(\partial ({\it\psi},{\it\chi}_{z}))_{z}+\text{i}{\it\beta}y{\it\chi}_{zz}+\frac{\text{i}}{2}\left(\left(\frac{N^{2}}{f_{0}}+{\it\psi}_{zz}\right){\rm\nabla}^{2}{\it\chi}+{\rm\nabla}^{2}{\it\psi}\,{\it\chi}_{zz}-2\boldsymbol{{\rm\nabla}}{\it\psi}_{z}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}{\it\chi}_{z}\right)=0, & \displaystyle \nonumber\\ \displaystyle & & \displaystyle\end{eqnarray}$$

(2.7b )

$$\begin{eqnarray}\displaystyle & q_{t}+\partial ({\it\psi},q)=0, & \displaystyle\end{eqnarray}$$

where

$\partial (\cdot ,\cdot )$ denotes the two-dimensional Jacobian (with

$\partial (\,f,g)=f_{x}g_{y}-g_{x}f_{y}$ ), and

$N$ is the Brunt–Väisälä frequency, which generally depends on

$z$ and is defined by

$N^{2}=\bar{{\it\theta}}_{z}$ , with

$\bar{{\it\theta}}$ the background stratification.

The first equation can be recognised as a version of the YBJ model, specifically their complete (3.2) rather than the simplified model given by their (1.2). It is supplemented by the boundary conditions at the top and bottom boundaries $z=z^{\pm }$ ,

(2.8)

$$\begin{eqnarray}{\it\chi}=\text{const.}^{\pm }\quad \text{at}~z=z^{\pm },\end{eqnarray}$$

ensuring a vanishing NIW vertical velocity there. The second equation is the material conservation of the QG potential vorticity (QGPV) $q$ . This is related to the streamfunction ${\it\psi}$ and to ${\it\chi}$ through

(2.9)

$$\begin{eqnarray}q={\it\beta}y+{\rm\Delta}{\it\psi}+\frac{\text{i}f_{0}}{2}\partial ({\it\chi}_{z}^{\ast },{\it\chi}_{z})+f_{0}G({\it\chi}^{\ast },{\it\chi}),\end{eqnarray}$$

where

(2.10)

$$\begin{eqnarray}{\rm\Delta}={\rm\nabla}^{2}+\partial _{z}(f_{0}^{2}/N^{2}\partial _{z})\end{eqnarray}$$

is the familiar QGPV operator,

(2.11)

$$\begin{eqnarray}G({\it\chi}^{\ast },{\it\chi})={\textstyle \frac{1}{4}}(2|\boldsymbol{{\rm\nabla}}{\it\chi}_{z}|^{2}-{\it\chi}_{zz}{\rm\nabla}^{2}{\it\chi}^{\ast }-{\it\chi}_{zz}^{\ast }{\rm\nabla}^{2}{\it\chi}),\end{eqnarray}$$

and $\text{}^{\ast }$ denotes complex conjugate. In the usual way, (2.9) should be interpreted as an inversion equation which relates the streamfunction ${\it\psi}$ and hence the advecting velocity $\boldsymbol{{\rm\nabla}}^{\bot }{\it\psi}$ to the dynamical variables, here $q$ and ${\it\chi}$ . This inversion necessitates boundary conditions. In the vertical direction they are provided by the advection of the Lagrangian-mean buoyancy at the top and bottom boundaries, that is,

(2.12)

$$\begin{eqnarray}\partial _{t}{\it\theta}^{\pm }+\partial ({\it\psi}^{\pm },{\it\theta}^{\pm })=0,\quad \text{where}~{\it\psi}^{\pm }={\it\psi}|_{z=z^{\pm }}~\text{and}~{\it\theta}^{\pm }=f_{0}{\it\psi}_{z}|_{z=z^{\pm }}.\end{eqnarray}$$

For horizontally periodic or unbounded domains, as assumed in what follows, (2.7)–(2.12) define the new model completely. The YBJ equation (2.7a ) describes the weak dispersion that arises from a finite horizontal scale (through the term $\text{i}N^{2}{\rm\nabla}^{2}{\it\chi}/(2f_{0})$ ) and also the various effects that the mean flow has on the NIWs: advection (term $(\partial ({\it\psi},{\it\chi}_{z}))_{z}$ ), and refraction by the mean vorticity (term $\text{i}{\rm\nabla}^{2}{\it\psi}\,{\it\chi}_{zz}/2$ ) and by vertical shear (term $-\text{i}\boldsymbol{{\rm\nabla}}{\it\psi}_{z}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}{\it\chi}_{z}$ ). The simple QGPV equation (2.7b ) governs the mean flow. Here the effect of the NIWs is a modification of the relation between ${\it\psi}$ and $q$ by the quadratic wave terms in (2.9). This structure is expected from GLM theory, which interprets the quadratic wave terms as a PV contribution stemming from the wave pseudomomentum (Bühler & McIntyre Reference Bühler and McIntyre1998; Holmes-Cerfon et al. Reference Holmes-Cerfon, Bühler and Ferrari2011).

2.2. Some properties

An important feature of the coupled model is its conservation laws. The model conserves the total energy

(2.13)

$$\begin{eqnarray}\mathscr{H}=\frac{1}{2}\int \left(|\boldsymbol{{\rm\nabla}}{\it\psi}|^{2}+\frac{f_{0}^{2}}{N^{2}}{\it\psi}_{z}^{2}+f_{0}{\it\beta}y|{\it\chi}_{z}|^{2}+\frac{N^{2}}{2}|\boldsymbol{{\rm\nabla}}{\it\chi}|^{2}\right)\text{d}\boldsymbol{x},\end{eqnarray}$$

and the wave action

(2.14)

$$\begin{eqnarray}\mathscr{A}=\frac{f_{0}}{2}\int |{\it\chi}_{z}|^{2}\,\text{d}\boldsymbol{x}.\end{eqnarray}$$

The wave action can be recognised as the kinetic energy of the NIWs divided by $f_{0}$ . Its conservation does not follow from an analogous conservation in the hydrostatic–Boussinesq equations; rather it stems from an adiabatic invariance associated with the large time-scale separation between the fast oscillations of the NIWs and the slow evolution of their amplitude and of the mean flow (see Cotter & Reich Reference Cotter and Reich2004). Since, in the NIW limit, the leading-order wave energy is entirely kinetic and their frequency is $f_{0}$ , the familiar form of wave action, namely the ratio of wave energy to frequency, reduces to (2.14). The conservation of $\mathscr{H}$ is directly inherited from the energy conservation for the hydrostatic–Boussinesq equations. The first two terms in (2.13) are recognised as the QG kinetic and potential energy associated with the mean flow. The third term is associated with the ${\it\beta}$ -effect. The fourth and final term can be interpreted as the time-averaged potential energy of the NIWs; indeed, using the vertical displacement in (2.6) and denoting averaging over the wave time scale $f_{0}^{-1}$ by $\langle \cdot \rangle$ , we compute this as

(2.15)

$$\begin{eqnarray}\left\langle \int \frac{N^{2}{\it\zeta}^{2}}{2}\,\text{d}\boldsymbol{x}\right\rangle =\frac{1}{4}\int N^{2}|\boldsymbol{{\rm\nabla}}{\it\chi}|^{2}\,\text{d}\boldsymbol{x}.\end{eqnarray}$$

Here the left-hand side is the standard expression for the quadratic part of the potential energy in a Boussinesq fluid in terms of vertical displacements (e.g. Holliday & McIntyre Reference Holliday and McIntyre1981). The total energy in the model could alternatively be defined as $\mathscr{H}+f_{0}\mathscr{A}$ . However, since $f_{0}\mathscr{A}\gg \mathscr{H}$ is conserved independently, and $\mathscr{H}$ is the Noetherian conserved quantity associated with time invariance (see § 4), our separation appears more natural.

The energy and action are not the only conserved quantities for the coupled model. Clearly, the enstrophy and more generally the integrals

(2.16)

$$\begin{eqnarray}\int f(q)\,\text{d}\boldsymbol{x}\end{eqnarray}$$

of arbitrary functions $f$ of the PV are conserved, as in the standard QG model. In fact, as we discuss in § 4, the coupled model is Hamiltonian and additional conservation laws (e.g. linear and angular momentum) can be derived using Noether’s theorem.

2.3. Scaling assumptions

Our derivation of the coupled model relies on a number of approximations which we now detail. The parameters characterising the mean flow are the Burger and Rossby numbers

(2.17a,b )

$$\begin{eqnarray}\mathit{Bu}=\frac{N^{2}H^{2}}{f_{0}^{2}L^{2}}\quad \text{and}\quad \mathit{Ro}=\frac{U_{QG}}{f_{0}L},\end{eqnarray}$$

where $L$ and $H$ are the mean-flow horizontal and vertical scales, and $U_{QG}$ is a typical mean velocity. These parameters are taken to satisfy $\mathit{Bu}=O(1)$ and $\mathit{Ro}\ll 1$ in accordance with QG theory. The NIWs are characterised by two parameters analogous to $\mathit{Bu}$ and $\mathit{Ro}$ , namely

(2.18a,b )

$$\begin{eqnarray}{\it\epsilon}=\frac{Nk}{f_{0}m}\quad \text{and}\quad {\it\alpha}=\frac{U_{NIW}}{f_{0}L},\end{eqnarray}$$

where $k$ and $m$ are typical horizontal and vertical wavenumbers, and $U_{NIW}$ is a typical NIW horizontal velocity. The parameter ${\it\epsilon}$ , which measures the relative frequency shift of NIWs compared with $f_{0}$ , is small: ${\it\epsilon}\ll 1$ . In the YBJ model (2.7a ), dispersion and mean-flow effects have similar orders of magnitudes provided that $\mathit{Ro}=O({\it\epsilon}^{2})$ , which we also assume. Note that this makes no specific assumption about the relative size of the wave and mean horizontal scales, which can be taken to satisfy $kL=O(1)$ (provided that $mH=O({\it\epsilon}^{-1})\gg 1$ ).

The parameter ${\it\alpha}$ controls the NIW amplitude. We choose its scaling relative to $\mathit{Ro}$ in order that the NIW feedback affects the mean motion at the same order as nonlinear vorticity advection. This imposes that

(2.19)

$$\begin{eqnarray}\mathit{Ro}=O({\it\alpha}^{2}),\quad \text{hence}~{\it\alpha}=O({\it\epsilon}).\end{eqnarray}$$

This scaling indicates that $U_{NIW}/U_{QG}={\it\alpha}^{-1}\gg 1$ , as is relevant to strong NIWs generated by intense storms (e.g. D’Asaro et al. Reference D’Asaro, Eriksen, Levine, Paulson, Niiler and Meurs1995). It leads to a mean equation that is a modification of the QG equation by wave effects. Had a smaller wave amplitude been assumed in order to model quieter conditions, say by taking $\mathit{Ro}=O({\it\alpha})$ , the wave effects would have been an $O(\mathit{Ro})$ -factor smaller than advection in (2.7b ) and of comparable order to balanced corrections to quasi-geostrophy (see Zeitlin, Reznik & Ben Jelloul Reference Zeitlin, Reznik and Ben Jelloul2003). Because these corrections do not alter the qualitative properties of the QG model, we prefer the scaling (2.19) to retain a model that is as simple as possible. In spite of their relatively large amplitudes, the NIWs remain described by the YBJ equation, which neglects all wave–wave interactions. This is justified on the grounds that these interactions are remarkably weak for NIWs: first because NIWs triads cannot be resonant, and second because of a cancellation of the cubic terms associated with resonant quartets (Falkovich, Kuznetsov & Medvedev Reference Falkovich, Kuznetsov and Medvedev1994; Zeitlin et al. Reference Zeitlin, Reznik and Ben Jelloul2003).

It is however important to note that our model is not fully consistent from an asymptotic viewpoint. The assumption of two different aspect ratios for NIWs and mean flow – implied by the condition ${\it\epsilon}\ll 1$ and $\mathit{Bu}=O(1)$ and best thought of as resulting from a disparity in vertical scales, $mH\gg 1$ – is not generally consistent. Indeed, small-scale NIWs generally lead to small-scale wave terms in (2.9) and hence to a pair $q$ and ${\it\psi}$ that varies on the wave scale (with a vertically planar NIW field ${\it\chi}\propto \exp (\text{i}mz)$ a notable exception: see § 6.2). A consistent assumption would be to take $\mathit{Bu}=O({\it\epsilon}^{2})\ll 1$ . But this assumption is less relevant to most of the ocean; it leads to a different balanced dynamics, namely frontal dynamics, with negligible wave–mean interactions (Zeitlin et al. Reference Zeitlin, Reznik and Ben Jelloul2003).

While the model is heuristic, we regard it as valuable for its simplicity and because it respects key properties including conservation laws. The variational derivation of the wave–mean equations as detailed in the next sections makes this possible. This derivation starts with that of a coupled YBJ–primitive-equation ((3.14)–(3.16) below) which makes no assumption of quasi-geostrophy for the mean flow. This model, naturally more complex than (2.7), is asymptotically consistent provided that ${\it\alpha}\ll \mathit{Ro}^{1/2}/{\it\epsilon}^{1/2}$ so that the wave–wave interactions are negligible. It could serve as a basis to obtain a balanced model for the mean flow that is more accurate than QG and/or with relaxed assumptions on $\mathit{Bu}$ so as to be fully consistent with the YBJ equation.

3. Derivation of the coupled model

We follow Salmon (Reference Salmon2013) in deriving the Lagrangian-mean and wave equations from a variational formulation of the fluid equations rather than from the equations themselves. This is advantageous since it guarantees that the wave–mean model inherits conservation laws from the original hydrostatic–Boussinesq model. While Salmon (Reference Salmon2013) develops a general theory making no specific assumptions on the form of the perturbations to the mean flow, we focus on NIWs, assuming that the displacements ${\bf\xi}$ satisfy (2.6). With this assumption, which can be viewed as a form of closure relying on a hypothesis of small wave amplitude, a natural step is to average the Lagrangian in the manner of Whitham (Reference Whitham1974) to obtain a reduced Lagrangian that is a functional of the mean map $\boldsymbol{X}$ and of the NIW amplitude ${\it\chi}$ . This is described in § 3.1. Variations with respect to $\boldsymbol{X}$ (or rather its inverse) and ${\it\chi}$ are carried out in § 3.2 to obtain the mean and wave (YBJ) equations.

3.1. Lagrangian and wave–mean decomposition

The hydrostatic–Boussinesq equations (2.1) can be derived from the Lagrangian

(3.1)

$$\begin{eqnarray}\mathscr{L}[\boldsymbol{x},p]=\int \left(\frac{1}{2}({\dot{x}}^{2}+{\dot{y}}^{2})-\left(f_{0}y+\frac{1}{2}{\it\beta}y^{2}\right){\dot{x}}+{\it\theta}z+p\left(\left|\frac{\partial \boldsymbol{x}}{\partial \boldsymbol{a}}\right|-1\right)\right)\text{d}\boldsymbol{a},\end{eqnarray}$$

where $\boldsymbol{a}=(a,b,{\it\theta})$ are particle labels, with the (materially conserved) buoyancy ${\it\theta}$ taken as third component, and $\boldsymbol{x}(\boldsymbol{a},t)$ is the flow map (e.g. Salmon Reference Salmon2013). The pressure $p(\boldsymbol{x},t)$ is a Lagrange multiplier enforcing the incompressibility constraint. Following standard GLM practice, we introduce the mean-map $\boldsymbol{X}(\boldsymbol{a},t)$ and displacement ${\bf\xi}(\boldsymbol{x},t)$ , with

(3.2)

$$\begin{eqnarray}\boldsymbol{x}(\boldsymbol{a},t)=\boldsymbol{X}(\boldsymbol{a},t)+{\bf\xi}(\boldsymbol{X}(\boldsymbol{a},t),t).\end{eqnarray}$$

Following Salmon (Reference Salmon2013), we regard the Lagrangian as a functional of the inverse of the mean-flow map, $\boldsymbol{a}(\boldsymbol{X},T)=\boldsymbol{X}^{-1}(\boldsymbol{X},t)$ , with $T=t$ . Using the chain rule, (3.1) can be shown to take the form

(3.3)

$$\begin{eqnarray}\displaystyle \mathscr{L}[\boldsymbol{a},{\bf\xi},p] & = & \displaystyle \int \left(J\left(\frac{1}{2}((U+D_{T}{\it\xi})^{2}+(V+D_{T}{\it\eta})^{2})-\left(f_{0}(Y+{\it\eta})+\frac{1}{2}{\it\beta}(Y+{\it\eta})^{2}\right)\right.\right.\nonumber\\ \displaystyle & & \displaystyle \left.\left.\times \,(U+D_{T}{\it\xi})+{\it\theta}(Z+{\it\zeta})\vphantom{\frac{1}{2}}\right)+p(\boldsymbol{X})\left(\left|\frac{\partial (\boldsymbol{X}+{\bf\xi})}{\partial \boldsymbol{X}}\right|-J\right)\right)\text{d}\boldsymbol{X},\end{eqnarray}$$

where $D_{T}=\partial _{T}+\boldsymbol{U}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}_{3}$ , with $\boldsymbol{U}=\dot{\boldsymbol{X}}=\bar{\boldsymbol{u}}^{L}$ the Lagrangian-mean velocity and $\boldsymbol{{\rm\nabla}}_{3}$ the three-dimensional gradient with respect to $\boldsymbol{X}$ , and $J=|\partial \boldsymbol{a}/\partial \boldsymbol{X}|$ is the Jacobian of the inverse mean map. In this expression, $\boldsymbol{U}$ should be thought as a differential function of $\boldsymbol{a}(\boldsymbol{X},T)$ ; an explicit form for it is obtained from the material invariance of the labels, $D_{T}\boldsymbol{a}=0$ , as

(3.4a−c )

$$\begin{eqnarray}U=-\frac{1}{J}\frac{\partial (a,b,{\it\theta})}{\partial (T,Y,Z)},\quad V=-\frac{1}{J}\frac{\partial (a,b,{\it\theta})}{\partial (X,T,Z)},\quad W=-\frac{1}{J}\frac{\partial (a,b,{\it\theta})}{\partial (X,Y,T)}.\end{eqnarray}$$

We next introduce the expansion

(3.5)

$$\begin{eqnarray}{\bf\xi}={\bf\xi}^{(1)}+{\bf\xi}^{(2)}+\cdots \!,\end{eqnarray}$$

of the NIW displacement into the Lagrangian equation (3.3), with $|{\bf\xi}^{(n)}|=O({\it\alpha}^{n})$ . Retaining only terms in ${\it\alpha}^{n},\,n\leqslant 2$ , which amounts to linearising the NIW dynamics, and averaging leads to the Lagrangian

(3.6)

$$\begin{eqnarray}\displaystyle \langle \mathscr{L}\rangle & = & \displaystyle \int \left(\frac{1}{2}J(U^{2}+V^{2}+2UD_{T}\langle {\it\xi}^{(2)}\rangle +2VD_{T}\langle {\it\eta}^{(2)}\rangle +\langle (D_{T}{\it\xi}^{(1)})^{2}\rangle +\langle (D_{T}{\it\eta}^{(1)})^{2}\rangle )\right.\nonumber\\ \displaystyle & & \displaystyle -\,J\left(f_{0}Y+\frac{1}{2}{\it\beta}Y^{2}\right)(U+D_{T}\langle {\it\xi}^{(2)}\rangle )-J(f_{0}+{\it\beta}Y)(\langle {\it\eta}^{(2)}\rangle U+\langle {\it\eta}^{(1)}D_{T}{\it\xi}^{(1)}\rangle )\nonumber\\ \displaystyle & & \displaystyle -\,J\frac{1}{2}{\it\beta}\langle ({\it\eta}^{(1)})^{2}\rangle U+J{\it\theta}Z+J{\it\theta}\langle {\it\zeta}^{(2)}\rangle +P\boldsymbol{{\rm\nabla}}_{3}\nonumber\\ \displaystyle & & \displaystyle \left.\times \left(\langle {\bf\xi}^{(2)}\rangle -\frac{1}{2}\langle {\bf\xi}^{(1)}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}_{3}{\bf\xi}^{(1)}\rangle \right)+P(1-J)\right)\text{d}\boldsymbol{X},\end{eqnarray}$$

where $\langle \cdot \rangle$ denotes the average. It is standard in GLM theories for this average to be defined as an arbitrary ensemble average. Here, a natural ensemble is that formed by a family of NIWs differing by a phase shift. Thus, an ensemble parameter ${\it\gamma}\in [0,2{\rm\pi}]$ is introduced in (2.6) to obtain the ensemble of leading-order wavefields

(3.7a,b )

$$\begin{eqnarray}{\it\xi}^{(1)}+\text{i}{\it\eta}^{(1)}={\it\chi}_{Z}\,\text{e}^{-\text{i}(f_{0}t+{\it\gamma})}\quad \text{and}\quad {\it\zeta}^{(1)}=-{\it\chi}_{S}\text{e}^{-\text{i}(f_{0}t+{\it\gamma})}+\text{c.c.}\end{eqnarray}$$

with $S=X+\text{i}Y$ and $\partial _{S}=(\partial _{X}-\text{i}\partial _{Y})/2$ . When there is a time-scale separation between the (fast) oscillation at frequency $f_{0}$ and the (slow) evolution of the amplitude ${\it\chi}$ , averaging over ${\it\gamma}$ amounts to averaging over the fast time scale $f_{0}^{-1}$ . Thus the ensemble average becomes physically relevant, and it leads to an averaged dynamics identical to that obtained by explicit perturbation expansions, as demonstrated by Whitham (Reference Whitham1974). Note that our notation ${\bf\xi}^{(1)}(\boldsymbol{x},t)$ does not make the dependence of ${\bf\xi}^{(1)}$ on the ensemble parameter ${\it\gamma}$ explicit; our compact notation is justified by the fact that parameter ${\it\gamma}$ disappears completely from the problem after the (Whitham) average has been performed. Note also that the truncation of the Lagrangian equation (3.6) to $O({\it\alpha})$ can be regarded as a closure in which the nonlinearity of wave dynamics is neglected.

To derive (3.6), we have used that $\langle {\bf\xi}^{(1)}\rangle =0$ , that $\boldsymbol{{\rm\nabla}}_{3}\boldsymbol{\cdot }{\bf\xi}^{(1)}=0$ (stemming from the divergence-free property of NIWs), and that

(3.8)

$$\begin{eqnarray}\left|\frac{\partial (\boldsymbol{X}+{\bf\xi})}{\partial \boldsymbol{X}}\right|=1+\boldsymbol{{\rm\nabla}}_{3}\boldsymbol{\cdot }{\bf\xi}^{(2)}+\frac{1}{2}\boldsymbol{{\rm\nabla}}_{3}\boldsymbol{\cdot }({\bf\xi}^{(1)}\boldsymbol{{\rm\nabla}}_{3}\boldsymbol{\cdot }{\bf\xi}^{(1)}-{\bf\xi}^{(1)}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}_{3}{\bf\xi}^{(1)})+O({\it\alpha}^{3}),\end{eqnarray}$$

as well as integration by parts. Importantly, we do not assume that $\langle {\bf\xi}^{(2)}\rangle =0$ as is standard in GLM theory. Instead, we follow Soward & Roberts’ (Reference Soward and Roberts2010) glm prescription, which ensures that the mean motion is divergence-free. As detailed in appendix A, at the order we consider, this prescription amounts to taking

(3.9)

$$\begin{eqnarray}\langle {\bf\xi}^{(2)}\rangle ={\textstyle \frac{1}{2}}\langle {\bf\xi}^{(1)}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}_{3}{\bf\xi}^{(1)}\rangle .\end{eqnarray}$$

Thus $\langle {\bf\xi}^{(2)}\rangle \neq 0$ takes a value slaved to ${\bf\xi}^{(1)}$ (which contains terms in both $\text{e}^{\pm \text{i}(f_{0}t+{\it\gamma})}$ ) and hence to ${\it\chi}$ . As (3.8) indicates, this ensures that the map $\boldsymbol{X}\mapsto \boldsymbol{X}+{\bf\xi}$ from mean to perturbed position is volume-preserving: since the map $\boldsymbol{a}\mapsto \boldsymbol{X}+{\bf\xi}$ is volume-preserving, this is also true for the map $\boldsymbol{a}\mapsto \boldsymbol{X}$ , so the Lagrangian-mean velocity is divergence-free.

At this point, we can substitute the NIW-ansatz (3.7) into (3.6) to obtain the averaged Lagrangian in terms of $\boldsymbol{a}$ , $P$ and ${\it\chi}$ . This leads to

(3.10)

$$\begin{eqnarray}\displaystyle \langle \mathscr{L}\rangle & = & \displaystyle \int \left(\frac{1}{2}J(U^{2}+V^{2})-J\left(f_{0}Y+\frac{1}{2}{\it\beta}Y^{2}\right)U+J{\it\theta}Z\right.\nonumber\\ \displaystyle & & \displaystyle +\,J\left(-\frac{\text{i}f_{0}}{4}({\it\chi}_{Z}D_{T}{\it\chi}_{Z}^{\ast }-{\it\chi}_{Z}^{\ast }D_{T}{\it\chi}_{Z})-\frac{1}{2}f_{0}{\it\beta}Y|{\it\chi}_{Z}|^{2}\right)\nonumber\\ \displaystyle & & \displaystyle \left.+\,J(-f_{0}YD_{T}\langle {\it\xi}^{(2)}\rangle -f_{0}\langle {\it\eta}^{(2)}\rangle U+{\it\theta}\langle {\it\zeta}^{(2)}\rangle )+P(1-J)\vphantom{\frac{1}{2}}\right)\text{d}\boldsymbol{X}.\end{eqnarray}$$

To obtain this expression, we have retained only wave terms that are $O(1)$ or $O({\it\alpha}^{2}/\mathit{Ro})$ relative to the size $U_{QG}^{2}$ of the first term, assuming that ${\it\beta}L/f=O(\mathit{Ro})$ , so that only a single wave term involving ${\it\beta}$ remains. Note that the linearisation of the NIW dynamics entailed by ignoring cubic terms in $\langle \mathscr{L}\rangle$ can be justified: averaging eliminates cubic terms in ${\bf\xi}^{(1)}$ , leaving cubic terms involving higher harmonics (with frequency $2f$ ), whose size can be estimated as ${\it\epsilon}{\it\alpha}^{4}/\mathit{Ro}^{2}=O({\it\alpha})$ . The absence of resonant cubic terms has been noted by Falkovich et al. (Reference Falkovich, Kuznetsov and Medvedev1994) and Zeitlin et al. (Reference Zeitlin, Reznik and Ben Jelloul2003) and is related to the possible elimination of advective nonlinearities by means of Lagrangian coordinates (Falkovich et al. Reference Falkovich, Kuznetsov and Medvedev1994; Hunter & Ifrim Reference Hunter and Ifrim2013).

The Lagrangian equation (3.10) governs the NIW–mean flow system: when (3.7) and (3.9) are used to express ${\bf\xi}^{(2)}$ explicitly as

(3.11a )

$$\begin{eqnarray}\displaystyle & \langle {\it\xi}^{(2)}\rangle ={\textstyle \frac{1}{4}}({\it\chi}_{Z}{\it\chi}_{ZS}^{\ast }-{\it\chi}_{S}{\it\chi}_{ZZ}^{\ast })+\text{c.c.}, & \displaystyle\end{eqnarray}$$

(3.11b )

$$\begin{eqnarray}\displaystyle & \displaystyle \langle {\it\eta}^{(2)}\rangle =\frac{\text{i}}{4}({\it\chi}_{Z}{\it\chi}_{ZS}^{\ast }-{\it\chi}_{S}{\it\chi}_{ZZ}^{\ast })+\text{c.c.}, & \displaystyle\end{eqnarray}$$

(3.11c )

$$\begin{eqnarray}\displaystyle & \langle {\it\zeta}^{(2)}\rangle ={\textstyle \frac{1}{2}}(-{\it\chi}_{Z}{\it\chi}_{SS^{\ast }}^{\ast }+{\it\chi}_{S}{\it\chi}_{ZS^{\ast }}^{\ast })+\text{c.c.}, & \displaystyle\end{eqnarray}$$

$\mathscr{L}$ is a functional of

$\boldsymbol{a}$ ,

${\it\chi}$ and

$P$ from which primitive equations for the mean flow coupled to a YBJ-like equation for the NIWs can be derived systematically. This is carried out in the next subsection, § 3.2. The reduced QG model (2.7) is then derived in § 3.3.

3.2. Coupled YBJ–primitive-equation model

Taking the variation ${\it\delta}P$ of the action $\int \langle \mathscr{L}\rangle \,\text{d}t$ with the Lagrangian equation (3.10) and using (3.11), we obtain

(3.12)

$$\begin{eqnarray}J=1,\end{eqnarray}$$

confirming that the mean map is volume-preserving. Thus the Lagrangian-mean velocity is divergence-free:

(3.13)

$$\begin{eqnarray}\boldsymbol{{\rm\nabla}}_{3}\boldsymbol{\cdot }\boldsymbol{U}=0.\end{eqnarray}$$

The mean equations of motion can now be obtained from the stationarity of $\int \langle \mathscr{L}\rangle \,\text{d}t$ with respect to variations ${\it\delta}\boldsymbol{a}$ . It is convenient to use the energy–momentum formalism as proposed by Salmon (Reference Salmon2013). Computations detailed in appendix B lead to the momentum equations in the form

(3.14a )

$$\begin{eqnarray}\displaystyle D_{T}U-(f_{0}+{\it\beta}Y)V+\partial _{X}P & = & \displaystyle \frac{\text{i}f_{0}}{2}(D_{T}{\it\chi}_{Z}{\it\chi}_{XZ}^{\ast }-D_{T}{\it\chi}_{Z}^{\ast }{\it\chi}_{XZ})-\frac{1}{2}f_{0}{\it\beta}\partial _{X}(Y|{\it\chi}_{Z}|^{2})\nonumber\\ \displaystyle & & \displaystyle +~f_{0}\langle D_{T}{\it\eta}^{(2)}-U{\it\eta}_{X}^{(2)}+V{\it\xi}_{X}^{(2)}\rangle +{\it\theta}\langle {\it\zeta}_{X}^{(2)}\rangle ,\end{eqnarray}$$

(3.14b )

$$\begin{eqnarray}\displaystyle D_{T}V+(f_{0}+{\it\beta}Y)U+\partial _{Y}P & = & \displaystyle \frac{\text{i}f_{0}}{2}(D_{T}{\it\chi}_{Z}{\it\chi}_{YZ}^{\ast }-D_{T}{\it\chi}_{Z}^{\ast }{\it\chi}_{YZ})-\frac{1}{2}f_{0}{\it\beta}\partial _{Y}(Y|{\it\chi}_{Z}|^{2})\nonumber\\ \displaystyle & & \displaystyle +~f_{0}\langle -D_{T}{\it\xi}^{(2)}-U{\it\eta}_{Y}^{(2)}+V{\it\xi}_{Y}^{(2)}\rangle +{\it\theta}\langle {\it\zeta}_{Y}^{(2)}\rangle ,\end{eqnarray}$$

(3.14c )

$$\begin{eqnarray}\displaystyle -{\it\theta}+\partial _{Z}P & = & \displaystyle \frac{\text{i}f_{0}}{2}(D_{T}{\it\chi}_{Z}{\it\chi}_{ZZ}^{\ast }-D_{T}{\it\chi}_{Z}^{\ast }{\it\chi}_{ZZ})-\frac{1}{2}f_{0}{\it\beta}\partial _{Z}(Y|{\it\chi}_{Z}|^{2})\nonumber\\ \displaystyle & & \displaystyle +~f_{0}\langle -U{\it\eta}_{Z}^{(2)}+V{\it\xi}_{Z}^{(2)}\rangle +{\it\theta}\langle {\it\zeta}_{Z}^{(2)}\rangle .\end{eqnarray}$$

These are completed by the buoyancy equation

(3.15)

$$\begin{eqnarray}D_{T}{\it\theta}=0,\end{eqnarray}$$

which expresses that ${\it\theta}$ is a label. The left-hand sides of (3.13)–(3.15) recover the hydrostatic–Boussinesq equations (2.1) for the mean flow; the right-hand sides, which can be written completely in terms of ${\it\chi}$ , describe the impact of the NIWs on the mean flow.

Taking the variation ${\it\delta}{\it\chi}^{\ast }$ of the Lagrangian equation (3.10) after using (3.11) for ${\bf\xi}^{(2)}$ leads to the wave equation

(3.16)

$$\begin{eqnarray}\displaystyle & & \displaystyle (D_{T}{\it\chi}_{Z})_{Z}-\text{i}{\it\beta}Y{\it\chi}_{ZZ}+\frac{\text{i}}{2}((V{\it\chi}_{Z})_{ZS}-(V{\it\chi}_{S})_{ZZ}-(V{\it\chi}_{ZS\ast })_{Z}+(V{\it\chi}_{ZZ})_{S\ast })\nonumber\\ \displaystyle & & \displaystyle \quad +\,\frac{1}{2}((U{\it\chi}_{Z})_{ZS}-(U{\it\chi}_{S})_{ZZ}-(U{\it\chi}_{ZS^{\ast }})_{Z}-(U{\it\chi}_{ZZ})_{S^{\ast }})\nonumber\\ \displaystyle & & \displaystyle \quad +\,\frac{\text{i}}{f_{0}}(-({\it\theta}{\it\chi}_{Z})_{SS^{\ast }}+({\it\theta}{\it\chi}_{S})_{ZS^{\ast }}+({\it\theta}{\it\chi}_{SS^{\ast }})_{Z}-({\it\theta}{\it\chi}_{ZS})_{S^{\ast }})=0.\end{eqnarray}$$

This equation can be interpreted as a generalisation of the YBJ equations which makes no assumption that the mean flow is QG or steady.

Together, (3.14)–(3.16) constitute a closed model for the joint evolution of the wave and the mean flow. This model is complex and we prefer to focus our analysis on its QG approximation introduced in § 2 and derived in the next subsection. It is nonetheless worth noting that the full model has two simple conservation laws. The first is obtained by multiplying (3.16) by ${\it\chi}^{\ast }$ and adding the complex conjugate of the resulting equation. Integrating over space and making liberal use of integration by parts yields the wave-action conservation

(3.17)

$$\begin{eqnarray}\frac{\text{d}}{\text{d}t}\int |{\it\chi}_{Z}|^{2}\,\text{d}\boldsymbol{X}=0.\end{eqnarray}$$

This conservation law is associated with the obvious symmetry ${\it\chi}\mapsto \text{e}^{\text{i}{\it\gamma}}{\it\chi},\,{\it\gamma}\in \mathbb{R}$ , of the Lagrangian equation (3.10) and can therefore also be obtained from Noether’s theorem (e.g. Goldstein Reference Goldstein1980) in the form

(3.18)

$$\begin{eqnarray}\frac{\text{d}}{\text{d}t}\int \!\left(\text{i}{\it\chi}\frac{{\it\delta}}{{\it\delta}{\it\chi}_{T}}-\text{i}{\it\chi}^{\ast }\frac{{\it\delta}}{{\it\delta}{\it\chi}_{T}^{\ast }}\right)\langle \mathscr{L}\rangle \,\text{d}\boldsymbol{X}=0,\end{eqnarray}$$

thus justifying the terminology of action. The second conservation law is that of energy. It is best obtained from the Lagrangian equation (3.10). The general form of the conserved energy, associated with the symmetry $T\mapsto T+{\it\delta}T$ , also follows from Noether’s theorem. This yields the energy in the form

(3.19)

$$\begin{eqnarray}\int \left(a_{T}^{i}\frac{{\it\delta}}{{\it\delta}a_{T}^{i}}+{\it\chi}_{T}\frac{{\it\delta}}{{\it\delta}{\it\chi}_{T}}+{\it\chi}_{T}^{\ast }\frac{{\it\delta}}{{\it\delta}{\it\chi}_{T}^{\ast }}-1\right)\langle \mathscr{L}\rangle \,\text{d}\boldsymbol{X},\end{eqnarray}$$

which implies that the energy is readily deduced from $\mathscr{L}$ using the following rules: terms that are quadratic in $\boldsymbol{U}$ (and hence in $a_{T}^{i}$ ) or ${\it\chi}_{T}$ are retained, terms that are linear are omitted, and terms that contain no time derivatives change sign. So the energy conservation reads

(3.20)

$$\begin{eqnarray}\frac{\text{d}}{\text{d}t}\int \left(\frac{1}{2}(U^{2}+V^{2})-{\it\theta}(Z+\langle {\it\zeta}^{(2)}\rangle )+\frac{1}{2}f_{0}{\it\beta}Y|{\it\chi}_{Z}|^{2}\right)\text{d}\boldsymbol{X}=0\end{eqnarray}$$

using $J=1$ . This is a remarkably simple expression in which the effect of the waves arises only through the potential-energy term $-{\it\theta}\langle {\it\zeta}^{(2)}\rangle$ and the ${\it\beta}$ -term. Surprisingly, perhaps, it is simpler than the analogous energy that is conserved in the (uncoupled) YBJ model (Vanneste Reference Vanneste2014).

3.3. Quasi-geostrophic approximation

We now derive an approximation to the mean and wave equations in the QG limit $\mathit{Ro}\rightarrow 0$ . The standard QG model cannot be derived in a simple manner from the variational formulation of the primitive equations (see Bokhove, Vanneste & Warn Reference Bokhove, Vanneste and Warn1998 or Oliver Reference Oliver2006, however), and the same difficulty arises here. We therefore derive the QG approximation of the mean equations directly from the momentum equations (3.14), retaining a variational argument for the wave part only. That the approximations made in both parts of the model are consistent is confirmed by the fact that the resulting coupled model has a Hamiltonian structure, as discussed in § 4.

In the QG approximation, the buoyancy is decomposed into a $Z$ -dependent mean part and a perturbation according to

(3.21)

$$\begin{eqnarray}{\it\theta}=\bar{{\it\theta}}(Z)+{\it\theta}^{\prime }=\int ^{Z}N^{2}(z)\,\text{d}z+{\it\theta}^{\prime }.\end{eqnarray}$$

To leading order in $\mathit{Ro}$ , the mean equations (3.14) then reduce to

(3.22)

$$\begin{eqnarray}\displaystyle & f_{0}V=\partial _{X}(P-\bar{{\it\theta}}\langle {\it\zeta}^{(2)}\rangle ), & \displaystyle\end{eqnarray}$$

(3.23)

$$\begin{eqnarray}\displaystyle & -f_{0}U=\partial _{Y}(P-\bar{{\it\theta}}\langle {\it\zeta}^{(2)}\rangle ), & \displaystyle\end{eqnarray}$$

(3.24)

$$\begin{eqnarray}\displaystyle & \displaystyle {\it\theta}^{\prime }=\partial _{Z}\left(P-\bar{{\it\theta}}\langle {\it\zeta}^{(2)}\rangle -\int ^{Z}\text{d}z\int ^{z}N^{2}(z^{\prime })\,\text{d}z^{\prime }\right)+N^{2}\langle {\it\zeta}^{(2)}\rangle , & \displaystyle\end{eqnarray}$$

and are recognised as expressing geostrophic and hydrostatic balance. This leads to the introduction of a streamfunction

${\it\psi}$ such that

(3.25a−c )

$$\begin{eqnarray}U=-{\it\psi}_{Y},\quad V={\it\psi}_{X}\quad \text{and}\quad {\it\theta}^{\prime }=f_{0}{\it\psi}_{Z}+N^{2}\langle {\it\zeta}^{(2)}\rangle .\end{eqnarray}$$

Using this, the buoyancy conservation becomes

(3.26)

$$\begin{eqnarray}D_{T}^{0}(f_{0}{\it\psi}_{Z}+N^{2}\langle {\it\zeta}^{(2)}\rangle )+N^{2}W=0,\end{eqnarray}$$

where $D_{T}^{0}=\partial _{T}+\partial ({\it\psi},\cdot )$ .

A closed equation for ${\it\psi}$ can now be derived from (3.14) and (3.26) in a familiar way: taking the horizontal curl of (3.14a ) and (3.14b ) and keeping terms up to $O(U^{2}/L^{2})$ , we obtain

(3.27)

$$\begin{eqnarray}D_{T}^{0}\left({\it\beta}Y+V_{X}-U_{Y}+\frac{\text{i}f}{2}\partial ({\it\chi}_{Z},{\it\chi}_{Z}^{\ast })+f_{0}(\langle {\it\xi}_{X}^{(2)}\rangle +\langle {\it\eta}_{Y}^{(2)}\rangle )\right)-f_{0}W_{Z}=0.\end{eqnarray}$$

Substituting (3.26) to eliminate $W$ leads to the conservation equation

(3.28)

$$\begin{eqnarray}D_{T}^{0}q=0,\quad \text{where}~q={\it\beta}Y+{\rm\Delta}{\it\psi}+\frac{\text{i}f_{0}}{2}\partial ({\it\chi}_{Z}^{\ast },{\it\chi}_{Z})+f_{0}\boldsymbol{{\rm\nabla}}_{3}\boldsymbol{\cdot }\langle {\bf\xi}^{(2)}\rangle ,\end{eqnarray}$$

with ${\rm\Delta}$ defined in (2.10), is the QGPV. A direct computation using (3.11) gives the last term explicitly as

(3.29)

$$\begin{eqnarray}\boldsymbol{{\rm\nabla}}_{3}\boldsymbol{\cdot }\langle {\bf\xi}^{(2)}\rangle =G({\it\chi}^{\ast },{\it\chi}),\end{eqnarray}$$

with the symmetric bilinear operator $G$ defined in (2.11). Replacing $\boldsymbol{X}$ by $\boldsymbol{x}$ as independent variable reduces the QGPV equation (3.28) to the form announced in (2.7b ). An alternative derivation based on PV conservation and valid for an arbitrary definition of the Lagrangian average is presented in appendix C. The vertical boundary conditions (2.12) associated with the QGPV equation are derived by applying the no-normal-flow condition $W=0$ at $z=z^{\pm }$ to (3.26) and noting from (3.9) that $\langle {\it\zeta}^{(2)}\rangle =0$ at $z^{\pm }$ follows from the fact that ${\it\zeta}^{(1)}=0$ there.

The NIW equation associated with (3.28) is best derived by introducing the geostrophic and hydrostatic conditions into the averaged Lagrangian equation (3.10), then taking variations with respect to ${\it\chi}$ or ${\it\chi}^{\ast }$ . The wave part of the Lagrangian is readily found from (3.10) to be

(3.30)

$$\begin{eqnarray}\displaystyle \langle \mathscr{L}\rangle _{NIW} & = & \displaystyle \int \left(-\frac{\text{i}f_{0}}{4}({\it\chi}_{Z}D_{T}^{0}{\it\chi}_{Z}^{\ast }-{\it\chi}_{Z}^{\ast }D_{T}^{0}{\it\chi}_{Z})-\frac{1}{2}f_{0}{\it\beta}Y|{\it\chi}_{Z}|^{2}\right.\nonumber\\ \displaystyle & & \displaystyle \left.-~f_{0}{\it\psi}\boldsymbol{{\rm\nabla}}_{3}\boldsymbol{\cdot }\langle {\bf\xi}^{(2)}\rangle +\int ^{Z}N^{2}(z)\,\text{d}z\,\langle {\it\zeta}^{(2)}\rangle \right)\,\text{d}\boldsymbol{X},\end{eqnarray}$$

where we have used that $J=1$ , integration by parts, and neglected a term in $\langle {\it\zeta}^{(2)}\rangle ^{2}$ . The terms depending on ${\bf\xi}^{(2)}$ can now be written in terms of ${\it\chi}$ using (3.29) and the observation that

(3.31)

$$\begin{eqnarray}\langle {\it\zeta}^{(2)}\rangle ={\textstyle \frac{1}{2}}\partial _{Z}\langle ({\it\zeta}^{(1)})^{2}\rangle +\cdots ={\textstyle \frac{1}{4}}\partial _{Z}|\boldsymbol{{\rm\nabla}}{\it\chi}|^{2}+\cdots \!,\end{eqnarray}$$

where $\cdots \,$ denotes the horizontal divergence of an irrelevant vector. This simplifies (3.30) into

(3.32)

$$\begin{eqnarray}\langle \mathscr{L}\rangle _{NIW}=-\!\!\int \!\left(\frac{\text{i}f_{0}}{4}({\it\chi}_{Z}D_{T}^{0}{\it\chi}_{Z}^{\ast }-{\it\chi}_{Z}^{\ast }D_{T}^{0}{\it\chi}_{Z})+\frac{1}{2}f_{0}{\it\beta}Y|{\it\chi}_{Z}|^{2}+f_{0}{\it\psi}G({\it\chi}^{\ast }\!,{\it\chi})+\frac{1}{4}N^{2}|\boldsymbol{{\rm\nabla}}{\it\chi}|^{2}\!\right)\text{d}\boldsymbol{X}.\end{eqnarray}$$

To take the variations of the corresponding action, it is convenient to introduce the symmetric bilinear operator ${\hat{G}}$ dual to $G$ in the sense that

(3.33)

$$\begin{eqnarray}\int {\it\psi}G({\it\chi}^{\ast },{\it\chi})\,\text{d}\boldsymbol{X}=\int {\it\chi}^{\ast }{\hat{G}}({\it\psi},{\it\chi})\,\text{d}\boldsymbol{X}.\end{eqnarray}$$

The variation ${\it\delta}{\it\chi}^{\ast }$ then gives

(3.34)

$$\begin{eqnarray}(D_{T}^{0}{\it\chi})_{Z}+\text{i}{\it\beta}Y{\it\chi}_{ZZ}+\frac{\text{i}N^{2}}{2f_{0}}{\rm\nabla}^{2}{\it\chi}-2\text{i}{\hat{G}}({\it\psi},{\it\chi})=0.\end{eqnarray}$$

From its definition and (2.11), ${\hat{G}}({\it\psi},{\it\chi})$ is calculated to be

(3.35)

$$\begin{eqnarray}{\hat{G}}({\it\psi},{\it\chi})={\textstyle \frac{1}{4}}(2\boldsymbol{{\rm\nabla}}{\it\psi}_{Z}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}{\it\chi}_{Z}-{\rm\nabla}^{2}{\it\psi}{\it\chi}_{ZZ}-{\it\psi}_{ZZ}{\rm\nabla}^{2}{\it\chi})\end{eqnarray}$$

and is recognised as the negative of YBJ’s bracket $[[\cdot ,\cdot ]]$ . Introducing (3.35) into (3.34), dropping the superscript 0 from $D_{T}^{0}$ and replacing $\boldsymbol{X}$ by $\boldsymbol{x}$ leads to the YBJ equation in the form (2.7a ).

4. Conservation laws and Hamiltonian structure

We now derive conservation laws satisfied by the coupled model (2.7). We start by the conservation law identified in YBJ: multiplying (2.7a ) by ${\it\chi}^{\ast }$ and integrating yields

(4.1)

$$\begin{eqnarray}\int \!\left(-{\it\chi}_{z}^{\ast }\partial _{t}{\it\chi}_{z}+{\it\psi}\partial ({\it\chi}_{z}^{\ast },{\it\chi}_{z})-\text{i}{\it\beta}y|{\it\chi}_{z}|^{2}-\frac{\text{i}N^{2}}{2f_{0}}|\boldsymbol{{\rm\nabla}}{\it\chi}|^{2}-2\text{i}{\it\psi}G({\it\chi}^{\ast },{\it\chi})\right)\text{d}\boldsymbol{x}=0,\end{eqnarray}$$

after using integration by parts. Adding the complex conjugate and using the symmetry of $G$ and antisymmetry of $\partial (\cdot ,\cdot )$ gives

(4.2)

$$\begin{eqnarray}\frac{\text{d}}{\text{d}t}\int |{\it\chi}_{z}|^{2}\,\text{d}\boldsymbol{x}=0.\end{eqnarray}$$

Thus, the wave action $\mathscr{A}$ defined in (2.14) is conserved. This conservation law is identical to that obtained for the YBJ–primitive-equation model in (3.17) and, as checked below using the Hamiltonian structure of the YBJ–QG model, also associated with an invariance with respect to phase shifts of the amplitude ${\it\chi}$ .

Next we derive an energy conservation law. Multiplying the QGPV equation (2.7b ) by ${\it\psi}$ , integrating and using the definition (2.9) of $q$ gives

(4.3)

$$\begin{eqnarray}\displaystyle & & \displaystyle \int \left(\frac{1}{2}\partial _{t}\left(|\boldsymbol{{\rm\nabla}}{\it\psi}|^{2}+\frac{f_{0}^{2}}{N^{2}}{\it\psi}_{z}^{2}\right)-\frac{\text{i}f_{0}{\it\psi}}{2}(\partial ({\it\chi}_{zt}^{\ast },{\it\chi}_{z})+\partial ({\it\chi}_{z}^{\ast },{\it\chi}_{zt}))\right.\nonumber\\ \displaystyle & & \displaystyle \quad \left.-\,f_{0}{\it\psi}(G({\it\chi}_{t}^{\ast },{\it\chi})+G({\it\chi}^{\ast },{\it\chi}_{t}))\vphantom{\left(|\boldsymbol{{\rm\nabla}}{\it\psi}|^{2}+\frac{f_{0}^{2}}{N^{2}}{\it\psi}_{z}^{2}\right)}\right)\text{d}\boldsymbol{x}=0.\end{eqnarray}$$

Multiplying the YBJ equation (2.7a ) by $\text{i}f_{0}\partial _{t}{\it\chi}^{\ast }/2$ , integrating and adding the complex conjugate gives

(4.4)

$$\begin{eqnarray}\displaystyle & & \displaystyle \int \left(\frac{\text{i}f_{0}{\it\psi}}{2}(\partial ({\it\chi}_{zt}^{\ast },{\it\chi}_{z})+\partial ({\it\chi}_{z}^{\ast },{\it\chi}_{zt}))+\frac{f_{0}{\it\beta}y}{2}\partial _{t}|{\it\chi}_{z}|^{2}\right.\nonumber\\ \displaystyle & & \displaystyle \quad \left.+\,\frac{N^{2}}{4}\partial _{t}|\boldsymbol{{\rm\nabla}}{\it\chi}|^{2}+f_{0}{\it\psi}(G({\it\chi}_{t}^{\ast },{\it\chi})+G({\it\chi}^{\ast },{\it\chi}_{t}))\right)\text{d}\boldsymbol{x}=0,\end{eqnarray}$$

where the relation (3.33) between $G$ and ${\hat{G}}$ is used. Adding (4.3) and (4.4) leads to

(4.5)

$$\begin{eqnarray}\frac{\text{d}}{\text{d}t}\int \frac{1}{2}\left(|\boldsymbol{{\rm\nabla}}{\it\psi}|^{2}+\frac{f_{0}^{2}}{N^{2}}{\it\psi}_{z}^{2}+f_{0}{\it\beta}y|{\it\chi}_{z}|^{2}+\frac{1}{2}N^{2}|\boldsymbol{{\rm\nabla}}{\it\chi}|^{2}\right)\text{d}\boldsymbol{x}=0,\end{eqnarray}$$

and hence to the conservation of the energy $\mathscr{H}$ in (2.13). This energy conservation can be recognised as the QG approximation of primitive-equation energy equation (3.20): the first two terms are the usual QG approximation of the mean kinetic and potential energy; the third term is unchanged; the fourth term is an approximation to ${\it\theta}\langle {\it\zeta}^{(2)}\rangle$ obtained by noting that ${\it\theta}\approx \int ^{z}N^{2}(z^{\prime })\,\text{d}z^{\prime }$ and using (3.31).

The coupled model (2.7) is in fact Hamiltonian. The Hamiltonian structure (e.g. Shepherd Reference Shepherd1990), which can be obtained by inspection, is conveniently written using the amplitude of the horizontal NIW displacement ${\it\phi}={\it\chi}_{z}$ , its complex conjugate ${\it\phi}^{\ast }$ , $q$ and ${\it\theta}^{\pm }$ as dynamical variables. Grouping these in a vector ${\bf\phi}$ , it can be checked that the governing equations (2.7) are recovered from

(4.6)

$$\begin{eqnarray}{\bf\phi}_{t}=\unicode[STIX]{x1D645}\frac{{\it\delta}\mathscr{H}}{{\it\delta}{\bf\phi}},\end{eqnarray}$$

where

(4.7)

$$\begin{eqnarray}\unicode[STIX]{x1D645}=\left(\begin{array}{@{}ccccc@{}}0 & -2\text{i}/f_{0} & 0 & 0 & 0\\ 2\text{i}/f_{0} & 0 & 0 & 0 & 0\\ 0 & 0 & -\partial (q,\cdot ) & 0 & 0\\ 0 & 0 & 0 & (N^{+})^{2}f_{0}^{-1}\partial ({\it\theta}^{+},\cdot ) & 0\\ 0 & 0 & 0 & 0 & -(N^{-})^{2}f_{0}^{-1}\partial ({\it\theta}^{-},\cdot )\end{array}\right)\end{eqnarray}$$

and the Hamiltonian is

(4.8)

$$\begin{eqnarray}\mathscr{H}=\frac{1}{2}\int \left(|\boldsymbol{{\rm\nabla}}{\it\psi}|^{2}+\frac{f_{0}^{2}}{N^{2}}|{\it\psi}_{z}|^{2}+f_{0}{\it\beta}y|{\it\phi}|^{2}+\frac{N^{2}}{2}\left|\boldsymbol{{\rm\nabla}}\int ^{z}{\it\phi}(\tilde{z})\,\text{d}\tilde{z}\right|^{2}\right)\text{d}\boldsymbol{x}.\end{eqnarray}$$

The streamfunction ${\it\psi}$ is here regarded as a functional of $q$ and ${\bf\phi}$ defined by

(4.9)

$$\begin{eqnarray}{\it\psi}={\rm\Delta}^{-1}\left(q-{\it\beta}y-\frac{\text{i}f_{0}}{2}\partial ({\it\phi}^{\ast },{\it\phi})-f_{0}G\left(\int ^{z}{\it\phi}(\tilde{z})^{\ast }\,\text{d}\tilde{z},\int ^{z}{\it\phi}(\tilde{z})\,\text{d}\tilde{z}\right)\right)\end{eqnarray}$$

with

(4.10)

$$\begin{eqnarray}{\it\psi}_{z}|_{z=z^{\pm }}=f_{0}^{-1}{\it\theta}^{\pm }\end{eqnarray}$$

following (2.12).

The Hamiltonian structure provides a systematic route to the derivation of conservation laws using Noether’s theorem. We note that the Hamiltonian flow associated with the wave action $\mathscr{A}=f_{0}\int |{\it\phi}|^{2}\,\text{d}\boldsymbol{x}/2$ , namely $\unicode[STIX]{x1D645}{\it\delta}\mathscr{A}/{\it\delta}{\bf\phi}$ , is $(-\text{i}{\it\phi},\text{i}{\it\phi}^{\ast },0,0,0)^{\text{T}}$ . This is recognised as the generator of the continuous transformation ${\it\phi}\mapsto {\it\phi}\exp (-\text{i}{\it\gamma}),\,{\it\gamma}\in \mathbb{R}$ , an obvious symmetry of $\mathscr{H}$ . The invariance of $\mathscr{H}$ with respect to translations and horizontal rotations gives rise to conserved linear and angular momenta. For instance, the conserved $x$ -momentum is readily shown to be

(4.11)

$$\begin{eqnarray}\displaystyle \mathscr{M}_{x} & = & \displaystyle \int \left(\frac{\text{i}f_{0}}{4}({\it\phi}^{\ast }{\it\phi}_{x}-{\it\phi}_{x}^{\ast }{\it\phi})-qy\right)\text{d}\boldsymbol{x}+f_{0}\int ((N^{+})^{-2}{\it\theta}^{+}-(N^{-})^{-2}{\it\theta}_{-})y\,\text{d}x\text{d}y\nonumber\\ \displaystyle & = & \displaystyle \int U\,\text{d}\boldsymbol{x}.\end{eqnarray}$$

Additional conserved quantities are of course the same Casimir invariants as in three-dimensional QG dynamics, namely the volume integrals of arbitrary functions of $q$ and surface integrals of arbitrary functions of ${\it\theta}^{\pm }$ (Shepherd Reference Shepherd1990).

5. Implications

We now discuss some implications of the conservation of energy (2.13) and action (2.14) for ocean dynamics. First, we note that the action conservation implies that the NIW amplitude remains zero if it is initially so: thus spontaneous generation of NIWs is impossible in this model, unsurprisingly since it is expected to be exponentially small in $\mathit{Ro}$ (Vanneste Reference Vanneste2013) and thus much smaller than neglected terms. Second, the energy conservation indicates that the decrease in NIW scales induced by the ${\it\beta}$ -effect in the absence of a flow, ${\it\psi}=0$ , is necessarily accompanied by an equatorward drift of the NIWs, consistent with WKB results (Garrett Reference Garrett2001).

A third, more striking, conclusion is that conservation laws show unambiguously that oceanic NIWs forced by atmospheric winds provide an energy sink for the mean flow. To see how, consider NIWs forced at some initial time $t=0$ with horizontal scales large enough that ${\it\chi}_{0}={\it\chi}(t=0)$ has negligible horizontal gradient, i.e. $\boldsymbol{{\rm\nabla}}{\it\chi}_{0}\approx 0$ . This is a reasonable approximation since NIWs are generated by atmospheric storms whose scales are ten or more times the scale of oceanic eddies. Initially, NIWs make no contribution to the energy $\mathscr{H}$ , which then purely consists of the mean-flow energy. As time progresses, the advection and refraction of the waves by the mean flow lead to a scalar cascade in the NIW field, producing horizontal scales similar to, or smaller than, the eddy scale. As a result, $|\boldsymbol{{\rm\nabla}}{\it\chi}|$ grows since $|{\it\chi}|$ is constrained by wave-action conservation. According to (2.13), the contribution of $|\boldsymbol{{\rm\nabla}}{\it\chi}|^{2}$ to the energy must be balanced by a decrease in the energy of the mean flow. Physically, the mechanism for this energy exchange is clear: as the horizontal scale of the NIWs decreases, their potential energy increases, necessarily at the expense of the mean energy since the NIW kinetic energy $f_{0}\mathscr{A}$ is conserved. This mechanism can be suggestively termed ‘stimulated wave generation’ to distinguish it from spontaneous generation (ruled out in our model) and complete an electromagnetic analogy (e.g. Berestetskii, Lifshitz & Pitaevskii Reference Berestetskii, Lifshitz and Pitaevskii1982).

The explicit form of (2.13) and (2.14) enables us to make quantitative predictions. Suppose that the NIWs initially have a typical vertical scale $m_{0}^{-1}$ , corresponding for example to the depth of the mixed layer. Suppose too that at some final time $t$ , the various processes governing their dynamics have led to typical horizontal and vertical scales $k^{-1}$ and $m^{-1}$ and to typical amplitudes $|{\it\chi}|$ . The conservation of wave action (2.14) implies that

(5.1)

$$\begin{eqnarray}\frac{f_{0}m_{0}^{2}}{2}|{\it\chi}_{0}|^{2}\approx \frac{f_{0}m^{2}}{2}|{\it\chi}|^{2}.\end{eqnarray}$$

Correspondingly, the kinetic energy of the NIW per unit volume, $\mathscr{K}_{NIW}\approx f_{0}^{2}m^{2}|{\it\chi}|^{2}/2$ remains unchanged. The potential energy, on the other hand, increases from 0 to $\mathscr{P}_{NIW}\approx N^{2}k^{2}|{\it\chi}|^{2}/4$ . We therefore conclude that the NIWs extract from the mean flow an energy

(5.2)

$$\begin{eqnarray}-\mathscr{E}_{QG}=\mathscr{P}_{NIW}=\frac{N^{2}k^{2}}{2f_{0}^{2}m^{2}}\mathscr{K}_{NIW}=\frac{{\it\epsilon}^{2}}{2}\mathscr{K}_{NIW}\end{eqnarray}$$

per unit volume. Because the dispersion relation of NIWs is ${\it\omega}=f_{0}(1+{\it\epsilon}^{2}/2)$ (as follows from the dispersion term in (2.7a ) or from a Taylor expansion of the inertia–gravity-wave frequency ${\it\omega}=(f_{0}+N^{2}k^{2}/m^{2})^{1/2}$ ), ${\it\epsilon}^{2}/2$ can also be rewritten as ${\rm\Delta}{\it\omega}/f_{0}$ , the relative frequency shift away from $f_{0}$ .

Since one of the main open questions in ocean dynamics concerns the dissipation of mesoscale energy, it is natural to ask whether the mechanism we have identified could be a significant contributor. Assuming that the process of NIW generation followed by their cascade to small scale occurs in a continuous fashion, (5.2) can be turned into an expression for the power rate extracted from the mean flow,

(5.3)

$$\begin{eqnarray}-\dot{\mathscr{E}}_{QG}=\frac{{\it\epsilon}^{2}}{2}\dot{\mathscr{K}}_{NIW},\end{eqnarray}$$

where $\dot{\mathscr{K}}_{NIW}$ is the power injected into NIWs by winds. Integrating over the whole ocean, this power is estimated as 0.6 TW in Wunsch & Ferrari (Reference Wunsch and Ferrari2004). It is unclear what a realistic value of ${\it\epsilon}^{2}/2$ might be: if we take $k$ and $m$ as representative of typical NIWs, ${\it\epsilon}^{2}/2={\rm\Delta}{\it\omega}/f_{0}$ can be interpreted as the width of the inertial peak relative to $f_{0}$ , and a value of ${\it\epsilon}^{2}/2=0.2$ is plausible. This leads to a sink of 0.12 TW, comparable, for instance, with the 0.1 TW estimated for the dissipation caused by bottom drag (Wunsch & Ferrari Reference Wunsch and Ferrari2004). There is considerable uncertainty in these estimates, however, in particular because it is not clear what the final values of $k$ and $m$ ought to be and whether the impact of NIWs is restricted to the upper parts of the ocean. Furthermore, the scale cascade can be expected to lead to values of ${\it\epsilon}^{2}/2$ that are not small, e.g. through the mechanism of wave capture (Badulin & Shrira Reference Badulin and Shrira1993; Bühler & McIntyre Reference Bühler and McIntyre2005), which suggests that ${\it\epsilon}$ stabilises at $O(1)$ values. While our model ceases to be valid then – and the crucial feature of conserved wave kinetic energy ceases to hold – one can expect energy to be transferred from mean flow to the waves throughout the cascading process. Our argument above, necessarily limited to ${\it\epsilon}\ll 1$ , may therefore underestimate the amount of energy extracted from the mean flow. It would certainly be valuable to test the efficiency of the process through detailed numerical simulations.

6. Two-dimensional models

In this section we discuss two two-dimensional models that are deduced from the YBJ–QG model under certain symmetry assumptions. These models are useful to the study of NIW-mean interactions in a simplified context.

6.1. Slice model

Neglecting the ${\it\beta}$ -effect, we consider solutions that are independent of $y$ . This reduces (2.7) to

(6.1a )

$$\begin{eqnarray}\displaystyle & \displaystyle {\it\chi}_{zzt}+\frac{\text{i}N^{2}}{2f_{0}}{\it\chi}_{xx}+\frac{\text{i}}{2}({\it\psi}_{xx}{\it\chi}_{zz}+{\it\psi}_{zz}{\it\chi}_{xx}-2{\it\psi}_{xz}{\it\chi}_{xz})=0, & \displaystyle\end{eqnarray}$$

(6.1b )

$$\begin{eqnarray}\displaystyle & \displaystyle \partial _{t}\left({\it\psi}_{xx}+\partial _{z}\left(\frac{f_{0}^{2}}{N^{2}}{\it\psi}_{z}\right)+\frac{f_{0}}{4}(2|{\it\chi}_{xz}|^{2}-{\it\chi}_{zz}{\it\chi}_{xx}^{\ast }-{\it\chi}_{zz}^{\ast }{\it\chi}_{xx})\right)=0. & \displaystyle\end{eqnarray}$$

Because advection disappears, (6.1b ) can be integrated in time to provide the streamfunction in terms of

${\it\chi}$ , leaving (6.1a ) as the sole prognostic equation.

We illustrate the interest of this model by presenting the result of a numerical simulation examining the impact of NIWs on a barotropic mean flow using a set-up based on that of Balmforth, Llewellyn-Smith & Young (Reference Balmforth, Llewellyn-Smith and Young1998). In this set-up, NIWs initialised near the surface propagate vertically as a result of their interactions with the one-dimensional mean flow

(6.2)

$$\begin{eqnarray}\boldsymbol{{\rm\nabla}}^{\bot }{\it\psi}=(0,U_{QG}\sin (2{\rm\pi}x/L)),\end{eqnarray}$$

where $L$ is the length of the domain. The coupled model enables us to study the feedback of the NIWs on this mean flow.

We carried out simulations using a pseudospectral implementation of (6.1), with a domain $(x,z)\in [0,L]\times [-H,0]$ , where $L=80~\text{km}$ and $H=4\,200~\text{m}$ . The Coriolis frequency is taken as $f_{0}=10^{-4}~\text{s}^{-1}$ and a constant Brunt–Väisälä frequency $N=8\times 10^{-3}~\text{s}^{-1}$ is used, somewhat smaller than that in Balmforth et al. (Reference Balmforth, Llewellyn-Smith and Young1998). The maximum mean velocity is $U_{QG}=0.08~\text{m}~\text{s}^{-1}$ . The NIWs are initially confined within the mixed layer with a characteristic depth $H_{m}=50~\text{m}$ , with the form ${\it\chi}_{0z}=U_{\text{NIW}}\exp (-(z/H_{m})^{2})$ where $U_{\text{NIW}}=0.8~\text{m}~\text{s}^{-1}$ . The corresponding dimensionless parameters are $\mathit{Ro}=0.01$ , ${\it\alpha}=0.1$ and ${\it\epsilon}=0.05$ , so $\mathit{Ro}^{1/2}={\it\alpha}\approx {\it\epsilon}$ , consistent with our scaling assumptions.

Figure 1 shows the evolution of the change in mean energy, wave potential energy and total energy from their initial values in a 14-day simulation. Here, the mean energy and wave potential energies are the two terms

(6.3a,b )

$$\begin{eqnarray}\frac{1}{2}\int \left({\it\psi}_{x}^{2}+\frac{f_{0}^{2}}{N^{2}}{\it\psi}_{z}^{2}\right)\text{d}\boldsymbol{x}\quad \text{and}\quad \frac{N^{2}}{4}\int |{\it\chi}_{x}|^{2}\,\text{d}\boldsymbol{x},\end{eqnarray}$$

which make up the constant total energy. The figure confirms that, overall, NIWs act as an energy sink for the mean flow. The net energy transfer from mean flow to NIWs is concentrated within the first five days; afterwards, the energy exchange is much smaller and its sign alternates. The NIW amplitude $|{\it\chi}_{z}|$ and the change in the mean velocity $V={\it\psi}_{x}$ are shown in figure 2. Their feedback results in a slowing down of the mean flow, consistent with the energy loss and collocated with the NIW wavepacket. An important feature of the mean-flow evolution is that it is reversible: at each location, the flow velocity returns to its initial value once the NIWs have propagated away. This is a particular feature of the slice model, specifically of the diagnostic relation existing between the mean flow and the NIW amplitude. We next consider another two-dimensional model in which the NIW–mean-flow interactions lead to an irreversible behaviour.

Figure 1. Energy exchange in the slice model: the changes in the mean energy (solid line), NIW potential energy (dashed line) and total energy (dotted line) are shown as functions of time. These energy changes are normalised by the initial mean-flow energy in the mixed layer, $z\in [-50,0]~\text{m}$ . The increase of NIW potential energy is offset by a mean energy loss, resulting in a total energy that is conserved up to a small hyperviscous dissipation added for numerical stability.

Figure 2. Wave amplitude $|{\it\chi}_{z}|$ (a–c) and change in the mean velocity $V={\it\psi}_{x}$ (d–f) in the slice model; $|{\it\chi}_{z}|$ and $V$ are non-dimensionalised by ${\it\alpha}L$ and $U_{QG}$ , respectively. The downward propagating NIWs induce a mean flow change, which slows down the original mean flow. Times: (a,d) $t=17.4$ days, (b,e) $t=34.7$ days, (c,f) $t=52.1$ days.

6.2. Vertically plane wave

A simple two-dimensional model in the $(x,y)$ plane is obtained by assuming that the wavefield takes the form of a plane wave in the vertical, that is, ${\it\chi}_{z}={\it\varphi}(x,y,t)\text{e}^{\text{i}mz}$ for some complex function ${\it\varphi}$ and vertical wavenumber $m$ . This is consistent with a barotropic mean flow ${\it\psi}={\it\psi}(x,y,t)$ . Introducing this restricted form of the solution into the coupled model (2.7) reduces it to

(6.4a )

$$\begin{eqnarray}\displaystyle & \displaystyle \partial _{t}{\it\varphi}+\partial ({\it\psi},{\it\varphi})+\text{i}{\it\beta}y{\it\varphi}-\frac{\text{i}N^{2}}{2m^{2}f_{0}}{\rm\nabla}^{2}{\it\varphi}+\frac{\text{i}}{2}{\rm\nabla}^{2}{\it\psi}\,{\it\varphi}=0, & \displaystyle\end{eqnarray}$$

(6.4b )

$$\begin{eqnarray}\displaystyle & \partial _{t}q+\partial ({\it\psi},q)=0, & \displaystyle\end{eqnarray}$$

where

(6.5)

$$\begin{eqnarray}q={\it\beta}y+{\rm\nabla}^{2}{\it\psi}+\frac{\text{i}f_{0}}{2}\partial ({\it\varphi}^{\ast },{\it\varphi})+\frac{f_{0}}{4}{\rm\nabla}^{2}|{\it\varphi}|^{2}.\end{eqnarray}$$

As an illustration, we consider the propagation of a vorticity dipole in a NIW field on the $f$ -plane ( ${\it\beta}=0$ ). We carry out simulations initialising the streamfunction ${\it\psi}$ to match the vorticity

(6.6)

$$\begin{eqnarray}{\it\omega}={\rm\nabla}^{2}{\it\psi}=\left\{\begin{array}{@{}ll@{}}\displaystyle \frac{2kU}{\text{J}_{0}({\it\kappa}a)}\text{J}_{1}({\it\kappa}r)\sin {\it\theta},\quad & r<a\\ 0,\quad & r>a,\end{array}\right.\end{eqnarray}$$

of the Lamb (Reference Lamb1932) dipole propagating at speed $U$ in the $y$ direction. Here $(r,{\it\theta})$ are polar coordinates, $a$ characterises the spatial scale of the dipole, $\text{J}_{n}$ are the Bessel functions of the first kind of order $n$ , and ${\it\kappa}$ is determined by solving the matching condition $\text{J}_{1}({\it\kappa}a)=0$ .

We carry out a numerical simulation in a periodic domain of size $500~\text{km}\times 500~\text{km}$ using a pseudospectral method. Because of the periodisation, the vorticity (6.6) does not exactly correspond to that of a dipole steadily propagating at speed $U$ ; however, for the dipole size $a=40~\text{km}$ that we take, the differences are minor. We take the other parameters to be $U=0.05~\text{m}~\text{s}^{-1}$ , $f_{0}=10^{-4}~\text{s}^{-1}$ , and $N=0.01~\text{s}^{-1}$ . Taking $L=a$ gives a Rossby number $\mathit{Ro}=0.0125$ . The initial wave amplitude is chosen as the Gaussian

(6.7)

$$\begin{eqnarray}{\it\varphi}=A\text{e}^{-(k_{0}(y-y_{0}))^{2}},\end{eqnarray}$$

where $A=1.5~\text{km}$ , $k_{0}=2\times 10^{-5}~\text{m}^{-1}$ and $y_{0}=250~\text{km}$ . This implies that ${\it\alpha}=A/L=0.0375$ and $U_{NIW}=0.15~\text{m}~\text{s}^{-1}$ . The vertical scale of the wave is taken as $m=0.02~\text{m}^{-1}$ , so ${\it\epsilon}=0.125$ . We therefore have that $\mathit{Ro}<{\it\alpha}<{\it\epsilon}\approx \mathit{Ro}^{1/2}$ , consistent with our scaling. The initial position of the dipole ( $r=0$ ) and wavepacket (maximum of $|{\it\varphi}|$ ) are $(0.5,\,0.3)$ and $y=0.5$ when distances are normalised by the domain size of $500~\text{km}$ .

We report the results of an integration time of $t=1.5\times 10^{7}~\text{s}\approx 173~\text{days}$ , within which the dipole travels about 1.5 times the domain size. The changes in mean and wave energies (normalised by the initial mean energy) are shown as functions of time in figure 3. As in the slice model, the increase of NIW energy is compensated by a loss of mean-flow energy. Using (5.2) and ${\it\epsilon}=0.1$ , we can estimate the relative mean energy change to be about 0.05, in agreement with the numerical results. The initial and final streamfunction ${\it\psi}$ and wave amplitude $|{\it\varphi}|$ are shown in figure 4. This also shows the trajectories of the vorticity maximum and minimum as an indication of the dipole’s trajectory. The NIWs, which partly concentrate in the anticyclonic core of the dipole through a well-established mechanism (e.g. Danioux, Vanneste & Bühler Reference Danioux, Vanneste and Bühler2015, and references therein), have an obvious impact on the mean flow: instead of propagating in a straight line $x=\text{const.}$ , the dipole deforms and is deflected to the left. This illustrates the irreversible nature of the wave–mean-flow interactions when, unlike in the slice model, the PV is not constant. The phenomenon is reminiscent of the deflection of dipoles observed by Snyder et al. (Reference Snyder, Muraki, Plougonven and Zhang2007) in simulations of the spontaneous generation of inertia–gravity waves by dipoles; there is a possible connection that might be worth exploring.

Figure 3. The same as figure 1 but for the simulation of a vortex dipole propagating in a field of vertically travelling NIWs. The energy changes are normalised by the initial mean-flow energy.

Figure 4. NIW-dipole interaction: initial (a,c) and final (b,d) streamfunction ${\it\psi}$ (a,b) and NIW amplitude $|{\it\varphi}|$ (c,d). Both ${\it\psi}$ and $|{\it\varphi}|$ have been normalised by their maximum value at the initial time. The trajectories of the vorticity maximum and minimum shown by the thick black lines in (b) indicate the motion of the dipole during the simulation.

7. Discussion

In this paper we derive and study a model of the interactions between slow balanced motion and fast NIWs in the ocean. The model is obtained within the GLM framework (e.g. Bühler Reference Bühler2009) or, more precisely, its glm variant (Soward & Roberts Reference Soward and Roberts2010), and neglects dissipative effects. In its simplest form (2.7), the model consists of the YBJ model of NIW propagation (Young & Ben Jelloul Reference Young and Ben Jelloul1997) coupled with a modified QG equation. As expected from general GLM theory (Bühler & McIntyre Reference Bühler and McIntyre1998; Bühler Reference Bühler2009; Salmon Reference Salmon2013), the modification consists solely in a change in the relation between streamfunction and PV which adds to the standard QGPV a quadratic wave contribution. (A comparison between averaging formalisms – glm, GLM and others – in appendix C shows that this wave contribution arises as the sum of the curl of a pseudomomentum, a wave-induced mean-stratification change and a mean-density change, with the exact form of each term depending on the formalism.) Thus NIWs impact the dynamics of PV by changing its advection in what is, in general, an irreversible manner. The assumption that the waves are near-inertial leads to drastic simplifications, reducing the wave part of the dynamics to the YBJ equation for a single (complex) amplitude ${\it\chi}$ evolving on the same time scale as the balanced flow.

Our YBJ–QG coupled model can be thought of as providing a parametrisation of NIW effects, with the fast NIWs regarded as a subgrid phenomenon in time. In this view, the YBJ is an asymptotically motivated closure for the NIWs: it provides enough information about the NIWs to compute their impact on the balanced flow. We emphasise that the derivation relies on a scale separation in time only and does not assume that the waves have small spatial scales, unlike previous applications of GLM (Gjaja & Holm Reference Gjaja and Holm1996; Bühler & McIntyre Reference Bühler and McIntyre1998). This is crucial for NIWs since they are forced by atmospheric winds at horizontal scales that are much larger than the oceanic mesoscales. It is also practically convenient since the YBJ and QG equations can be solved numerically on the same grid, so that the coupled model requires only about three times as much computational effort as the standard QG equation.

As discussed in § 2, the model is not fully consistent asymptotically. This is because the different aspect ratios it assumes for NIWs and balanced motion, specifically $m/k={\it\epsilon}^{-1}N/f_{0}\gg N/f_{0}$ and $L/H=O(N/f_{0})$ , cannot be expected to persist: the feedback of the NIWs implies that their aspect ratio is imprinted onto the balanced flow, leading to an increase in $L/H$ and potentially to a breakdown of the assumption of order-one Burger number that underlies the QG approximation. In practice this may not be significant: the NIWs contribute to the QG velocity $\boldsymbol{{\rm\nabla}}^{\bot }{\it\psi}$ through a term that is twice as smooth in the vertical than the NIW amplitude ${\it\chi}_{z}$ itself (because of the Helmholtz inversion in (2.9)). As a result, short vertical fluctuations in ${\it\chi}_{z}$ have a limited impact on $\boldsymbol{{\rm\nabla}}^{\bot }{\it\psi}$ . Furthermore, in the case of locally planar NIWs, it is the envelope scale that is imprinted onto $\boldsymbol{{\rm\nabla}}^{\bot }{\it\psi}$ rather than the (much shorter) wavelength. Finally, the existence of a coupled YBJ–primitive-equation model with conservations of PV, energy and action analogous to those of the YBJ–QG model suggests that conclusions inferred from the latter model are robust. Nonetheless, it might be desirable to treat the difference in the vertical scales $H$ and $m^{-1}$ in a fully consistent way by applying a multiscale method in space as well as in time. It is unclear, however, whether a model derived in this manner would be significantly different from the YBJ–QG model.

In this paper we discuss some qualitative aspects of the interactions between balanced flow and NIWs in the ocean, mostly based on the remarkably simple action and energy conservation laws of the YBJ–QG model. The conservation of action implies the complete absence of spontaneous NIW generation in the model, consistent with the expected exponentially small size of this phenomenon (Vanneste Reference Vanneste2013). The conservation laws further indicate that NIWs forced at large scales by atmospheric winds provide an energy sink for the oceanic balanced motion through a mechanism that can be termed ‘stimulated wave generation’. This is potentially significant: several mechanisms have been proposed to explain the dissipation of mesoscale energy but it is far from clear whether they are efficient enough to balance the flux imposed by the energy source (mainly baroclinic instability). We offer a rough estimate of the power extracted from the mean flow by the mechanism we have identified; this suggests that further consideration is worthwhile. More reliable estimates would require intensive numerical simulations of the YBJ–QG or of the primitive equations and are well beyond the scope of this paper.

Acknowledgements

The authors thank O. Bühler, E. Danioux, D. N. Straub, S. M. Taylor, G. L. Wagner and W. R. Young for valuable discussions. W. R. Young proposed the term ‘stimulated generation’ for the mechanism described in § 5. This research is funded by the UK Natural Environment Research Council (grant NE/J022012/1). J.-H.X. acknowledges financial support from the Centre for Numerical Algorithms and Intelligent Software (NAIS).

Appendix A. glm average

In glm, the map from mean to perturbed positions is written in terms of a divergence-free vector field, ${\bf\nu}(\boldsymbol{X},t)$ say, as

(A 1)

$$\begin{eqnarray}\boldsymbol{X}+{\bf\xi}(\boldsymbol{X},t)=\text{e}^{{\bf\nu}}\boldsymbol{X}.\end{eqnarray}$$

Here the exponential denotes the flow map generated by ${\bf\nu}$ , that is, defining $\boldsymbol{x}(s)$ as the solution of

(A 2)

$$\begin{eqnarray}\frac{\text{d}}{\text{d}s}\boldsymbol{x}(s)={\bf\nu}(\boldsymbol{x}(s),t),\quad \text{where}~\boldsymbol{x}(0)=\boldsymbol{X}\end{eqnarray}$$

and $t$ is regarded as a fixed parameter, $\text{e}^{{\bf\nu}}\boldsymbol{X}=\boldsymbol{x}(1)$ . The glm average is then defined by the condition

(A 3)

$$\begin{eqnarray}\langle {\bf\nu}\rangle =0,\end{eqnarray}$$

which replaces GLM’s condition $\langle {\bf\xi}\rangle =0$ (Soward & Roberts Reference Soward and Roberts2010; note that we use the symbol ${\bf\nu}$ for the vector field denoted by ${\it\eta}$ in their paper). The divergence-free property of ${\bf\nu}$ ensures that (A 1) preserves volume. For small perturbations ${\it\alpha}\ll 1$ , it is easy to relate ${\bf\xi}$ to ${\bf\nu}$ order-by-order in ${\it\alpha}$ . Expanding ${\bf\xi}$ according to (3.5) and, similarly, ${\bf\nu}$ according to ${\bf\nu}={\bf\nu}^{(1)}+{\bf\nu}^{(2)}+\cdots \,$ , we can use (A 1) to write

(A 4)

$$\begin{eqnarray}{\bf\xi}={\bf\nu}+{\textstyle \frac{1}{2}}{\bf\nu}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}_{3}{\bf\nu}+\cdots ={\bf\nu}^{(1)}+\left({\textstyle \frac{1}{2}}{\bf\nu}^{(1)}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}_{3}{\bf\nu}^{(1)}+{\bf\nu}^{(2)}\right)+\cdots \,.\end{eqnarray}$$

Identifying the first two orders in ${\it\alpha}$ yields

(A 5a,b )

$$\begin{eqnarray}{\bf\nu}^{(1)}={\bf\xi}^{(1)}\quad \text{and}\quad {\bf\nu}^{(2)}={\bf\xi}^{(2)}-{\textstyle \frac{1}{2}}{\bf\nu}^{(1)}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}_{3}{\bf\nu}^{(1)}.\end{eqnarray}$$

The condition (A 3) then becomes

(A 6)

$$\begin{eqnarray}\langle {\bf\xi}^{(2)}\rangle ={\textstyle \frac{1}{2}}\langle {\bf\nu}^{(1)}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}_{3}{\bf\nu}^{(1)}\rangle ={\textstyle \frac{1}{2}}\langle {\bf\xi}^{(1)}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}_{3}{\bf\xi}^{(1)}\rangle .\end{eqnarray}$$

Appendix B. Mean dynamics

Following Salmon (Reference Salmon2013), the equations governing the mean dynamics are derived from the energy–momentum equations

(B 1)

$$\begin{eqnarray}\frac{\partial }{\partial X^{j}}\left(a_{R}^{i}\frac{\partial \langle L\rangle }{\partial a_{X^{j}}^{i}}\right)=\frac{\partial \langle L\rangle }{\partial R}-\left.\frac{\partial \langle L\rangle }{\partial R}\right|_{expl}^{{\it\chi}}\end{eqnarray}$$

applied to the density $\langle L\rangle$ associated with the Lagrangian equation (3.10) (i.e. $\langle L\rangle$ is the integrand in the expression of $\langle \mathscr{L}\rangle$ ). In the energy–momentum equations, $(X^{0},X^{1},X^{2},X^{3})=(T,X,Y,Z)$ , $(a^{1},a^{2},a^{3})=(a,b,{\it\theta})$ and Einstein’s summation convention is used; $R$ can be taken to be $T$ , leading to an energy equation, or $X$ , $Y$ or $Z$ , leading to the corresponding momentum equations. The sub- and superscript ‘expl’ and ${\it\chi}$ attached to the last term in (B 1) indicate derivatives of the terms that depend explicitly on $R$ , treating the dependence introduced by ${\it\chi}$ as such an explicit dependence; in other words, the right-hand side of (B 1) collects derivatives associated with the mean flow only.

To keep expressions compact, we make the following definitions:

(B 2a )

$$\begin{eqnarray}\displaystyle A & \equiv & \displaystyle \frac{1}{J}\frac{{\it\delta}\langle \mathscr{L}\rangle }{{\it\delta}U}=U-\left(f_{0}Y+\frac{1}{2}{\it\beta}Y^{2}\right)+A^{\prime }\nonumber\\ \displaystyle & = & \displaystyle U-\left(f_{0}Y+\frac{1}{2}{\it\beta}Y^{2}\right)-\frac{\text{i}f_{0}}{4}({\it\chi}_{Z}{\it\chi}_{ZX}^{\ast }-{\it\chi}_{Z}^{\ast }{\it\chi}_{ZX})-f_{0}Y\langle {\it\xi}^{(2)}\rangle _{X}-f_{0}\langle {\it\eta}^{(2)}\rangle ,\end{eqnarray}$$

(B 2b )

$$\begin{eqnarray}\displaystyle B & \equiv & \displaystyle \frac{1}{J}\frac{{\it\delta}\langle \mathscr{L}\rangle }{{\it\delta}V}=V+B^{\prime }\nonumber\\ \displaystyle & = & \displaystyle V-\frac{\text{i}f_{0}}{4}({\it\chi}_{Z}{\it\chi}_{ZY}^{\ast }-{\it\chi}_{Z}^{\ast }{\it\chi}_{ZY})-f_{0}Y\langle {\it\xi}^{(2)}\rangle _{Y},\end{eqnarray}$$

(B 2c )

$$\begin{eqnarray}\displaystyle C & \equiv & \displaystyle \frac{1}{J}\frac{{\it\delta}\langle \mathscr{L}\rangle }{{\it\delta}W}=C^{\prime }\nonumber\\ \displaystyle & = & \displaystyle -\frac{\text{i}f_{0}}{4}({\it\chi}_{Z}{\it\chi}_{ZZ}^{\ast }-{\it\chi}_{Z}^{\ast }{\it\chi}_{ZZ})-f_{0}Y\langle {\it\xi}^{(2)}\rangle _{Z},\end{eqnarray}$$

(B 2d )

$$\begin{eqnarray}\displaystyle E & \equiv & \displaystyle \frac{{\it\delta}\langle L\rangle }{{\it\delta}J}=\frac{1}{2}(U^{2}+V^{2})-\left(f_{0}Y+\frac{1}{2}{\it\beta}Y^{2}\right)U+{\it\theta}Z+P+E^{\prime }\nonumber\\ \displaystyle & = & \displaystyle \frac{1}{2}(U^{2}+V^{2})-\left(f_{0}Y+\frac{1}{2}{\it\beta}Y^{2}\right)U+{\it\theta}Z+P\nonumber\\ \displaystyle & & \displaystyle -\,\frac{\text{i}f_{0}}{4}({\it\chi}_{Z}D_{T}{\it\chi}_{Z}^{\ast }-{\it\chi}_{Z}^{\ast }D_{T}{\it\chi}_{Z})-\frac{1}{2}f_{0}{\it\beta}Y|{\it\chi}_{Z}|^{2}\nonumber\\ \displaystyle & & \displaystyle -\,f_{0}YD_{T}\langle {\it\xi}^{(2)}\rangle -f_{0}\langle {\it\eta}^{(2)}\rangle U+{\it\theta}\langle {\it\zeta}^{(2)}\rangle ,\end{eqnarray}$$

where

$A^{\prime }$ ,

$B^{\prime }$ ,

$C^{\prime }$ and

$E^{\prime }$ group the NIW contributions. Note that

$(A^{\prime },B^{\prime },C^{\prime })$ is the wave pseudomomentum. The terms in the energy–momentum tensor (B 1) for

$R=T$ can then be written as

(B 3)

$$\begin{eqnarray}\displaystyle a_{R}^{i}\frac{\partial \langle L\rangle }{\partial a_{T}^{i}} & = & \displaystyle a_{R}^{i}\frac{\partial U^{j}}{\partial a_{T}^{i}}\frac{\partial \langle L\rangle }{\partial U^{j}}\nonumber\\ \displaystyle & = & \displaystyle -\frac{1}{J}\frac{\partial \langle L\rangle }{\partial U}\frac{\partial (a,b,{\it\theta})}{\partial (R,Y,Z)}-\frac{1}{J}\frac{\partial \langle L\rangle }{\partial V}\frac{\partial (a,b,{\it\theta})}{\partial (X,R,Z)}-\frac{1}{J}\frac{\partial \langle L\rangle }{\partial W}\frac{\partial (a,b,{\it\theta})}{\partial (X,Y,R)}\nonumber\\ \displaystyle & = & \displaystyle -A\frac{\partial (a,b,{\it\theta})}{\partial (R,Y,Z)}-B\frac{\partial (a,b,{\it\theta})}{\partial (X,R,Z)}-C\frac{\partial (a,b,{\it\theta})}{\partial (X,Y,R)}\end{eqnarray}$$

when (3.4) is used. Similarly, for $R=X,\,Y,\,Z$ , we obtain

(B 4a )

$$\begin{eqnarray}\displaystyle & \displaystyle a_{R}^{i}\frac{\partial \langle L\rangle }{\partial a_{X}^{i}}=-B\frac{\partial (a,b,{\it\theta})}{\partial (R,T,Z)}-C\frac{\partial (a,b,{\it\theta})}{\partial (R,Y,T)}+(E-UA-VB-WC)\frac{\partial (a,b,{\it\theta})}{\partial (R,Y,Z)}, & \displaystyle \nonumber\\ \displaystyle & & \displaystyle\end{eqnarray}$$

(B 4b )

$$\begin{eqnarray}\displaystyle & \displaystyle a_{R}^{i}\frac{\partial \langle L\rangle }{\partial a_{Y}^{i}}=-A\frac{\partial (a,b,{\it\theta})}{\partial (T,R,Z)}-C\frac{\partial (a,b,{\it\theta})}{\partial (X,R,T)}+(E-UA-VB-WC)\frac{\partial (a,b,{\it\theta})}{\partial (X,R,Z)}, & \displaystyle \nonumber\\ \displaystyle & & \displaystyle\end{eqnarray}$$

(B 4c )

$$\begin{eqnarray}\displaystyle & \displaystyle a_{R}^{i}\frac{\partial \langle L\rangle }{\partial a_{Z}^{i}}=-A\frac{\partial (a,b,{\it\theta})}{\partial (T,Y,R)}-B\frac{\partial (a,b,{\it\theta})}{\partial (X,T,R)}+(E-UA-VB-WC)\frac{\partial (a,b,{\it\theta})}{\partial (X,Y,R)}. & \displaystyle \nonumber\\ \displaystyle & & \displaystyle\end{eqnarray}$$

Using (B 3) and (B 4), the momentum equations are derived from (B 1) with

$R=X,Y,Z$ in the form

(B 5a )

$$\begin{eqnarray}\displaystyle & -D_{T}A+E_{X}=AU_{X}+BV_{X}+CW_{X}+(Z+\langle {\it\zeta}^{(2)}\rangle ){\it\theta}_{X}, & \displaystyle\end{eqnarray}$$

(B 5b )

$$\begin{eqnarray}\displaystyle & -D_{T}B+E_{Y}=AU_{Y}+BV_{Y}+CW_{Y}+(Z+\langle {\it\zeta}^{(2)}\rangle ){\it\theta}_{Y}, & \displaystyle\end{eqnarray}$$

(B 5c )

$$\begin{eqnarray}\displaystyle & -D_{T}C+E_{Z}=AU_{Z}+BV_{Z}+CW_{Z}+(Z+\langle {\it\zeta}^{(2)}\rangle ){\it\theta}_{Z}. & \displaystyle\end{eqnarray}$$

Introducing the explicit forms (B 2) of

$A,B,C$ and

$D$ leads, after simplifications, to (3.14).

Appendix C. Alternative derivation

In this appendix we show that the QGPV equation (2.7b ) can be obtained directly from PV conservation. In this procedure GLM, glm and indeed any definition of the average $\langle {\bf\xi}^{(2)}\rangle$ gives the same leading-order dynamics because the associated mean-flow maps are $O({\it\alpha}^{2})$ close. The wave contributions to the mean dynamics come from different sources depending on the definition of the average, but their total effect is the same.

We start from the general Lagrangian equation (3.6). Taking ${\it\delta}P$ variation we obtain

(C 1)

$$\begin{eqnarray}J=1+\boldsymbol{{\rm\nabla}}_{3}\boldsymbol{\cdot }\left(\langle {\bf\xi}^{(2)}\rangle -{\textstyle \frac{1}{2}}\langle {\bf\xi}^{(1)}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}{\bf\xi}^{(1)}\rangle \right).\end{eqnarray}$$

The relabelling symmetry of Lagrangian equation (3.6) gives PV conservation

(C 2)

$$\begin{eqnarray}D_{T}\left(\frac{\boldsymbol{{\rm\nabla}}{\it\theta}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}\times \boldsymbol{A}}{J}\right)=0,\end{eqnarray}$$

where $\boldsymbol{A}=(A,\,B,\,C)$ are defined as in (B 2) but with the Lagrangian equation (3.6) in place of (3.10) (Salmon Reference Salmon2013).

Under QG scaling and using the buoyancy equation (3.24) to replace $W$ in the above equation, we obtain

(C 3)

$$\begin{eqnarray}D_{T}^{0}\left(\frac{N^{2}(B_{X}-A_{Y})+f_{0}{\it\theta}_{Z}^{\prime }}{J}\right)-\frac{f_{0}}{N^{2}}D_{T}^{0}({\it\theta}^{\prime }(N^{2})_{Z})=0,\end{eqnarray}$$

where ${\it\theta}$ follows the definition (3.21). By substituting

(C 4)

$$\begin{eqnarray}B_{X}-A_{Y}=f_{0}+{\it\beta}Y+{\rm\nabla}^{2}{\it\psi}+\frac{\text{i}f_{0}}{2}\partial ({\it\chi}_{Z}^{\ast },{\it\chi}_{Z})+f_{0}\langle \partial _{x}{\it\xi}^{(2)}+\partial _{y}{\it\eta}^{(2)}\rangle ,\end{eqnarray}$$

and (C 1), we obtain the modified QGPV equation

(C 5)

$$\begin{eqnarray}D_{T}^{0}\left(f_{0}+{\it\beta}Y+{\rm\nabla}^{2}{\it\psi}+\partial _{Z}\left(\frac{f_{0}^{2}}{N^{2}}\partial _{Z}{\it\psi}\right)+\frac{\text{i}f_{0}}{2}\partial ({\it\chi}_{Z}^{\ast },{\it\chi}_{Z})+\frac{f_{0}}{2}\boldsymbol{{\rm\nabla}}\boldsymbol{\cdot }\langle {\bf\xi}^{(1)}\boldsymbol{\cdot }\boldsymbol{{\rm\nabla}}_{3}{\bf\xi}^{(1)}\rangle \right)=0,\end{eqnarray}$$

identical to (2.7b ) since the last term is equal to $fG({\it\chi}^{\ast },{\it\chi})$ . Note that the cancellation of the second-order mean displacements (term $\boldsymbol{{\rm\nabla}}_{3}\boldsymbol{\cdot }\langle {\bf\xi}^{(2)}\rangle$ ) indicates that this equation is independent of the specific averaging used to define the Lagrangian mean. In contrast, the individual wave contributions to the QGPV, namely the curl of the pseudomomentum (wave terms in (C 4)), the buoyancy term $N^{2}f_{0}\langle \partial _{Z}{\it\zeta}^{(2)}\rangle$ and the density correction (divergence in (C 1)) depend on the averaging used. A relation reducing to (C 5) for NIWs was derived by Holmes-Cerfon et al. (Reference Holmes-Cerfon, Bühler and Ferrari2011, their (3.10)) using GLM theory.

References

Andrews, D. G. & McIntyre, M. E. 1978 An exact theory of nonlinear waves on a Lagrangian-mean flow. J. Fluid Mech. 89 (4), 609–646.CrossRef Google Scholar

Badulin, S. I. & Shrira, V. I. 1993 On the irreversibility of internal-wave dynamics due to wave trapping by mean flow inhomogeneities. Part 1. Local analysis. J. Fluid Mech. 251, 21–53.CrossRef Google Scholar

Balmforth, N., Llewellyn-Smith, S. G. & Young, W. R. 1998 Enhanced dispersion of near-inertial waves in an idealised geostrophic flow. J. Mar. Res. 56, 1–40.CrossRef Google Scholar

Berestetskii, V. B., Lifshitz, E. M. & Pitaevskii, L. P. 1982 Quantum Electrodynamics, 2nd edn. Cambridge University Press.Google Scholar

Bokhove, O., Vanneste, J. & Warn, T. 1998 A variational formulation for barotropic quasi-geostrophic flows. Geophys. Astrophys. Fluid Dyn. 88, 67–79.CrossRef Google Scholar

Bühler, O. 2009 Waves and Mean Flows. Cambridge University Press.CrossRef Google Scholar

Bühler, O. & McIntyre, M. E. 1998 On non-dissipative wave–mean interactions in the atmosphere or oceans. J. Fluid Mech. 354, 301–343.CrossRef Google Scholar

Bühler, O. & McIntyre, M. E. 2005 Wave capture and wave-vortex duality. J. Fluid Mech. 534, 67–95.CrossRef Google Scholar

Cotter, C. J. & Reich, S. 2004 Adiabatic invariance and applications: from molecular dynamics to numerical weather prediction. BIT 44, 439–455.CrossRef Google Scholar

Danioux, E., Vanneste, J. & Bühler, O. 2015 On the concentration of near-inertial waves in anticyclones. J. Fluid Mech. 773, R2.CrossRef Google Scholar

Danioux, E., Vanneste, J., Klein, P. & Sasaki, H. 2012 Spontaneous inertia–gravity-wave generation by surface-intensified turbulence. J. Fluid Mech. 699, 153–157.CrossRef Google Scholar

D’Asaro, E. A., Eriksen, C. C., Levine, M. D., Paulson, C. A., Niiler, P. & Meurs, P. V. 1995 Upper-ocean inertial currents forced by a strong storm. Part I. Data and comparisons with linear theory. J. Phys. Oceanogr. 25, 2909–2936.2.0.CO;2>CrossRef Google Scholar

Duhaut, T. H. A. & Straub, D. N. 2006 Wind stress dependence on ocean surface velocity: implications for mechanical energy input to ocean circulation. J. Phys. Oceanogr. 36, 202–211.CrossRef Google Scholar

Falkovich, G., Kuznetsov, E. & Medvedev, S. 1994 Nonlinear interaction between long inertio-gravity and Rossby waves. Nonlinear Process. Geophys. 1, 168–171.CrossRef Google Scholar

Ferrari, R. & Wunsch, C. 2009 Ocean circulation kinetic energy: reservoirs, sources, and sinks. Annu. Rev. Fluid Mech. 31, 962–971.Google Scholar

Fu, L.-L. 1981 Observations and models of inertial waves in the deep ocean. Rev. Geophys. Space Phys. 19, 141–170.CrossRef Google Scholar

Garrett, C. 2001 What is the ‘near-inertial’ band and why is it different from the rest of the internal wave spectrum? J. Phys. Oceanogr. 41, 253–282.Google Scholar

Gertz, A. & Straub, D. N. 2009 Near-inertial oscillations and the damping of midlatitude gyres: a modeling study. J. Phys. Oceanogr. 39, 2338–2350.CrossRef Google Scholar

Gjaja, I. & Holm, D. D. 1996 Self-consistent Hamiltonian dynamics of wave mean-flow interaction for a rotating stratified incompressible fluid. Physica D 98, 343–378.CrossRef Google Scholar

Goldstein, H. 1980 Classical Mechanics, 2nd edn. Addison-Wesley.Google Scholar

Grimshaw, R. 1984 Wave action and wave–mean flow interaction, with application to stratified shear flows. Annu. Rev. Fluid Mech. 16, 11–44.CrossRef Google Scholar

Holliday, D. & McIntyre, M. E. 1981 On potential energy density in an incompressible stratified fluid. J. Fluid Mech. 107, 221–225.CrossRef Google Scholar

Holm, D. D., Schmah, T. & Stoica, C. 2009 Geometric Mechanics and Symmetry. Oxford University Press.CrossRef Google Scholar

Holmes-Cerfon, M., Bühler, O. & Ferrari, R. 2011 Particle dispersion by random waves in the rotating Boussinesq system. J. Fluid Mech. 670, 150–175.CrossRef Google Scholar

Hunter, J. K. & Ifrim, M. 2013 A quasi-linear Schrödinger equation for large amplitude inertial oscillations in a rotating shallow fluid. IMA J. Appl. Maths 78, 777–796.CrossRef Google Scholar

Kunze, E. 1985 Near-inertial wave propagation in geostrophic shear. J. Phys. Oceanogr. 15, 544–565.2.0.CO;2>CrossRef Google Scholar

Lamb, H. 1932 Hydrodynamics, 6th edn. Cambridge University Press.Google Scholar

Medvedev, S. B. & Zeitlin, V. 1997 Turbulence of near-inertial waves in the continuously stratified fluid. Phys. Lett. A 371 (3), 221–227.CrossRef Google Scholar

Mooers, C. N. K. 1975a Several effects of a baroclinic current on the cross-stream propagation of inertial-internal waves. Geophys. Fluid Dyn. 6, 245–275.CrossRef Google Scholar

Mooers, C. N. K. 1975b Several effects of baroclinic currents on the three-dimensional propagation of inertial-internal waves. Geophys. Fluid Dyn. 6, 277–284.CrossRef Google Scholar

Nikurashin, M., Vallis, G. K. & Adcroft, A. 2013 Routes to energy dissipation for geostrophic flows in the Southern Ocean. Nat. Geosci. 6, 48–51.CrossRef Google Scholar

Oliver, M. 2006 Variational asymptotics for rotating shallow water near geostrophy: a transformational approach. J. Fluid Mech. 551, 197–234.CrossRef Google Scholar

Salmon, R. 1988 Hamiltonian fluid mechanics. Annu. Rev. Fluid Mech. 20, 225–256.CrossRef Google Scholar

Salmon, R. 2013 An alternative view of generalized Lagrangian mean theory. J. Fluid Mech. 719, 165–182.CrossRef Google Scholar

Shepherd, T. G. 1990 Symmetries, conservation laws and Hamiltonian structure in geophysical fluid dynamics. Adv. Geophys. 32, 287–338.CrossRef Google Scholar

Snyder, C., Muraki, D., Plougonven, R. & Zhang, F. 2007 Inertia–gravity waves generated within a dipole vortex. J. Atmos. Sci. 64, 4417–4431.CrossRef Google Scholar

Soward, A. M. & Roberts, P. H. 2010 The hybrid Euler–Lagrange procedure using an extension of Moffatt’s method. J. Fluid Mech. 661, 45–72.CrossRef Google Scholar

Vallis, G. K. 2006 Atmospheric and Oceanic Fluid Dynamics: Fundamentals and Large-Scale Circulation. Cambridge University Press.CrossRef Google Scholar

Vanneste, J. 2013 Balance and spontaneous generation in geophysical flows. Annu. Rev. Fluid Mech. 45, 147–172.CrossRef Google Scholar

Vanneste, J.2014 Deriving the Young–Ben Jelloul model of near-inertial waves by Whitham averaging. arXiv:1410.0253.Google Scholar

Whitham, G. B. 1974 Linear and Nonlinear Waves. Wiley.Google Scholar

Wunsch, C. & Ferrari, R. 2004 Vertical mixing, energy, and the general circulation of the oceans. Annu. Rev. Fluid Mech. 36, 281–314.CrossRef Google Scholar

Young, W. R. & Ben Jelloul, M. 1997 Propagation of near-inertial oscillations through a geostrophic flow. J. Mar. Res. 55 (4), 735–766.CrossRef Google Scholar

Young, W. R., Tsang, Y.-K. & Balmforth, N. J. 2008 Near-inertial parametric subharmonic instability. J. Fluid Mech. 607, 25–49.CrossRef Google Scholar

Zeitlin, V., Reznik, G. M. & Ben Jelloul, M. 2003 Nonlinear theory of geostrophic adjustment. Part 2. Two-layer and continuously stratified primitive equations. J. Fluid Mech. 491, 207–228.CrossRef Google Scholar

Figure 1. Energy exchange in the slice model: the changes in the mean energy (solid line), NIW potential energy (dashed line) and total energy (dotted line) are shown as functions of time. These energy changes are normalised by the initial mean-flow energy in the mixed layer, $z\in [-50,0]~\text{m}$. The increase of NIW potential energy is offset by a mean energy loss, resulting in a total energy that is conserved up to a small hyperviscous dissipation added for numerical stability.

Figure 2. Wave amplitude $|{\it\chi}_{z}|$ (a–c) and change in the mean velocity $V={\it\psi}_{x}$ (d–f) in the slice model; $|{\it\chi}_{z}|$ and $V$ are non-dimensionalised by ${\it\alpha}L$ and $U_{QG}$, respectively. The downward propagating NIWs induce a mean flow change, which slows down the original mean flow. Times: (a,d) $t=17.4$ days, (b,e) $t=34.7$ days, (c,f) $t=52.1$ days.

Figure 3. The same as figure 1 but for the simulation of a vortex dipole propagating in a field of vertically travelling NIWs. The energy changes are normalised by the initial mean-flow energy.

Article contents

A generalised-Lagrangian-mean model of the interactions between near-inertial waves and mean flow

Abstract

JFM classification

1. Introduction

2. Coupled model

2.1. Model

2.2. Some properties

2.3. Scaling assumptions

3. Derivation of the coupled model

3.1. Lagrangian and wave–mean decomposition

3.2. Coupled YBJ–primitive-equation model

3.3. Quasi-geostrophic approximation

4. Conservation laws and Hamiltonian structure

5. Implications

6. Two-dimensional models

6.1. Slice model

6.2. Vertically plane wave

7. Discussion

Acknowledgements

Appendix A. glm average

Appendix B. Mean dynamics

Appendix C. Alternative derivation

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests