1. Introduction
We are interested in the horizontal spreading of particles along stratification surfaces due to the advection by random small-amplitude gravity waves. On the one hand this process poses a fundamental problem in geophysical fluid dynamics in its own right (e.g. Herterich & Hasselmann Reference Herterich and Hasselmann1982; Sanderson & Okubo Reference Sanderson and Okubo1988; Weichman & Glazman Reference Weichman and Glazman2000; Balk, Falkovich & Stepanov Reference Balk, Falkovich and Stepanov2004; Balk Reference Balk2006), and on the other hand it might also be relevant to explain part of the actual tracer diffusion observed at very small horizontal scales in the ocean (e.g. Ledwell, Watson & Law Reference Ledwell, Watson and Law1998; Garrett Reference Garrett2006). The simplest quantity that can quantify this process is the one-particle diffusivity $D$ of Taylor (Reference Taylor1921), which in a horizontally isotropic situation can be computed based on a single Cartesian velocity component $U(t)$ , say, which the Lagrangian velocity following a fluid particle. If $U(t)$ is a stationary zero-mean random function with covariance function
Here $\mathbb{E}$ denotes probabilistic expectation and (1.1) makes obvious how the problem hinges on the statistics of the Lagrangian velocity field $U(t)$ .
In a recent paper, Bühler, Grisouard & Holmes-Cerfon (Reference Bühler, Grisouard and Holmes-Cerfon2013) (hereafter BGHC13), it was pointed out that the scaling of $D$ with wave amplitude $a\ll 1$ depends crucially on the presence of explicit wave damping. Specifically, in the unforced and undamped case studied in rotating shallow water in Bühler & Holmes-Cerfon (Reference Bühler and Holmes-Cerfon2009) and in a three-dimensional rotating Boussinesq system in Holmes-Cerfon, Bühler & Ferrari (Reference Holmes-Cerfon, Bühler and Ferrari2011), the diffusivity scaling was $D=O(a^{4})$ , whilst in the forced–dissipative nonlinear numerical simulations presented in BGHC13 it was clearly found that $D=O(a^{2})$ . This significant difference could be satisfactorily explained by considering toy models for the forced–dissipative wave dynamics based on stochastic differential equations (SDEs). In these toy models a single wave mode with frequency ${\it\omega}$ was forced by white noise and dissipated by linear damping. Such linear models can be solved exactly and this allowed the computation of $D$ directly from the theory, and good agreement with the numerical simulations was found.
However, as a model for the real ocean internal wave field, say, white-noise forcing and linear damping are of course poor models for the actual physical processes that underpin the spectral energy cascade through a broad spectrum of internal waves (e.g McComas & Bretherton Reference McComas and Bretherton1977; Munk Reference Munk, Warren and Wunsch1981). Hence the present paper investigates the robustness of the simple linear models by considering nonlinear variants of the damping function associated with a single wave mode. Crudely, such a damping function seeks to capture the nonlinear energy transfer from the given mode to all other modes of the wave spectrum, and there is no particularly good reason to assume that such a transfer should be linear in wave amplitude.
Our aim here is to investigate this issue by considering a family of nonlinear stochastic wave models (see (3.1) in § 3 below) that have an adjustable degree of nonlinearity indicated by a parameter $p\geqslant 0$ . The technically difficult part is that nonlinear SDEs cannot be solved directly anymore, so the covariance function $C(t)$ must be found by other means in order to compute $D$ . In this paper we investigate both an eigenfunction expansion based on the Kolmogorov backward equation (KBE) in § 2.1, which allows $C(t)$ to be computed exactly, and also a simple (but surprisingly accurate) approximation for $C(t)$ in § 2.2.
Overall, we find that the role of the nonlinearity strongly affects the scaling of $D$ with wave amplitude $a\gg 1$ , but that its explicit impact on $D$ can be removed if one chooses the expected energy flux and the expected kinetic wave energy as variables to describe the wave mode statistics. In fact, rather unexpectedly, in this way the linear theory can be generalized to nonlinear damping functions. This leads to a simple new expression for $D$ (see (3.7) and (3.10) in § 3 below) that generalizes the earlier results in BGHC13 and indicates their robustness. This simple new formula also allows the oceanic situation to be looked at in quantitative detail, as will be shown in § 4. Concluding remarks are offered in § 5.
2. Analysis of a nonlinear Ornstein–Uhlenbeck process
To introduce the analysis tools we first consider a nonlinear variant of the classical Ornstein–Uhlenbeck (OU) process in one and two dimensions and thereafter add the important physical feature of wave oscillations to the two-dimensional equations in § 3. The one-dimensional OU-process for a Lagrangian velocity component $U(t)$ is defined by the SDE
Here $W(t)$ is a standard Wiener process (Brownian motion) and the parameters ${\it\gamma}>0$ and ${\it\beta}\neq 0$ measure the strength of the damping and of the random forcing. The exponent $p\geqslant 0$ describes the form of the nonlinear damping term and for $p=0$ the process (2.1) reduces to the linear OU-process studied in BGHC13. For $p>0$ the instantaneous damping rate varies as ${\it\gamma}|U|^{p}$ and consequently the dimensions of ${\it\gamma}$ depend on $p$ (see (2.8a−d ) below). We will be most interested in the cases $p=1$ (quadratic nonlinearity) and $p=2$ (quadratic instantaneous damping rate). The former is suggested by the elementary quadratic nonlinearity of the fluid equations, whilst the latter is often invoked in models for a spectrum of inertia–gravity waves in the ocean (e.g. McComas & Bretherton Reference McComas and Bretherton1977; Munk Reference Munk, Warren and Wunsch1981).
In the linear case $p=0$ a closed-form solution for $U(t)$ is available, but this does not extend to the nonlinear case $p>0$ . Hence indirect methods are required in order to compute the covariance $C(t)$ and the diffusivity $D$ . First, for all values of $p\geqslant 0$ a stationary state of (2.1) can be readily computed (see § A.1), which is described by the probability density function (p.d.f.)
Here ${\it\Gamma}(\cdot )$ is the gamma function (see (A 3)). Compared to the linear case $p=0$ , the p.d.f. for $p>0$ is flattened near the origin and then decreases more sharply for larger values of $u$ , which indicates that the intermittency of $U$ depends on $p$ . The stationary moments of $U$ are
In particular, the variance
Notably, if $m=2+p$ then the identity ${\it\Gamma}(x+1)=x{\it\Gamma}(x)$ yields the simple result
This is the fluctuation–dissipation relation belonging to (2.1) and it can be derived directly from the SDE by multiplying it by $U$ to form an equation for the energy budget. It is peculiar to stochastic calculus that in this derivation one must retain quadratic terms such as $\text{d}U\,\text{d}U$ in order to get the correct answer (e.g. Gardiner Reference Gardiner1997). This yields
after using (2.1) and Itô’s formula in the form $\text{d}W\,\text{d}W=\text{d}t$ . The stationary expectation of (2.6) then yields (2.5) because $\mathbb{E}[U\,\text{d}W]=0$ . This makes it obvious that ${\it\beta}^{2}/2$ is the expected generation rate of energy per unit time, whilst ${\it\gamma}\mathbb{E}[|U|^{2+p}]$ is the expected energy dissipation rate per unit time, which balances the generation rate in a stationary state. The conventional notation for the energy dissipation rate is ${\it\epsilon}$ , so
This provides a simple and accurate formula that relates the amplitude of $U$ monotonically to the parameter ${\it\beta}^{2}/{\it\gamma}$ .
Before moving on we note how dimensional analysis can be used to define length and time scales $(L,T)$ from $({\it\beta},{\it\gamma})$ . Because $U$ is a velocity, this yields
The intrinsic velocity and diffusivity scales then follow as
For fixed variance and hence fixed $L/T=O(a)$ the diffusivity $D$ is simply proportional to $1/{\it\gamma}$ , which extends to all $p\geqslant 0$ , a result quoted for the linear OU-process in BGHC13. Interestingly, in the case of quadratic damping rate $p=2$ , the diffusivity becomes independent of the velocity amplitude. However, the nonlinear OU-process of course lacks any wave-like oscillations and hence not much relevance to oceanic waves can be read into these scaling relations.
2.1. Eigenfunction expansion for the covariance function
The exact covariance function can be determined via an eigenfunction expansion of the generator of the SDE (2.1), which is the differential operator
The generator $L$ is less frequently used in physical applications than its adjoint $L^{\dagger }$ , which appears in the Fokker–Planck equation (a.k.a. the Kolmogorov forward equation)
for the p.d.f. ${\it\rho}(u,t)$ of the process. The invariant p.d.f. ${\it\rho}(u)$ in (2.2) is in the null-space of the adjoint: $L^{\dagger }{\it\rho}=0$ . From a mathematical perspective the generator $L$ is fundamental and has a number of attractive properties (e.g. Oksendal Reference Oksendal2013), the most important of which for the present purpose is its role in the KBE. For an autonomous SDE such as (2.1) the KBE takes the following form: consider a function $f(u,t)$ defined for $t\geqslant 0$ by the initial-value problem
It then follows from probability theory that
In other words, if $f(u,t)$ solves the KBE problem in (2.12) then it equals the expected value of $g(U(t))$ at time $t\geqslant 0$ conditioned on $U(0)=u$ at the initial time $t=0$ . Here $g(\cdot )$ is an arbitrary function of $u$ . Hence the KBE allows time-dependent expected values to be computed directly, i.e. without finding the time-dependent p.d.f. of the process first, which is a significant simplification.
This can be used to compute $C(t)$ as follows. We first use the KBE to compute the expected value of $U(t)$ conditioned on starting at a specific location $U(0)=u$ . Thereafter we aggregate this result (times $u$ ) over all values of $u$ according to the invariant measure. In equations, for $t\geqslant 0$ this means
where $f(u,t)$ solves (2.12) with $g(u)=u$ . The actual computation of $f$ utilizes the eigenfunctions $y_{k}(u)$ and eigenvalues ${\it\lambda}_{k}$ of $L$ as defined by
Multiplication by ${\it\rho}(u)$ and using $L^{\dagger }{\it\rho}=0$ puts (2.15) into self-adjoint form:
This is a standard Sturm–Liouville problem with weight ${\it\rho}(u)$ , so the eigenfunctions are orthogonal under the inner product with that weight and can be normalized such that
From here it is straightforward to find $(y_{k},{\it\lambda}_{k})$ numerically by using a suitable software package such as MATSLISE for matlab (Ledoux, Daele & Berghe Reference Ledoux, Daele and Berghe2005). Because ${\it\rho}(u)$ decays rapidly at large $|u|$ this can be done using a finite computational domain.
Each eigenfunction generates a separable solution $y_{k}(u)\exp (-{\it\lambda}_{k}t)$ of the KBE and therefore $f(u,t)$ can be expanded as
From the second part of (2.14) the covariance and the diffusion are then given by
The main result (2.19) shows that the covariance is always a sum of exponential functions, albeit with different decay rates given by the eigenvalues ${\it\lambda}_{k}$ . Numerical results of this eigenfunction expansion for $p=1$ and $p=2$ are given in table 1 and illustrated in figure 1. The sum for $C(t)$ in (2.19) converges rapidly and this suggests a simple approximation based on a single exponential function $C_{\ast }(t)$ , say. This is pursued in the next section.
2.2. Exponential approximation for the covariance function
A natural idea for an approximate covariance function would be to truncate the sum in (2.19) after just one term. However, this would have little practical advantage because one would still need the eigenfunction expansion to compute the parameters $a_{1}$ and ${\it\lambda}_{1}$ . Also, such an approximation would not capture the exact variance $C(0)$ of the process, which is the sum over all $a_{k}^{2}$ . It is therefore a better idea to pose an exponential approximation for $C(t)$ at the outset and fit its parameters directly from the behaviour of the exact $C(t)$ near $t=0$ . This side-steps the need for an eigenfunction expansion altogether and turns out to be surprisingly accurate as well. We therefore consider
with suitable constants $(A,B)$ determined from the behaviour of $C(t)$ near $t=0$ . Specifically, we require
where the slope in the second equation is evaluated as $t\rightarrow 0+$ , say. This is necessary because $C^{\prime }$ jumps at $t=0$ . The exact slope at the origin can be computed by combining a Taylor expansion for small $\text{d}t>0$ with the fluctuation–dissipation relation (2.7) as follows:
The corresponding approximation for the auto-correlation time is
Of course, the functional expressions for $D_{\ast }$ and ${\it\tau}_{\ast }$ are merely as one would expect from dimensional analysis based on the kinetic energy $\mathbb{E}[U^{2}]/2$ and on the energy dissipation ${\it\epsilon}$ , but the interesting point is that these formulae yield the correct diffusivity values to within a few percent, as was seen in § 2.1. Notably, there are no pre-factors involving the nonlinearity parameter $p\geqslant 0$ , so in this approximation two processes with different values of $p$ but equal values of $\mathbb{E}[U^{2}]$ and ${\it\epsilon}$ will yield the same diffusivity. This is in fact a robust finding even for wave-like processes, as we shall see.
The comparison of $C(t)$ and $C_{\ast }(t)$ in figure 1 shows that the latter matches the former very well, with a small underestimate for large values of $t$ . Overall we find that the exponential approximation (2.24a,b ) is surprisingly accurate, with errors less than 5 % or so. This unexpected accuracy of the exponential approximation $C_{\ast }(t)$ allows us to simplify a great number of relations below, but we did check that using the exact $C(t)$ would yield essentially the same results in all cases we considered.
2.3. Two-dimensional nonlinear OU-process
A natural two-dimensional isotropic generalization of the process (2.1) is
Here $\boldsymbol{U}=(U,V)$ is the two-dimensional horizontal velocity vector and $(W_{1},W_{2})$ are two independent Wiener processes. No new dimensional parameters have been introduced so the dimensional analysis leading to (2.8a−d ) and below remains unchanged. The nonlinear drift is a gradient flow with isotropic potential
and hence the two-dimensional invariant p.d.f. has the form (e.g Gardiner Reference Gardiner1997)
Clearly, ${\it\rho}(u,v)$ depends only on the velocity magnitude $r=\sqrt{u^{2}+v^{2}}$ , but it does not factorize and hence $U$ and $V$ are not independent, except in the linear case $p=0$ . For example, the conditional expectations
for large $|u|$ . The remaining analysis works much as it did for the one-dimensional model. First, the stationary moments of $U$ are (see § A.1)
The variance is
Second, the fluctuation–dissipation relation is
Here ${\it\epsilon}$ is the energy dissipation per kinetic energy component $U^{2}/2$ or $V^{2}/2$ , so the total energy dissipation associated with $(U^{2}+V^{2})/2$ is twice that, or ${\it\beta}^{2}=2{\it\epsilon}$ . Third, the approximate covariance $C_{\ast }$ and diffusion $D_{\ast }$ take the same functional forms as in (2.24a,b ). This is because (2.22) adjusts to
using the fluctuation–dissipation relation once more. Finally, the KBE (2.12) now applies to functions $f(\boldsymbol{u},t)$ in an obvious fashion:
The eigenfunction expansion is now based on the two-dimensional generator
which in polar coordinates takes the form
The eigenfunctions of $L$ are separable in $(r,{\it\theta})$ so we can consider functions of the form $y_{k}(r)\exp (\text{i}n{\it\theta})$ with integer $n$ , which yields
Again, multiplication with the weight $r{\it\rho}(r)$ puts this equation into self-adjoint form:
The boundary condition at $r=0$ is
For any given $n$ the radial eigenfunctions are orthogonal and normalized such that
In order to compute the expected value of $U$ we consider
which is just the $n=1$ mode. Hence
and the $y_{k}$ are the eigenfunctions of (2.38) with $n=1$ . These are again easily found numerically. The covariance function sought is
We have computed the eigenfunction expansion for $p=1$ and $p=2$ but found the same result as in the one-dimensional case: the series for the diffusivity in (2.45) converges rapidly, the exponential approximation to it is accurate to within 5 %, and the exponential covariance $C_{\ast }(t)$ slightly underestimates the true $C(t)$ for large values of $t$ .
The summary of this investigation is that the exponential approximation gives an excellent prediction of the true covariance in both the one-dimensional and two-dimensional cases and that the value of the nonlinearity parameter $p\geqslant 0$ does not enter the expressions for $C_{\ast }(t)$ and $D_{\ast }$ when these quantities are expressed in terms of $\mathbb{E}[U^{2}]$ and ${\it\epsilon}$ . Notably, ${\it\epsilon}$ is the energy dissipation rate per energy component $U^{2}/2$ , so in the two-dimensional case the total dissipation range is $2{\it\epsilon}$ .
3. Two-dimensional nonlinear process with wave oscillation
We now augment the two-dimensional nonlinear process (2.26a ) with a simple model for a wave-like oscillation with frequency ${\it\omega}\geqslant 0$ :
In the context of inertia–gravity waves this model can be viewed either as a model for a clockwise inertial oscillation based on a Coriolis parameter $f={\it\omega}$ , or as a generic evolution equation for a wave mode in spectral space with frequency ${\it\omega}$ , say. The former is directly relevant to storm-forced oscillations in the upper ocean whilst the latter is relevant to nonlinear forced–dissipative inertia–gravity wave models as in § 4 of BGHC13. Either way, in terms of the intrinsic scales $(L,T)$ based on $({\it\beta},{\it\gamma})$ in (2.8a−d ), the new frequency ${\it\omega}$ introduces a non-dimensional parameter ${\it\omega}T$ , which significantly weakens the conclusions that can be drawn from dimensional analysis alone.
3.1. Covariance reduction to the non-oscillatory case
The linear case $(p=0)$ was considered in BGHC13 and there the outcome was that the covariance of $U$ for ${\it\omega}\neq 0$ was simply $\cos ({\it\omega}t)$ times the covariance of $U$ for ${\it\omega}=0$ . However, this result was derived using the exact solution for the linear system, which does not extend to the nonlinear case. Hence we need to check carefully how the nonlinearity affects the situation. First of all, for all values of ${\it\omega}$ the system (3.1) has the same invariant measure (2.28) because the oscillatory drift $({\it\omega}V,-{\it\omega}U)$ is along lines of constant potential ${\it\Phi}$ in (2.27). Hence although the Fokker–Planck equation for (3.1) contains new terms, these terms are zero if evaluated on the invariant measure (2.28). For further analysis it is convenient to replace $\boldsymbol{U}$ by the back-rotated complex variable
The system (3.1) then takes the form
Here the second equality holds in the sense that the unitary factor $\exp (\text{i}{\it\omega}t)$ can be absorbed without loss of generality in the definition of the complex-valued white noise. This is because the generator of the respective diffusions is identical (see § A.2). But this means that $(A_{r},A_{i})$ satisfy exactly the ${\it\omega}=0$ nonlinear equations (2.26a ), for which we know how to solve for the covariance structure. Hence we can write
where the subscripts indicate whether the expectation is taken with respect to the dynamics with ${\it\omega}\neq 0$ or with ${\it\omega}=0$ . The key question is whether $\mathbb{E}_{0}[U(0)V(t)]=0$ . In the linear case studied in BGHC13 this term was zero because if ${\it\omega}=0$ then $U$ and $V$ were independent zero-mean functions. In the nonlinear case $p>0$ they are not independent, but it is still true that their cross-covariance is zero at all time lags. This can either be seen by inspection of the symmetries of the ${\it\omega}=0$ equations, or it can be derived formally from the KBE (2.34). In the latter case we would integrate a solution starting from $f(r,{\it\theta},0)=v=r\sin {\it\theta}$ against a term $u=r\cos {\it\theta}$ in (2.43), which vanishes after integration over ${\it\theta}$ . The upshot is that
holds exactly for all values of $p$ . From (2.43) we find that
This infinite sum for $D$ generalizes the single-term expression (3.10) derived in BGHC13 for the linear case. The corresponding approximate versions are
is the energy damping time scale and ${\it\epsilon}={\it\beta}^{2}/2$ . As before, the dependence on the nonlinearity parameter $p$ has disappeared when the model parameters $({\it\beta},{\it\gamma})$ are eliminated in favour of $\mathbb{E}[U^{2}]$ and ${\it\epsilon}$ . Notably, if ${\it\omega}\neq 0$ then the auto-correlation time
is always shorter than the energy damping time scale ${\it\tau}^{E}$ .
3.2. High-frequency waves
The practically relevant regime in the ocean is that of high-frequency waves in the sense that the wave period is short compared with the energy decay time scale of the wave field. This corresponds to
and then we obtain the limiting forms
Finally, the fluctuation–dissipation relation (2.32) applied to (3.10) yields the simple scalings
with respect to the $O(a)$ wave amplitude. This also recovers the linear results of BGHC13 if $p=0$ .
4. Ocean internal waves
We now investigate how these simple stochastic models for wave-induced diffusion might apply to the kind of small-scale horizontal diffusion in the ocean that was first observed in the NATRE experiment (Ledwell et al. Reference Ledwell, Watson and Law1998) and subsequently confirmed in a number of additional observational campaigns (e.g. Sundermeyer & Ledwell Reference Sundermeyer and Ledwell2001; Shcherbina et al. Reference Shcherbina, Sundermeyer, Kunze, D’Asaro, Badin, Birch, Anne-Marie, Callies, Kuebel, Brandy and Claret2014). A suitable reference point is the Garrett–Munk (GM) spectrum for the observed background ocean spectrum of internal waves (e.g. Munk Reference Munk, Warren and Wunsch1981).
Three salient elements are important here: first, there is a systematic nonlinear energy flux ${\it\epsilon}_{GM}$ , say, that flows through the modal components of the GM-spectrum from small to large vertical wavenumbers. There are several different physical mechanisms for this flux, such as resonant triad interactions or the differential advection of small-scale waves by large-scale waves, but the upshot is that the energy flux ${\it\epsilon}_{GM}$ eventually leads to instability of the highest vertical wavenumber waves and therefore to wave breaking and wave dissipation. The small vertical wavenumber waves do not break, but their oscillatory dynamics is of course affected by the energy flux ${\it\epsilon}_{GM}$ flowing through them. This is the kind of nonlinear energy flux through the spectrum that is modelled by the forced–dissipative balance in our model.
Second, all evidence suggests that ${\it\epsilon}_{GM}$ is quadratic in the wave energy density $E_{GM}$ , say, of the GM-spectrum. In other words, the instantaneous damping rate of the wave field is roughly proportional to its energy. In the context of our model this selects the value $p=2$ for the nonlinearity parameter in (3.1) and consequently $D=O(a^{4})$ according to (3.11c ). This is the same scaling of $D$ with wave amplitude as in the unforced inviscid weakly nonlinear study in Holmes-Cerfon et al. (Reference Holmes-Cerfon, Bühler and Ferrari2011), but the pre-factor is very different, as we shall see.
Third, typical values for $E_{GM}$ and ${\it\epsilon}_{GM}$ at equilibrium in the main ocean thermocline are (e.g. Munk Reference Munk, Warren and Wunsch1981; Watanabe & Hibiya Reference Watanabe and Hibiya2005)
This, together with ${\it\omega}\geqslant f$ , shows that the high-frequency regime (3.9) is clearly relevant here, because the energy decay time is of the order of a week, which is long compared to the wave periods.
Combing these three elements we assume that there is a common ${\it\epsilon}\propto {\it\epsilon}_{GM}$ for the modal components of the wave spectrum, that $p=2$ , and that (3.10) together with (4.1) suggests that the wave-induced particle diffusion is dominated by the near-inertial waves ${\it\omega}\approx f$ . Notably, in the GM-spectrum the near-inertial waves are also the most energetic, but the present argument is based on the constant spectral energy flux ${\it\epsilon}$ rather than the variable spectral energy density. Let us consider only the diffusion induced by waves with ${\it\omega}\approx f$ . To use (3.10) for the diffusion we need to relate ${\it\epsilon}$ to ${\it\epsilon}_{GM}$ . For near-inertial waves the wave energy is dominated by the horizontal kinetic energy and therefore ${\it\epsilon}_{GM}$ is the energy flux associated with $U^{2}/2+V^{2}/2$ . Our ${\it\epsilon}$ is the energy flux associated with $U^{2}/2$ alone, so by isotropy we can argue that
using $f=0.0001~\text{s}^{-1}$ and (4.1). On the one hand, this is larger by a factor of ten or so than the $O(a^{4})$ inviscid estimate in Holmes-Cerfon et al. (Reference Holmes-Cerfon, Bühler and Ferrari2011) (or to the similarly small estimate based on shear dispersion, see Young, Rhines & Garrett Reference Young, Rhines and Garrett1982). On the other, it is still somewhat modest when compared to the observed small-scale diffusivity value of $1{-}2~\text{m}^{2}~\text{s}^{-1}$ in the NATRE experiment (Ledwell et al. Reference Ledwell, Watson and Law1998).
An alternative formula for $D$ that links it directly to the diapycnal diffusivity ${\it\kappa}$ can also be derived, although this runs the risk of making quantities look related that physically have little to do with each other. This is based on the often-used empirical formula
where ${\it\Gamma}_{m}$ is the mixing efficiency of stratified turbulence and $N$ is the buoyancy frequency. Using the typical values ${\it\kappa}=10^{-5}~\text{m}^{2}~\text{s}^{-1}$ , ${\it\Gamma}_{m}=0.2$ , and $N=0.01~\text{s}^{-1}$ then recovers the numerical estimate for $D$ in (4.2).
As noted above, this makes it look as if there was a close link between shear dispersion and the dispersion process considered here, because modulo the pre-factor (4.3) turns out to be identical to the deep-ocean version for shear dispersion quoted in § 4d of Young et al. (Reference Young, Rhines and Garrett1982). But we believe this is not the case. For example, one could imagine a situation where wave energy sloshes back and forth between different wave modes in a conservative fashion, say between the members of a resonant internal wave triad. In this situation there would be zero shear dispersion because there is no vertical diapycnal diffusion to begin with. However, the process described in this paper would lead to a non-zero horizontal dispersion because the wave–wave interactions upset the exquisite reversibility of the linear wave motion, with consequent particle dispersion in the horizontal.
Finally, it is tempting to speculate on what would happen to these estimates if one were to take the diffusivity due to waves with other frequencies than ${\it\omega}=f$ into account. For example, if this were a purely additive process then one could add up contributions of the type ${\it\epsilon}/{\it\omega}^{2}$ for each octave in frequency between ${\it\omega}=f$ and ${\it\omega}=N$ . However, the rapid decay of $1/{\it\omega}^{2}$ with increasing ${\it\omega}$ suggests that this would not make a material difference to the estimate in (4.2); numerically we found that it might increase the estimate by some 30 % or so, which is not a big difference.
Overall it seems that internal-wave-induced dispersion can make a noticeable but modest contribution to the observed small-scale horizontal diffusivity in the ocean, leaving the bulk of the phenomenon to be attributed to other processes such as the small-scale tail of the potential-vorticity-controlled balanced flow (e.g. Polzin & Ferrari Reference Polzin and Ferrari2004). Interestingly, as suggested by a referee, the stochastic models studied here with frequency ${\it\omega}=0$ might be a useful modelling tool for the diffusion achieved by such balanced flows.
5. Concluding remarks
In hindsight it is easy to see the connection between the $D=O(a^{2})$ scaling of BGHC13 and the $D=O(a^{4})$ scaling obtained here: in the high-frequency regime the former estimate for $D$ was multiplied by the linear damping rate and divided by the wave frequency squared. The present estimate allows for a nonlinear amplitude-dependent damping rate of size $1/{\it\tau}^{E}=O(a^{p})$ (cf. (3.11a )), and hence for $p=2$ the scaling $D=O(a^{4})$ is consistent with the BGHC13 result. As noted before, this is the same scaling as in the inviscid theory in Holmes-Cerfon et al. (Reference Holmes-Cerfon, Bühler and Ferrari2011), but with a much larger pre-factor.
The main restriction of the present toy model is of course that it considers only a single wave mode in isolation, with the presence of all other modes crudely represented by white-noise forcing and nonlinear damping. It would be attractive to have a firm theoretical approach at hand to improve on this, but applications of many-mode theories such as weak wave turbulence theory have been difficult to carry out successfully for internal waves, especially in the presence of the strong Coriolis forces that are inevitable in oceanic applications. This remains an open theoretical question.
We have experimented with direct nonlinear numerical simulations of the rotating shallow-water equations under the constraint of uniform potential vorticity in order to discern the relevant statistics for a single wave mode in the spectrum, but this has not led to clear results so far.
An important theoretical point in this connection is that the wave–wave interactions that are crudely modelled by the nonlinear damping term in (3.1) do not impact on the potential vorticity of the fluid at all. This is in stark contrast to viscous wave dissipation or nonlinear wave breaking, both of which of course do change the potential vorticity locally (e.g. Bühler Reference Bühler2000). A consistent theoretical many-mode model that encompasses both wave and vortex modes would therefore only allow for actual wave dissipation on the smallest scales to impact on the potential-vorticity-controlled vortex flow. In other words, the bulk of the wave energy cascade through a spectrum of internal waves is invisible to the potential vorticity even though there is effective forcing and damping of each wave mode in the cascade. Clearly, it would be highly desirable to have a simple ocean spectral wave and vortex model that is functioning under these constraints. This would be useful not just for studies of diffusivity, but also for the exchange of energy between waves and vortices in general.
Acknowledgements
We thank M. Holmes-Cerfon for useful discussions and the anonymous referees for several helpful suggestions. Financial support under the United States National Science Foundation grant DMS-1312159 is gratefully acknowledged.
Appendix A
A.1. Computations involving invariant measures
The Fokker–Planck equation for the p.d.f. ${\it\rho}(u,t)$ belonging to (2.1) is (e.g. Gardiner Reference Gardiner1997)
and it follows that (2.2) is its time-independent equilibrium solution. The value of $Z$ follows from the integral identity
which is valid for $a,b>0$ and $c>-1$ . Here ${\it\Gamma}(\cdot )$ is the gamma function defined by
To derive (2.30) for the two-dimensional system (2.26a ) we start with polar coordinates such that $u=r\cos {\it\theta}$ and hence
The azimuthal integral is
and the radial part is an instance of (A 2). Combining with the second equation in (2.28) then yields (2.30).
A.2. Equivalent diffusions in (3.3)
We ignore the identical drift terms. If $n$ random variables are forced by $m$ noise terms via
then the generator is given by (e.g. Oksendal Reference Oksendal2013)
In the present case $m=n=2$ , $(X_{1},X_{2})=(A_{r},A_{i})$ , and hence (3.3) holds because