1. Introduction
Flow control is a challenging problem both from the academic perspective and in terms of concrete applications, where problems such as laminar-to-turbulent transition, drag reduction and flow-induced vibration are of practical concern (Rowley et al. Reference Rowley, Williams, Colonius, Murray and Macmynowski2006; Kim & Bewley Reference Kim and Bewley2007; Bagheri & Henningson Reference Bagheri and Henningson2011; Luhar, Sharma & McKeon Reference Luhar, Sharma and McKeon2014; Brunton & Noack Reference Brunton and Noack2015; Jin, Illingworth & Sandberg Reference Jin, Illingworth and Sandberg2020). The difficulties arise from the nonlinearity of flow dynamics, which is often avoided by considering a linearized system. Nevertheless, it is not clear how to model the neglected nonlinear terms, and effective methods to determine the optimal location of sensors and actuators are still a topic of research. Among the linear control strategies presently available, inverse feed-forward, wave-cancellation, optimal and robust control are popular choices. Throughout this work, by optimal and robust control we mean the corresponding linear control approaches.
In flows dominated by convection, wave cancellation has been proposed as a simple-but-effective control strategy. Waves are identified by upstream sensors, and actuators situated between the sensors and targets act to minimize perturbations at the target location (Sasaki et al. Reference Sasaki, Tissot, Cavalieri, Silvestre, Jordan and Biau2016). Although the approach is not guaranteed to be causal, in so far as computation of the actuation signal may require future sensor readings, causality can be imposed by ignoring the non-causal part of the control kernel. This has been shown to closely reproduce optimal control when there is sufficient distance between sensors, actuators and targets (Sasaki et al. Reference Sasaki, Morra, Fabbiane, Cavalieri, Hanifi and Henningson2018a). While this can significantly reduce the effectiveness of the controller (Brito et al. Reference Brito, Morra, Cavalieri, Araújo, Henningson and Hanifi2021), the approach has been successfully used in numerous studies (Hanson et al. Reference Hanson, Bade, Belson, Lavoie, Naguib and Rowley2014; Sasaki et al. Reference Sasaki, Tissot, Cavalieri, Silvestre, Jordan and Biau2018b; Maia et al. Reference Maia, Jordan, Cavalieri, Martini, Sasaki and Silvestre2021). A similar approach was used by Luhar et al. (Reference Luhar, Sharma and McKeon2014), where opposition control based on the resolvent operator was performed. Adaptive control strategies are often similar to wave-cancellation approaches, but use additional downstream sensors to adapt the control law to changes in the flow (Fabbiane et al. Reference Fabbiane, Semeraro, Bagheri and Henningson2014; Simon et al. Reference Simon, Fabbiane, Nemitz, Bagheri, Henningson and Grundmann2016).
Optimal control, on the other hand, minimizes a quadratic cost functional (Bagheri et al. Reference Bagheri, Henningson, Hœpffner and Schmid2009b), frequently associated with the mean perturbation energy. This approach provides a maximal reduction of perturbation energy and has been used to control flow instabilities (Bewley & Liu Reference Bewley and Liu1998) or to reduce the receptivity of a flow to disturbances (Barbagallo, Sipp & Schmid Reference Barbagallo, Sipp and Schmid2009; Semeraro et al. Reference Semeraro, Bagheri, Brandt and Henningson2011; Juillet, Schmid & Huerre Reference Juillet, Schmid and Huerre2013; Juillet, McKeon & Schmid Reference Juillet, McKeon and Schmid2014; Morra et al. Reference Morra, Sasaki, Hanifi, Cavalieri and Henningson2020; Sasaki et al. Reference Sasaki, Morra, Cavalieri, Hanifi and Henningson2020), to delay transition to turbulence, for instance. However, the robustness of optimal control may be hindered by feedback between actuators and sensors and/or by small errors in the flow model. In some cases, such issues may even cause the control law to further destabilize the system. While the model can be accurately known in some scenarios, for instance, when simulating transitional flows subject to small disturbances, applications in off-design conditions or in cases where nonlinear dynamics are important, will generally result in a reduction of the accuracy with which the linear model represents the physical system. Robust control allows a balance to be struck between the cost-functional reductions and the control robustness: at the cost of achieving lower energy reduction than optimal control, robust control can tolerate higher modeling errors. This kind of approach has been used in several recent studies (Dahan, Morgans & Lardeau Reference Dahan, Morgans and Lardeau2012; Jones et al. Reference Jones, Heins, Kerrigan, Morrison and Sharma2015; Jin et al. Reference Jin, Illingworth and Sandberg2020). Robust control is particularly important for the control of oscillator systems where actuators are frequently situated upstream of sensors (Bagheri et al. Reference Bagheri, Henningson, Hœpffner and Schmid2009b). For this kind of configuration, optimal control is typically not robust (Schmid & Sipp Reference Schmid and Sipp2016).
There are, however, many scenarios where optimal control is robust. Schmid & Sipp (Reference Schmid and Sipp2016) showed that this is the case for amplifier flows, where sensors are typically located upstream of actuators. Semeraro et al. (Reference Semeraro, Bagheri, Brandt and Henningson2011) arrived at a similar conclusion for the control of boundary layer disturbances, where the optimal control was found to be robust to changes in Reynolds numbers and pressure gradients. Our study is focused on this scenario: the development of optimal control for amplifier flows.
The usual approach to obtain optimal control laws involves solution of Riccati equations. The computational cost grows rapidly with the number of degrees of freedom (DOFs) of the system. As problems in fluid mechanics typically have many DOFs, methods based on the solution of Riccati equations are impractical. A standard way of dealing with this issue is to base control design on reduced-order models (ROMs) (Bagheri et al. Reference Bagheri, Henningson, Hœpffner and Schmid2009b). Several bases have been used to obtain ROMs, such as proper orthogonal decomposition modes (Noack & Eckelmann Reference Noack and Eckelmann1993), eigenmodes (Åkervik et al. Reference Åkervik, Hœpffner, Ehrenstein and Henningson2007), and balanced modes (Bagheri et al. Reference Bagheri, Henningson, Hœpffner and Schmid2009b; Barbagallo et al. Reference Barbagallo, Sipp and Schmid2009). The eigensystem realization algorithm (ERA) (Juang & Pappa Reference Juang and Pappa1985) was shown to be equivalent to a ROM based on balanced modes, with only a fraction of the costs when external disturbances are low rank (Ma, Ahuja & Rowley Reference Ma, Ahuja and Rowley2011). A drawback of such techniques is that control laws obtained from these ROMs are not guaranteed to be optimal when applied to the full system. Model reduction with balanced modes has upper-error bounds for modelling open-loop systems, but even when the open-loop system is accurately represented by the ROM, a control law based on the ROM can be ineffective when applied to the original problem (Åström & Murray Reference Åström and Murray2010, p. 349).
Several approaches exist for obtaining estimation and control laws for the full system. Optimal estimation and control gains for the full system can be obtained iteratively (Semeraro et al. Reference Semeraro, Pralits, Rowley and Henningson2013; Luchini & Bottaro Reference Luchini and Bottaro2014). This, however, requires integration of a large auxiliary system for real-time application; this adds significant cost when used in a numerical simulation and is likely unfeasible for experimental implementation. Alternatively, the control law can be reduced a posteriori (design-then-reduce), instead of being designed based on a ROM (reduce-then-design). Another approach applicable to the full system is to use estimation strategies using an ensemble Kalman filter (Colburn, Cessna & Bewley Reference Colburn, Cessna and Bewley2011). However, this requires the integration of multiple realizations of the full system, which adds significant computational cost when applied to numerical solutions.
Control strategies are highly dependent on actuator and sensor placement, and the optimal choice is often unclear. As ROMs are frequently derived for a specific set of sensors and actuators, studies of the role of their placement for control often rely on the derivation of multiple ROMs. An example is the study of Belson et al. (Reference Belson, Semeraro, Rowley and Henningson2013), in which an ERA ROM was required for each sensor position. This highlights the role of ROMs in studies of sensor and actuator placement and how this can be costly. As sensor and actuator positioning substantially impacts the control strategy (Ilak & Rowley Reference Ilak and Rowley2008; Illingworth, Morgans & Rowley Reference Illingworth, Morgans and Rowley2011; Belson et al. Reference Belson, Semeraro, Rowley and Henningson2013; Freire et al. Reference Freire, Cavalieri, Silvestre, Hanifi and Henningson2020), it is of interest to be able to obtain control laws without having to rely on ROMs for each possible choice of sensor and actuator.
Most approaches used for flow control typically require simplified forcing assumptions, frequently modelled as white-in-time noise. Although complex spatial-temporal forcing colour can be used in the Kalman-filter framework, the approach requires use of an expanded system that filters a white-noise input to create a coloured noise, with the extra assumption that the forcing cross-spectral density (CSD) is a rational function of the frequency (Åström & Wittenmark Reference Åström and Wittenmark2013). Simplified forcing-colour models are thus typically used to avoid this complexity. However, it has been shown that the use of realistic spatiotemporal forcing colour is crucial for accurate estimation of complex flows (Chevalier et al. Reference Chevalier, Hœpffner, Bewley and Henningson2006; Martini et al. Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020; Amaral et al. Reference Amaral, Cavalieri, Martini, Jordan and Towne2021). This is an indication that control can be considerably enhanced if realistic forcing models are used.
Another approach to estimation and control is possible using the Wiener–Hopf formalism. An optimal non-causal estimation method, known as a Wiener filter (Wiener Reference Wiener1942; Barrett & Moir Reference Barrett and Moir1987), and optimal control strategies, known as Wiener regulators (Youla, Jabr & Bongiorno Reference Youla, Jabr and Bongiorno1976; Grimble Reference Grimble1979; Moir & Barrett Reference Moir and Barrett1989), can be obtained based on CSDs between sensors, actuators and flow states. Although well described in the control literature, their potential for flow control has not been appropriately explored. To the best of the authors’ knowledge, Martinelli (Reference Martinelli2009) is the only work in which the formalism has been used for flow control. Difficulties in solving the associated Wiener–Hopf problems limited that study to the use of one sensor and one actuator, and the significant cost of converging the CSDs probably hindered further application of the method.
Wiener–Hopf problems appear when causality constraints are imposed on the estimation and control kernels. The Wiener–Hopf method has been used in the fluid mechanics community to obtain solutions to linear problems with spatial discontinuities, such as acoustic scattering by edges (Noble Reference Noble1959; Peake Reference Peake2004). Solutions to this class of problem are typically based on a factorization of the Wiener–Hopf kernel into components that are regular on the upper/lower halves of the complex frequency plane. Such factorization can be achieved analytically only for scalar problems (Crighton & Leppington Reference Crighton and Leppington1970; Peake Reference Peake2004) and for some special classes of matrices (Daniele Reference Daniele1978; Rawlins & Williams Reference Rawlins and Williams1981). Lacking general analytical solutions, several numerical approaches can be used (Tuel Reference Tuel1968; Daniele & Lombardi Reference Daniele and Lombardi2007; Atkinson & Shampine Reference Atkinson and Shampine2008; Kisil Reference Kisil2016). The potential of the method for estimation and control comes from the fact that the size of the matrices to be factorized scales with the number of sensors and actuators, in contrast to control strategies based on solutions of algebraic Riccati equations that scale with the system's size.
While the use of a linear control strategy for nonlinear systems has clear limitations, i.e. it is expected to work only when the disturbances around a reference flow can be reasonably approximated by a linear model, there are many examples in the literature (several of which are cited above) that show how the approach is useful in a broad variety of flows.
The objective of our study is to obtain a method for real-time estimation and control that avoids some of the drawbacks of previous approaches. To achieve this, we strategically combine three pre-existing tools: the Wiener–Hopf regulator for flow control (Martinelli Reference Martinelli2009); a numerical method to solve matrix Wiener–Hopf problems (Daniele & Lombardi Reference Daniele and Lombardi2007); and matrix-free methods for obtaining the resolvent operator of large systems (Martini et al. Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020, Reference Martini, Rodríguez, Towne and Cavalieri2021; Farghadan et al. Reference Farghadan, Towne, Martini and Cavalieri2021). Combining these tools provides a novel method that can, for the first time, simultaneously handle complex forcing colour and be applied directly to large systems without the need for a priori model reduction, allowing parametric investigation of sensor/actuator placement at low cost when forcing/targets are low rank. To the best of the authors’ knowledge, this is also the first time that the Wiener–Hopf regulator is constructed from first principles, i.e. from the linearized equations of motion and a model of the forcing, as in the linear quadratic Gaussian (LQG) framework. This construction provides physical insights on the structures that can be estimated and controlled. The approach can also be used to improve wave-cancelling strategies (Li & Gaster Reference Li and Gaster2006; Fabbiane et al. Reference Fabbiane, Semeraro, Bagheri and Henningson2014; Sasaki et al. Reference Sasaki, Morra, Fabbiane, Cavalieri, Hanifi and Henningson2018a; Brito et al. Reference Brito, Morra, Cavalieri, Araújo, Henningson and Hanifi2021; Maia et al. Reference Maia, Jordan, Cavalieri, Martini, Sasaki and Silvestre2021), minimizing the effect of kernel truncation.
The method can be viewed as an extension of the resolvent-based estimation methods recently developed by Towne, Lozano-Durán & Yang (Reference Towne, Lozano-Durán and Yang2020) and Martini et al. (Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020). In the former study, a resolvent-based approach was developed to estimate space–time flow statistics from limited measurements. The central idea is to use the measurements to approximate the nonlinear terms that act as a forcing of the linearized Navier–Stokes equations, which in turn provide an estimate of the flow state upon application of the resolvent operator in the frequency domain. The latter study extends this resolvent-based methodology to obtain optimal estimates of the time-varying flow state and forcing. The estimator is derived in terms of transfer functions between the measurements and the forcing terms and can be written in terms of the resolvent operator and the CSD of the forcing. However, these methods are nominally non-causal, making them appropriate for flow reconstruction but limiting their applicability for flow control.
The method developed in the current paper follows that of Martini et al. (Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020), but with additional constraints to enforce causality. It is these constraints that lead to the aforementioned Wiener–Hopf problem. The causality of the new resolvent-based estimator makes it applicable for real-time estimation, and we use a similar approach to develop an optimal resolvent-based controller. The resulting estimation and control methods are thus obtained directly for the full-rank flow system, without requiring ROMs, and they make use of the spatiotemporal forcing statistics, thus avoiding the simplified forcing assumptions that can lead to significant reductions in performance.
The paper is structured as follows. The derivation of optimal estimation and control kernels based on the Wiener–Hopf formalism are constructed in § 2, and solutions are compared to those obtained from the algebraic Riccati equation using a linearized Ginzburg–Landau problem. Implementation of the method using numerical integration of the linearized system and using experimental data are described in § 3. An application to flow over a backward-facing step is presented in § 4. Final conclusions are drawn in § 5. An introduction to Wiener–Hopf problems (which appear in § 2) and their solution is presented in Appendix A.
2. Estimation and control using Wiener–Hopf methods
In what follows, we define the linear system considered, followed by the derivation of optimal estimation and control kernels based on the Wiener–Hopf approach. Optimality here is defined in terms of quadratic cost functionals. An approach to recover estimation and control gains from the proposed approach is presented. The derivation for full-state control is presented in Appendix B.
2.1. System definition
We consider the linear time-invariant system
where $\boldsymbol {u} \in \mathbb {C}^{n_u}$ represents the flow state, $\boldsymbol {f} \in \mathbb {C}^{n_f}$ is an unknown stochastic forcing, which can represent external disturbances and/or nonlinear interactions (McKeon & Sharma Reference McKeon and Sharma2010), $\boldsymbol {a}\in \mathbb {C}^{n_a}$ represents flow actuation used for control, $\boldsymbol {y} \in \mathbb {C}^{n_y}$ is a set of system observables, $\boldsymbol {n} \in \mathbb {C}^{n_y}$ is measurement noise and $\boldsymbol {z} \in \mathbb {C}^{n_z}$ is a set of targets for the control problem, for instance, perturbations at a given position or surface loads, to be minimized. The system evolution is described by the matrix ${{\boldsymbol{\mathsf{A}}}} \in \mathbb {C}^{n_u\times n_u}$, representing the linearized Navier–Stokes operator. The matrices ${{{\boldsymbol{\mathsf{B}}}}_{\boldsymbol {f}}} \in \mathbb {C}^{n_u\times n_f}, {{{\boldsymbol{\mathsf{B}}}}_{\boldsymbol {a}}} \in \mathbb {C}^{n_u\times n_a}, {{{\boldsymbol{\mathsf{C}}}}_{\boldsymbol {y}}} \in \mathbb {C}^{n_y\times n_u}$ and ${{{\boldsymbol{\mathsf{C}}}}_{\boldsymbol {z}}} \in \mathbb {C}^{n_z\times n_u}$ determine the spatial support of external disturbances, actuators, sensors and targets, as in Bagheri, Brandt & Henningson (Reference Bagheri, Brandt and Henningson2009a). Forcing and sensor noise are modelled as stochastic zero-mean processes with two-point space–time correlations given by
with ${\dagger}$ representing the adjoint operator using a suitable inner product and $\langle \cdot \rangle$ representing the ensemble average. We emphasize that these forcing and noise statistics are more general than those assumed in the derivation of the Kalman filter and LQG control, in which they must be uncorrelated in time. A frequency-domain representation of the two-point correlation is given by the CSDs
Note that this differs from the standard definition, i.e. $\langle \hat {\boldsymbol {f}} \hat {\boldsymbol {f}}^{H} \rangle$, with $H$ representing the conjugate transpose of the matrices; this modified definition simplifies the derivations that follow.
Throughout this work, we assume the system to be stable, that is, all eigenvalues $\lambda$ of ${{\boldsymbol{\mathsf{A}}}}$, i.e. $\lambda$ for which ${{\boldsymbol{\mathsf{A}}}} \hat {\boldsymbol {u}} = -\mathrm {i} \lambda \hat {\boldsymbol {u}}$ has non-trivial solutions, lie in the lower half-plane. As discussed in § 1, optimal control strategies are best suited for amplifier flows, which satisfy this stability requirement.
It is useful to split (2.1) into two systems, one of which is driven by the forcing and includes sensor noise,
and another noiseless system driven by actuation only,
The original system can be recovered adding the variables with subscripts ‘1’ and ‘2’, that is,
Figures 1 and 2 illustrate these systems.
The derivations that follow make use of the frequency-domain form of these equations. Taking a Fourier transform of (2.6) and (2.7) yields the input–output relationships
where
is the resolvent operator. Note that the above input–output relations are exact within the linearized model used here, and thus has the same limitation as any linear modelling of the original nonlinear system, i.e. treating the nonlinear interactions as exogenous stochastic noise. This framework is shared by all linear control laws that have been developed.
2.2. The estimation problem
Before deriving the optimal causal estimator that we seek, we first briefly review the derivation of the optimal non-causal estimator (Martini et al. Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020). We will see later that the two cases are closely related but that imposing causality leads to the appearance of an additional term. We focus on the uncontrolled estimation problem, i.e. $\boldsymbol {a}(t) = 0$, thus making (2.1) and (2.6) equivalent.
The optimal estimator minimizes the cost functional
defined in terms of the estimation error
where $\boldsymbol {z}$ is the target, i.e. the quantity to be estimated, and $\tilde {\boldsymbol {z}}$ is its estimate and where $\langle\, \cdot\, \rangle$ represents the expected value. Note that full-state estimation is recovered using ${{{\boldsymbol{\mathsf{C}}}}_{\boldsymbol {z}}}={{\boldsymbol{\mathsf{I}}}}$.
We seek an estimate of the targets in the form of a linear combination of sensor readings, in the form of
where the subscript $nc$ is a reminder that the kernel, ${{\boldsymbol{\mathsf{T}}}}_{z,nc} \in \mathbb {C}^{n_z\times n_y }$, is, in general, non-causal, and $\hat {{{\boldsymbol{\mathsf{T}}}}}_{z,nc}$ is its Fourier transform. Expanding (2.14) leads to
In the above equations, and henceforth, the frequency dependence is omitted for clarity. The minimum is found by taking the derivative of the cost function (2.17) with respect to $\hat {{{\boldsymbol{\mathsf{T}}}}}_{z,nc}^{{\dagger} }(\omega )$ and setting it to zero. The optimal estimation kernel is obtained as the solution of
where
Note that the forcing CSD $\hat {{{\boldsymbol{\mathsf{F}}}}}$ appears explicitly in the equation, and is thus naturally handled by the approach.
Causality of the estimation kernel can be enforced in (2.14) using Lagrange multipliers (Martinelli Reference Martinelli2009), as
where the Lagrange multipliers ${\boldsymbol {\varLambda }}_-,{\boldsymbol {\varLambda }}_+ \in \mathbb {C}^{n_y\times n_z}$ are required to be zero for $t > 0$, thus enforcing the condition ${{{\boldsymbol{\mathsf{T}}}}_{z,c}(t<0)=0 }$. The subscript $c$ is used to emphasize the causal nature of this kernel.
The functionals $J$ and $J'$ differ only by linear terms in the estimation kernel, and thus it is straightforward to see that taking the derivative of $J'$ and setting it to zero leads to
This apparently simple equation hides significant complexities. First, this is a single equation with two variables, ${\hat {{{\boldsymbol{\mathsf{T}}}}}_{z,c}}$ and ${\hat {\boldsymbol {\varLambda }}}$. Nevertheless, it admits a unique solution due to the requirements that ${{{\boldsymbol{\mathsf{T}}}}_{z,c}(t<0)=0 }$ and ${\boldsymbol {\varLambda }(t>0)=0 }$, which in the frequency domain impose restrictions on ${\hat {{{\boldsymbol{\mathsf{T}}}}}_{z,c}}$ and ${\hat {\boldsymbol {\varLambda }}}$, namely that these quantities are regular on the upper and lower complex planes, respectively. This restriction means that the values of ${\hat {{{\boldsymbol{\mathsf{T}}}}}_{z,c}}$ and ${\hat {\boldsymbol {\varLambda }}}$ for different frequencies have a non-trivial relation between them, and thus cannot be chosen independently. Equation (2.22), with the regularity constrains, constitutes a Wiener–Hopf problem. As discussed in § 1, analytical solutions are only known for special cases, and thus we resort to numerical methods. An introduction to Wiener–Hopf problems and numerical methods to solve them is presented in Appendix A. The solution of (2.22), once inverse Fourier transformed, is the optimal causal estimation kernel for the linear system at hand.
In previous studies (Martini et al. Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020; Amaral et al. Reference Amaral, Cavalieri, Martini, Jordan and Towne2021) we have shown that using the spatiotemporal forcing statistics considerably improves the accuracy of the estimation of a turbulent channel flow. The estimation method presented here preserves the ability to handle these complex forcing models, while being applicable in real time via the simple integration of
which is similar to (2.16a,b), but with integration restricted to positive $\tau$, implying that only present and past sensor measurements are used for estimation.
2.3. Partial-knowledge control
Control can be divided in full-knowledge control, where the full state of the flow is assumed to be known, and partial-knowledge control, where the flow state needs to be estimated from limited, noisy, sensor readings. In this section we derive the optimal partial-knowledge control; the full-knowledge case is considered in Appendix B for completeness.
Analogous to the estimation problem, we seek a control law that constructs an actuation signal as a linear function of sensor readings,
where $\boldsymbol {\varGamma }' \in \mathbb {C}^{ n_a\times n_y }$ is the control kernel and $\hat {\boldsymbol {\varGamma }}'$ is its Fourier transform. However, the derivation of the kernel is simplified if instead the actuation is expressed only in terms of $\boldsymbol {y}_1$, i.e. ignoring the influence of the actuator on the sensor,
where again $\boldsymbol {\varGamma } \in \mathbb {C}^{ n_a\times n_y }$ and $\hat {\boldsymbol {\varGamma }}$ its Fourier transform.
This implies no loss of generality: as $\boldsymbol {y}_2$ is a function only of the previous actuation, which is known, it can be computed using the actuator impulse response and subtracted from $\boldsymbol {y}$ to obtain $\boldsymbol {y}_1$. The approach is equivalent to the formalism of an internal model control (Morari & Zafiriou Reference Morari and Zafiriou1989). The control kernel $\boldsymbol {\varGamma }$ is then chosen so as to minimize a cost functional that trades off the expected value of targets and actuation,
where ${{\boldsymbol{\mathsf{P}}}} \in \mathbb {C}^{ n_a\times n_a }$ is a positive-definite matrix containing actuation penalties. For simplicity, we do not include a state cost matrix. This does not imply any loss of generality, as any state cost matrix ${{\boldsymbol{\mathsf{Q}}}} \in \mathbb {C}^{ n_u\times n_u }$, which must be positive definite, can be absorbed into the definition of the targets as ${{{\boldsymbol{\mathsf{C}}}}_{\boldsymbol {z}}}'= {{{\boldsymbol{\mathsf{C}}}}_{\boldsymbol {z}}} {{\boldsymbol{\mathsf{Q}}}}_c$, where ${{\boldsymbol{\mathsf{Q}}}} = {{\boldsymbol{\mathsf{Q}}}}_c {{\boldsymbol{\mathsf{Q}}}}_c^{H}$ is a Cholesky decomposition of ${{\boldsymbol{\mathsf{Q}}}}$.
Using the identity $\textrm {Tr}(\boldsymbol {\varPhi }\boldsymbol {\varPsi })=\textrm {Tr}(\boldsymbol {\varPsi }\boldsymbol {\varPhi } )$, valid for any matrices $\boldsymbol {\varPsi }$ and $\boldsymbol {\varPhi }$ with suitable sizes, the functional is rewritten as
The functional is expanded using
Before further expanding these terms, we define
As shown in Appendix B, these terms are the counterparts of $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$ for the full-knowledge control problem. It is now straightforward to show that
The cost functional can then be expressed as
where the subscript ‘nc’ was added to emphasize the non-causal nature of the control that will be obtained. Using the cyclic property of the trace and isolating terms with $\hat {\boldsymbol {\boldsymbol {\varGamma }}}_{nc}^{{\dagger} }$ , we obtain
where $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}} = \hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}}\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}}^{{\dagger} } + {{\boldsymbol{\mathsf{P}}}}$ was used. The minimum of $J$ is found by differentiating (2.36) with respect to $\hat {\boldsymbol {\boldsymbol {\varGamma }}}_{nc}^{{\dagger} }$, leading to
which can be solved for the optimal control kernel as
From (2.38), it can be seen that the optimal non-causal control kernel is a combination of the optimal non-causal estimation of targets ($\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}} \hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}^{-1}$, solution of (2.18)) with the optimal full-knowledge, non-causal control, described in Appendix B, ($\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}^{-1}\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$, solution of (B3)), acting to minimize these targets with actuation inputs.
As this control kernel is, in general, not causal, it cannot be used in real-time applications. It does, however, provide upper bounds for the effectiveness of causal control, and it is the basis for the control method proposed by Sasaki et al. (Reference Sasaki, Tissot, Cavalieri, Silvestre, Jordan and Biau2018b), where the kernel is truncated to its causal part. This approach was also successfully applied to experiments (Brito et al. Reference Brito, Morra, Cavalieri, Araújo, Henningson and Hanifi2021; Maia et al. Reference Maia, Jordan, Cavalieri, Martini, Sasaki and Silvestre2021). However, truncating the non-causal control kernel is, in general, sub-optimal.
To obtain the optimal causal control law, causality is again enforced via Lagrange multipliers. The modified cost functional reads as
where the subscript $c$ emphasizes the causal nature of the control that will be obtained. Just as in the estimation problem, the linear terms added to $J$, contribute to an additional term when taking the derivative of $J'$. The control kernel is now given by the solution of
Again, as in the estimation problem, the requirements that ${\boldsymbol {\varGamma }(t<0)=0}$ and ${\boldsymbol {\varLambda }_-(t>0)=0}$ makes this a well-posed problem, and specifically another type of Wiener–Hopf problem. The procedure to solve this problem is equivalent to the one used for the estimation problem, and is presented in Appendix A.
The control kernel $\boldsymbol {\varGamma }'$ can be now recovered from $\boldsymbol {\varGamma }$. The expression for the actuation
can be rewritten as
thus recovering $\boldsymbol {\varGamma }'$.
The closed-loop control diagram is illustrated in figure 3, where the relation between $\hat {\boldsymbol {\boldsymbol {\varGamma }}}'$ and $\hat {\boldsymbol {\boldsymbol {\varGamma }}}$ is shown. As $\boldsymbol {y}_2$ is a function only of the previous actuation, it can be computed in real time and subtracted from $\boldsymbol {y}$ to obtain $\boldsymbol {y}_1$. This procedure is by construction included in ${\boldsymbol {\varGamma }}'$.
Note that it is the process of removing the actuator's response from the sensor readings that can lead to instabilities if the feedback is not accurately modelled (Belson et al. Reference Belson, Semeraro, Rowley and Henningson2013). For amplifier flows, the actuators are typically located downstream of the sensors, the feedback tends to be small, and the optimal control is thus robust (Schmid & Sipp Reference Schmid and Sipp2016). This explains the successful use of optimal control in many studies (Semeraro et al. Reference Semeraro, Bagheri, Brandt and Henningson2011; Barbagallo et al. Reference Barbagallo, Dergham, Sipp, Schmid and Robinet2012; Morra et al. Reference Morra, Sasaki, Hanifi, Cavalieri and Henningson2020; Sasaki et al. Reference Sasaki, Morra, Cavalieri, Hanifi and Henningson2020; Maia et al. Reference Maia, Jordan, Cavalieri, Martini, Sasaki and Silvestre2021). As our method targets amplifier flows, the issue of robustness is not further addressed.
2.4. Recovering Kalman and LQR control gains
We now demonstrate that gain matrices for the Kalman-filter estimation (${{\boldsymbol{\mathsf{L}}}}$) and LQR control (${{\boldsymbol{\mathsf{K}}}}$) can be recovered from the Wiener–Hopf formalism. As both methods have the same optimality properties, they amount to different approaches for obtaining the same result if applied to a system satisfying the common assumptions in the derivation of Kalman filters and LQR control, such as white-in-time disturbances, i.e. $\hat {{{\boldsymbol{\mathsf{F}}}}}(\omega ) = {{\boldsymbol{\mathsf{F}}}}_0$.
From the Kalman-filter estimation equation (Åström & Wittenmark Reference Åström and Wittenmark2013),
the impulse response of the $i$th sensor, $y_j(t) =\delta _{j,i}\delta (t)$, is such that $\tilde {\boldsymbol {u}}(0^{+}) = {{\boldsymbol{\mathsf{L}}}}_i$, where ${{\boldsymbol{\mathsf{L}}}}_i$ is the $i$th row of ${{\boldsymbol{\mathsf{L}}}}$. Thus, from (2.16a,b),
The LQR control gains can be recovered by emulating an initial condition $\boldsymbol {u}_0$ at $t=0$, which can be accomplished by setting ${{{\boldsymbol{\mathsf{B}}}}_{\boldsymbol {f}}}={{\boldsymbol{\mathsf{I}}}}$ and using $\boldsymbol {f}(t) = \boldsymbol {u}_0 \delta (t)$, or equivalently, $\hat {\boldsymbol {z}}_1 = {{{\boldsymbol{\mathsf{C}}}}_{\boldsymbol {z}}}{{\boldsymbol{\mathsf{R}}}} \boldsymbol {u}_0$. From the LQR framework,
Since $\boldsymbol {u}(t<0)=0$, it follows that $\boldsymbol {a}(t<0) =0$. Comparing with the solution of (A16), (B6) and (2.10), we have
Comparing (2.45) and (2.46), we have ${{\boldsymbol{\mathsf{K}}}} = \boldsymbol {\varPi }_c(0)$.
Knowledge of LQR gains ${{\boldsymbol{\mathsf{K}}}}$ and the Kalman filter ${{\boldsymbol{\mathsf{L}}}}$ may be useful to understand regions of the flow that require accurate estimation to obtain effective control strategies, as discussed by Freire et al. (Reference Freire, Cavalieri, Silvestre, Hanifi and Henningson2020).
2.5. Validation using a linearized Ginzburg–Landau problem
We here compare the kernels and gains computed with the present method to those obtained with the Kalman-filter and the LQR approaches. We use a linearized Ginzburg–Landau problem, for which standard tools can be used for the computation of estimation and control gains without resorting to model reduction. Such models have been used in many previous studies (Bagheri et al. Reference Bagheri, Henningson, Hœpffner and Schmid2009b; Lesshafft Reference Lesshafft2018; Cavalieri, Jordan & Lesshafft Reference Cavalieri, Jordan and Lesshafft2019). The model reads as
and we use the same parameters as in Martini et al. (Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020), namely: $U=6 , \gamma =1-\mathrm {i}$ and $\mu (x)=\beta \mu _c(1-x/20)$, where $\beta =0.1$ was used and $\mu _c=U^{2} \textrm {Re} (\gamma ) / |\gamma |^{2}$ is the critical value for onset of absolute instability (Bagheri et al. Reference Bagheri, Henningson, Hœpffner and Schmid2009b).
For this validation, we use $\hat {{{\boldsymbol{\mathsf{F}}}}}={{\boldsymbol{\mathsf{I}}}}$, two sensors located at $x=5$ and $20$, an actuator at $x=15$, and a target at $x=30$. Figure 4 compares gains obtained from the proposed approach and from the algebraic Riccati equations. Both approaches produce identical gains, indicating their equivalence when white-noise forcing is assumed. Estimation and control kernels are shown in figure 5, again showing the equivalence between the two methods. The Kalman and LQG kernels are obtained as follows. From (2.16a,b), the state estimation is obtained as
with the estimation kernel given by
The LQG kernel is obtained from
with the solution reading as
and where the control kernel is given by
Figures 4 and 5 show that the Riccati-based and proposed approaches provide the same results, illustrating their equivalence.
3. Implementation
The size of the Wiener–Hopf problems (2.22) and (2.40) is independent of the size of the linear system $n_{u}$. The dominant cost to solve these problems is the Wiener–Hopf factorization of $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$, which are matrices that scale with $n_y$ and $n_a$, respectively. Accordingly, a solution can be obtained with low cost for arbitrarily large systems, as long as the number of sensors and actuators remains reasonable. However, the coefficient matrices of the Wiener–Hopf problems are functions of the resolvent operator (2.13), and thus requires the inversion of matrices of size $n_u$ to be constructed, which is unfeasible for large systems.
In § 3.1 we show a method to construct the coefficients of the Wiener–Hopf problems efficiently, with an approach to incorporate effective forcing models presented in § 3.2. In § 3.3 we show that the terms in the equations correspond to CSD matrices that can be obtained directly from numerical and physical experiments. The latter technique allows application of the tools developed in this paper when adjoint solvers are not available or in experimental set-ups.
3.1. Matrix-free implementation
In this section we apply methods developed in previous works (Martini et al. Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020, Reference Martini, Rodríguez, Towne and Cavalieri2021) to construct the terms $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}} , \hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}} , \hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$, circumventing the need to construct the resolvent operator, which would make their construction prohibitive in any practical scenario. The approach consists of using time-domain solutions of the linearized equations to obtain the action of the resolvent operator on a vector, which in turn can be used to reconstruct the operator.
As an example, assume that the action of ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}$ on a vector $\hat {\boldsymbol {v}}_d$ can be efficiently computed. The operator can be constructed row-wise by setting $\hat {\boldsymbol {v}}_d=\boldsymbol {e}_i$, where $\boldsymbol {e}_i$ is an element of the canonical basis for the forcing space: the resulting vector ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}\boldsymbol {e}_i$ provides the $i$th row of ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}$. If $n_f$ is small, ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}$ can be recovered by repeating the procedure for $i=1,\ldots, n_f$, and subsequently used for the construction of $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$.
To obtain the action of the resolvent operator on a vector for all resolved frequencies simultaneously, we use an approach based on the transient-response method, developed in Martini et al. (Reference Martini, Rodríguez, Towne and Cavalieri2021). The action of ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}$ and ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {f}}$ on a vector $\hat {\boldsymbol {v}}_d(\omega )$ is obtained using the system
where (3.2a–c) is obtained from a Fourier transform of (3.1a–c). Here $\hat {\boldsymbol {v}}_d$ is regarded as an input, with $\hat {\boldsymbol {r}}_d$ and $\hat {\boldsymbol {s}}_d$ as outputs of the system, with their dimensions implicit by the context. Using $\boldsymbol {v}_d(t)=\boldsymbol {e}_i\delta (t)$ ensures that the canonical basis is used for all frequencies. A time-domain solution of (3.1a–c) will be referred to as a forcing direct run. An actuator direct run is obtained by replacing ${{{\boldsymbol{\mathsf{B}}}}_{\boldsymbol {f}}}$ with ${{{\boldsymbol{\mathsf{B}}}}_{\boldsymbol {a}}}$, and provides the action of ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {a}}$ and ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {a}}$ on a given vector.
Similarly, the action of the operators ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}^{{\dagger} }$ and ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {a}}^{{\dagger} }$ on a vector is constructed by time marching the adjoint system,
referred to here as a sensor adjoint run. A target adjoint run is obtained by replacing ${{{\boldsymbol{\mathsf{C}}}}_{\boldsymbol {y}}}$ with ${{{\boldsymbol{\mathsf{C}}}}_{\boldsymbol {z}}}$ and can be used to obtain the action of ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {f}}^{{\dagger} }$ and ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {a}}^{{\dagger} }$.
The action of more complex terms can be obtained by solving (3.1a–c) and (3.3a–c) in succession. For example, if the output $\boldsymbol {r}_a$ of a sensor adjoint run is an input of a direct run, then the outputs of the latter will be given by ${ \hat {\boldsymbol {r}}_d = {{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}{{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}^{{\dagger} }\hat {\boldsymbol {v}}_a}$ and ${ \hat {\boldsymbol {s}}_d = {{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {f}}{{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}^{{\dagger} }\hat {\boldsymbol {v}}_a }$. The resulting system is referred to as a sensor adjoint-direct run. Following similar procedures, direct-adjoint, direct-adjoint-direct and adjoint-direct-adjoint runs are constructed. The terms whose action are obtained from each of these runs are illustrated in figure 6.
In practice, to link together direct and adjoint runs as described above, checkpoints of the output of one time integration are saved to disk and subsequently read and interpolated in the following run to be used as forcing terms. Details of this procedure as well as the impact of different interpolation strategies are described by Martini et al. (Reference Martini, Rodríguez, Towne and Cavalieri2021), to which we refer the reader for details. Alternatively, Farghadan et al. (Reference Farghadan, Towne, Martini and Cavalieri2021) developed an approach that minimizes the data to be retained using streaming Fourier sums to obtain the action of the operator on a discrete set of frequencies.
The most effective approach for constructing the Wiener–Hopf problems depend on the rank of forcing and targets. In the following sections, we outline the best approach for four possible scenarios and discuss the associated computational cost and the types of parametric studies that can be easily conducted in each case.
3.1.1. Low-rank $\boldsymbol{\mathsf{B}}$$_f$ and $\boldsymbol{\mathsf{C}}$$_{z}$
In this case scenario, the terms ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}, {{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {a}}^{{\dagger} }$ and ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {f}}^{{\dagger} }$ are small matrices and can be obtained using $n_f$ direct and $n_z$ adjoint runs. With those terms, $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}} , \hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}} , \hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$ can be constructed.
Note that as ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}$ is obtained from state readings in (3.1a–c), storing snapshots of $\boldsymbol {q}(t)$ allows for computing ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}$ for any ${{{\boldsymbol{\mathsf{C}}}}_{\boldsymbol {y}}}$ from inexpensive data post-processing. Using the same strategy in (3.3a–c) allows ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {a}}^{{\dagger} }$ to be computed for any ${{{\boldsymbol{\mathsf{B}}}}_{\boldsymbol {a}}}$. It is thus possible to inexpensively compute the control kernels for any sensor and actuator when the forcing and targets are low rank.
3.1.2. High-rank $\boldsymbol{\mathsf{B}}$$_{f}$ and low-rank $\boldsymbol{\mathsf{C}}$$_{z}$
In this scenario, ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}$ and ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {f}}$ are large matrices, and it is impractical to construct and manipulate them. Instead, ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}} {{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}^{{\dagger} }$ and ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {f}} {{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}^{{\dagger} }$, which are still small matrices, are obtained from $n_y$ sensors adjoint-direct runs, while ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {a}}$ can be constructed from $n_z$ target adjoint runs, as in § 3.1.1.
These terms can then be used to construct $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}} , \hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}} , \hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$, assuming that ${{\boldsymbol{\mathsf{F}}}}={{\boldsymbol{\mathsf{I}}}}$. A strategy to efficiently include complex forcing models in this approach will be detailed in § 3.2. As the sensor position is an input of the sensor adjoint-direct run, it becomes expensive to perform parametric sensor studies in this scenario. However, actuator placement studies are still inexpensive.
3.1.3. Low-rank $\boldsymbol{\mathsf{B}}$$_{f}$ and high-rank $\boldsymbol{\mathsf{C}}$$_{z}$
This scenario is similar to the one in § 3.1.2, with the role of sensors and actuators reversed. Here ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}$ is constructed from $n_f$ forcing direct runs, and ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {a}}$ is obtained from $n_a$ actuator direct runs. Since $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$ and $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$ are both large matrices, the product $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}} \hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$ is constructed directly from ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {a}}^{{\dagger} } {{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {f}}$, which can be obtained from $n_f$ forcing direct-adjoint runs, and ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}^{{\dagger} }$, which is obtained directly from ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}$.
In this scenario, parametric studies of actuator placement are costly, while such studies for the sensors are inexpensive.
3.1.4. High-rank $\boldsymbol{\mathsf{B}}$$_{f}$ and $\boldsymbol{\mathsf{C}}$$_{z}$
Finally, in this scenario, the product $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}} \hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$ cannot be broken into the product of small matrices as before, and thus has to be constructed directly using $n_y$ sensor adjoint-direct-adjoint runs or $n_a$ actuator direct-adjoint-direct runs. The term $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}} ( \hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}} )$ is constructed from $n_y ( n_a )$ sensor adjoint-direct(actuator direct-adjoint) runs. Again, we have assumed here that ${{\boldsymbol{\mathsf{F}}}}={{\boldsymbol{\mathsf{I}}}}$.
In this scenario, parametric studies of both sensors and actuators are expensive.
3.1.5. Summary
A brief summary of the costs of each scenario is presented in table 1. Note that the descriptions above focused on the solution of the Wiener–Hopf problem, which provide $\boldsymbol {\varGamma }$. In some scenarios, supplementary runs are required to obtain ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {a}}$, which is required to construct $\boldsymbol {\varGamma }'$ as in (2.42). This term can be obtained using sensor adjoint runs or actuator direct runs. We also report the extra runs required to obtain ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {f}}^{{\dagger} }{{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {f}}$, which will be used to obtain offline estimates of the control performance in § 3.5.
3.2. Using coloured forcing statistics
Previous results show that incorporating accurate coloured forcing statistics in the model is important to obtain accurate estimates of the flow state (Chevalier et al. Reference Chevalier, Hœpffner, Bewley and Henningson2006; Martini et al. Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020; Amaral et al. Reference Amaral, Cavalieri, Martini, Jordan and Towne2021). We now detail how to incorporate forcing colour for each of the scenarios described in the previous sections.
In the scenarios described in §§ 3.1.1 and 3.1.3, since the term ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}$ is a small matrix, forcing colour could be easily included a posteriori in the construction of $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$. In the scenarios §§ 3.1.2 and 3.1.4, where ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}{{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}^{{\dagger} }$ is directly obtained from a time-domain solution, the inclusion of forcing colour requires time-domain convolutions of the output of the sensor adjoint system with the forcing cross-correlation matrices, before its use as an input of the direct system. However, the convolutions of a large matrix (${{\boldsymbol{\mathsf{F}}}}$) and a vector (output of the adjoint system) is typically unfeasible. To circumvent this limitation, the approach proposed by Martini et al. (Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020) can be used. An estimated forcing CSD, obtained from low-rank flow measurements, is used to construct $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$. This can be done without explicit construction of $\hat {{{\boldsymbol{\mathsf{F}}}}}$, as will be shown next.
From a set of $n_{y'}$ auxiliary sensor readings, $\boldsymbol {y}'$, obtained as ${{{\boldsymbol{\mathsf{C}}}}_{\boldsymbol {y}}}'\boldsymbol {u}$, which we assume to contain the set of sensors that will be used for estimation and/or control ($\boldsymbol {y}$), and their CSD, ${{\boldsymbol{\mathsf{Y}}}}'$, the estimated forcing colour is obtained as
where
is a transfer function that estimates forcing from measurements with a prior assumption of white-noise forcing (Martini et al. Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020; Towne et al. Reference Towne, Lozano-Durán and Yang2020).
Estimates for $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$ using the estimated forcing CSD, $\hat {{{\boldsymbol{\mathsf{F}}}}}'$, referred to here as $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}'$ and $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}} '$, read as
Note that $\hat {{{\boldsymbol{\mathsf{T}}}}}' _f \in \mathbb {C}^{n_f \times n_{y'}}$, and that $n_f$ was assumed to be large. To avoid operations with large matrices, the compound term
that has size $n_y \times n_{y'}$, can be constructed from ${ {{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {f}} {{\boldsymbol{\mathsf{R}}}}^{{\dagger} }_{\boldsymbol {y}'\boldsymbol {f}} \in \mathbb {C}^{n_z\times n_{y'}}}$ and ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}'\boldsymbol {f}} {{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}'\boldsymbol {f}}^{{\dagger} } \in \mathbb {C}^{n_{y'}\times n_{y'}}$. As the terms extra terms ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}'\boldsymbol {f}}$ and ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}'\boldsymbol {f}} {{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}'\boldsymbol {f}}^{{\dagger} }$ can be obtained from the matrix-free approach described, the time-domain convolution can be replaced by a few extra time-domain solutions of the linearized problems and inexpensive post-processing.
3.3. Adjointless method
We now show that the terms needed to construct the estimation and control kernels can be obtained from experimental data alone. We first write the sensors and targets in terms of the external disturbances,
Their CSDs are then obtained from the forcing statistics (Towne et al. Reference Towne, Lozano-Durán and Yang2020) as
The non-causal estimation is obtained via the transfer function ${{\boldsymbol{\mathsf{S}}}}_{y,z} {{\boldsymbol{\mathsf{S}}}}_{y,y}^{-1}$. The terms $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and ${{{\boldsymbol{\mathsf{C}}}}_{\boldsymbol {z}}} \hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$ can be constructed from measurements of the uncontrolled system, as they correspond to sensor CSDs (${{{\boldsymbol{\mathsf{S}}}}}_{y_1,y_1}$) and the cross-spectra between sensors and targets (${{{\boldsymbol{\mathsf{S}}}}}_{y_1,z_1}$), respectively. As the terms $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$ can be obtained from the actuators’ impulse response, all the required terms can be obtained using physical or numerical experiments. This not only provides a simple framework to obtain optimal control based on a data-driven approach, but it also relaxes the requirement of an adjoint solver to obtain these control strategies for large systems, either by using readings from the nonlinear problem, as done by Martinelli (Reference Martinelli2009), or from the direct linearized problem excited by stochastic forcing.
The adjointless approach presented here is in fact the classical application of the Wiener regulator, which is constructed directly from sensor/target CSDs, with the derivation presented here showing that the two approaches, the classical and the resolvent-based, are equivalent when the exact force model is used. However, the statistical convergence of the CSDs is considerably more expensive, and typically less accurate, than its construction from first principles (§ 3.1) and, thus, the latter is preferable to a numerical experiment if an adjoint solver is available. For physical experiments, where obtaining a long time series is typically inexpensive, the method presented here provides an efficient way to obtain a data-driven optimal control law.
3.4. The role of sensors for the estimation problem
Insights into the problem of sensor placement can be obtained from further analysis of (3.11), focusing the discussion on non-causal estimation and white-noise forcing for simplicity. Sensor and target sensitivities to the forcing terms are given by ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}^{{\dagger} }$ and ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {f}}^{{\dagger} }$. We define two forcing subspaces, $\hat {{{\boldsymbol{\mathsf{W}}}}}_y$ and $\hat {{{\boldsymbol{\mathsf{W}}}}}_z$, spanned by the rows of ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {y}\boldsymbol {f}}^{{\dagger} }$ and ${{\boldsymbol{\mathsf{R}}}}_{\boldsymbol {z}\boldsymbol {f}}^{{\dagger} }$. Defining $\hat {\boldsymbol {w}}_{i,y}$ and $\hat {\boldsymbol {w}}_{i,z}$ to be orthogonal bases for these subspaces, the forcing CSD can be decomposed into four different subspaces:
• $\mathcal {F}_{y,z}$: spanned by components given by $\hat {\boldsymbol {w}}_{i,y}\hat {\boldsymbol {w}}_{i,z}^{{\dagger} } + \hat {\boldsymbol {w}}_{i,z}\hat {\boldsymbol {w}}_{i,y}^{{\dagger} }$, corresponds to forcing components correlated with responses in both $\boldsymbol {y}$ and $\boldsymbol {z}$,
• $\mathcal {F}_{y}$: the subspace spanned by components $\hat {\boldsymbol {w}}_{i,y}\hat {\boldsymbol {w}}_{i,y}^{{\dagger} }$ that are orthogonal to $\mathcal {F}_{y,z}$, containing forcing components correlated with responses in $\boldsymbol {y}$ but not in $\boldsymbol {z}$,
• $\mathcal {F}_{z}$: the subspace spanned by components $\hat {\boldsymbol {w}}_{i,z}\hat {\boldsymbol {w}}_{i,z}^{{\dagger} }$ that are orthogonal to $\mathcal {F}_{y,z}$, containing forcing components correlated with responses in $\boldsymbol {z}$ but not in $\boldsymbol {y}$,
• $\mathcal {F}^{\perp }$: the complement of the three above subspaces, containing forcing components that are correlated with neither $\boldsymbol {y}$ nor $\boldsymbol {z}$.
Defining $\hat {{{\boldsymbol{\mathsf{F}}}}}_{y}$ as the projection of $\hat {{{\boldsymbol{\mathsf{F}}}}}$ into $\mathcal {F}_y$, with similar definitions for the other subspaces, (3.11) can be rewritten as
Each of these subspaces plays a different role in the estimation. The subspace $\mathcal {F}_y$ generates responses at the sensors but not at the target and, thus, it does not provide any useful information that can be used to construct the estimates, instead appearing in the problem in a similar way as does the sensor noise. On the other hand, the subspace $\mathcal {F}_{y,z}$ generates responses at sensors and targets and, thus, the response to this forcing measured by the sensors can be used to estimate the corresponding response of the flow at the target locations. This is the term that is effectively used for target estimation. Finally, the subspace $\mathcal {F}_{z}$ corresponds to forces that have responses at the target but not at the sensors. Accordingly, the responses associated with it cannot be estimated.
This analysis has a direct relation with the concept of observable forces discussed in previous works by Towne et al. (Reference Towne, Lozano-Durán and Yang2020): only target components that are excited by forcing components that also generate readings on the sensors can be estimated and, as a consequence, controlled. Martini et al. (Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020) discussed how the correlation between difference forcing components can allow the estimation of the responses to non-observable forcing components, and this was shown to considerably improve turbulent flow estimation. Note that, in general, non-causal estimation/control is needed for use of the full correlation, i.e. estimation/control of all the target components which are correlated with the sensors. The causality constraint allows for a real-time application at the price of deteriorating the estimation/control.
To optimize the non-causal estimation, one should seek to minimize ${{\boldsymbol{\mathsf{F}}}}_z$ and reduce the effect of ${{\boldsymbol{\mathsf{F}}}}_y$ on the estimation of ${{\boldsymbol{\mathsf{F}}}}_{y,z}$. Although causality imposes extra restrictions, which further complicates the problem, the above discussion can provide insights into the sensor placement problem, which is still an active topic of research.
3.5. Discussion
As previously stated, the Wiener–Hopf formalism proposed here provides large computational savings when compared to tools based on algebraic Riccati equations for large systems. While the latter requires solutions of algebraic Riccati equations for matrices of size $n_u$, the former only requires factorization of matrices of size $n_y$ and $n_a$, and can thus be solved for large systems. The largest cost of the approach is associated with construction of the terms that define the Wiener–Hopf problem. Our matrix-free methods allow the construction of these terms using direct and adjoint simulations or experimental data (as shown in § 3.3), making the proposed approach widely applicable.
When the terms are obtained numerically, the construction of control laws for different sensor/actuator configurations is very cheap when forcing/targets are low rank. Noting that a given control law, ${\boldsymbol {\varGamma }}$, e.g. the causal or non-causal optimal control, can be analysed via the sensor/target CSDs allows for a fast offline evaluation of the control strategy and, thus, for an efficient investigation of sensor/actuator placement. The target CSDs (${{\boldsymbol{\mathsf{S}}}}_{z,z}$) of a system controlled with the control kernel $\boldsymbol {\varGamma }$ can be obtained from the uncontrolled target CSDs (${{\boldsymbol{\mathsf{S}}}}_{z_1,z_1}$) and the terms found in the Wiener–Hopf equations. From (2.31)–(2.34),
In the next section this expression will be used to quickly estimate the control performance for several actuator locations.
The cost of the method is dominated by the direct and adjoint runs used to construct the matrix coefficients in the Wiener–Hopf equations, and the cost of actually solving the Wiener–Hopf problem is comparatively negligible. This is the case because the cost of the direct and adjoint runs scale (often linearly) with the problem dimension $n_{u}$, while the cost of the Wiener–Hopf problem is independent of the problem dimension and instead scales with the number of sensors and actuators, which are typically small. As a point of reference, the direct and adjoint runs for the example problem in § 4 took around one hour, while solving the Wiener–Hopf took minutes on a laptop computer.
The costs of the current method are thus considerably smaller than the iterative method proposed by Semeraro et al. (Reference Semeraro, Pralits, Rowley and Henningson2013). For a case with a single sensor and actuator and low-rank forcing and targets, they report approximately $40$ iterations to converge both estimation and control gains. With the proposed approach, only two runs are necessary, reducing the cost by a factor of $20$ while also providing sensor and actuator parametric studies, an offline performance estimation of these configurations and the possibility of handling complex force models.
The kernels that were obtained in § 2.5, and as will be shown in § 4, are smooth and decay rapidly for large values of $\tau$. This means that the convolutions required for estimation and control can be efficiently computed with a finite number of points using standard numerical quadrature methods, and, thus, can be used in implementations with limited memory and computational power, as in experimental applications (Fabbiane et al. Reference Fabbiane, Simon, Fischer, Grundmann, Bagheri and Henningson2015; Brito et al. Reference Brito, Morra, Cavalieri, Araújo, Henningson and Hanifi2021; Maia et al. Reference Maia, Jordan, Cavalieri, Martini, Sasaki and Silvestre2021).
4. Estimation and control of the flow over a backwards-facing step
In this section we illustrate the potential of the tools developed in this paper using an amplifier flow in which the level of nonlinearity can be easily adjusted. This allows a smooth transition between linear (linearized Navier–Stokes equations, or small perturbation amplitude in the nonlinear system) and nonlinear (larger perturbation amplitude in the nonlinear system) problems.
We study the two-dimensional flow over a backward-facing step with Reynolds number $Re=500$ based on the step height. In order to model disturbances coming from the upstream channel-like flow, we consider high-rank disturbances localized upstream of the step, a case significantly more challenging than the similar problem studied by Hervé et al. (Reference Hervé, Sipp, Schmid and Samuelides2012), who considered rank-1 disturbances. The flow is illustrated in figure 7, where the region to which the disturbances are applied is illustrated. The inflow condition is laminar Poiseuille flow, and quantities are made non-dimensional with respect to the maximal inflow velocity and the step height. The system linearization is performed around the steady solution of the unforced nonlinear problem. Note that for large disturbances, mean-flow distortion can arise, which can be partially accounted for by linearizing the system around the mean flow (Högberg, Bewley & Henningson Reference Högberg, Bewley and Henningson2003; Chevalier et al. Reference Chevalier, Hœpffner, Bewley and Henningson2006; Sipp & Lebedev Reference Sipp and Lebedev2007; Leclercq et al. Reference Leclercq, Demourant, Poussot-Vassal and Sipp2019). However, the choice of base flow is a generic issue for any linear method and not specific to our particular formulation, so we do not explore this further.
The Navier–Stokes equations are solved using the spectral-element code Nek5000 (Fischer & Patera Reference Fischer and Patera1989; Fischer Reference Fischer1998), which uses $n$th-order Lagrangian interpolants within each element to solve a weak formulation of the incompressible Navier–Stokes equations, upon which the method proposed here was implemented. The resulting open-source code is available at . The domain was represented by 600 elements, each discretized by fifth-order polynomials. Time integration was performed using a non-dimensional time step of $3\times 10^{-2}$. For the nonlinear integration, used to obtain the base flow, an inflow with a Poiseuille profile was imposed at the leftmost boundary, an outflow condition imposed on the rightmost boundary, and no-slip conditions imposed on all other boundaries. For the linear runs, Dirichlet boundary conditions for velocity fluctuations were used on all boundaries. The flow is globally stable and, thus, the base flow is obtained by simple time marching of the Navier–Stokes equations until the time derivative is smaller than $10^{-10}$. When present, external forcing is obtained by a pseudo-random number generator, whose seed is initialized to the same value on each processing unit, making uncontrolled and controlled runs comparable as long as the same time step and number of cores are used. Although straightforward for the linear problem, CFL constraints rendered fixed time steps impractical when large perturbations are present, and, thus, equivalent time steps for different runs could not be obtained. For such cases, representative snapshots will be presented instead.
The dynamics of the linearized system can be summarized as follows. Upstream of the step the flow is Poiseuille-like and only exhibits spatially decaying waves. Once these waves reach the shear layer downstream of the separation, they excite Kelvin–Helmholtz instability waves, which undergo significant growth before the end of the recirculation bubble that forms in the wake of the step.
All sensors, actuator and targets considered here have Gaussian spatial support, given by $\exp ({-{(x-x_c)^{2}}/{2\sigma _x^{2}}-{(y-y_c)^{2}}/{2\sigma _y^{2}}})$, with $\sigma _x=0.2$ and $\sigma _y=0.1$. All sensors and actuators used act on the streamwise direction only. Unless explicitly stated, $\hat{\boldsymbol{\mathsf{N}}}$/${{\boldsymbol{\mathsf{P}}}}$ is taken as a constant identity matrix with diagonal entries corresponding to $10^{-2}$ of the maximum value of $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}} / \hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ without noise/penalty. The ability of the control law to suppress disturbances is typically a monotonic function of sensor noise/actuator penalty, asymptotically reaching a minimum value for small enough values of these quantities. However, if very small values are used, numerical ill-conditioning can affect this trend. These effects are discussed by Sasaki et al. (Reference Sasaki, Morra, Fabbiane, Cavalieri, Hanifi and Henningson2018a). The values used here guarantee effective factorization, avoiding numerical ill-conditioning of the factorization problem (§ A.2.2). For most of the configurations studied, these values yield results that approach the zero cost/noise limit.
In the following sections, control on the linearized problem is first investigated, where the role of the number and placement of sensors and actuators is studied. As we have previously explored the non-causal estimation problem (Martini et al. Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020), we focus on the causal control of disturbances, which depends on accurate causal estimation. Next, the same control strategies are obtained from a numerical experiment, as proposed in § 3.4, showing that the method can be applied to experimental and adjoint-less scenarios. Finally, we illustrate the application of control laws obtained for the linearized system to the nonlinear equations.
4.1. Control of the linearized problem
We initially focus on a linearized system disturbed by high-rank external forces, designing control strategies to minimize readings from low-rank targets. We thus perform a limited study on the sensor placement, but a large actuator placement study. These results will be used to inform sensors and actuators to be used on a full-rank target scenario.
4.1.1. Control with a single sensor
We first consider the use of a single sensor (middle circle in figure 7) and investigate the placement of one actuator for control (locations indicated by crosses in figure 7). Three control approaches are explored: non-causal control (nc), the truncated non-causal control (tnc) and optimal-causal control (c). Truncated non-causal control corresponds to a truncation of the optimal non-causal control to its causal part, corresponding to the approach used in previous works (Sasaki et al. Reference Sasaki, Tissot, Cavalieri, Silvestre, Jordan and Biau2018b; Brito et al. Reference Brito, Morra, Cavalieri, Araújo, Henningson and Hanifi2021), which was seen, in some cases, to be a good approximation for optimal causal control, pointing to a wave-cancelling nature of optimal control strategies in these scenarios (Sasaki et al. Reference Sasaki, Morra, Fabbiane, Cavalieri, Hanifi and Henningson2018a). Non-causal control cannot be used in real-time applications, but it provides upper bounds for the performance of any linear control strategy and, thus, can be useful to evaluate different sensor and actuator placements.
As we use high-rank forcing and a low-rank target, the configuration corresponds to scenario (ii) described in § 3.1. A total of three time-domain solutions were obtained, corresponding to one adjoint-direct system for the sensors, from which the estimation terms ${{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}$ and ${{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}$ are obtained, and one adjoint-direct system for the targets, from which impulse responses from any actuator can be obtained. All the following results were obtained via inexpensive post-processing of the resulting data.
Representative control kernels for different actuator placements are shown in figure 8. The causal kernel is seen to converge to the non-causal kernel for large $\tau$ when the actuator is not far upstream of the sensor. When the non-causal control law uses non-causal information, the causal control law typically exhibits a spike at $\tau =0$. This behaviour can be understood as the control law using the most recent information to compensate for future information, which is not available. Similar behaviour was reported by Morra et al. (Reference Morra, Sasaki, Hanifi, Cavalieri and Henningson2020).
To compare the different control strategies, (3.13) is used to compute the target's power spectral density (PSD) and expected energy, i.e. the integral of the PSD, with the results presented in figure 9. All control strategies converge approximately to the same PSD reduction when the actuator is located downstream of the sensor. Not surprisingly, non-causal control provides the highest PSD reduction for all configurations, with little dependence on the actuator placement. As the non-causal control law can be solved independently for each frequency, the PSDs obtained with this control are lower than that of the uncontrolled case for all frequencies. Different trends are observed for the causal or the truncated-non-causal (TNC) control. Comparing the target PSDs for the causal and non-causal cases in figure 9(b), the non-causal control provides the lowest PSD at all frequencies. The causal control increases target PSD at lower frequencies ($|\omega |<0.2$), which is however compensated by the reduction for the other frequencies. Such a compromise between PSD reduction in different frequencies is unavoidable in causal control strategies, due to constraints imposed by causality. Figure 10 compares the kernels in the frequency domain and shows that each approach better represents the non-causal kernel for different frequencies. A comparison of figures 9(b) and 10 shows that the causal control better reproduces the non-causal control for the most relevant frequencies, i.e. those which provide a higher reduction in the target PSD, explaining its superior performance.
Figure 9(d) shows that the three approaches show different trends as the actuator is moved upstream. While actuator placement has a negligible influence on the non-causal control, it significantly affects the TNC control, leading to an increase in the target energy for some configurations. Causal control, on the other hand, remains efficient even when located upstream of the sensor, with the reduction in the PSD degraded by $50\,\%$ only when the actuator is located approximately 4.5 step heights upstream of the sensor.
Two different effects can explain these observations. Structures that emerge from the upstream disturbances can have significant coherence lengths, of the order of two convective time units, as later reported in figure 15. This coherence length allows the sensor to partially estimate their upstream components, which can thus be cancelled by the actuator. It is speculated that a similar mechanism is responsible for the control of a jet (a convectively unstable flow, as the present example) using downstream sensors obtained by Bychkov et al. (Reference Faranosov, Belyaev, Bychkov, Kopiev, Kopiev, Moralev and Kazansky2019) and Kopiev et al. (Reference Kopiev, Faranosov, Kopiev, Bychkov, Moralev and Kazansky2019). An extreme example is when the flow is excited with a harmonic, rank-1, forcing: forces and flow response have infinite coherence lengths, and can thus be estimated and controlled based on only a few sensor readings, from which amplitudes and phases are extracted.
Another possible mechanism is the exploration of different control approaches by the causal control. As the flow studied here is incompressible, the actuation has an instantaneous, although possibly small, effect throughout the flow. This response is typically negligible for the non-causal control, which favours more effective mechanisms, but may be the only one available for the causal control when the actuator is located far upstream and exploited if low actuation penalties are used. This interpretation is supported by the dotted lines in figure 9, where the actuation penalty was drastically reduced. For upstream actuators, a significant reduction of the targets PSDs is observed for the causal control, but a small effect is observed for the non-causal control. Actuator placements upstream of the sensors tend to be less robust to unmodelled dynamics (Belson et al. Reference Belson, Semeraro, Rowley and Henningson2013), and it is a configuration that is typically not used for amplifier flows (Schmid & Sipp Reference Schmid and Sipp2016). Thus, upstream actuators are not further investigated in this work.
4.1.2. Control with multiple sensors
Hervé et al. (Reference Hervé, Sipp, Schmid and Samuelides2012) reported higher perturbation reductions for a similar control configuration than those obtained here using a single sensor and actuator. However, the cited work considered only rank-1 disturbances. The present work deals with a more complex forcing scenario, with several waves exciting the Kelvin–Helmholtz instability. The identification of such multiple waves invariably requires multiple sensors; we thus explore this scenario. Four additional time-domain solutions were necessary for the results presented next, corresponding to adjoint-direct systems for the added sensors, the top and bottom circles in figure 7.
Figure 11 shows the expected target energy using the three sensors. A comparison with results obtained using one sensor identifies two trends: (i) lower target PSDs are observed, indicating that the additional sensors are indeed necessary to identify the multiple incoming waves and to accurately estimate the target readings; (ii) a larger distance between the actuator and the sensors is required for TNC to reproduce the optimal control strategies. Comparing the kernels when using one and three sensors in figure 12, it can be seen that non-causal control for three sensors is more ‘spread’ in $\tau$ and, thus, has more signal content for $\tau <0$. It is then expected that its truncation leads to a more significant degradation of the control.
4.1.3. Full-rank target control
From figures 9 and 11, we conclude that the actuator should be located shortly downstream of the sensors. With sensor and actuator placements defined, control kernels for full-rank targets are constructed. This required that the following additional time-domain solutions be performed: the last equation in the sensor direct-adjoint-direct problem for each of the sensors, and the full direct-adjoint system for each actuator. In total nine additional time-domain solutions were used.
Figure 13 compares the control kernel for a rank-1 and full-rank ${{{\boldsymbol{\mathsf{C}}}}_{\boldsymbol {z}}}$. The close match between the two kernels indicates that downstream of the step the problem can be well approximated by a rank-1 model, and that the high-rank behaviour of the problem is indeed restricted to the receptivity of the Kelvin–Helmholtz modes to channel disturbances. This scenario is further supported by results presented in figure 14, where flow snapshots and time series for the perturbation norm are shown for several configurations of the sensor, actuators and targets. Control designed for full-rank targets and multiple actuators do lead to perturbation energy reduction, but the biggest improvement is provided by the use of multiple sensors.
4.2. Adjoint-less and empirical application
Here we illustrate a data-driven application of the proposed method. The necessary data can be obtained from experimental set-ups, from numerical solutions of the nonlinear problem or from the direct linearized problem disturbed by stochastic forcing. We focus here on the latter scenario, with the use of a single sensor and target. The generalization to more complex configurations, with more sensors, actuators and targets, is straightforward. From sensor and target time series, the auto-correlation and cross-correlations are computed in the time domain. As previously described in § 2, (3.11), the terms $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$ are equivalent to the corresponding CSDs.
Figure 15 compares estimates of ${{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}$ and ${{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}$, obtained ‘empirically’, from the numerical experiment with different data lengths, and ‘analytically’, obtained with time marching of direct and adjoint equations. The resulting control kernels are also compared. Convergence trends are shown using the error, defined as
where the $L^{2}$ norm is used. The terms $\tilde {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and ${{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}} }$ indicate, respectively, the empirical and analytical terms, with similar expressions for the other terms. The trends suggest convergence rates scaling with $\Delta T^{-1/2}$, where $\Delta T$ is the time-series length used to estimate ${{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}$ and ${{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}$.
Although significantly more expensive, and typically less accurate than results obtained with the use of adjoint solvers, the use of numerical experiments does extend the applicability of the method to virtually any scenario and solver, and also to derive control laws directly from experiments, which typically can be carried out to obtain long time series for better convergence of the required correlations. Note that if experimental data are used, sensor noise is already present in the data. However, if noise levels are small, adding a small noise may be required to better condition the factorization of $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$.
4.3. Estimation and control of the nonlinear system
Control laws using two different configurations for controlling the nonlinear problem are investigated. As shown in §§ 4.1.2 and 4.1.3, the control kernels for full- and low-rank targets provide similar results, and we thus focus on the latter. We consider two scenarios, one using one sensor, one actuator and one target, and another using three of each. Although including statistics of the nonlinear terms within $\hat {{{\boldsymbol{\mathsf{F}}}}}$ in the nonlinear case would likely improve the results (Amaral et al. Reference Amaral, Cavalieri, Martini, Jordan and Towne2021), we chose not to pursue this so that the control law remains fixed as the forcing amplitudes increase, simplifying comparisons.
A series of nonlinear simulations were performed using spatially and temporally white forcing in the upstream region indicated in figure 7 with different forcing amplitudes defined by the root-mean-square (RMS) value $\epsilon$, i.e. $\hat {{{\boldsymbol{\mathsf{F}}}}} = \epsilon ^{2} {{\boldsymbol{\mathsf{I}}}}$. For reference, $\epsilon =10^{-2}$ leads to perturbations that reach 10–20 % of the baseflow velocity at the targets and, thus, is well into the nonlinear regime. As the control and estimation kernels are constructed to minimize the expected value of their respective cost functionals, a comparison should be based on an ensemble of simulations. Here we assume ergodicity of the system, and the ensemble averaging is replaced by time averaging. This assumption is validated using multiple runs for some configurations. The results for the linear problem presented in this section used an outflow condition on the rightmost domain, in order to properly compare the linear and nonlinear systems. The estimation and control laws should be based on a linear system that represents the nonlinear problem; however, for this problem, the impact of the different boundary conditions used in the linear and nonlinear systems is negligible, as reported next.
For small forcing amplitudes, the dynamics is dominated by linear mechanisms and, thus, it is expected that the performance of the estimation and control laws will be the same as demonstrated in the previous sections. Focusing initially on the control problem, representative snapshots of the flow and time evolution of perturbation norms for controlled and uncontrolled nonlinear systems are shown in figure 16. For the lower external forcing RMS, control of the nonlinear system is similar to the control of the linearized problem but, for larger amplitudes, it degrades. The perturbation norm, when normalized by the forcing RMS, reduces for larger forcing amplitudes due to nonlinear saturation. Estimation and control performances are measured as the non-estimated/controlled target energy fraction,
Figure 17 shows $\mathcal {E}_{est,con}$ as a function of $\epsilon$. Performances for the linear and the nonlinear systems are equivalent when small forcing amplitudes are used but, for the latter, it degrades for larger forcing amplitudes. The trend is also observed in figure 18, where time-series samples of a target for the uncontrolled problem, its causal estimation and for the controlled problem are shown. Comparing different runs, indicated by the black dots in figure 17, it can be observed that the spread is low for the low forcing amplitudes and increases for higher amplitudes, for which the control performance is degraded.
Control and estimation performances for the linear problems are equivalent, indicating that all estimated perturbations are effectively controlled. A similar result is observed for the nonlinear problem disturbed by small external forcing. Estimation deteriorates for $\epsilon \approx 10^{-3}$, while control remains effective up to ${\epsilon \approx 10^{-2}}$. As the Kelvin–Helmholtz mode amplifies upstream disturbances, small but finite disturbances are amplified and can exhibit significant nonlinear dynamics. Such nonlinear effects, which are not accounted for by the linear approach used, degrade estimation performance. For the controlled system, these perturbations are cancelled before they are amplified, and, thus, the linear assumption is valid for a larger range of external forcing amplitudes for the controlled problem.
The degraded performance of the approaches presented here when applied to the nonlinear system disturbed with high-amplitude forcing is due to the saturation of the uncontrolled flow and to the violation of the assumption that the disturbances evolve linearly. From figure 16(j), comparing the normalized perturbation norm for the scenario with $\epsilon =10^{-2}$ to those with lower values of $\epsilon$, it can be seen that not only the perturbation norm for the controlled problem is larger, but also the norm of the uncontrolled problem is smaller. Both of these factors impact $\mathcal {E}_{con}$ as in (4.2a,b). The nonlinear interactions can be treated as additional external forcing terms in the linearized equation (McKeon & Sharma Reference McKeon and Sharma2010; Karban et al. Reference Karban, Bugeat, Martini, Towne, Cavalieri, Lesshafft, Agarwal, Jordan and Colonius2020). In this context, the degraded performance can be related to inadequacies of the forcing model. In particular, the nonlinear interactions are not, in general, zero mean when the linearization is performed around a baseflow, thus violating one of the assumptions used to construct the control laws. An alternative is to use a linearization around the mean flow, for which the nonlinear terms are, by definition, zero mean. In previous works (Martini et al. Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020; Amaral et al. Reference Amaral, Cavalieri, Martini, Jordan and Towne2021) it was shown that a more representative modelling of these forcing terms can lead to significant improvements in estimation, and, thus, potentially to improvements in control also for larger disturbances. Control formulations using a linearization around the mean flow may follow the methods in Högberg et al. (Reference Högberg, Bewley and Henningson2003) or Leclercq et al. (Reference Leclercq, Demourant, Poussot-Vassal and Sipp2019), with various mean flows used in successive linearisations of the system as control gains are progressively increased. Integrating these approaches into the control law is beyond the scope of this study, and will be considered in future work.
5. Conclusions
Causal resolvent-based estimation and control methods based on the Wiener–Hopf framework have been presented. The approach is an extension of the non-casual resolvent-based estimation methods developed by Towne et al. (Reference Towne, Lozano-Durán and Yang2020) and Martini et al. (Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020) to causal estimation and control, and is obtained by combining three different tools: the Wiener-regulator framework (Martinelli Reference Martinelli2009), matrix-free methods to obtain the action of the resolvent operator (Martini et al. Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020, Reference Martini, Rodríguez, Towne and Cavalieri2021) and numerical methods to solve matrix Wiener–Hopf problems (Daniele & Lombardi Reference Daniele and Lombardi2007). The resulting method is directly applicable to large systems without model reduction or simplified forcing assumptions, requiring only low-rank sensor and actuator set-ups, which is the case in any practical configuration. Computational costs are orders-of-magnitude lower than previous approaches for full-rank systems (Semeraro et al. Reference Semeraro, Pralits, Rowley and Henningson2013). If low-rank forcing/targets are used, inexpensive exploration of virtually any sensor/actuator configuration using only data post-processing can be obtained, allowing the optimization of sensor and actuator placements. Control of systems with high-rank forcing and targets is obtained. The ability to deal with high-rank targets avoids a possible bias of the control law towards the specific location of a given low-rank target; instead, fluctuations across the domain of interest may be minimized.
Using an open-source implementation of the proposed method, control of the flow over a backward-facing step is investigated. The flow is disturbed by high-rank forcing, making this test case considerably more challenging than a previous study that focused on rank-1 forcing (Hervé et al. Reference Hervé, Sipp, Schmid and Samuelides2012). Downstream of the step the flow is low rank, dominated by a Kelvin–Helmholtz instability wave, which is reflected in the fact that a control strategy is only slightly altered if targets are rank-1, rank-3 or full rank, and if one or multiple actuators are used. Considerable gains were obtained using multiple sensors, which is explained by the presence of several modes from the upstream Poiseuille-like flow and that excite the downstream Kelvin–Helmholtz waves.
Obtaining optimal controllers without model reduction circumvents possible performance losses of ROM-derived controllers when applied to the full system (Åström & Murray Reference Åström and Murray2010, p. 349). Moreover, the present method handles naturally complex, coloured forcing in space and time, as the forcing CSD, $\hat {{{\boldsymbol{\mathsf{F}}}}}$, can be arbitrarily specified for each frequency. This is likely crucial for the control of turbulent flows, as the behaviour of coherent structures depends strongly on how these are forced (Chevalier et al. Reference Chevalier, Hœpffner, Bewley and Henningson2006; Towne, Schmidt & Colonius Reference Towne, Schmidt and Colonius2018; Martini et al. Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020; Morra et al. Reference Morra, Nogueira, Cavalieri and Henningson2021; Nogueira et al. Reference Nogueira, Morra, Martini, Cavalieri and Henningson2021), but the difficulties involved in estimating and using coloured forcing models for control have hindered their use in previous applications. The method presented here, and forcing estimation methods presented in previous work (Martini et al. Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020), bridge these difficulties, allowing affordable full-rank, full-coloured controllers to be used.
This work also sheds light on the wave-cancellation behaviour of optimal control of shear flows (Sasaki et al. Reference Sasaki, Morra, Fabbiane, Cavalieri, Hanifi and Henningson2018a), which is equivalent to the TNC control presented here. The approach is optimal whenever the non-causal control kernels do not rely on future information, i.e. while not designed to be causal, they are causal. This is the case for flows dominated by downstream-travelling modes, such as jets and boundary layers, provided that there is proper spacing between sensors, actuators and targets. Reducing the distance between these, which may enhance control, generates non-causal components in the control kernel, and deteriorates the TNC approach, particularly if several sensors/actuators are used. This effect can be considerably reduced with the optimal causal control strategy developed here.
Funding
E.M. acknowledges financial support by CAPES grant 88881.190271/2018-01. J.J. and A.T. were supported by the Air Force Office of Scientific Research (AFOSR) grant no. FA9550-20-1-0214. A.V.G.C. was supported by CNPq grant 313225/2020-6.
Declaration of interests
The authors report no conflict of interest.
Appendix A. Wiener–Hopf problems
Analysing linear differential equations in the Fourier domain has the advantage of decoupling different frequencies for which solutions can be obtained independently. This property has been exploited in a previous study (Martini et al. Reference Martini, Cavalieri, Jordan, Towne and Lesshafft2020) to achieve optimal non-causal estimation, i.e. estimation based on both past and future sensor readings. Imposing causality, however, couples different frequencies, which must then be solved for simultaneously. The resulting equations frequently lead to Wiener–Hopf problems. We here provide a brief introduction to this class of problems and a numerical method to solve them.
A.1. Introduction
Wiener–Hopf equations appear when solving problems with restrictions applied on half-domains. One example is the inversion of the half-convolution, given by
with ${{\boldsymbol{\mathsf{W}}}}_+(\tau ) \in \mathbb {C}^{n_a\times n_z}$ considered as the unknown matrix function to be determined. The terms ${{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}(t) \in \mathbb {C}^{n_a\times n_a}$ and ${{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}(t) \in \mathbb {C}^{n_a\times n_z}$ are known matrix functions. In § 2, $n_a$ and $n_z$ corresponded, respectively, to the number of actuators and targets used for control, ${{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}$ and ${{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}$ are related to the actuator transfer functions and ${{\boldsymbol{\mathsf{W}}}}_+ \in \mathbb {C}^{n_a\times n_y}$ to the optimal full-knowledge control. Solving (A1) allows for the construction of full-knowledge control laws, as will be demonstrated later.
The frequency-domain representation of (A1) is obtained by first constructing an equation valid for $t<0$. Extending (A1) for negative times will in general break the equality of the equation. An a priori unknown term ${{\boldsymbol{\mathsf{W}}}}_- \in \mathbb {C}^{n_a\times n_z}$ is thus added to preserve the equality, as
The integral can be extended to $-\infty$ by requiring ${{\boldsymbol{\mathsf{W}}}}_+(t<0)=0$. Similarly, requiring ${{\boldsymbol{\mathsf{W}}}}_-(t>0)=0$ allows (A1) and (A2) to be added, leading to
which can be expressed in the frequency domain as
where
with similar expressions for the other variables. Although this is a single equation for two variables (${{\boldsymbol{\mathsf{W}}}}_-$ and ${{\boldsymbol{\mathsf{W}}}}_+$), the restriction that these variables be zero on different temporal half-domains ensures that the problem is well posed.
Provided ${{\boldsymbol{\mathsf{W}}}}_+$ is bounded for $t \to \infty$, the requirement that ${{\boldsymbol{\mathsf{W}}}}_+(t<0) =0$ is equivalent to restricting $\hat {{{\boldsymbol{\mathsf{W}}}}}_+(\omega )$ to be regular in the upper half of the complex plane. This equivalence can be observed from its inverse Fourier transform,
which can be computed for $t<0$ by closing the contour around the upper half-plane and using the residue theorem. As $\mathrm {e}^{-\mathrm {i} \omega t}\to 0$ for $|\omega | \to \infty$ if $t<0$ and $\Im (\omega )>0$, the integral on the upper contour closure is zero and, as neither $\hat {{{\boldsymbol{\mathsf{W}}}}}_+(\omega )$ nor the exponential function have poles in the top half-plane, the integral is null for all $t<0$. A similar argument holds for ${{\boldsymbol{\mathsf{W}}}}_-$ and $t>0$, with the contour being closed from below. Henceforth, plus and minus subscripts are used to label functions that are regular in the upper and lower halves of the complex frequency plane, respectively. Both frequency- and time-domain representations of functions will be used interchangeably for convenience or clarity.
Other related Wiener–Hopf problems read as
and
where $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}} \in \mathbb {C}^{n_y\times n_y}$ and $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}} \in \mathbb {C}^{n_z\times n_y}$.
A.2. Solving Wiener–Hopf problems
To obtain optimal causal estimation and partial-knowledge control, the Wiener–Hopf problems (2.22) and (2.40) need to be solved. In what follows, formal solutions for these equations are presented. These solutions are based on the factorization of the kernels into plus and minus components. As analytical factorization is known only for special cases, a numerical method to factorize matrix functions, tailored for functions that are known only numerically, is presented.
Before proceeding, we define the two types of factorizations that will be used: additive and multiplicative. Multiplicative factorization of a matrix function $\hat {{{\boldsymbol{\mathsf{D}}}}}$ reads as
while an additive factorization reads as
where all factors have the same size as the original matrix, and multiplicative factorizations are only defined for square matrices. To differentiate these two types of factorizations, multiplicative factors will have the subscripts applied directly to them, as in (A9), and additive factors will be presented with the subscripts applied outside the parenthesis, as in (A10).
These factorizations are not unique. For a multiplicative factorization as in (A9), a valid factorization is constructed as $\hat {{{\boldsymbol{\mathsf{H}}}}}_{{{{\boldsymbol{\mathsf{l}}}}}_-} {{\boldsymbol{\mathsf{J}}}}$ and ${{\boldsymbol{\mathsf{J}}}}^{-1} {{\boldsymbol{\mathsf{H}}}}_{{{{\boldsymbol{\mathsf{l}}}}}_+}$, for any constant and invertible matrix ${{\boldsymbol{\mathsf{J}}}}$. Likewise, new additive factorizations are obtained by respectively adding and subtracting a constant to the plus and minus factors.
Any multiplicative factorization can be used to solve Wiener–Hopf problems with the methods presented in this work, and, thus, we do not impose any extra condition to make it unique. However, we restrict additive factorizations to standard factorizations (Noble Reference Noble1959; Daniele & Zich Reference Daniele and Zich2014), that is,
Note that the multiplicative factorization is also known as spectral factorization (Claerbout Reference Claerbout1976) in the signal processing community and is frequently expressed in terms of the $Z$, instead of the Fourier, transform. Typical methods to obtain this factorization are the root method, which provides an analytical factorization if the poles of the kernel are known, and the Levinson algorithm, which is based on recursion to solve a deconvolution problem. Any of these methods can in principle be used for the solution of the Wiener–Hopf problems presented here. In this work we use the strategy proposed by Daniele & Lombardi (Reference Daniele and Lombardi2007).
A.2.1. Formal solution
To obtain a solution for (A4), a multiplicative factorization of the kernel $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$,
is used. After manipulation, (A4) becomes
Using an additive factorization of $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}_-^{-1}(\omega ) \hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}}(\omega )$, (A13) is rewritten as
Equation (A14) is in Wiener–Hopf form, with only plus (minus) functions on the left(right)-hand side. Thus, the left- and right-hand sides are analytical functions in the lower and upper complex half-planes for $\omega$, respectively. Solution of the Wiener–Hopf equation amounts to stating that the left- and right-hand sides are analytical continuations of each other, which allows us to define a single function of $\omega$ that is analytical everywhere.
That each side of the equation contains only plus or minus terms suggests that each side can be solved independently, as, loosely speaking, each side is an equation for positive/negative times only. To formalize this idea, we make use of the following assumptions:
(i) $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}(\omega )$ is bounded and positive definite,
(ii) $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}(\omega )$ has no poles on the real line,
(iii) $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}}(\pm \infty )\to 0$,
and define
Assumption (i) guarantees that $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}_\pm$ is invertible, and that $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}_-^{-1}$ does not create any poles on the right-hand side of (A14). Assumption (ii) guarantees that (A14) is valid on a strip around the real axis, ${-\epsilon <\Im (\omega )<\epsilon }$. As the left-/right-hand side of (A14) are the analytical continuation of this strip in the upper/lower half-plane, these two functions and $\mathcal {L}(\omega )$ are regular everywhere. Since they are bounded, by Liouville's theorem, they are also constant. Finally, assumption (iii) and the use of a standard additive factorization in (A14), guarantees that the left-hand side of (A14) goes to zero for $\omega \to \infty$, and, thus, that $\mathcal {L}(\omega ) =0$.
The solution of (A4) is obtained as
In this work, assumptions (i)–(iii) are satisfied by construction. The kernels are constructed from a Hermitian quadratic form of the resolvent operator, to which a constant and Hermitian positive-definite matrix is added, thus guaranteeing assumption (i). The restriction to stable systems guarantees that the resolvent operator has no poles in the real line and, thus, neither do the kernels, guaranteeing assumption (ii). Finally, as the term $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ is a linear function of the resolvent operator, and since ${{\boldsymbol{\mathsf{R}}}} \propto 1/\omega$ for $\omega \to \infty$, assumption (iii) is also guaranteed.
Two other Wiener–Hopf problems, (A7) and (A8), are used in this study. The first, which appears when solving for the full-knowledge control kernel in Appendix B, reads as
where $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}} \in \mathbb {C}^{n_y\times n_y}, \hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}} \in \mathbb {C}^{n_z\times n_y}$ and $\hat {{{\boldsymbol{\mathsf{W}}}}}_\pm \in \mathbb {C}^{n_z\times n_y}$ are matrix functions. Making similar assumptions for $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$ as the ones made for $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ and $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{r}}}}}}$, solutions are obtained as
where $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}(\omega )$ has a multiplicative factorisation with different convention from $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}} (\omega )$, given by
The second problem, which appears when solving for the partial-knowledge optimal control kernel in § 2.3, reads as
where now $\hat {{{\boldsymbol{\mathsf{W}}}}}_\pm \in \mathbb {C}^{n_a\times n_y}$. With the same assumption as before, solutions are given by
A.2.2. Numerical Wiener–Hopf factorizations
An analytical expression for additive factorization reads as (Noble Reference Noble1959)
with integration contours indented below or above the pole at $\omega$ for $+$ and $-$ functions, respectively. However, when the factors are desired numerically, additive factorizations can be easily obtained using Fourier transforms. Applying an inverse transform, the time-domain representation of the function is obtained. This representation is then split into its plus (minus) component by multiplication with a Heaviside-step function, i.e. setting to zero all values for $t>0 ( t<0)$. A Fourier transform is then used to recover the frequency-domain representation. The function thus obtained constitutes a standard factorization: if the original function is smooth, i.e. its spectral content goes to zero for high enough frequencies, so will the factors calculated with the procedure just described.
Multiplicative factorization for scalar problems can be reduced to an additive factorization using a logarithm function to convert multiplication into addition (Noble Reference Noble1959; Peake Reference Peake2004), with the factorization reading as
This procedure, however, requires that the quantities commute, which is not generally the case when $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$ is a matrix. Analytical Wiener–Hopf factorizations are only known for special classes of matrices (Daniele Reference Daniele1978), and an analytical method for the factorization of general matrices is still unknown.
In this work we use a method similar to that described by Daniele & Lombardi (Reference Daniele and Lombardi2007) to obtain multiplicative matrix factorizations for kernels that are known numerically, rather than analytically.
The multiplicative factorization, satisfying (A12), can be obtained from $n_a$ independent solutions of
as
where $n_a$ is the size of the square matrix $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$. That is, matrices $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}_-$ and $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}_+^{-1}$ have vectors $\hat {\boldsymbol {w}}_{i,\mp }$ as columns, respectively.
To obtain solutions of (A25), we divide it by $\omega -\omega _0$, with $\Im (\omega _0)<0$, and integrate along a line that crosses the real axis and closes around the lower half-plane. Defining $\hat {\boldsymbol {x}}_i = \hat {\boldsymbol {w}}_{i,+}/ (\omega -\omega _0)$, (A25) becomes a Fredholm integral equation of the second kind (Daniele & Lombardi Reference Daniele and Lombardi2007),
Note that (A28) has only one unknown, $\hat {\boldsymbol {x}}$, while (A25) has two, $\hat {\boldsymbol {w}}_{i,-}$ and $\hat {\boldsymbol {w}}_{i,+}$. The integration of the unknown term $\hat {\boldsymbol {w}}_{i,-}(\omega )$ is carried out with the residue theorem, leading to $\hat {\boldsymbol {w}}_{i,-}(\omega _0)$, which is constant and can be arbitrarily specified. Choosing it as the canonical basis ($\hat {\boldsymbol {w}}_{i,-}(\omega _0) =\boldsymbol {e}_i$) is an obvious choice. The parameter $\omega _0$ can be arbitrarily chosen, although different values can change convergence requirements for the numerical solution of the equation. As discussed by Daniele & Lombardi (Reference Daniele and Lombardi2007), $\omega _0$ introduces an apparent singularity in the equation that, while not impacting the analytical solutions, can lead to numerical instabilities for approximate, numerical, solutions. Values close to the real axis lead to a right-hand side that has sharp variations and, thus, requires finer frequency discretization to be resolved, whereas values excessively far from the real axis cause the left-hand side to have significant values on a larger domain, thus requiring the discretization of a larger frequency range. Daniele & Lombardi (Reference Daniele and Lombardi2007) suggest choosing $\omega _0$ such that it corresponds to singularities of the physical problem under study, but in the context of the problem studied here the choice is not obvious. It is thus necessary to check convergence using different values of $\omega _0$ and/or different frequency discretizations.
Daniele & Lombardi (Reference Daniele and Lombardi2007) discretized (A28) to construct a matrix representing its right-hand side. The numerical solution was obtained by solving the resulting linear problem. Deformation of the integration path into the complex $\omega$ plane was used to improve the convergence rate of the solutions whenever the kernel had poles close to the real line. Similarly, Atkinson & Shampine (Reference Atkinson and Shampine2008) used different integration weights and collocations points to deal with such singularities. Throughout this work, we focus on kernels that are obtained numerically and, thus, only available on the real frequency line. Deformation of the integration path is thus unfeasible, and convergence is obtained by refining the frequency discretization.
To solve (A28), a linear problem with size $n_a^{2} n_\omega$ has to be solved, where $n_\omega$ is the number of frequency points used, and $n_a$ the size of the square matrix $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}$. This approach becomes unpractical for the frequency discretization required for convergence of the results in this study. Instead, we rewrite (A28) as
where
is the Hilbert transform of $\hat {\boldsymbol {x}}(\omega )$. Hilbert transforms can be efficiently computed numerically using fast-Fourier transforms (Todoran, Holonec & IAKAB Reference Todoran, Holonec and IAKAB2008), and, thus, the left-hand side of (A29) can be obtained without the construction of the matrix that represents it. The problem is thus well suited for solutions via iterative methods, such as GMRES (generalized minimal residual), used here.
As mentioned by Zhou et al. (Reference Zhou, Yang, Liu and Yang2009), using Fourier transforms to compute Hilbert transforms can lead to significant errors at the extremities of the signal, due to the implicit assumption of periodicity. The signals need thus to be zero padded to avoid such errors. Due to the slow decay of the term $1/(\omega -u)$, large paddings can be necessary. Padding of 20 times the time signal has been used throughout this study.
Using $n_a$ linearly independent solutions, $\hat {{{\boldsymbol{\mathsf{H}}}}_{{{\boldsymbol{\mathsf{l}}}}}}_\pm (\omega )$ can be obtained from
A factorization with the order of plus and minus functions exchanged, i.e. $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}} (\omega) = \hat{{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}_+ (\omega ) \hat{{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}_- (\omega)$, can be obtained via the same method using an auxiliary matrix ${\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}'(\omega ) = \hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}^{*} (\omega )}$, where $^{*}$ represents complex conjugation. From a factorization of $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}'(\omega )$, the desired factorization is obtained as $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}_+ (\omega ) = \hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}_-'^{*} (\omega )$ and $\hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}_-(\omega ) = \hat {{{\boldsymbol{\mathsf{G}}}}_{{{\boldsymbol{\mathsf{l}}}}}}_+'^{*} (\omega )$.
A.2.3. Convergence
To access the convergence of the method, we compare the control kernels obtained for the Ginzburg–Landau system described in § 2.5. A normalized error is defined as
where $\boldsymbol {\varGamma }'_c$ is the kernel computed using the Wiener–Hopf approach and $\boldsymbol {\varGamma }'_{lqg}$ is obtained as in (2.53). A time interval $[-T,T ]$ was discretized with points spaced by $\Delta t$, corresponding to a sampling frequency $\omega _s= 2{\rm \pi} \Delta t$ and a frequency resolution of $\Delta \omega = 2{\rm \pi} /T$.
The normalized error is shown in figure 19 for different values of $T$ and $\Delta t$. The factorization scales linearly with the number of points in the frequency discretization. The linear convergence of the kernel with $\Delta t$ is a consequence of its discontinuity at $\tau =0$: this is the convergence rate of a Fourier series for discontinuous functions. Note that when actuators are close to the sensor, the discontinuity is stronger, leading to the larger errors seen when the actuator is at $x=7$. The converge trend is nevertheless unaffected. Convergence with respect to the domain size $T$ is very fast and, thus, has a small impact on the overall cost.
The effect of having multiple sensors and actuators on the factorization is explored in figure 20. Adding sensor/actuators leads to an increase in the required $T$ and $\Delta t$ for a given accuracy, but does not significantly affect the convergence trends. It is also seen that the cost scales linearly with the number of points used for time/frequency discretization. The increase with $n_a$ and $n_y$ is due to the need to perform matrix multiplication, which scales with the square of its size. In all scenarios, good accuracy is obtained within a few minutes on a standard notebook. Note also that this cost does not scale with the size of the system, and, thus, remains roughly the same for any system. When applied to complex flows, factorization is thus orders of magnitude less costly than the time-domain solutions of the direct and adjoint problems described in § 3.1.
Appendix B. The full-knowledge control problem
Complementing the optimal estimation (§ 2.2) and partial-knowledge control (§ 2.3), we here present the derivation of the optimal full-knowledge control.
Analogous to the procedure used in § 2.2, optimal control is obtained by minimizing a cost functional given by
Full system knowledge control implies the system state for the current time is known, from which $\boldsymbol {z}_1$ can be computed. If external forcing is present, $\boldsymbol {z}_1$ need to be updated at each time instant. An actuation, $\boldsymbol {a}_{nc}(t)$, that minimizes the cost functional can be obtained in terms of $\boldsymbol {z}_1$ alone. Expanding terms in (B1) gives
and differentiation with respect to $\hat {\boldsymbol {a}}_{nc}^{{\dagger} }(\omega )$ leads to
where
As in the estimation problem, causality, i.e. $\boldsymbol {a}_c(t<0)=0$, can be enforced with Lagrange multipliers. Taking the derivative of the modified cost functional given by
with respect to $\boldsymbol {a}_c^{{\dagger} }$ yields the Wiener–Hopf problem
which has the structure of (A4) and solution given by (A16).