1. Introduction
The improved confinement of modern stellarators is largely attributed to the numerical optimization of the magnetic field. With the technique pioneered by Nührenberg & Zille (Reference Nührenberg and Zille1988), the shape of the plasma boundary, $S_P$, is optimized for certain properties of the magnetohydrodynamic (MHD) equilibrium,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn1.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn2.png?pub-status=live)
where $S_P = \partial V_P$,
$p(\psi )$ is the prescribed pressure profile and
$\hat {\boldsymbol {n}}$ is the unit normal. For the past three decades, stellarator optimization has largely proceeded with this fixed-boundary approach with codes such as STELLOPT (Lazerson et al. Reference Lazerson, Schmitt and Zhu2020) and ROSE (Drevlak et al. Reference Drevlak, Beidler, Geiger, Helander and Turkin2018), resulting in the W7-X (Beidler et al. Reference Beidler, Grieger, Herrnegger, Harmeyer, Kisslinger, Lotz, Maassberg, Merkel, Nührenberg and Rau1990), HSX (Anderson et al. Reference Anderson, Almagri, Anderson, Matthews, Talmadge and Shohet1995) and NCSX (Zarnstorff et al. Reference Zarnstorff, Berry, Brooks, Fredrickson, Fu, Hirshman, Hudson, Ku, Lazarus and Mikkelsen2001) configurations. We use the term ‘fixed-boundary’ to describe this approach as the equilibrium is computed with a specified boundary, although the boundary is varied throughout the optimization.
Although there has been significant success with this approach, there are several ways to improve the performance of fixed-boundary optimization. Specifically, the incorporation of derivative information could be transformative. Fixed-boundary optimization has previously relied on derivative-free methods, such as genetic algorithms (Miner et al. Reference Miner, Valanju, Hirshman, Brooks and Pomphrey2001), differential evolution (Mynick, Pomphrey & Ethier Reference Mynick, Pomphrey and Ethier2002) and Brent's algorithm (Drevlak et al. Reference Drevlak, Beidler, Geiger, Helander and Turkin2018), or derivative-based methods with finite-difference gradients. While global derivative-free algorithms may prevent the optimization from terminating in local minima, they are only effective for smaller problems (Nocedal & Wright Reference Nocedal and Wright2006). Gradient-based optimization with finite-difference gradients requires excessive function evaluations in high-dimensional spaces and suffers from error that enters due to the choice of step size. If the step size is too small, the error is dominated by round-off error, and if it is too large it is dominated by nonlinearity (Sauer Reference Sauer2012). Owing to the requirement of excessive function evaluations and the unreliability of the gradient information, gradient-based optimization with finite-difference derivatives is not always effective. In this work, we present the first example of analytic gradient-based, fixed-boundary optimization of stellarator equilibria.
With a local gradient descent approach, each iteration reduces to a one-dimensional line search (Nocedal & Wright Reference Nocedal and Wright2006); thus, the further incorporation of derivative information eliminates restrictions on the size of the optimization space. There is some evidence from the machine learning community that overparameterization of the space can accelerate optimization (Oymak & Soltanolkotabi Reference Oymak and Soltanolkotabi2018). Therefore, it is possible that increasing the Fourier resolution of the plasma boundary may similarly eliminate local minima.
There are several ways that this derivative information can be obtained. For sufficiently simple figures of merit, the objective can be analytically differentiated and implemented by hand. Alternatively, the derivatives can be obtained programmatically using automatic differentiation tools. When a given objective function depends on the solution of a set of equations, such as the MHD equilibrium equations (1.1), the derivatives can be obtained more efficiently using an adjoint method. With this technique, the solution of only one additional equation, known as the adjoint equation, is required. Once the adjoint solution is obtained, the derivative of a given objective can be obtained with respect to any optimization parameter, eliminating the need to solve a perturbed set of equations. In this way, the cost associated with obtaining a high-dimensional gradient is reduced significantly. In this work, we employ adjoint-based gradients of functions which depend on the MHD equilibrium equations. This adjoint method results from a generalized self-adjointness property of the MHD force operator (Antonsen, Paul & Landreman Reference Antonsen, Paul and Landreman2019). This technique has been demonstrated for computing the shape derivatives of several figures of merit relevant for stellarator design, including the magnetic well, rotational transform, and magnetic ripple (Paul et al. Reference Paul, Antonsen, Landreman and Cooper2020). Each of these objective functions have been included in modern stellarator designs (Beidler et al. Reference Beidler, Grieger, Herrnegger, Harmeyer, Kisslinger, Lotz, Maassberg, Merkel, Nührenberg and Rau1990; Anderson et al. Reference Anderson, Almagri, Anderson, Matthews, Talmadge and Shohet1995; Zarnstorff et al. Reference Zarnstorff, Berry, Brooks, Fredrickson, Fu, Hirshman, Hudson, Ku, Lazarus and Mikkelsen2001; Henneberg et al. Reference Henneberg, Drevlak, Nührenberg, Beidler, Turkin, Loizu and Helander2019). We also employ analytic gradients for objectives that do not depend on the MHD equilibrium equations, such as the volume and properties of the surface curvatures.
There have been several recent applications of derivative information to other related problems in stellarator design. The FOCUS (Zhu et al. Reference Zhu, Hudson, Song and Wan2018) and FOCUSADD (McGreivy, Hudson & Zhu Reference McGreivy, Hudson and Zhu2021) codes optimize coil shape to be consistent with a given plasma boundary with gradients obtained from analytic and automatic differentiation methods, respectively. Our work is distinct from the FOCUS approach, as we use gradients to optimize properties of the equilibrium rather than using gradients to optimize coils in order to match the boundary of a given equilibrium. Adjoint methods have recently been applied to directly optimize coils for quasi-symmetry near the magnetic axis in a vacuum field (Giuliani et al. Reference Giuliani, Wechsung, Cerfon, Stadler and Landreman2020). In contrast, our approach can be applied to optimize equilibria with arbitrary pressure. Furthermore, we have developed adjoint methods for direct optimization of coil shapes for properties of an MHD equilibrium (Antonsen et al. Reference Antonsen, Paul and Landreman2019; Paul et al. Reference Paul, Antonsen, Landreman and Cooper2020), although its application to optimization is not presented in this work. Adjoint methods have also been used to optimize the local magnetic field for neoclassical properties (Paul et al. Reference Paul, Abel, Landreman and Dorland2019) and the coil winding surface for properties of the current potential (Paul et al. Reference Paul, Landreman, Bader and Dorland2018).
We discuss the new gradient-based fixed-boundary optimization tool in § 2. In § 3, we present regularization terms for fixed-boundary optimization, including a constraint on the curvature of the boundary that prevents self-intersection. In § 4, we present several optimization demonstrations, including obtaining a low magnetic shear configuration (§ 4.1), a configuration with a magnetic well (§ 4.2), and a configuration with quasi-symmetry near the magnetic axis (§ 4.3). We conclude in § 5.
2. Overview of ALPOpt optimization tool
As with the STELLOPT and ROSE codes, the ALPOptFootnote 1 tool optimizes the boundary of VMEC (Hirshman & Whitson Reference Hirshman and Whitson1983) equilibria. The VMEC code obtains solutions of (1.1) under the assumption of nested toroidal magnetic surfaces. The plasma boundary is described by a set of Fourier coefficients of the cylindrical coordinates, $\{R_{m,n}^c,Z_{m,n}^s\}$,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn3.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn4.png?pub-status=live)
where $\theta$ is a poloidal angle,
$\phi$ is the cylindrical toroidal angle and
$N_P$ is the number of field periods. Therefore, the optimization space is taken to be the set of coefficients
$\{R_{m,n}^c, Z_{m,n}^s\}$. The pressure profile
$p(\psi )$ and another function of flux, either the rotational transform
$\iota (\psi )$ or enclosed toroidal current
$I_T(\psi )$, are prescribed and fixed. (For all of the examples in this work,
$I_T(\psi )$ is fixed.) The optimization code interfaces with VMEC through python, and optimization is performed with the scipyFootnote 2 and NLOPT (Johnson Reference Johnson2014) packages.
The optimization tool takes advantage of the adjoint method for obtaining the shape gradient of MHD equilibria (Antonsen et al. Reference Antonsen, Paul and Landreman2019; Paul et al. Reference Paul, Antonsen, Landreman and Cooper2020). Here the perturbation to the magnetic field due to a perturbation of the boundary is expressed as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn5.png?pub-status=live)
where the displacement vector satisfies $\boldsymbol {\xi }_1 \boldsymbol {\cdot } \hat {\boldsymbol {n}} = \delta \boldsymbol {x} \boldsymbol {\cdot } \hat {\boldsymbol {n}}$ on the boundary for a given normal perturbation to the surface and
$\delta \iota _1(\psi )$ is the perturbation to the rotational transform profile that may arise due to the constraint of fixed
$I_T(\psi )$. This perturbed magnetic field can be related to an adjoint perturbed magnetic field,
$\delta \boldsymbol {B}_2$, through a generalized self-adjointness relation,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn6.png?pub-status=live)
where the linearized force operator is
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn7.png?pub-status=live)
The adjoint approach is as follows: rather than directly computing the perturbed magnetic field $\delta \boldsymbol {B}_1$ by perturbing the boundary of a fixed-boundary equilibrium or solving a set of linearized equilibrium equations, the adjoint magnetic field is computed. The adjoint magnetic field has no perturbation to the boundary (
$\boldsymbol {\xi }_2 \boldsymbol {\cdot } \hat {\boldsymbol {n}} = 0$) but may have a perturbation to the toroidal current profile (
$\delta I_{T,2}(\psi )$) or a bulk force perturbation (
$\boldsymbol {F}_2$). In this work, we consider figures of merit whose derivatives can be computed with a perturbation to the toroidal current profile or a bulk force perturbation which takes the form of the gradient of a scalar function of flux or the divergence of an anisotropic pressure tensor.
Rather than consider a set of linearized equations, we add a small perturbation to a nonlinear VMEC equilibrium in the form of a perturbation to the pressure profile, toroidal current profile, or an anisotropic pressure tensor. In the case of the addition of a pressure tensor, the ANIMEC (Cooper et al. Reference Cooper, Hirshman, Merazzi and Gruber1992) code is used to evaluate the adjoint equilibrium. The resulting shape gradient, $\mathcal {G}$, of a given objective function
$f$, is defined through,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn8.png?pub-status=live)
Here $\delta f(S_P;\delta \boldsymbol {x})$ is the shape derivative of
$f$ with respect to a normal perturbation of the surface,
$\delta \boldsymbol {x} \boldsymbol {\cdot } \hat {\boldsymbol {n}}$. Given the shape gradient, which quantifies the local sensitivity to normal perturbations of the surface, the derivatives with respect to the parameters
$\varOmega = \{R_{m,n}^c,Z_{m,n}^s\}$ are computed,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn9.png?pub-status=live)
2.1. Managing code failures
The equilibrium code may return with an error for a given plasma boundary. There are many reasons for these failures, such as the flux coordinate Jacobian becoming ill-conditioned or the number of iterations exceeding the maximum prescribed value. When such a failure is experienced during the optimization, the objective function is set to an arbitrarily large value (e.g. $10^{12}$) to enforce an effective constraint. The approach of assigning a very large fictitious objective value at unevaluable points is common in the optimization literature (Rasheed, Hirsh & Gelsey Reference Rasheed, Hirsh and Gelsey1997; Emmerich et al. Reference Emmerich, Giotis, Özdemir, Bäck and Giannakoglou2002) and is employed in the STELLOPT code (Lazerson et al. Reference Lazerson, Schmitt and Zhu2020).
As encountering such unevaluable points makes the parameter space non-smooth, it is prudent to try to avoid code failures by placing additional constraints on the parameter space. Such a technique has been employed in the optimization of aircraft by checking that simulation outputs match physical model assumptions, such as the drag coefficient being non-negative (Gelsey Reference Gelsey1995; Gelsey, Schwabacher & Smith Reference Gelsey, Schwabacher and Smith1998). In § 3, we present constraints which prevent the surface from self-intersecting or from obtaining large curvature, leading to an ill-conditioned Jacobian and code failure. Even in the presence of such constraints, unevaluable points may still be present. Here the objective function becomes discontinuous, which is problematic for optimizers which assume function continuity. In this work we employ the Broyden–Fletcher–Goldfarb–Shanno (BFGS) quasi-Newton method with an Armijo–Wolfe line search, which assumes $C^2$ continuity of the objective function (Nocedal & Wright Reference Nocedal and Wright2006). Although convergence of the BFGS algorithm is not guaranteed for non-smooth problems, it has been observed that a reasonable approximation to the optimum is most often achieved (Lemaréchal Reference Lemaréchal1982). With the standard Armijo–Wolfe line search method, the gradient need not be evaluated unless the objective function satisfies the sufficient decrease criterion. Thus the gradient does not need to be evaluated at unevaluable points. In this way, as long as the line search never returns a point where the objective is not differentiable, the BFGS method is well defined. Although there are specialized quasi-Newton methods for non-smooth objective functions (Lewis & Overton Reference Lewis and Overton2013), we have obtained acceptable results with a standard BFGS method.
3. Regularization terms
3.1. Preventing surface self-intersection
We now describe a constraint which prevents self-intersection of the plasma boundary. Given a surface described by the cylindrical coordinates $R(\theta ,\phi )$ and
$Z(\theta ,\phi )$, self-intersection of the boundary may occur if either:
(a)
$R(\theta ,\phi )< 0$ at any point; or
(b) the planar curve
$\boldsymbol {x}_{\phi _0}(\theta ) = \{ R(\theta ,\phi _0), Z(\theta ,\phi _0)\}$ is self-intersecting for any
$\phi _0$.
Condition $(a)$ can be avoided using a penalty objective of the form,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn10.png?pub-status=live)
where $w_R$ is a weight which sets the gradient length scale of the objective,
$R_\textrm {min}$ is the minimum allowable major radius and
$A_P$ is the area of
$S_P$.
Condition $(b)$ can be avoided by introducing a constraint on the global radius of curvature (Gonzalez & Maddocks Reference Gonzalez and Maddocks1999; Walker Reference Walker2016) of each of the curves,
$\boldsymbol {x}_{\phi _0}(\theta )$, which will now be defined. The radius of the unique circumcircle containing any three points
$\boldsymbol {x}_1$,
$\boldsymbol {x}_2$, and
$\boldsymbol {x}_3$ (figure 1), can be computed from
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn11.png?pub-status=live)
where $\mathcal {A}(\boldsymbol {x}_1,\boldsymbol {x}_2,\boldsymbol {x}_3)$ is the area of a triangle with vertices
$\boldsymbol {x}_1$,
$\boldsymbol {x}_2$, and
$\boldsymbol {x}_3$. Under the assumption that these points lie on a non-self-intersecting smooth curve,
$\mathcal {C}$ such that
$\boldsymbol {x}_1 = \boldsymbol {x}(l_1)$,
$\boldsymbol {x}_2 = \boldsymbol {x}(l_2)$, and
$\boldsymbol {x}_3 = \boldsymbol {x}(l_3)$, then the radius of the circumcircle satisfies
$\rho (\boldsymbol {x}(l_1),\boldsymbol {x}(l_2),\boldsymbol {x}(l_2)) \le \rho (\boldsymbol {x}(l_1),\boldsymbol {x}(l_2),\boldsymbol {x}(l_3))$. This limiting case can be computed from
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn12.png?pub-status=live)
where $\hat {\boldsymbol {t}}$ is the unit tangent. This is the radius of the circle that passes through
$\boldsymbol {x}(l_1)$ and is tangent to
$\mathcal {C}$ at
$\boldsymbol {x}(l_2)$, which we define as the self-contact function between these points,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn13.png?pub-status=live)
the radius of the so-called point-tangent circle (figure 1).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_fig1.png?pub-status=live)
Figure 1. (a) The circumcircle containing $\boldsymbol {x}_1$,
$\boldsymbol {x}_2$ and
$\boldsymbol {x}_3$ with radius
$\rho (\boldsymbol {x}_1,\boldsymbol {x}_2,\boldsymbol {x}_3)$. (b) The points
$\boldsymbol {x}_1$,
$\boldsymbol {x}_2$, and
$\boldsymbol {x}_3$ lie on a closed elliptical curve
$\mathcal {C}$ (blue). The point tangent radius for points
$\boldsymbol {x}_1$ and
$\boldsymbol {x}_2$ and points
$\boldsymbol {x}_1$ and
$\boldsymbol {x}_3$ are displayed. The global radius of curvature at
$\boldsymbol {x}_1$ is
$\mathcal {S}_{\mathcal {C}}(\boldsymbol {x}_1,\boldsymbol {x}_2)$.
In the limit that $l_2, l_3 \rightarrow l_1$, we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn14.png?pub-status=live)
where $\rho _{\mathcal {C}}(\boldsymbol {x}(l_1))$ is the local radius of curvature of
$\mathcal {C}$ at
$\boldsymbol {x}(l_1)$. We define the global radius of curvature as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn15.png?pub-status=live)
This can be thought of as a generalization of the local radius of curvature, as we have the inequality $0 \le \rho _G(\boldsymbol {x}) \le \rho _{\mathcal {C}}(\boldsymbol {x})$. At a given point, the minimizer in (3.6) will either be a point where
$(\boldsymbol {x}_1 - \boldsymbol {x}_2) \boldsymbol {\cdot } \hat {\boldsymbol {t}}(\boldsymbol {x}_2) = 0$ or
$\rho _{\mathcal {C}}(\boldsymbol {x}_2) = \mathcal {S}_{\mathcal {C}}(\boldsymbol {x}_1,\boldsymbol {x}_2)$ (Smutny Reference Smutny2004). In other words, at a given point
$\boldsymbol {x}_1$, the point on the curve that has the smallest self-contact function containing
$\boldsymbol {x}_1$ will be where the local radius of curvature is equal to the point-tangent radius or the displacement of the points
$(\boldsymbol {x}_1 - \boldsymbol {x}_2)$ is orthogonal to the curve at
$\boldsymbol {x}_2$.
If the inequality
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn16.png?pub-status=live)
is satisfied, this implies that the curve can be ‘thickened’ to a tube without self-contact surrounding $\mathcal {C}$ such that at any given point, the cross-section of the tube is a circle in the plane perpendicular to the local tangent vector of radius
$\mathcal {R}$ (Gonzalez & Maddocks Reference Gonzalez and Maddocks1999). For this reason, the concept of the global radius of curvature has been employed for the shape optimization of finite-thickness knots (Gonzalez & Maddocks Reference Gonzalez and Maddocks1999; Carlen et al. Reference Carlen, Laurie, Maddocks and Smutny2005; Walker Reference Walker2016).
We enforce the inequality constraint (3.7) to prevent the self-intersection of every planar curve $\boldsymbol {x}_{\phi _0}(\theta )$ with a penalty function of the form
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn17.png?pub-status=live)
where $\mathcal {R}$ is the minimum allowable global radius of curvature and
$w_{\mathcal {S}}$ is a weight function. In addition to preventing self-intersection of the boundary, this objective improves the regularity of the surface by reducing the curvature of the poloidal cross-sections. A demonstration of this penalty function is presented in § 3.3.
3.2. Surface curvature constraint
While constraining the self-contact function reduces the curvature of the cross-sections of the plasma boundary, we include additional terms which improve the regularity of the plasma boundary. First, we place an effective constraint on the magnitude of the principal curvatures,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn18.png?pub-status=live)
by including a penalty function of the form,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn19.png?pub-status=live)
A similar constraint on the principal curvatures is employed in the ROSE code (Drevlak et al. Reference Drevlak, Beidler, Geiger, Helander and Turkin2018). We assume the convention that $\kappa _1 \le \kappa _2$ and
$\kappa _{1,2}<0$ indicates concavity.
We additionally add a penalty on the smallest principal curvature,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn20.png?pub-status=live)
As large concavity is associated with coil complexity (Paul et al. Reference Paul, Landreman, Bader and Dorland2018), this regularization becomes critical in the two-staged optimization approach. Also, regularity of the boundary improves the convergence of the VMEC code. If the boundary is not close to being a star domain, indicating that there exists a coordinate axis such that the line segment connecting the axis and any point on the boundary is contained within the boundary, then the solver may fail to initialize a guess for the magnetic axis. For highly shaped boundaries, the coordinate surfaces may begin to overlap. Thus, the inclusion of these regularization terms prevents code failures during optimization and improves the convergence toward the optimum. This advantage of curvature penalization is highlighted in the magnetic well optimization in § 4.2.
3.3. Constrained volume optimization
To demonstrate the procedure described in § 3.1 for avoiding self-intersection, we define an objective function,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn21.png?pub-status=live)
where $V_P$ is the volume enclosed by the plasma boundary. For this example, we take
$\lambda _{\mathcal {S}} = \lambda _{R} = 10^3$,
$R_\textrm {min} = \mathcal {R} = 0.2$ m,
$w_R = w_{\mathcal {S}} = 10^{-2}$,
$\lambda _{\kappa } = 1$,
$\kappa _{\textrm {max},1} = \kappa _{\textrm {max},2} = 10\ \textrm {m}^{-1}$ and
$w_k = 1$. Here the volume is computed using a surface integral upon application of the divergence theorem,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn22.png?pub-status=live)
to avoid discretizing the volume.
We begin with a stellarator whose boundary is defined by a rotating ellipse,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn23.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn24.png?pub-status=live)
where $a$ is the semi-major axis and
$b$ is the semi-minor axis. For this example, we take
$R_0 = 5$,
$a =2$,
$b= 1$ and
$N_P=3$. We optimize
$f(S_P)$ with respect to modes
$m \le 3$ and
$|n| \le 3$ using the BFGS algorithm provided by SciPy. (This optimization algorithm is used for all demonstrations in this work.)
We compare the surfaces obtained with and without ($\lambda _{\mathcal {S}} = \lambda _{R} = 0$) the constraint terms, shown in figure 2. Without the constraint, the boundary begins to self-intersect. Although the volume is not well-defined for this surface, the discretized surface integral given by (3.13) is reduced from its initial value of
$197.39\text{ to }4.18\ \textrm {m}^{3}$ by minimizing the projection of the normal vector in the
$\hat {\boldsymbol {Z}}$ direction. With the constraint, the surface is reduced to an axisymmetric torus with a nearly circular cross-section centred at a major radius
$R_0 = 0.43\ \textrm {m}$ with an averaged minor radius of
$0.24\ \textrm {m}$ and a volume of
$0.50\ \textrm {m}^3$.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_fig2.png?pub-status=live)
Figure 2. Cross-sections of the initial (blue) and optimized surfaces obtained by minimizing the objective function given by (3.12). The surface obtained without the constraints ($\lambda _R = \lambda _{\mathcal {S}} = 0$) becomes self-intersecting (red), and the surfaced obtained with the constraints remains regular (green).
In figure 3 we display the value of the global radius of curvature (3.6) for the initial and final surfaces. On the initial surface, $\rho _G$ is minimized at the endpoints of the semi-major axis where the global radius of curvature matches the local radius of curvature. The maximum value of
$\rho _G$ is obtained in the region near the endpoint of the ellipse's semi-minor axis, where the point-tangent radius is minimized for
$\boldsymbol {x}_2$ on the opposite side of the semi-minor axis. The optimized boundary features decreased values of
$\rho _G$ on the outboard side, where
$\rho _G$ matches the local radius of curvature. We note that the minimum value of
$\mathcal {S}_{\mathcal {C}}$ on the optimized slightly violates the constraint at 0.15 m owing to the penalty formulation.
4. Optimization demonstrations
4.1. Target rotational transform
We now discuss a demonstration of optimization to obtain a target rotational transform profile. We consider a target function which quantifies the difference between the rotational transform and a desired profile, $\iota _{\text {target}}(\psi )$,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn25.png?pub-status=live)
in order to obtain a low shear stellarator. The gradient of this objective function is obtained by computing an equilibrium with a perturbed toroidal current profile,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn26.png?pub-status=live)
as described in Antonsen et al. (Reference Antonsen, Paul and Landreman2019). We present a benchmark problem in appendix A to demonstrate that the gradient-based optimizer can converge to a known minimum of an objective involving $f_{\iota }$ in a 2D space.
Although the VMEC code assumes nested surfaces such that the magnetic field is integrable, the solution to Laplace's equation with the prescribed boundary $S_P$ will not generally have continuously nested surfaces. We can, however, aim to obtain a vacuum field with a value of the rotational transform that improves the integrability of the ‘real’ vacuum field. (Note that although we can run VMEC with prescribed
$\iota (\psi )$ rather than
$I_T(\psi )$, it will not generally be a vacuum field.) We choose
$\iota _{\text {target}}$ to be the ‘most irrational’ noble between
$p/q = 0/1$ and
$p'/q'=1/2$ (Greene, MacKay & Stark Reference Greene, MacKay and Stark1986),
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn27.png?pub-status=live)
where $\gamma = (\sqrt {5} + 1)/2$ is the golden mean. This is a noble irrational, indicating that its path in the Farey tree is eventually alternating. At a given level in the Farey tree, the path that changes direction is said to be ‘most irrational’ if a cantorus with this rotation number has the smallest flux of field line trajectories (Meiss Reference Meiss1992). Thus, the value of
$\iota _{\text {target}}$ is chosen to have the smallest flux for rotation numbers between
$p/q = 0/1$ and
$p'/q' = 1/2$. According to the Kolmogorov–Arnold–Moser (KAM) theorem, invariant circles with sufficiently irrational frequencies persist under small perturbations. Thus, choosing such a value for the rotational transform is likely to result in a large volume of magnetic surfaces. We remark that moderate shear may be desirable for certain stellarator design studies. Our adjoint formalism is quite flexible, and the choice of
$\iota _{\text {target}}$ could be modified accordingly.
We define our objective function to be
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn28.png?pub-status=live)
where $\lambda _{\mathcal {S}} = \lambda _{R} = 10^3$,
$R_\textrm {min} = \mathcal {R} = 0.2\ \textrm {m}$ and
$w_R = w_{\mathcal {S}} = 10^{-2}$. We begin with a surface given by a rotating ellipse with a non-planar axis,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn29.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn30.png?pub-status=live)
with $a = 2$,
$b = 1$,
$N_P = 3$,
$R_0 = 5$ and
$R_{0,1} = -0.5$. We consider a vacuum field with
$p(\psi ) = I_T(\psi ) = 0$.
To investigate the benefit of increasing the dimensionality of the optimization space, we optimize with respect to the boundary harmonics $m \le 3$,
$|n| \le 3$. We then use this result to optimize with respect to
$m \le 4$,
$|n| \le 4$ and then with respect to
$m \le 5$,
$|n| \le 5$. In comparison with the result of the initial optimization in the low-dimensional space, we are able to reduce the objective function by 70 % (figure 4). We obtain a rotational transform profile which very closely matches the target value with an objective value of
$f_{\iota } = 4.93 \times 10^{-10}$. If the optimization space is further increased, the optimum is reduced by less than 1 %. For the further analysis of the optimization in this section, we present results from the optimization with respect to the modes
$m \le 5$ and
$|n| \le 5$.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_fig4.png?pub-status=live)
Figure 4. (a) The $f_{\iota }$ objective value (4.1), (b) the L2 norm of the
$f(S_P)$ objective (4.4) gradient and (c) the initial and final profiles of the rotational transform. In (a,b) the optimization was performed in a staged approach: first with respect to modes
$m \le 3$ and
$|n| \le 3$, then with respect to
$m \le 4$ and
$|n| \le 4$ and, finally, with respect to
$m \le 5$ and
$|n| \le 5$.
We note that we were not able to reduce the gradient norm to the requested tolerance of $10^{-8}$. This result can be attributed to approximations made in computing the gradient with the adjoint method. As discussed in Antonsen et al. (Reference Antonsen, Paul and Landreman2019) and Paul et al. (Reference Paul, Antonsen, Landreman and Cooper2020), the adjoint equations are derived under the assumption that the VMEC code satisfies MHD force balance (1.1). However, there is always some residual error in force balance due to discretization error and the assumption of continuously nested flux surfaces. For these calculations, we converged to a force balance tolerance of
$10^{-11}$ with 99 flux surfaces and mode numbers
$|m| \le 11$,
$|n| \le 11$. These resolution parameters were chosen to strike a balance between an accurate adjoint solution and efficiency of the optimization. Preliminary calculations indicate that increasing the resolution parameters over what was used in this work does not cause a significant change in the optimum. Furthermore, the linear adjoint solution is approximated by adding a small perturbation to the nonlinear MHD force balance. This technique is effectively a forward-difference approximation of the adjoint equation, which introduces an error that scales with the magnitude of the perturbation. When there are small errors in the gradient, the computed gradient may no longer be a descent direction, making convergence difficult near the optimum (Dekeyser Reference Dekeyser2014).
Nonetheless, we effectively eliminate the magnetic shear throughout the volume. In figure 5 we display the initial and optimized boundaries (solid) along with an interior surface at $\psi /\psi _0 = 0.06$ (dashed) and the magnetic axis (star). We can consider some of the features of the optimized surface in consideration of the expression for the rotational transform near the magnetic axis (Mercier Reference Mercier1964; Helander Reference Helander2014), which indicates that rotating ellipticity and torsion of the axis contribute to the on-axis transform in a vacuum field. We see that the interior surfaces become slightly more elliptical in order to increase
$\iota$ near the axis, whereas the ellipticity of the boundary is slightly decreased, becoming more square. However, the torsion of the magnetic axis is maintained.
To analyse the effect of low shear on the integrability of the field, we compute the vacuum field using the SPEC code (Hudson et al. Reference Hudson, Dewar, Dennis, Hole, McGann, Von Nessi and Lazerson2012). The SPEC calculations are performed with a single volume with Beltrami parameter $\mu = 0$ such that the magnetic field satisfies
$\boldsymbol {\nabla } \times \boldsymbol {B} = 0$ with a Neumann boundary condition (1.1b). In figure 6 we show a Poincaré section for the initial field, which has a small island chain at the
$\iota = 3/8$ resonance. With the optimized boundary, we eliminate this island chain and obtain a large volume of nested surfaces.
4.2. Magnetic well
We next consider an objective function which aims to obtain a magnetic well,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn31.png?pub-status=live)
with $w_1(\psi ) = \exp \left ({-(\psi /\psi _0-1.0)^2/0.1^2}\right )$ and
$w_2(\psi ) = \exp \left ({-(\psi /\psi _0)^2/0.1^2}\right )$. When
$V''(\psi ) < 0$, a magnetic well is said to be present, which provides a stabilizing term in the Mercier criterion for interchange modes (Mercier & Luc Reference Mercier and Luc1974). We can consider
$f_w$ to be a normalized finite-difference approximation of
$V''(\psi )$; thus minimization of
$f_w$ is performed in order to achieve a magnetic well. A similar objective is employed in the ROSE code (Drevlak et al. Reference Drevlak, Beidler, Geiger, Helander and Turkin2018), computed from integration along a field line. Some experimental observations point toward a correlation between the existence of a magnetic hill and the onset of fluctuations (Castellano et al. Reference Castellano, Jiménez, Hidalgo, Pedrosa, Fraguas, Pastor, Herranz and Alejaldre2002). Nonetheless, we remark that the existence of a magnetic well does not always imply ideal MHD stability (Watanabe et al. Reference Watanabe, Sakakibara, Narushima, Funaba, Narihara, Tanaka, Yamaguchi, Toi, Ohdachi and Kaneko2005).
The gradient of this objective function is obtained by computing an equilibrium with a perturbed pressure profile,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn32.png?pub-status=live)
as described in Paul et al. (Reference Paul, Antonsen, Landreman and Cooper2020).
We define our objective function to be
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn33.png?pub-status=live)
with $\lambda _{\mathcal {S}} = 10^{-3}$,
$\lambda _{R} = 10^{2}$,
$w_{\mathcal {S}} = w_{R} = 10^{-2}$,
$\kappa _{\textrm {max}} \equiv \kappa _{\textrm {max},1} = \kappa _{\textrm {max},2} = 7\ \textrm {m}^{-1}$,
$w_{\kappa } = 1$,
$\lambda _{\bar {\kappa }}= \lambda _V = 1$ and
$\lambda _{\kappa } = 10^{2}$. Here
$V_{P}^{\text {init}}$ is the volume of the initial surface. To determine the importance of the maximum curvature regularization term, we optimize with four sets of parameters as described in table 1. The values
$\lambda _{\kappa } = 10^2$ and
$\kappa _{\textrm {max}} = 7\ \textrm {m}^{-1}$ were chosen to balance the curvature and well metrics.
Table 1. Parameters used for the optimization of the magnetic well and the resulting values of the optimum objective function (4.8).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_tab1.png?pub-status=live)
In addition to the regularization terms, we include a term in the objective function which penalizes a change in the volume, as an increase in the inverse aspect ratio can yield a Shafranov shift (§ 3.7 in Wesson & Campbell Reference Wesson and Campbell2011). This shift in the flux surfaces toward a smaller major radius implies that the volume of a flux surface increases less rapidly than its cross-sectional area. Furthermore, the flux through a surface increases more rapidly than its cross-sectional area because the geometric centre is moving into a region of increased field strength (assuming the field is mostly toroidal). Thus, the volume increases less rapidly than the flux, and a negative value of $V''(\psi )$ can be achieved (Taylor Reference Taylor1965).
We begin with a boundary given by a rotating ellipse (A 2) with $a = 2$,
$b = 1$,
$R_0 = 5$ and
$R_{0,1} = 0$. We optimize the boundary with respect to the modes
$m \le 10$ and
$|n| \le 10$. The equilibrium is computed with
$p(\psi ) = 0$ and
$I_T(\psi ) = 0$ such that a vacuum field is considered. We arrive at the boundary given in figure 7. With the addition of the volume constraint, the aspect ratio remains roughly constant (3.54 for the initial boundary and 3.46 for the optimized boundary), so that the well is provided by the shaping of the boundary rather than the Shafranov shift that arises due to the inverse aspect ratio. We note that the optimized surface features triangularity that rotates from outward-pointing with horizontal elongation to inward-pointing with vertical elongation.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_fig7.png?pub-status=live)
Figure 7. Cross-sections of the boundary (orange) optimized for the magnetic well (4.8) with regularization on the curvature $(\lambda _{\kappa } = 10^2, \lambda _{\bar {\kappa }} = 1)$. Several interior surfaces are shown, along with the location of the magnetic axis (blue star) and axis normal vector (black arrow). The optimized boundary features rotating triangularity and elongation.
To demonstrate the effect of the curvature regularization terms, we examine the optimization with $\lambda _{\kappa } = \lambda _{\bar {\kappa }} = 0$. We converge to the boundary shown in figure 8. As can be seen, regions of large curvature are exhibited, including a dimple-like feature on the inboard side and a concave ‘pinching’ feature. Again the aspect ratio remains roughly constant (3.46).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_fig8.png?pub-status=live)
Figure 8. Cross-sections of the boundary optimized for the magnetic well (4.8) without regularization on the principal curvature $(\lambda _{\kappa } = \lambda _{\bar {\kappa }} = 0)$. Several interior surfaces are shown, along with the location of the magnetic axis (blue star) and axis normal vector (black arrow). The optimized boundary features a ‘dimple’ feature at the
$\phi = 0$ plane and regions of increased concavity.
We can understand the geometric dependence of the magnetic well by considering the expression for $V''(\psi )$ on the axis that arises from the near-axis expansion in the inverse coordinate representation (Landreman & Jorge Reference Landreman and Jorge2020),
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn34.png?pub-status=live)
where $\varphi _B$ is the Boozer toroidal angle and we have made the assumption of a vacuum field. The field strength near the axis is expanded as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn35.png?pub-status=live)
where $r \propto \sqrt {\psi }$ is the effective minor radius and
$\vartheta _B$ is the Boozer poloidal angle.
As can be seen from (A34) in Landreman & Sengupta (Reference Landreman and Sengupta2019), the poloidally independent shift in the flux surface in the normal direction, $X_{20}$, increases proportional to
$B_{20}$. Here, the normal vector is
$\hat {\boldsymbol {n}} = \boldsymbol {r}''(l)/|\boldsymbol {r}''(l)|$, where
$\boldsymbol {r}(l)$ is the position vector along the magnetic axis parameterized by the length. In the axisymmetric limit, a positive value of
$X_{20}$ indicates a net shift of the surfaces’ geometric centre toward a smaller major radius. This correlation between
$X_{20}$ and
$B_{20}$ is consistent with the statement that a Shafranov shift provides a magnetic well. In figures 7 and 8 we display the magnetic axis (blue star) and normal vector (black arrow) along with the shapes of several magnetic surfaces for the optimized configurations. While both the unconstrained and constrained optima feature a net shift in the geometric centre of the flux surfaces in the normal direction, this is achieved with vastly different shaping of the boundary. Interestingly, both configurations feature negative triangularity at the
$\phi = 0.5$
$(2{\rm \pi} /N_P)$ plane.
We note that $B_{20}$ arises owing to other shaping components of the surface (Landreman & Sengupta Reference Landreman and Sengupta2019). In axisymmetry, negative values of
$B_{2c}$ contribute to positive (outward-pointing) triangularity (
$X_{2c}<0$) ((B11) in Landreman Reference Landreman2021). Assuming stellarator symmetry, negative values of
$B_{2c}$ contribute to positive values of
$B_{20}$ if the surface is vertically elongated, and positive values of
$B_{2c}$ contribute to positive values of
$B_{20}$ if the surface is horizontally elongated. This implies that positive triangularity coupled with vertical elongation or negative triangularity coupled with horizontal elongation contributes to the magnetic well in axisymmetry. These trends are consistent with the stability analysis of oblate plasmas with negative triangularity (Pogutse & Yurchenko Reference Pogutse and Yurchenko1982; Kesner, Ramos & Gang Reference Kesner, Ramos and Gang1995; Medvedev et al. Reference Medvedev, Kikuchi, Villard, Takizuka, Diamond, Zushi, Nagasaki, Duan, Wu and Ivanov2015). In three dimensions, the connection between
$B_{20}$ and
$B_{2c}$ is more complicated (Landreman & Sengupta Reference Landreman and Sengupta2019, (A41) and (A42)). Nonetheless, we note that the magnetic well near the axis arises at second order in the expansion parameter, which includes the effects of ellipticity and triangularity. Thus, it is not surprising that the optimized boundary with curvature constraints features rotating ellipticity and triangularity.
In comparing the convergence of the optimization with and without the curvature constraints (figure 9), we see that the presence of the curvature objective prevents some failures of the VMEC code. Each function evaluation that resulted in a VMEC failure is visualized as a ‘spike’ that extends above $|f_w - f^{\text {opt}}_w| = 1$, as we artificially set the value of the objective function to
$10^{12}$ at these points. Although the constrained optimization still features some VMEC failures (12 versus 19), the inclusion of the curvature constraints improves the convergence toward the optimum, and a deeper magnetic well is achieved throughout the volume.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_fig9.png?pub-status=live)
Figure 9. Convergence of the optimization of the objective function (4.8) (a) with the inclusion of the curvature constraints ($\lambda _{\kappa } = 10^2, \lambda _{\bar {\kappa }} = 1$) and (b) without the curvature constraints (
$\lambda _{\kappa } = \lambda _{\bar {\kappa }} = 0$). (c) With the inclusion of the curvature constraint, we achieve a deeper magnetic well throughout the volume.
4.3. Quasi-symmetry
We next consider optimization for quasi-symmetry near the magnetic axis. This is quantified through the objective,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn36.png?pub-status=live)
where $\bar {B} = \int _{V_P} \,\textrm {d}^3 x \, w(\psi ) B$ and we take
$w(\psi ) = \exp \left ({-(\psi /\psi _0)^2/0.1^2}\right )$. This objective aims to make the field strength constant on the magnetic axis, which is a feature of quasi-axisymmetric and quasi-helically symmetric equilibria. A similar objective function has also been included in optimization for energetic particle confinement (Drevlak et al. Reference Drevlak, Geiger, Helander and Turkin2014). The gradient of this objective function is obtained by computing an equilibrium with the addition of an anisotropic pressure tensor,
$\boldsymbol {\nabla } \boldsymbol {\cdot } \boldsymbol {P} = p_{\|} \hat {\boldsymbol {b}}\hat {\boldsymbol {b}} + p_{\perp } (\boldsymbol {I} -\hat {\boldsymbol {b}}\hat {\boldsymbol {b}})$ with
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn37.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn38.png?pub-status=live)
as described in Paul et al. (Reference Paul, Antonsen, Landreman and Cooper2020).
We take our objective function to be
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn39.png?pub-status=live)
where $\iota _{\text {target}}(\psi )$ is taken to be the initial rotational transform profile. We also take
$\lambda _{\iota } = 1$,
$\lambda _{\bar {\kappa }} = 10^{-3}$,
$\lambda _{\kappa } = 5 \times 10^{-2}$,
$\kappa _{\textrm {max},1} = \kappa _{\textrm {max},2} = 10\ \textrm {m}^{-1}$,
$w_{\kappa } = 1$,
$w_{\mathcal {S}} = w_R = 10^{-1}$,
$\lambda _R = 10^3$,
$\lambda _{\mathcal {S}} = 10^{-1}$,
$\mathcal {R} = 0.5\ \textrm {m}$ and
$R_\textrm {min} = 0.2\ \textrm {m}$. The additional constraint on the rotational transform is required to prevent the surface from becoming axisymmetric to reduce the toroidal ripple on the axis.
We begin with a rotating ellipse boundary with torsion, given by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn40.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn41.png?pub-status=live)
with $R_0 = 5$,
$R_{0,1} = R_{0,2} = 0.3$,
$a = 2$,
$b = 1$ and
$N_P = 3$. We perform optimization at
$\beta = 1.2\,\%$ with the profiles shown in figure 10. We optimize with respect to the modes
$m \le 10$ and
$|n| \le 10$ in a staged approach: first optimizing with respect to
$m \le 3$ and
$|n| \le 3$, then with respect to
$m \le 6$ and
$|n| \le 6$, then with respect to the full set of modes.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_fig10.png?pub-status=live)
Figure 10. (a) The pressure and (b) the integrated toroidal current profiles used for the optimization of quasi-symmetry on the axis (4.13).
We display the magnetic field strength on the initial and optimized boundaries in figure 11. We note the initial configuration features a large toroidal variation of the field strength, with increased field strength near the ‘corners’ that arise owing to the large torsion of the axis. Although the axis ripple figure of merit (4.11) reduces the toroidal variation of the field strength on the axis, which must vanish in both quasi-axisymmetry and quasi-helical symmetry, we find that the optimized magnetic field is driven closer to quasi-axisymmetry.
We remark that a design based solely on the objective (4.11) may be limited in that quasi-symmetry is not targeted away from the axis. We therefore quantify the departure from quasi-symmetry in the two configurations in two additional ways. The first is through the figure of merit,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn42.png?pub-status=live)
where $r_{\text {minor}}(\psi ) = \sqrt {A(\psi )/{\rm \pi} }$ is the averaged minor radius of a flux surface where
$A(\psi )$ is the toroidally averaged cross-sectional area. (This is the same definition of the minor radius used in the VMEC code.) The flux-surface average of a quantity
$Q$ is
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn43.png?pub-status=live)
where $\sqrt {g} = (\boldsymbol {\nabla } \psi \times \boldsymbol {\nabla } \theta \boldsymbol {\cdot } \boldsymbol {\nabla } \phi )^{-1}$ is the flux coordinate Jacobian. Equation (4.15) is a normalized figure of merit which employs the triple product form for quasi-symmetry (Helander Reference Helander2014; Rodriguez, Helander & Bhattacharjee Reference Rodriguez, Helander and Bhattacharjee2020),
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn44.png?pub-status=live)
and allows us to quantify the quasi-symmetry error without specifying the helicity of the symmetry. We also perform a Boozer coordinate ($\psi$,
$\vartheta _B$,
$\varphi _B$) transformation (Sanchez et al. Reference Sanchez, Hirshman, Ware, Berry and Spong2000) to obtain the Fourier harmonics of the field strength,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn45.png?pub-status=live)
The deviation from quasi-axisymmetry can then be quantified as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn46.png?pub-status=live)
Although quasi-symmetry was only targeted on the axis, we reduce the quasi-symmetry error throughout the volume with respect to both metrics without introducing a reduction in the rotational transform (figure 12).
5. Conclusions
In this work, we have provided the first example of gradient-based, fixed-boundary optimization of stellarator equilibria. We provide examples of several equilibria obtained with the ALPOpt tool, including a vacuum field with ultra-low magnetic shear. Furthermore, we have identified regularization terms for fixed-boundary optimization that prevent self-intersection and reduce the surface curvature. These regularization terms improve the convergence toward the optimum and may reduce the required coil complexity. Another approach to reducing coil complexity would be to incorporate metrics related to the properties of the current potential solution on a uniformly offset winding surface (Carlton-Jones, Paul & Dorland Reference Carlton-Jones, Paul and Dorland2020).
The availability of derivative information enables the optimization in high-dimensional spaces. We present the optimization of the boundary with respect to the set of boundary harmonics $m \le 10$,
$|n| \le 10$ (441 modes). This dimensionality is significantly larger than that of previous optimized efforts. For example, ESTELL was optimized with respect to the modes
$m \le 4$,
$|n| \le 4$ (Drevlak et al. Reference Drevlak, Brochard, Helander, Kisslinger, Mikhailov, Nührenberg, Nührenberg and Turkin2013) and NCSX was optimized with respect to
$m \le 6$ and
$|n| \le 4$ (Zarnstorff et al. Reference Zarnstorff, Berry, Brooks, Fredrickson, Fu, Hirshman, Hudson, Ku, Lazarus and Mikkelsen2001). Such an increase in the optimization space may enable further refinement of an optimum, as demonstrated in § 4.1.
Gradient information of equilibrium quantities is obtained using an adjoint method (Antonsen et al. Reference Antonsen, Paul and Landreman2019; Paul et al. Reference Paul, Antonsen, Landreman and Cooper2020) which requires solving a modified equilibrium problem. For the objectives presented in this work, the rotational transform (§ 4.1), the magnetic well (§ 4.2) and near-axis quasi-symmetry (§ 4.3), the adjoint equilibrium problem requires the addition of a perturbation to the toroidal current profile, pressure profile and the addition of an anisotropic pressure tensor of a specific form. For other figures of merit, the adjoint equilibrium problem requires the addition of a bulk force of a different form, such as an anisotropic pressure that cannot be handled by the variational principle employed by the ANIMEC code (Cooper et al. Reference Cooper, Hirshman, Merazzi and Gruber1992). In this case, rather than approximating the linearized adjoint problem by adding a perturbation to the nonlinear equilibrium solution, a linearized equilibrium solution can be computed. We have demonstrated this technique for computing the shape gradient of the magnetic well for axisymmetric equilibria (Paul Reference Paul2020). In generalizing this approach to 3D equilibria, there are additional challenges that arise. As regular singular points occur at every surface where the rotational transform resonates with a mode included in the spectrum for the solution vector, additional care must be taken in regularizing the equations. To avoid these difficulties associated with 3D MHD equilibria, the adjoint equations for a vacuum or force-free equilibrium model could instead be considered.
We have presented several example configurations obtained with the set of figures of merit for which derivative information is available. Although similar objective functions can be minimized with derivative-free methods using codes such as STELLOPT and ROSE, the availability of derivative information is known to improve convergence toward the optimum, especially in high-dimensional spaces. A direct comparison between the adjoint approach and derivative-free methods is left for future work. Upon further advancement of adjoint methods, we hope that these numerical advances will enable the identification of equilibria of experimental relevance. To conclude, we anticipate many extensions of this work, such as applying the same adjoint principle for free-boundary optimization.
Acknowledgements
The authors would like to acknowledge C. Zhu for the development of a Python interface with the VMEC code and S. R. Hudson for enlightening discussions and help with the SPEC calculations.
Editor Per Helander thanks the referees for their advice in evaluating this article.
Funding
This work was supported by the US Department of Energy through grants DE-FG02-93ER-54197 and DE-FC02-08ER-54964. This work was also supported by a grant from the Simons Foundation (560651, ML). Some of the computations presented in this paper have used resources at the National Energy Research Scientific Computing Center (NERSC).
Declaration of interests
The authors report no conflict of interest.
Appendix A. Target rotational transform optimization
In this section, we perform a benchmark to demonstrate the convergence of the gradient-based optimization to a known minimum in a 2D space using the following objective function,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn47.png?pub-status=live)
defined with $\iota _{\text {target}} = 0.618034$ with the definition in (4.1). Although this target function is not physically relevant, as there remains residual error in the rotational transform profile, it allows us to easily identify a local optimum where
$f_{\iota } = 0.06$.
We begin with the boundary,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn48.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_eqn49.png?pub-status=live)
with $R_0 = 5$,
$R_{1,0} = 1.5$ ,
$R_{1,1} = 0.6$,
$Z_{1,0} = 1.5$ and
$Z_{1,1} = -0.6$. We consider a vacuum field with profiles
$p(\psi ) = I_T(\psi ) = 0$. We identify the minimum of
$\tilde {f}_{\iota }$ with respect to
$R_{1,1}$ and
$Z_{1,1}$ by performing a scan over the local space. We achieve convergence to this optimum in nine function evaluations (six BFGS iterations). In figure 13 we present the convergence of the objective function in the 2D optimization space, with the optimum denoted by the blue star. We are able to reduce the L2 gradient norm to
$5.13 \times 10^{-13}$.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210402171640829-0890:S0022377821000283:S0022377821000283_fig13.png?pub-status=live)
Figure 13. (a) Convergence of BFGS optimization beginning at the orange star toward optimum function value at $R_{1,1} = 0.46$ and
$Z_{1,1} = -0.47$ (blue star). Each function evaluation is denoted by a red dot. (b) Convergence of the L2 norm of the gradient of the objective function (A 1).