Hyperparameters optimization of neural network using improved particle swarm optimization for modeling of electromagnetic inverse problems

Debanjali Sarkar; Taimoor Khan; Fazal Ahmed Talukdar

doi:10.1017/S1759078721001690

Hyperparameters optimization of neural network using improved particle swarm optimization for modeling of electromagnetic inverse problems

Published online by Cambridge University Press: 17 December 2021

and

Debanjali Sarkar: Affiliation:
Department of Electronics and Communication Engineering, National Institute of Technology Silchar, Silchar, India
Taimoor Khan*: Affiliation:
Department of Electronics and Communication Engineering, National Institute of Technology Silchar, Silchar, India
Fazal Ahmed Talukdar: Affiliation:
Department of Electronics and Communication Engineering, National Institute of Technology Silchar, Silchar, India
*: Author for correspondence: Taimoor Khan, E-mail: ktaimoor@ieee.org

Article contents

Abstract
Introduction
Inverse modeling
Application examples
Conclusion
References

Rights & Permissions

Abstract

Optimization of hyperparameters of artificial neural network (ANN) usually involves a trial and error approach which is not only computationally expensive but also fails to predict a near-optimal solution most of the time. To design a better optimized ANN model, evolutionary algorithms are widely utilized to determine hyperparameters. This work proposes hyperparameters optimization of the ANN model using an improved particle swarm optimization (IPSO) algorithm. The different ANN hyperparameters considered are a number of hidden layers, neurons in each hidden layer, activation function, and training function. The proposed technique is validated using inverse modeling of two meander line electromagnetic bandgap unit cells and a slotted ultra-wideband antenna loaded with EBG structures. Three other evolutionary algorithms viz. hybrid PSO, conventional PSO, and genetic algorithm are also adopted for the hyperparameter optimization of the ANN models for comparative analysis. Performances of all the models are evaluated using quantitative assessment parameters viz. mean square error, mean absolute percentage deviation, and coefficient of determination (R2). The comparative investigation establishes the accurate and efficient prediction capability of the ANN models tuned using IPSO compared to other evolutionary algorithms.

Keywords

Artificial neural network (ANN)electromagnetic bandgap (EBG)genetic algorithm (GA)monopole antenna particle swarm optimization (PSO)ultra-wideband (UWB)

Type: Antenna Design, Modelling and Measurements
Information: International Journal of Microwave and Wireless Technologies , Volume 14 , Issue 10 , December 2022 , pp. 1326 - 1337

DOI: https://doi.org/10.1017/S1759078721001690 [Opens in a new window]
Copyright: Copyright © The Author(s), 2021. Published by Cambridge University Press in association with the European Microwave Association

Introduction

Rapid evolution has occurred in the field of wireless communication over recent years. Ultra-wideband (UWB) wireless technologies, intended for short-distance communication have gained popularity since the Federal Communications Commission (FCC) released 7.5 GHz of the unlicensed band for commercial applications. According to the FCC rulings, the UWB device occupies fractional bandwidth (FBW) greater than 20% and can commercially operate in the frequency range of 3.1–10.6 GHz. As UWB systems utilize a huge amount of bandwidth, they must share the spectrum with other wireless services and applications. To address spectrum access coordination, UWB regulatory organizations from various countries developed their own UWB radio spectrum regulations as listed in Table 1 [Reference Nikookar and Prasad1]. UWB antennas being the key component of UWB systems have wide impedance bandwidth, compact size, low cost, uniform omnidirectional radiation pattern, and provide high-speed data rate. Several narrow frequency bands operate in the UWB frequency band such as WiMAX (3.3–3.7 GHz), C-band (downlink 3.7–4.2 GHz, uplink 5.925–6.425 GHz), WLAN (5.15–5.35 GHz and 5.725–5.825 GHz), and X-band (downlink 7.25–7.75 GHz, uplink 7.9–8.4 GHz). The communicating devices operating in the narrow bands may interfere with the devices operating in UWB. To avoid this electromagnetic interference, UWB antennas are designed with band rejection characteristics by etching slots on the patch or ground plane, using parasitic components, or using tuning stubs. Electromagnetic band-gap (EBG) structures are being introduced to nullify the mutual coupling interference and independently regulate the notch band. EBG structures are a periodic arrangement of metal conductors and dielectric material to intercept the transmission of specific bandwidth at a certain frequency band. EBG structures exhibit stopband characteristics that can direct the radiation of the antenna and prevent the scattering of surface waves [Reference Ghahremani, Ghobadi, Nourinia, Ellis, Alizadeh and Mohammadi2–Reference Dalal and Dhull8].

Table 1. UWB Regulation standards of different countries

Several electromagnetic (EM) simulators such as IE3D, FEKO, HFSS, and CST Studio are available for the analysis and synthesis of electromagnetic devices. These simulators are based on the finite difference method (FDM) and finite element method (FEM) for solving differential equations and method of moment (MoM) for solving integral equations of electromagnetic problems. However, the huge computational time and resources required by these EM simulators are their main constraints. To overcome these limitations, computational intelligence (CI) techniques are proved to be an alternate solution. CI techniques are being extensively used in the field of microwave engineering for modeling and optimization of microwave structures. These techniques are efficient for solving intensive nonlinear problems as they require limited computational resources. Artificial neural network (ANN) has been widely used in modeling complex EM structures because of their ability to learn from experience and accurate predictions compared to numerical techniques [Reference Wang and Fang9–Reference Sarkar, Khan and Laskar14].

ANNs are characterized by hyperparameter values which are responsible for defining the network structure, regularization parameters, and learning rate. The performance of the NN model depends on the optimal selection of these user-specified hyperparameter values. Usually, the hyperparameters are selected using trial-and-error or by grid search. Grid search is a hyperparameter tuning method based on scanning each possible combination of parameters and evaluating the NN model accordingly [Reference Bergstra, Bardenet, Bengio and Kegl15]. However, these traditional approaches are time-consuming and not feasible for assessing a higher number of hyperparameters. Random search algorithms outperform the traditional techniques in terms of efficiency as it tries to find a global optimum. Each hyperparameter is statistically distributed such that the values may be randomly sampled. The drawback of random search is that it has a higher rate of convergence and yields high variance during computation [Reference Bergstra and Bengio16]. Evolutionary algorithms have been widely adopted to find the optimum parameter value of the network model. In [Reference Assuncao, Lourenco, Machado and Ribeiro17, Reference Zhang, Wang, Liu, Du and Lu18], a genetic algorithm (GA) has been used to configure ANN topology and tune the network weights and biases. The optimal regularization hyperparameters and activation functions of a multi-layer NN model are determined using GA in [Reference Itano, Sousa and Hernandez19, Reference Wang, Roger, Sarrazin and Perrault20]. GA has also been used to optimize the architecture of a convolution neural network (CNN) in [Reference Johnson, Valderrama, Valle, Crawford, Soto and Nanculef21]. Particle swarm optimization (PSO) has also been opted to determine the optimal hyper-parameters for designing CNN-LSTM network, deep neural network (DNN), and back-propagation network [Reference Kim and Cho22–Reference Sun, Li, Liu, Liu, Wang and Tan25].

In PSO, each particle's movement is dictated by its local best position in order to reach the global best position by computing its fitness. However, if a particle does not improve its fitness, the previous velocity vector has an effect on the next velocity vector. In order to overcome this, a mutation mechanism is subjected to the velocities of the particles that are unable to locate a better position [Reference Zaharis, Gravas, Yioultsis, Laziridis, Glover, Skeberis and Xenos26]. The PSO algorithm based on velocity mutation has been utilized for optimizing the geometry of a reconfigurable antenna array and complementary split-ring resonator [Reference Gravas, Zaharis, Lazaridis, Yioultsis, Kantartzis, Antonopoulos, Chochliouros and Xenos27, Reference Reddaf, Djerfaf, Ferroudji, Boudjerda, Hamdi-Chérif and Bouchachi28]. A fractal antenna and a log-periodic dipole array have also been optimized using this algorithm for obtaining low S ₁₁ and high-gain radiation pattern respectively [Reference Gravas, Sifakis, Zaharis, Lazaridis and Xenos29, Reference Zaharis, Gravas, Lazaridis, Glover, Antonopoulos and Xenos30]. The contribution of this work is the integration of the PSO algorithm based on velocity mutation with ANN which has been conceptualized for the first time to the best of the authors' knowledge. In the presented work, ANN-based modeling is proposed for inverse modeling two EBG unit cells and an EBG loaded antenna. ANN modeling for the two EBG unit cells is implemented to predict the geometrical parameters from resonant frequency and its corresponding reflection coefficient. These two unit cells are incorporated into a slotted monopole antenna to realize penta notched band characteristics. The geometrical parameters of the proposed antenna are predicted from the UWB impedance bandwidth and multiband notch frequencies of the antenna. Hyperparameters of the NN models, such as number of hidden layers, number of hidden neurons in each layer, activation function, and training function are tuned using PSO algorithm based on velocity mutation mechanism termed in this work as improved PSO (IPSO).

Inverse modeling

Multilayer perceptron

ANN is a computational model for surrogate modeling aimed at reducing computational time and resources. It bypasses the requirement of lengthy and tedious analysis and mathematical calculations. NN consists of artificial neurons in multiple layers and maps the input data with the target output. Neurons of each layer are connected with each other and carry weights that are responsible to excite or inhibit the input signals. One of the widely used ANN architecture is the multilayer perceptron (MLP) consisting of an arbitrary number of hidden layers between an input and output layer. MLP is a feed-forward network and is mostly implemented for supervised learning problems. It is used to process the correlation between input and target output using backpropagation. In the forward pass, the input data flows through the intermediate layers to the output layer. Error difference between the target and predicted output is calculated. The partial derivative of the error function with respect to weights and biases are backpropagated through the network. Weights and biases are adjusted in each iteration until the network reaches a state of convergence.

The equation representing the input-output relationship of a generalized MLP is shown in Fig. 1 can be expressed as,

(1)

$$y_o = f^{l + 1}\left({\mathop \sum \limits_{\,p = 1}^q w_{o, p}^{( {l + 1} ) } \left({\,f^l\left({\mathop \sum \limits_{k = 1}^s w_{\,p, k}^{( l ) } \left({ \ldots \ldots .\left({\,f^1\left({\mathop \sum \limits_{i = 1}^n w_{\,j, i}^{( 1 ) } x_i + b_j^{( 1 ) } } \right)} \right)\ldots ..} \right) + b_p^{( l ) } } \right)} \right) + b_o^{( {l + 1} ) } } \right)$$

where x_i is the input to i^th neuron and y_o is the output of o^th neuron of MLP; f¹, f^l, and f^l+1represents the activation function of the first hidden layer, l^th hidden layer, and output layer respectively; $w_{j, i}^{( 1 ) }$ denotes the connection weights between i^th input neuron and j^th hidden neuron; $w_{p, k}^{( l ) }$ denotes the connection weights between k^th and p^th hidden neuron; $w_{o, p}^{( {l + 1} ) }$ denotes the connection weights between p^th hidden and o^th output neuron; $b_j^{( 1 ) }$, $b_p^{( l ) }$, and $b_o^{( {l + 1} ) }$ represents the biases of hidden and output layer, respectively.

Fig. 1. Generalized architecture of MLP.

An MLP can be created with multiple hidden layers with a large number of neurons in each layer. However, an increase in the number of hidden layers and neurons may lead to generalization capability loss [Reference Karystinos and Pados31]. Also, few numbers of hidden layers and neurons fail to map complex input-output relationships and leads to underfitting. To achieve faster convergence, proper selection of activation function is necessary. There are several training functions used to train the MLP, which have a significant influence on its performance. Over the years, the trial and error approach is widely used to find the hyperparameters of MLP. In the proposed work, we have explored the capabilities of IPSO to determine appropriate MLP architecture (MLP-IPSO) with better hyperparameter configuration.

Design of MLP-IPSO

PSO is a search-based optimization algorithm inspired by the social behavior of swarms. An initial set of the population with random position and velocity are selected and moved around in a multidimensional search space. The movement of each particle is governed by its local best position to reach the global best position. The proximity of the particle to the global best is measured using a fitness function. The position and velocity of each particle are updated at each iteration until a global optimum solution is achieved. However, if a particle fails to achieve better fitness, then the previous velocity vector affects the next velocity vector. In order to avoid that, velocity mutation is introduced in this IPSO algorithm [Reference Zaharis, Gravas, Yioultsis, Laziridis, Glover, Skeberis and Xenos26].

Let the MLP configuration be represented as λ = (λ _N, λ _H) where λ _N denotes the network architecture and λ _H denotes the hyperparameter configuration respectively. The goal of the optimization is to tune λ _H = (λ _D, λ _C) ∈ K where K is the decision set of all hyperparameters. The hyperparameters to be optimized are the number of hidden layers, the number of neurons in each hidden layer, activation function, and training function. The discrete hyperparameters λ _D include the number of hidden layers and the number of neurons in each hidden layer. Activation function and training function are categorical or non-ordinal hyperparameters λ _C. Every λ _N is trained using corresponding λ _H in order to minimize err _MLP[(λ _N, λ _H)] which is calculated as,

(2)$$err_{MLP}[ {( {\lambda_N, \;\lambda_H} ) } ] = \displaystyle{{\mathop \sum \nolimits_{i = 1}^N {[ {\,f_p( {x_i} ) -f_t( {x_i} ) } ] }^2} \over N}\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;$$

where f _p(x _i) is the predicted model output and f _t(x _i) is the target output.

For MLP-IPSO, an initial population of particles with random position and velocity is selected. Each p_i particle has its positionx _i ∈ Sand velocity v _i ∈ S where S is the search space of all the hyperparameters. After population initialization, each particle p_i determines an MLP configuration, p_i = (net_IPSO, H_IPSO) where, net_IPSO and H_IPSO denotes the MLP architecture and hyperparameter configuration, respectively. The corresponding MLP model is then trained using the cross-validation technique [Reference Kwok and Yeung32]. k-fold cross-validation technique is widely applied to divide the dataset into k random independent subsets. k-1 subsets are used for training purposes and one of the k subsets is used for testing. During all the k-fold runs, each subset is selected as hold out at a time. The final performance of the model is then assessed by averaging the k recorded errors. The modeling performance of the MLP is assessed using mean square error (MSE) given by eq. 2.

The current fitness of p_i is compared with the previous best solution of the particle's position pbest and is updated if the current fitness is larger. Similarly, the global best solution of the swarm's position gbest is updated if the current fitness value is larger than the previous gbest. After updating pbest and gbest, the position and velocity of each particle is then modified using,

(3)$$\eqalign{v_i^{iter + 1} & = k\{ {v_i^{iter} + \varphi_1rand^{iter}( {\,pbest_i^{iter} -x_i^{iter} } ) }\cr & \quad+ \varphi_2rand^{iter}( {gbest_i^{iter} -x_i^{iter} } ) \}} $$

(4)$$x_i^{iter + 1} = x_i^{iter} + v_i^{iter + 1} $$

where, x_i and v_i represents n-dimensional position and velocity of i^th particle, respectively; iter denotes the iteration index; rand represents random numbers uniformly distributed in [0,1]; φ₁, φ₂denotes the cognitive and social coefficients respectively, and k is constriction coefficient calculated as,

(5)$$k = \displaystyle{2 \over {\left\vert {2-\varphi -\sqrt {\varphi^2-4\varphi } } \right\vert }}$$

where, φ = φ₁ + φ₂.

At the end of iteration iter, if an i^th particle is not able to improve its fitness, then its velocity components $v_i^{it}$ are mutated by a factor F_g, given as [Reference Zaharis, Gravas, Yioultsis, Laziridis, Glover, Skeberis and Xenos26],

(6)$$F_g = ( {0.6 + 0.1g} ) ( {2rand-1} ) $$

where g denotes the number of iterations for a particle with no fitness improvement. In this case, the velocity of the particle is updated using [Reference Zaharis, Gravas, Yioultsis, Laziridis, Glover, Skeberis and Xenos26],

(7)$$\eqalign{v_i^{iter + 1} & = k\{ {F_gv_i^{iter} + \varphi_1rand^{iter}( {\,pbest_i^{iter} -x_i^{iter} } ) }\cr& \quad+ \varphi_2rand^{iter}( {gbest_i^{iter} -x_i^{iter} } ) \}} $$

The process is continued until the maximum numbers of iterations are over. The search space of all the hyperparameters to be optimized by IPSO is listed in Table 2. The optimal solutions of the algorithm obtained are considered as the tuned hyperparameters of the MLP. Figure 2 represents the flowchart of the proposed MLP-IPSO approach.

Table 2. Search space for hyperparameter optimization

Fig. 2. Flowchart of MLP-IPSO.

Application examples

In this section, three inverse NN models based on MLP-IPSO approach have been proposed for two EBG unit cells and an EBG loaded antenna.

EBG unit cells

Two symmetrical defected spiral lines, and two L-shaped defected lines with two defected spiral lines, are employed to design EBG ₁. EBG ₂ is designed using four symmetrical spiral lines connected from the center. Both the EBG unit cells, shown in Fig. 3, are designed on FR4 dielectric substrate with a thickness of 1.6 mm using Ansys HFSS. The geometrical parameters of EBG ₁ (a ₁, b ₁, c ₁, d ₁, e ₁, a ₂, b ₂, c ₂, d ₂, e ₂, l) and height of the dielectric substrate (h) are varied to obtain the resonance frequency and its corresponding |S ₁₁|. Extensive parametric analysis has been performed on each parameter to obtain the training and testing datasets. Similarly for EBG ₂, different geometrical parameters of the spiral line (a, b ₁, b ₂, c, d, e, f, g), and height of the dielectric substrate (h) are varied using parametric analysis to obtain the datasets. Table 3 lists out the sampling procedure for generating the datasets. A total of 1220 and 1098 datasets are obtained for EBG ₁ and EBG ₂, respectively. The simulated performance of both EBG ₁ and EBG ₂ is shown in Fig. 4. EBG ₁ gives dual resonance at 5.7 and 11.1 GHz, respectively whereas EBG ₂ is resonating at 7.5 GHz. The Brillouin-zone based dispersion diagrams of EBG ₁ and EBG ₂ are shown in Fig. 5. For EBG ₁, two bandgaps are exhibited between 5.66–5.85 GHz and 10.8–11.5 GHz, and the bandgap for EBG ₂ falls between 7.21 and 7.79 GHz. A comparative study between the EBG structures and the existing literature in terms of types of EBG design, size, and bandgaps is conducted as listed in Table 4.

Fig. 3. EBG Unit Cells (a) EBG ₁, (b) EBG _2.

Fig. 4. Simulated performance of EBG structures.

Fig. 5. Dispersion diagram of (a) EBG ₁, (b) EBG _2.

Table 3. Sampling approach for dataset generation of EBG structures

Table 4. Comparison of EBG structures with existing literature

For modeling EBG ₁, and EBG ₂, two NN models, NN ₁, and NN ₂ are proposed. NN ₁ has been implemented for obtaining 12-dimensional response [G ₁] of EBG ₁ for the 4-dimensional input [I ₁] where [G ₁] = [a ₁, b ₁, c ₁, d ₁, e ₁, a ₂, b ₂, c ₂, d ₂, e ₂, l, h] and [I ₁] = [f_r ₁, |S₁₁|₁, f_r ₂, |S₁₁|₂]. NN ₂ is proposed for EBG ₂, where 2-dimensional excitation [I ₂] is processed for getting 9-dimensional response [G ₂]. Here, [I ₂] = [f_r, |S₁₁|] and [G ₂] = [a, b ₁, b ₂, c, d, e, f, g, h].

EBG loaded UWB antenna

A conventional UWB antenna with a rectangular radiating surface and a partial ground plane is taken as the reference antenna. The simulated performance of the antenna is shown in Fig. 6 and it can be observed that the antenna achieved an impedance bandwidth from 2.9 to 10.5 GHz. For obtaining triple-band notch characteristics, three modified U-shaped slots as shown in Fig. 7 are incorporated onto the radiating surface. The effect of notch frequencies is validated by individually simulating each modified U-shaped slot. The slot positions are optimized in such a way that it rejects three interfering bands intended for ISM, radar surveillance, and WiMAX applications.

Fig. 6. Simulated performance.

Fig. 7. Proposed EBG loaded antenna design.

To obtain the datasets of the reference UWB antenna, a parametric analysis is performed by varying four geometrical variables, L_P, W_P, W_f, and L_g. Slot ₁ is introduced on the radiator and parametric analysis is performed by varying L ₁ and W ₁. Slot ₂ is then incorporated on the radiating surface keeping Slot ₁ at its optimized position. Parametric analysis is done on Slot ₂ by varying L ₂, and W ₂. Finally, Slot ₃ is placed keeping Slot ₁ and Slot ₂ at their optimized place and parametric analysis is performed on L ₃, and W ₃. For validating the effectiveness of EBG structures in rejecting frequency bands, the two EBG unit cells are incorporated onto the slotted antenna. As shown in Fig. 7, EBG ₁ is etched from the radiator at a distance of d ₁ and d ₂ from Slot ₁. The three modified U-shaped slots are kept at their optimized position and d ₁ and d ₂ are varied for obtaining the datasets. Similarly, EBG ₂ is introduced at a distance of p ₁ from the feed line as depicted in Fig. 7. Parametric variation of p ₁ and p ₂ is performed by keeping other geometrical variables at their optimized values.

Figure 7 is the final proposed antenna geometry, with two meander line EBG cells, EBG ₁ and EBG ₂, which produces penta notch-band characteristics. The sampling method to obtain the datasets is listed in Table 5. Figure 6 shows that the antenna satisfies the bandwidth requirement of UWB applications from f_c ₁ = 2 GHz to f_c ₂ = 10.74 GHz and the notches are achieved at 2.33, 2.83, 3.35, 3.87, and 5.87 GHz. The prototype of the optimized antenna geometry is fabricated using FR4 substrate sheet of 1.6 mm thickness for validating its performance. The fabricated prototype and the far-field measurement setup are shown in Fig. 8. The measured |S ₁₁| result of the fabricated prototype obtained using a vector network analyzer (VNA) is compared with the simulated results in Fig. 9. A good agreement between the simulated and measured performance characteristics is achieved. The slight deviations might be due to fabrication tolerances, soldering effect, or human error during the process of fabrication and/or measurement. The simulated and measured normalized radiation patterns for the antenna in H-plane and E-plane are plotted in Fig. 10. The antenna achieves an omnidirectional radiation pattern in the H-plane, and a dipole-like radiation pattern in the E-plane at the working frequencies of 4.2, 6.9, and 9.5 GHz. Figure 11 depicts that the antenna achieved stable group delay and relatively flat gain. The sudden transitions validate the radiation prohibition at the notch bands. For validating the creation of notched bands by etching slots and loading EBG structures, the surface current distribution at five notch frequencies of the antenna is plotted in Fig. 12. From the figure, it is clear that the current distribution is strongly confined around the respective slots at 2.33, 2.83, and 3.35 GHz. Besides, at 3.87, and 5.87 GHz, the surface current remains concentrated near the EBG ₁ and EBG ₂, respectively.

Fig. 8. Fabricated prototype and the measurement setup in anechoic chamber.

Fig. 9. Simulated and measured |S ₁₁| comparison.

Fig. 10. Radiation pattern comparisons (a) H-Plane, (b) E-Plane.

Fig. 11. Simulated group delay and gain.

Fig. 12. Surface current distribution.

Table 5. Sampling approach for dataset generation of EBG loaded UWB antenna

A comparative study between the presented geometry and existing literature is conducted in terms of the performance characteristics viz. impedance bandwidth, the number of notches obtained, rejected frequency, notch bands, and antenna size, as listed in Table 6. Although, [Reference Rahman, Jahromi, Mirjavadi and Hamouda33] and [Reference Yadav, Abegaonkar, Koul, Tiwari and Bhatnagar3] have proposed a more compact structure, it has achieved only single and dual notches, respectively. The suggested geometry is found to be more miniaturized than [Reference Ghahremani, Ghobadi, Nourinia, Ellis, Alizadeh and Mohammadi2, Reference Lee, Yang and Cho4, Reference Li, Hei, Feng and Shi34–Reference Iqbal, Smida, Mallat, Islam and Kim36], and has also obtained notch characteristics at five frequency bands.

Table 6. Comparison of optimized design with existing literature

For modeling the proposed antenna structure, NN ₃ has been presented to predict 14-dimensional output [G ₃] from 7-dimensional input [I ₃] where [G ₃] = [L _P, W _P, W _f, L _g, L ₁, W ₁, L ₂, W ₂, L ₃, W ₃, d ₁, d ₂, p ₁ and p ₂] and [I₃] = [f_c ₁, f_c ₂, f_n ₁, f_n ₂, f_n ₃, f_n ₄, f_n ₅].

Computed performance

To verify the effectiveness of the MLP-IPSO approach mentioned in Section II, the performance of the three proposed models is evaluated in this section. The optimal hyperparameters obtained after the models are tuned using IPSO are listed in Table 7. It is observed that two hidden layers are proved to be optimum for all the models. The optimal activation function is tansig for NN ₁, and NN ₃, and logsig for NN ₂. Trainbr is proved to be the optimum training function for NN ₁, and NN ₂, whereas for NN ₃, trainlm is the optimum hyperparameter.

Table 7. Optimized hyperparameter values.

The convergence of MLP-IPSO is shown in Fig. 13. The MSE calculated using eq. 2 is plotted against the number of iterations measured on the entire training dataset, to obtain the convergence characteristics. The effectiveness of the algorithm for NN ₁, NN ₂, and NN ₃ is clearly depicted in Fig. 13, as the curve consistently converges toward the minimum. Tables 8 and 9 gives a comparison between the simulated and computed performance of all the models. It is observed from Table 8, that an error percentage of less than 1.5% is achieved for all the output parameters of EBG ₁ and EBG ₂, respectively. For the EBG loaded antenna, the error is below 1.1% for all the output parameters as shown in Table 9.

Fig. 13. Convergence curve.

Table 8. Performance comparison of EBG ₁ and EBG ₂

Table 9. Performance comparison of EBG loaded UWB antenna

For a comparative analysis, the three MLP models are also tuned using other evolutionary algorithms, viz. hybrid PSO (HPSO) [Reference Jin and Samii37], conventional PSO [Reference Sengupta, Basak and Peters38], and GA [Reference Mirjalili, Song Dong, Sadiq, Faris, Mirjalili, Song Dong and Lewis39]. The parameters of the algorithms mentioned in Table 10 are selected based on an initial parametric analysis performed with each algorithm individually. The performance of all the models has been analyzed in terms of MSE, mean absolute percentage deviation (MAPD), and coefficient of determination (R ²). The training MSE of three MLP models tuned using IPSO, HPSO, PSO, and GA are depicted in Fig. 14. Training MSE of NN ₁ and NN ₃ is the largest when tuned using PSO followed by GA, HPSO, and IPSO. NN ₂ when tuned using HPSO and PSO has achieved almost equal training MSE. It is observed from the figure, for all the MLP models, IPSO has achieved the least MSE compared to other algorithms.

Fig. 14. MSE performance.

Table 10. Settings of the evolutionary algorithms

The accuracy of the models is also evaluated using MAPD and R ²; mathematical formulae for which are given as,

(8)$$MAPD = \displaystyle{{100\% } \over N}\mathop \sum \limits_{i = 1}^N \displaystyle{{\,f_p( {x_i} ) -f_t( {x_i} ) } \over {\,f_t( {x_i} ) }}$$

(9)

$$R^2 = \left({\displaystyle{{N\mathop \sum \nolimits_{i = 1}^N f_p( {x_i} ) f_t( {x_i} ) -\mathop \sum \nolimits_{i = 1}^N f_p( {x_i} ) \mathop \sum \nolimits_{i = 1}^N f_t( {x_i} ) } \over {\sqrt {\left[{N\mathop \sum \nolimits_{i = 1}^N {( {\,f_p( {x_i} ) } ) }^2-{\left({\mathop \sum \nolimits_{i = 1}^N f_p( {x_i} ) } \right)}^2} \right]\left[{N\mathop \sum \nolimits_{i = 1}^N {( {\,f_t( {x_i} ) } ) }^2-{\left({\mathop \sum \nolimits_{i = 1}^N f_t( {x_i} ) } \right)}^2} \right]} }}} \right)^2$$

MAPD measures prediction accuracy and gives relative deviation from true values whereas R ² assesses the ability of the model to adequately fit the data. As shown in Table 11, training and testing MAPD of MLP-IPSO models are less than that of MLP-HPSO, MLP-PSO, and MLP-GA models. MAPD of NN ₁, and NN ₂ is the worst when the hyperparameters are tuned using PSO, whereas, in the case of NN ₁, training MAPD is the least when IPSO is used for tuning the hyperparameters, followed by HPSO, PSO, and GA. R ² of all the models, listed in Table 11, depicts a stronger correlation between the target and predicted output. Although R ² of all the models is greater than 0.97, training and testing R ² are better for MLP-IPSO models as compared to MLP-HPSO, MLP-PSO, and MLP-GA models.

Table 11. Performance metrics comparison

Conclusion

This paper presented an IPSO algorithm to optimize the hyperparameters of an ANN-based model for solving inverse electromagnetics problems. The proposed algorithm is validated with two EBG unit cells and an EBG loaded slotted monopole antenna. The geometrical parameters of the two EBG unit cells are predicted using resonant frequency and corresponding |S ₁₁| as the input to the NN models. However, to predict geometrical parameters of EBG loaded slotted antenna, cut off frequencies of the UWB frequency band and notch frequencies are considered as the input for the NN model. Hyperparameters of different NN model configurations investigated in this work have also been optimized using HPSO, PSO, and GA. Performances of all the models are compared on the basis of MSE, MAPD, and R ². MLP-IPSO models have outperformed all other evolutionary algorithms in terms of statistical approach. The computed results of MLP-IPSO models are observed to be in close agreement with the simulated results. Though ANN has been widely used to predict the performance of microwave components for a long time, hyperparameter optimization of ANN for such a problem has not been addressed to the best of the authors' knowledge.

Acknowledgement

This work was supported by the Science and Engineering Research Board (SERB), Department of Science and Technology (DST), Govt. of India (GoI) under a research grant No. SB/S3/ EECE/093/2016.

Debanjali Sarkar is a research scholar in the Department of Electronics and Communication Engineering, at NIT Silchar. She received her B.E degree in electronics and communication engineering and M.Tech degree with a specialization in digital electronics and communication systems from Visvesvaraya Technological University. Her current research interest includes the design and modeling of printed ultra-wideband components using computational intelligence-based algorithms.

Taimoor Khan is an assistant professor in the Department of Electronics and Communication Engineering, NIT Silchar since 2014. Prior to that, he worked for more than 15 years in several organizations. His active research interests include printed microwave components, ultra-wideband technology, microwave energy harvesting and computational intelligence paradigms in electromagnetics. He has published over eighty-six research articles in international SCI/SCIE journals and conference proceedings of repute. He has successfully executed one major and two minor funded projects by SERB, AICTE and MHRD and currently executing two international collaborative SPARC and VAJRA research projects with QU Canada and CSUN USA. He has also been awarded a prestigious IETE-Prof SVC Aiya Memorial Award for his outstanding contribution for the year 2020.

Fazal Ahmed Talukdar received the B.E. degree in electrical engineering from the Regional Engineering College, Silchar, India, in 1987, the M.Tech. degree in energy studies from the IIT Delhi, New Delhi, India, in 1993, and the Ph.D. degree in power electronics from Jadavpur University, Kolkata, India, in 2003. He has been with the NIT Silchar, since 1991, where he is currently a professor with the Department of Electronics and Communication Engineering. His current research interests include power electronics, signal processing, and analog circuits. Dr. Talukdar is a member of the Project Review Steering Group, Ministry of Electronics and Information Technology. He is a fellow of the Institution of Engineers (India) and a Life Member of the Indian Society for Technical Education.

References

Nikookar, H and Prasad, R (2009) Introduction to Ultra Wideband for Wireless Communications. Dordrecht, The Netherlands: Springer.Google Scholar

Ghahremani, M, Ghobadi, C, Nourinia, J, Ellis, MS, Alizadeh, F and Mohammadi, B (2019) Miniaturized UWB antenna with dual-band rejection of WLAN/WiMAX using slitted EBG structure. IET Microwaves, Antennas & Propagation 13, 360–366.CrossRef Google Scholar

Yadav, D, Abegaonkar, MP, Koul, SK, Tiwari, V and Bhatnagar, D (2018) A compact dual band-notched UWB circular monopole antenna with parasitic resonators. International Journal of Electronics Communications 84, 313–320.Google Scholar

Lee, DH, Yang, HY and Cho, YK (2014) Ultra-wideband tapered slot antenna with dual band-notched characteristics. IET Microwaves, Antennas & Propagation 8, 29–38.CrossRef Google Scholar

Alam, MS, Islam, MT and Misran, N (2012) A novel compact split ring slotted electromagnetic bandgap structure for microstrip patch antenna performance enhancement. Progress in Electromagnetics Research 130, 389–409.Google Scholar

Bhavarthe, PP, Rathod, SS and Reddy, KTV (2017) A compact two via slot-type electromagnetic bandgap structure. IEEE Microwave and Wireless Components Letters 27, 446–448.CrossRef Google Scholar

Bhavarthe, PP, Rathod, SS and Reddy, KTV (2019) A compact dual-band gap electromagnetic band gap structure. IEEE Transactions on Antennas and Propagation 67, 596–600.CrossRef Google Scholar

Dalal, P and Dhull, SK (2020) Upper WLAN band-notched UWB monopole antenna using compact two via slot electromagnetic band gap structure. Progress In Electromagnetics Research C 100, 161–171.CrossRef Google Scholar

Wang, Z and Fang, S (2014) ANN synthesis model of single-feed corner-truncated circularly polarized microstrip antenna with an air gap for wideband applications. International Journal of Antennas and Propagation 2014, 1–7.Google Scholar

Gosal, G, Almajali, E, McNamara, D and Yagoub, M (2016) Transmitarray antenna design using forward and inverse neural network modeling. IEEE Antennas and Wireless Propagation Letters 15, 1483–1486.CrossRef Google Scholar

Xiao, LY, Shao, W, Jin, FL and Wang, BZ (2018) Multiparameter modeling with ANN for antenna design. IEEE Transactions on Antennas and Propagation 66, 3718–3723.CrossRef Google Scholar

Kapetanakis, TN, Vardiambasis, IO, Ioannidou, MP and Maras, A (2018) Neural network modeling for the solution of the inverse loop antenna radiation problem. IEEE Transactions on Antennas and Propagation 66, 6283–6290.CrossRef Google Scholar

Ustun, D, Toktas, A and Akdagli, A (2019) Deep neural network-based soft computing the resonant frequency of E–shaped patch antennas. International Journal of Electronics Communications 102, 54–61.CrossRef Google Scholar

Sarkar, D, Khan, T and Laskar, RH (2020) Multi-parametric ANN modelling for interference rejection in UWB antennas. International Journal of Electronics 107, 2068–2083.Google Scholar

Bergstra, J, Bardenet, R, Bengio, Y and Kegl, B (2011) Algorithms for hyperparameter optimization. Advances in Neural Information Processing Systems 24, 2546–2554.Google Scholar

Bergstra, J and Bengio, Y (2012) Random search for hyper-parameter optimization. Journal of Machine Learning Research 13, 281–305.Google Scholar

Assuncao, F, Lourenco, N, Machado, P and Ribeiro, B. Automatic generation of neural networks with structured grammatical evolution. 2017 IEEE Congress on Evolutionary Computation, San Sebastian, 1557–1564.Google Scholar

Zhang, S, Wang, H, Liu, L, Du, C and Lu, J. Optimization of neural network based on genetic algorithm and BP. 2014 International Conference on Cloud Computing and Internet of Things, Changchun, 203–207.Google Scholar

Itano, F, Sousa, MADAD and Hernandez, EDM. Extending MLP ANN hyper-parameters optimization by using genetic algorithm. 2018 International Joint Conf on Neural Networks, 1–8.Google Scholar

Wang, S, Roger, M, Sarrazin, J and Perrault, CL (2019) Hyperparameter optimization of two-hidden-layer neural networks for power amplifiers behavioral modeling using genetic algorithms. IEEE Microwave and Wireless Components Letters 29, 802–805.CrossRef Google Scholar

Johnson, F, Valderrama, A, Valle, C, Crawford, B, Soto, R and Nanculef, R (2020) Automating configuration of convolutional neural network hyperparameters using genetic algorithm. IEEE Access 8, 156139–156152.CrossRef Google Scholar

Kim, TY and Cho, SB. Particle swarm optimization-based CNN-LSTM networks for forecasting energy consumption. 2019 IEEE Congress on Evolutionary Computation, 1510–1516.Google Scholar

Ye, F (2017) Particle swarm optimization-based automatic parameter selection for deep neural networks and its applications in large-scale and high-dimensional data. PLoS One 12, 1–36.CrossRef Google Scholar PubMed

Lorenzo, PR, Nalepa, J, Kawulok, M, Ramos, LS and Pastor, JR (2017) Particle swarm optimization for hyper-parameter selection in deep neural networks. Genetic and Evolutionary Computation Conference, Berlin, Germany 2017.CrossRef Google Scholar

Sun, C, Li, C, Liu, Y, Liu, Z, Wang, X and Tan, J (2019) Prediction method of concentricity and perpendicularity of aero-engine multistage rotors based on PSO-BP neural network. IEEE Access 7, 132271–132278.CrossRef Google Scholar

Zaharis, ZD, Gravas, IP, Yioultsis, TV, Laziridis, PI, Glover, IA, Skeberis, C and Xenos, TD (2017) Exponential log-periodic antenna design using improved particle swarm optimization with velocity mutation. IEEE Transactions on Magnetics 53, 1–4.CrossRef Google Scholar

Gravas, IP, Zaharis, ZD, Lazaridis, PI, Yioultsis, TV, Kantartzis, NV, Antonopoulos, CS, Chochliouros, IP and Xenos, TD (2020) Optimal design of aperiodic reconfigurable antenna array suitable for broadcasting applications. Electronics 9, 818.Google Scholar

Reddaf, A, Djerfaf, F, Ferroudji, K, Boudjerda, M, Hamdi-Chérif, K and Bouchachi, I (2019) Design of dual-band antenna using an optimized complementary split-ring resonator. Applied Physics A: Solids and Surfaces 125, 1–9.CrossRef Google Scholar

Gravas, IP, Sifakis, NF, Zaharis, ZD, Lazaridis, PI and Xenos, TD (2020) Optimal fractal antenna for in-vehicle entertainment application. Wireless Telecommunications Symposium, 1–5.Google Scholar

Zaharis, ZD, Gravas, IP, Lazaridis, PI, Glover, IA, Antonopoulos, CS and Xenos, TD (2018) Optimal LTE-protected LPDA design for DVB-T reception using particle swarm optimization with velocity mutation. IEEE Transactions on Antennas and Propagation 66, 3926–3935.CrossRef Google Scholar

Karystinos, GN and Pados, DA (2000) On overfitting, generalization, and randomly expanded training sets. IEEE Transactions on Neural Networks and Learning Systems 11, 1050–1057.CrossRef Google Scholar PubMed

Kwok, TY and Yeung, DY (1995) Efficient cross-validation for feedforward neural networks. International Conference on Neural Networks, Australia 1995.Google Scholar

Rahman, M, Jahromi, MN, Mirjavadi, SS and Hamouda, AM (2019) Compact UWB band-notched antenna with integrated bluetooth for personal wireless communication and UWB applications. Electronics 8, 1–13.CrossRef Google Scholar

Li, WT, Hei, YQ, Feng, W and Shi, XW (2012) Planar antenna for 3G/Bluetooth/WiMAX and UWB applications with dual band-notched characteristics. IEEE Antennas and Wireless Propagation Letters 11, 61–64.Google Scholar

Srivastava, K, Ashwani, K, Kanaujia, BK, Dwari, S, Verma, AK, Esselle, KP and Mittra, R (2018) Integrated GSM-UWB Fibonacci-type antennas with single, dual, and triple notched bands. IET Microwaves, Antennas & Propagation 12, 1004–1012.CrossRef Google Scholar

Iqbal, A, Smida, A, Mallat, NK, Islam, MT and Kim, S (2019) A compact UWB antenna with independently controllable notch bands. Sensors 19, 1–12.CrossRef Google Scholar PubMed

Jin, N and Samii, YR (2010) Hybrid real-binary particle swarm optimization (HPSO) in engineering electromagnetics. IEEE Transactions on Antennas and Propagation 58, 3786–3794.CrossRef Google Scholar

Sengupta, S, Basak, S and Peters, RA (2018) Particle swarm optimization: a survey of historical and recent developments with hybridization perspectives. Machine Learning and Knowledge Extraction 1, 157–191.CrossRef Google Scholar

Mirjalili, S, Song Dong, J, Sadiq, AS and Faris, H (2020) Genetic algorithm: theory, literature review, and application in image reconstruction. In Mirjalili, S, Song Dong, J and Lewis, A (eds), Nature-Inspired Optimizers, Studies in Computational Intelligence, vol. 811. Cham: Springer, pp. 69–85.Google Scholar