1. Introduction
Nature shows one of the best displays of group coordination through flocks of birds, schools of fish and herds of animals. The collective motion of a group of animals is not a planned script but instantaneous decisions and responses by individual members [Reference Lewis, Zhang, Hengster-Movric and Das1]. In artificial man-made settings, these members can be represented as intelligent mobile agents. The research interest in mobile agents [Reference Cao, Yu, Ren and Chen2, Reference Haghighi and Cheah3, Reference Rubenstein, Cornejo and Nagpal4, Reference Ota5, Reference Tan and Zheng6] is growing rapidly, as collective groups of agents are able to perform tasks that individual agents cannot. Recent research in this area focused on the motion and control planning of cooperative mobile ground robots and also aerial systems such as quadrotors [Reference Raj, Raghuwaiya, Sharma and Vanualailai7, Reference Raj, Raghuwaiya and Vanualailai8, Reference Pradhan, Nandi, Hui, Roy and Rodrigues9, Reference Chung, Paranjape, Dames, Shen and Kumar10], which may provide services such as coverage, search and rescue, delivery, tracking, and inspection [Reference Xie and Liu11, Reference Trotta, Felice, Montori, Chowdhury and Bononi12, Reference Rizk, Awad and Tunstel13, Reference Ali and Mailah14, Reference Queralta, Taipalmaa, Pullinen, Sarker, Gia, Tenhunen, Gabbouj, Raitoharju and Westerlund15].
Coverage control [Reference Cortes, Martinez, Karatas and Bullo16, Reference Ali and Mailah17, Reference Mohseni, Doustmohammadi and Menhaj18] is a benchmark problem and is the topic of interest in this paper. In general, the coverage control task aims to optimally deploy a group of mobile agents over an area of interest or environment. Control laws distribute agents to different parts of the coverage environment, usually partitioned into Voronoi cells. A well-known method for coverage problems is the gradient descent method [Reference Cortes, Martinez, Karatas and Bullo16], which proposes continuous and discrete-time algorithms that converge to the centroids of the Voronoi partition of the search domain. Similar approaches using Voronoi partitions for coverage control have been applied successfully [Reference Schwager, Rus and Slotine19, Reference Kwok and Martìnez20, Reference Schwager, Rus and Slotine21]. Other methods such as feedback control [Reference Hussein and Stipanovic22], receding horizon control [Reference Mohseni, Doustmohammadi and Menhaj18] and game theory method [Reference Ruan, Wang, Chen, Xu, Yang, Jiang, Zhang and Xu23] have also been used.
In all the above-mentioned methods, the global task is visible and known to all agents, and individual agents are aware of all other agents’ locations either partially or fully. However, the drawback of these methods is that collecting and storing all these information requires high communication data volume. Besides, with the global task and progress made known to individual agents, attacks on any one agent can compromise the mission’s security. The methods mentioned above modelled the mobile agents’ cooperative behaviour using either deterministic control techniques [Reference Arslan, Guralnik and Koditschek24, Reference Dimarogonas, Frazzoli and Johansson25] or stochastic techniques [Reference Bandyopadhyay, Chung and Hadaegh26, Reference Li, Feng, Ehrhard, Shen, Cobos, Zhang, Elamvazhuthi, Berman, Haber-land and Bertozzi27, Reference Elamvazhuthi and Berman28]. In deterministic techniques, the output is predictable and is fully determined by the system input and initial conditions. However, in stochastic techniques, the outputs vary with inherent randomness in the system. The randomness in stochastic models could stem from its system dynamics, switching in network topology, issues in network communication such as time delays. Stochastic techniques are better representation of the natural world but are considerably more complicated.
The broadcast control (BC) scheme [Reference Azuma, Yoshimura and Sugie29] and its variant pseudo-perturbation-based broadcast control (PBC) [Reference Ito, Kamal, Yoshimura and Azuma30] are recently proposed stochastic optimisation methods that have been applied to multi-agent coordination tasks. The BC scheme aims to solve the problem of multi-agent systems using one-to-all communication protocols that replace standard agent-to-agent-based communication protocols. The goal is to complete the mission by observing group performance and sending the same signal to all agents indiscriminately. The agents do not require specific hardware nor do they need to expend energy for information transmission. Although the BC/PBC system utilises minimal hardware and reduced communication volume, the broadcast scheme is susceptible to broadcast-based challenges. An example is when the broadcasted signal reaches an agent, it can affect other agents and result in undesirable configurations. The PBC converges twice as fast and displays improved control performance compared to the BC scheme. These schemes are able to overcome the drawbacks of the earlier mentioned methods as they do not involve communication among agents. Furthermore, the global mission is unknown to individual agents in the BC schemes.
In this paper, we introduce a novel variant of the BC scheme termed multi-step broadcast control (MBC) for coverage control of mobile agents with non-uniform density distribution in the study environment. Unlike the previous BC schemes studied for uniform coverage control, this paper considers non-uniformity in the coverage area and compares the performance of our proposed scheme against the previous BC schemes.
The proposed MBC scheme is inspired from model predictive control (MPC) theory, where the control strategy is based on a prediction horizon in the future. To the best of our knowledge, this is the first work that employs a MBC of multi-agent systems in a detailed non-uniform coverage task study. The proposed MBC scheme takes multiple predictive steps in a horizon to predict future variables several steps ahead in an environment. An immediate step is given a higher weight, and as the number of steps increases, the respective weightage is decreased. More agents are deployed to dense sections (higher density) than sparsely populated sections (lower density) to provide equitable services to an environment with uneven density distribution. Using multi-step forward views, dense sections are made known to MBC agents earlier than existing BC schemes, which are limited to a one-step forward view (PBC uses multiple one-step views from different directions). Thus, the proposed MBC increases the stochastic gradient accuracy much faster and outperforms the existing BC schemes in both coverage task achievement and deployment efficiency.
This paper is organised as follows. In Section 2, the Voronoi-based coverage control problem is described and formulated. We review the existing BC schemes and our proposed MBC scheme in Section 3. Section 4 presents the numerical example to demonstrate the effectiveness of the MBC scheme in coverage tasks with an uneven density distribution. Lastly, we present our conclusions and directions for future research in Section 5.
2. Coverage problem formulation
Consider an area with sections of varying density, with some being densely populated and some sparsely populated. To fairly and equitably serve the population within the bounded area, multiple mobile agents need to be deployed optimally depending on the number of available agents and population density pattern over the area (Fig. 1). By allocating a portion of the area to each agent, the entire area should be covered fully by all the agents. Such a deployment of multiple agents is known as the coverage task. Traditionally, coverage problems involve deploying a set of agents to provide equal coverage distribution over the bounded area, where agents are distributed at an equal distance from each other. However, in this paper, the final distribution of agents varies according to population density, that is, not all agents are placed at an equal distance from each other. For example, more agents need to be deployed to dense sections than to sparsely populated sections.
The objective function for the coverage task is [Reference Cortes, Martinez, Karatas and Bullo16]
where f is the coverage performance function. The term q represents the uniformly distributed points in the environment, Q and $\phi$ represents a weighting function that determines the relative importance of points in Q. Also, $x_i$ is the position of the i-th agent in the domain where $x_i = [x_1,x_2,\ldots x_n]^T \in \mathbb{R}^n$ . The total number of agents is N. The objective function, J(x(t)) aims to reduce J, which is indicative of the global coverage performance of the multi-agent system, given as:
The minimum of J is achieved when N agents are placed optimally in the space. The Voronoi tessellation method [Reference Fortune31], which partitions the entire area into sub-areas, is used to compute and visualise the coverage achievement. Created by points $(p_1,\ldots p_n)$ , the optimal partition of Q follows the Voronoi partition $V(P)=\{V_1,\ldots V_n\}$ that is given by:
3. Proposed scheme : Multi-step Broadcast Control (MBC)
3.1. Review of existing BC schemes
In this section, the BC [Reference Azuma, Yoshimura and Sugie29] and PBC [Reference Ito, Kamal, Yoshimura and Azuma30] schemes are reviewed. The BC multi-agent scheme consists of a global controller, $G_C$ , with N agents, each denoted as $A_{i}, i=1,2,\ldots N$ , and the respective local controllers, $L_{i}$ . The connectivity between the agents and the controllers is shown in Fig. 2.
In this paper, step is defined as the physical move at time, t; the next physical move is at $t+1$ and so forth. In BC, at each consecutive step, agents alternate between taking a randomly generated move and a deterministic move (Fig. 3(a)). The global controller ${G_C}$ computes the collective performance of the whole system at each step by evaluating the objective function, J(x). Using the difference between the objective function of random and deterministic moves, ${G_C}$ calculates and broadcasts this value as a scalar control signal, $\sigma_B$ to all agents. Using $\sigma_B$ , the local controller $L_i$ calculates and determines the control action ${u_i}(t)$ . The state equation of an agent is represented as follows:
where $x_i(t)\in \mathbb{R}^n$ is the position in the n-dimensional space and $u_i(t)\in \mathbb{R}^n$ is the control input.
The local controller $L_i$ of agent $A_i$ produces the control signal based on the information received from ${G_C}$ , as follows:
where $\zeta_i(t)\in \mathbb{R}^{n_{\zeta}}$ and $u_i(t)\in \mathbb{R}^n$ are the state and output of the local controller, $L_i$ , respectively. The initial state of $L_i$ is set to $\zeta_i(0)=0$ . $\sigma_B(t)\in \mathbb{R}$ is the broadcast signal provided by $G_C$ .
The controller functions $\alpha\,{:}\,\mathbb{R}^{n_{\zeta}} \times \mathbb{R}^{n_{\sigma_B}} \times \mathbb{N}\rightarrow \mathbb{R}^{n_{\zeta}}$ and $\beta\,{:}\, \mathbb{R}^{n_{\zeta}} \times \mathbb{R}^{n_{\sigma_B}} \times \mathbb{N}\rightarrow \mathbb{R}^n$ are defined as follows:
where $\zeta_i(t)\,{:\!=}\,[\zeta_{i1}(t)^T,\zeta_{i2}(t)]^T, \zeta_{i1} \in \{-1, 1\}^n$ with $\zeta_{i2} \in \mathbb{R}$ . $\zeta_{i1}^{[-1]}$ denoting the element-wise inverse of $\zeta_{i1}$ , namely $\zeta_{i1}^{[-1]}\,{:\!=}\,[1/\zeta_{i1,1},\ldots, 1/\zeta_{i1,n}]^T$ . Also, $\zeta_{i1}(t)=\Delta_i(t-1)$ and $\zeta_{i2}(t)=\sigma_B(t-1)$ . The symbol $\Delta_{i}(t) = [ \Delta_{i,1}(t),\ldots,\Delta_{i,n}(t)]^T\in \{-1, 1\}^n$ denotes the vector of random variable where each $\Delta_{i,j}(t) (i \in \{1,\ldots,N\}, j \in \{1,\ldots,N\}, t \in \{0,1,\ldots\})$ is represented as a Bernoulli distribution with outcome $\pm1$ with equal probabilities. Additionally, $a(t)\in \mathbb{R}_+$ and $c(t)\in \mathbb{R}_+$ are the time-varying gains of controller $L_i$ .
The broadcast signal from $G_C$ is described as:
where
is the collective state of all agents in the system. As time t tends to infinity, the multi-agent system approaches the optimal J as follows:
where $J\,{:}\, \mathbb{R}^{nN} \rightarrow \mathbb{R}$ . The optimisation process is based on the gradient of the objective function with respect to each individual action. However, the objective function is not known to individual agents; this is an advantageous feature of the BC scheme where it accomplishes a task without providing the task details or objective to the individual agents. Therefore, in the BC system, agents take random and deterministic control actions alternately. At even time steps $t\in {0,2,4,\ldots}$ , agents take random moves, and at odd time steps $t\in {1,3,5,\ldots}$ , they take deterministic moves. This is shown in Fig. 3(a).
In BC, the random movement of agents incurs movement cost and considerable time for convergence. Ito et al. [Reference Ito, Kamal, Yoshimura and Azuma30] introduced the PBC. In the PBC scheme, each agent takes multiple virtual random moves (Fig. 3(b)) within a single physical step. K is the total number of multiple virtual random moves taken by an agent within a step. Through this, an agent omits the costly single physical random move as taken by the agent in the BC scheme. This causes PBC agents to accomplish a similar outcome in a single step compared to two steps taken by agents using BC.
The state-space equation of agent i in PBC is given as:
where $\hat{x}_{i}^{(k)}(t+1)$ and $\hat{u}_{i}^{(k)}(t)$ are the virtual input and the virtual predictive state of agents $A_i$ at each step. The global controller, $G_C$ , calculates $\hat{u}_{i}^{(k)}(t)$ and $\hat{x}_{i}^{(k)}(t+1)$ using ( $x_i(t)$ , $\Delta_i^{(k)}(t))$ sent from the agent $A_i$ . Additionally, $\hat{x}^{(k)}$ and $\Delta^{(k)}(t)$ are given as $\hat{x}^{(k)}\,{:\!=}\,[\hat{x}_1^{(k)T},\ldots, \hat{x}_N^{(k)T}]^T\in \mathbb{R}^{nN}$ and $\Delta^{(k)}(t) \,{:\!=}\, [\Delta_1^{(k)}(t)^T,\ldots,\Delta_N^{(k)}(t)^T]^T.$
The $G_C$ in PBC calculates the objective function, J for all virtual steps taken as follows:
where ${\sigma_P}^{(k)}(t) \,{:\!=}\, J(\hat{x}^{(k)}(t + 1))-J(x(t))$ and $\sigma_P(t) = [{\sigma_{P}}^{(1)}(t),. .. ,{\sigma_{P}}^{(K)}(t)]^T$ . In PBC, the local controller, $L_i$ determines its state, $\zeta_{i}(t)$ and the control output, $u_{i}(t)$ as:
where $\Delta_{i}^{(k)[-1]}$ denotes the element-wise inverse of vector $\Delta_{i}^{(k)}$ . Compared to (7), we can notice that the term $c(t)\Delta_i(t)$ (which represents the physical random action) is removed in PBC.
In PBC, the global controller, $G_C$ , also receives the predicted random state $\zeta_{i}$ from the local controller, $L_i$ , whereas in BC, $L_i$ does not share its state with $G_C$ . Even with this extra cost of communication, PBC converges to the optimal point twice as fast as BC. To date, the PBC framework has been effectively applied for real-time coordination of vehicles at a merging road [Reference Ito, Kamal, Yoshimura and Azuma32].
In the PBC framework, each agent’s virtual random action is limited to a single step ahead of its current location. This does not allow for efficient prediction of unpredictable events that might exist in the task environment beyond a single step ahead, such as varying density distribution. To overcome this limitation, we propose a new scheme in the next section.
3.2. Main result: MBC scheme
This section presents the novel scheme proposed in this paper, the MBC scheme, which incorporates a predictive multi-step move in a horizon within a single physical step. The multiple steps taken is useful to predict uncertainties several steps ahead, which could be a dense section to move towards to. Figure 4 depicts the concept of the action taken under the proposed scheme. The inspiration behind this concept comes from the theory of MPC. Like MPC, the proposed multi-steps are driven along a horizon to anticipate and consider future events to optimise the current iteration/time.
In the case of MBC, K denotes the maximum number of predictive virtual steps taken along the horizon. At each iteration, an agent provides its current state $x_i(t)$ and $\Delta_i^{(k)}, k=0,1,2,\ldots,K-1$ , and using that the global controller, $G_C$ , calculates the future virtual states $\hat{x}_i^{(k)}$ with $\hat{x}_i^{(0)}=x_i(t)$ as:
where $\hat{u}_i^{(k)}$ is the corresponding virtual input. We define the collective states of all agents at k as $\hat{x}^{(k)} \,{:\!=}\,[\hat{x}_1^{(k)T},\ldots, \hat{x}_N^{(k)T}]^T\in \mathbb{R}^{nN}$ , resulting from corresponding collective random variables $\Delta^{(k)}(t) \,{:\!=}\, [\Delta_1^{(k)}(t)^T,\ldots,\Delta_N^{(k)}(t)^T]^T.$ The global controller computes the broadcast signal, $\sigma_{M}$ , through a compounding method as follows:
where ${\sigma_M}^{(k)}(t)\,{:\!=}\,J(\hat{x}^{(k+1)}(t))-J(\hat{x}^{(k)}(t))$ and $\sigma_M(t)=[{\sigma_{M}}^{(0)}(t),\ldots,{\sigma_{M}}^{(K-1)}(t)]^T$ . The state, $\zeta_{i}(t)$ , and the control output, $u_{i}(t)$ , of the local controller, $L_i$ , are given as:
where $\Delta_{i}^{(k)[-1]}$ denotes the element-wise inverse of vector $\Delta_{i}^{(k)}$ and $\lambda \in (0,1)$ is a discount factor of a weighted average technique. Through the use of $\lambda$ , immediate action is given a higher weight, and as the number of steps increases, the respective weightage is decreased gradually. The reduction in weights is designed as such because the probability of accurate prediction in an uncertain environment decreases with an increase in steps in forward prediction. While the use of the arithmetic average technique is a quick general representation of a data set (as used in PBC), the use of the weighted average technique in (18)–(20) is more descriptive of a specific problem and is capable of yielding better accuracy.
The computing complexity of BC schemes can be expressed in terms of communication volume. The MBC scheme will have similar communication data with that of PBC if the number of steps, K, is the same. It is shown in Ito et al. [Reference Ito, Kamal, Yoshimura and Azuma33] that the communication volume of the PBC scheme is about half of that of centralised unicast protocols. On the other hand, BC will have slightly lower communication volume than both MBC and PBC as it does not transmit random variable (K virtual steps). Nevertheless, both MBC and PBC have better control performance than BC scheme as random physical action is eliminated, and agents’ state converges quicker.
3.3. Convergence analysis of proposed MBC
In this subsection, we present an analysis on the convergence of the proposed MBC scheme. The convergence proof presented herein is similar in nature to that of BC [Reference Azuma, Yoshimura and Sugie29] and PBC [Reference Ito, Kamal, Yoshimura and Azuma30] and is summarised with appropriate modifications.
Theorem 1. (Convergence of the MBC Scheme) Consider the multi-agent system shown in Fig. 2, an objective function J(x), and the MBC scheme in (18), (19) and (20) with $\textit{K}>1$ . If the following conditions (c1)–(c3) hold, then x(t) in (9) converges to a (possibly sample-path-dependent) solution set to $\partial_x J(x)=0$ with probability 1.
-
(c1) The objective function $J(x): \mathbb{R}^{nN} \rightarrow \mathbb{R}$ is defined as:
(21) \begin{equation}J(x(t))\,{:\!=}\,\rho(\|x\|)J_{obj}(x) +(1-\rho(\|x\|))x^Tx,\end{equation}(22) \begin{equation}\rho(\|x\|)\,{:\!=}\, \begin{cases} 1 (\|x\| \leq l_1)\\ \\[-7pt] 0 (\|x\| \geq l_2) \end{cases}\end{equation}where J(x) and $J_{obj}(x)$ in (1) is non-negative $C^2$ continuous on $\mathbb{R}^{nN}$ , and there exists a solution to $\partial_x J(x)=0$ . $x^Tx$ is a non-decreasing function with respect to the distance from the origin, and $\rho$ is the switching function where $l_1$ and $ l_2 $ specifies the environment.
-
(c2) The compact connected internally chain transitive invariant sets of a gradient system $\dot{z}(\tau) = -\partial_z J(z(\tau))$ are included in the solution set to the equation $\partial_z Jz = 0$ (i.e., to $\partial_x J(x) = 0$ ), and there exists an asymptotically stable equilibrium for the gradient system, where $z(\tau) \in \mathbb{R}^{nN}$ and the stability are in the Lyapunov sense [Reference Azuma, Yoshimura and Sugie29].
-
(c3) $\lim_{t\to\infty} a(t) =0$ , $\sum_{t=0}^{\infty} a(t)=\infty,\ \lim_{t\to\infty} c(t) =0$ , $\sum_{t=0}^{\infty} (a(t)/c(t))^2<\infty$ and $\sum_{t=0}^{\infty}(a(t)$ $c(t))^2<\infty$ .
Remark 2 Convergence of the proposed MBC scheme is proven by showing that the transition of the MBC scheme converges to $\partial_x J(x)=0$ , which implies that the system of x(t) in (9) has and reaches a local minimum value for the coverage problem.
Remark 3 Condition (c1) holds as $J_{obj}$ in (1) is twice differentiable and $x^Tx$ is a quadratic potential function, resulting in J(x(t)) which is sufficiently smooth. Condition (c2) holds as the Hessian matrix of J(x) is nonsingular at each x satisfying $\partial_x J(x)=0$ [Reference Robinson34]. Condition (c3) is applied to prevent gains a(t) and c(t) from reaching a very low value; in Section 4 (Numerical Simulations), the gains will be designed as in (33) to satisfy condition (c3).
Proof. The local controller output, $u_i(t)$ of BC as in (7) performs a two-stage state transition as follows:
where
hold for $t\in \{0,2,4,\ldots\}$ . Then the system (23) becomes
where the stochastic function $g(t,\Delta)$ is defined as:
If J(x) is a $C^2$ continuous function with $c(t) \rightarrow 0$ (as implied by conditions (c1) and (c3)), then the expected value of $g(t,\Delta(t))$ (which is E in (28)) reduces to a stochastic approximate gradient of J(x) based on simultaneous perturbation stochastic approximation (SPSA) [Reference Spall35] as follows:
Similarly, following (26), the state transition of multi-steps in the MBC scheme is given by:
Next, we define the following function e(t) for brief notation:
Substituting (28) and (30) into (29), then as $c(t) \rightarrow 0$ , the transition of x(t) under the MBC scheme is given by:
The system (31) is identical to the stochastic approximation algorithm in equation (A.1) (in Appendix A.1) of Azuma et al. [Reference Azuma, Yoshimura and Sugie29]. Then, Lemma 2 of Azuma et al. [Reference Azuma, Yoshimura and Sugie29] shows that (A.1) converges to a (possibly sample path-dependent) compact connected internally chain transitive invariant set of the gradient system, $\dot{z}(\tau) = -\partial_z J(z(\tau))$ . Conditions (c1) and c(3) imply the conditions for the almost-sure convergence of Lemma 2 [Reference Azuma, Yoshimura and Sugie29] for (30), and thus if (c1) and (c3) hold, then it follows that (31) converges to a compact connected internally chain transitive invariant set $\dot{z}(\tau) = -\partial_z J(z(\tau))$ ; then, if (c2) holds, (31) converges to $\partial_x J(x)=0$ and the proof is complete.
Remark 4 Convergence to a solution set to $\partial_x J(x)=0$ shows that the proposed MBC scheme (18)–(20) is our solution to the coverage problem in (10).
Remark 5 MBC is similar to PBC in that it excludes random physical motion. However, the difference lies in that now the virtual random motion is designed to be multi-step. Furthermore, these predictive multi-steps are designed to be along the horizon and compounds from each other. The accuracy of gradient estimation can be increased using the MBC scheme in coverage task with non-uniform density distribution, for example, differing population density. The state transition of PBC is given by:
Comparing (29) and (32), it is clear that the MBC scheme utilises a weighted average technique, where else the PBC uses the arithmetic mean based on the central tendency.
4. Numerical simulations
The number of agents is set to $N=25$ in the two-dimensional state space, that is $n=2$ . The size of the coverage workspace is set to 200 by 200 square units. Initially, all agents are uniformly distributed within x-coordinate (80:120) and y-coordinate (0:40). The terminal simulation time is set as 5000 iterations and the number of multi-steps used is $K=10$ . The controller gains a and c are set as following:
for $t \in \{0,1,2,\ldots\}$ where $a_0>0$ and $c_0>0$ hold to satisfy $a(t)>0$ and $c(t)>0$ .
The performance study to evaluate the effectiveness of MBC against BC and PBC is conducted through three different scenarios. The first scenario is for a single-dense section, secondly for two dense sections, and thirdly for increased number of agents, from 15 to 25 agents.
4.1. Coverage with a single-dense section
The performance of MBC for a single-dense section is first discussed. For this scenario, the value of $a_o$ and $c_o$ are 1.2 and 14.5, respectively. $a_v$ , $a_p$ and $c_p$ are set as 15.5, 0.7 and 0.16, respectively, for all scenarios. Figure 5 shows the corresponding agent distributions during specific intervals of iteration of 500, 1500, 2500 and 5000 all three schemes. BC is represented by the left column, followed by PBC in the middle and proposed MBC in the right column. As the number of iterations increase, it can be observed that the agents begin to disperse from their initial positions and distribute themselves equally in the environment. The number of agents in the dense section increases as the iteration progresses. The dense shaded section holds two agents by iteration 5000 for all the three schemes. However, it is clear that MBC achieves this first by iteration 1500, followed by PBC by 2500 and lastly by BC by iteration 5000. These show that the schemes can achieve the same allocated number of agents in the dense section, making them comparable for convergence study.
Figure 6 shows the evolution of the objective function for BC, PBC and MBC. From the chart, it can be observed that both PBC and MBC have similar initial steeper gradient descent compared to BC. However, after the initial drop, MBC shows the quickest convergence to the final value, followed by PBC and finally by BC. This confirms that the proposed MBC shows better convergence performance than the PBC scheme based on the execution difference (23) and (24). The performance of BC is very different compared to both PBC and MBC as it takes two physical steps to accomplish what PBC and MBC achieve through a single physical step with multiple virtual steps. Therefore, for the subsequent scenarios, a comparison study will be conducted only for PBC and MBC. (Note: BC is able to reach the same performance albeit much more slowly).
4.2. Coverage with two dense sections
The earlier section’s coverage study can be expanded to multiple dense sections, and here two dense sections are studied. For this specific study, the value of $a_o$ is 0.55 and $c_o$ is 12.5.
Figure 7 shows the comparison between PBC agent distribution on the left column and MBC on the right column for iteration 1000, 2000 and 4000. At iteration 1000, both PBC and MBC have equally acquired a single agent in the right dense section. By iteration 2000, MBC has managed to allocate two agents in both the dense sections, while PBC only manages two agents in the left dense section. However, by iteration 4000, PBC has managed to catch up and converge similarly to MBC.
The objective function in Fig. 8 shows these results clearly where MBC has converged faster by iteration 1500 and PBC at about iteration 3000. This shows the MBC outperforms PBC in a coverage task with double-dense sections.
4.3. Coverage for an increased number of agents
The two scenarios discussed above for single- and double-dense sections utilises 15 agents. This section studies the performance of the proposed MBC scheme with an increased number of agents. Here, 25 agents are used, and the initial positions of the agents are now 5 rows. Two extra rows of agents are included in the existing three rows of agents. The values of initial gain utilised are $a_o=0.3$ and $c_o=7.5$ .
With an increased number of agents for coverage with a single-dense section, the comparison of agent distribution for PBC and MBC is shown in Fig. 9. The distribution is captured at iteration 1500, 2500 and 4000. Intuitively, we can expect that by using dense sections of the same size and increasing the number of agents, more agents will be allocated in the sections. Figure 9 shows that this is true, as now at convergence, there are three agents in the dense section. Referring to the same figure, by iteration 1500, MBC has two agents allocated in the dense section compared with PBC, which has 1. At iteration 2500, the number of agents for MBC within the dense section has increased to 3, where else for PBC, it has increased to 2. Finally, by iteration 4000, both methods show similar distribution. This is presented in Fig. 10 where the MBC scheme displays quicker convergence performance than PBC.
In addition, the initial BC scheme proposed by Azuma et al. [Reference Azuma, Yoshimura and Sugie29] was tested experimentally for uniform coverage task using mobile ground robots, and the results showed that the numerical scheme is transferable to physical execution. On the same note, the MBC scheme is expected to perform experimentally with similar success as it follows a similar physical execution protocol. The only difference here is the additional virtual computation load and decrease in physical steps taken by each robot. This, alongside the numerical simulations conducted in this paper, validates the proposed scheme for successful implementation in a coverage control task.
5. Conclusions and future work
In this paper, the MBC scheme has been proposed to overcome the limited one-step forward view of the BC schemes by introducing multiple predictive random steps along the horizon. The proposed scheme has been evaluated against the existing BC schemes in a coverage task over a region with uneven population density with mobile agents. Numerical simulations prove that the MBC scheme outperforms the BC and PBC scheme in task accomplishment and deployment efficiency. MBC converges the quickest as the gradient is computed from multiple steps along the horizon where the dense sections emerge to the agents earlier compared to both BC and PBC, which only uses a one-step forward view. As the MBC scheme utilises broadcast-based communication and takes a different outlook from the recent algorithms utilising agent-to-agent communication protocol [Reference Li, Yuan, Wang and Dong36, Reference Chen, de Marina and Cao37, Reference Flores-Resendiz and Aranda-Bricaire38, Reference Yousuf, Khan and Noor39, Reference Güzel, Ajabshir, Nattharith, Gezer and Can40], the comparison study is deemed appropriate only with the previous BC schemes as proven. In the future, we intend to extend the MBC scheme to some practical applications of multi-agent systems.
Conflicts of interest
None.
Financial support
This research is partly supported by the Fundamental Research Grant of Ministry of Higher Education, Malaysia, FRGS/1/2016/TK04/MUSM/02/1 and the Japan Society for the Promotion of Science (JSPS) Grant-in-Aids for Scientific Research (C) 20K04531.
Ethical considerations
None.
Authors’ contributions
S. D., M. K. and C. T. designed the study. S. D. performed the numerical simulations, data analysis and wrote the manuscript with input and feedback from C. T., M. K. and M. S. All authors reviewed the results and approved the final version of the manuscript.