Solving the tool switching problem with memetic algorithms

Jhon Edgar Amaya; Carlos Cotta; Antonio J. Fernández-Leiva

doi:10.1017/S089006041100014X

Solving the tool switching problem with memetic algorithms

Published online by Cambridge University Press: 24 October 2011

Jhon Edgar Amaya ,

Carlos Cotta and

Antonio J. Fernández-Leiva

Show author details

Jhon Edgar Amaya: Affiliation:
Laboratorio de Computación de Alto Rendimiento, Universidad Nacional Experimental del Táchira, San Cristóbal, Venezuela
Carlos Cotta*: Affiliation:
Departamento Lenguajes y Ciencias de la Computación, ETSI Informática, University of Málaga, Campus de Teatinos, Málaga, Spain
Antonio J. Fernández-Leiva: Affiliation:
Departamento Lenguajes y Ciencias de la Computación, ETSI Informática, University of Málaga, Campus de Teatinos, Málaga, Spain
*: Reprint requests to: Carlos Cotta, Departamento Lenguajes y Ciencias de la Computación, ETSI Informática, University of Málaga, Campus de Teatinos, 29071, Málaga, Spain. E-mail: ccottap@lcc.uma.es

Article contents

Abstract
INTRODUCTION
BACKGROUND
SOLVING THE ToSP WITH METAHEURISTICS
EXPERIMENTAL RESULTS
CONCLUSIONS AND FUTURE WORK
Footnotes
References

Rights & Permissions

Abstract

The tool switching problem (ToSP) is well known in the domain of flexible manufacturing systems. Given a reconfigurable machine, the ToSP amounts to scheduling a collection of jobs on this machine (each of them requiring a different set of tools to be completed), as well as the tools to be loaded/unloaded at each step to process these jobs, such that the total number of tool switches is minimized. Different exact and heuristic methods have been defined to deal with this problem. In this work, we focus on memetic approaches to this problem. To this end, we have considered a number of variants of three different local search techniques (hill climbing, tabu search, and simulated annealing), and embedded them in a permutational evolutionary algorithm. It is shown that the memetic algorithm endowed with steepest ascent hill climbing search yields the best results, performing synergistically better than its stand-alone constituents, and providing better results than the rest of the algorithms (including those returned by an effective ad hoc beam search heuristic defined in the literature for this problem).

Keywords

Evolutionary Algorithm Flexible Manufacturing System Local Search Memetic Algorithm Tool Switching Problem

Type: Practicum Article
Information: AI EDAM , Volume 26 , Issue 2: Design Computing and Cognition (DCC'10) , May 2012 , pp. 221 - 235

DOI: https://doi.org/10.1017/S089006041100014X [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2011

1. INTRODUCTION

Flexible manufacturing systems have the capability to be adjusted for generating different products and/or for changing the order of product generation. Thus, they incorporate versatility and efficiency in the production process. This is precisely the reason that has motivated an increasing interest on this kind of systems; for some time now, the manufacturing industry is more and more often demanding flexible manufacturing systems as an alternative to traditional rigid production systems.

In the setting dealt in this work, we consider a simple machine that has several slots into which different tools can be loaded. Each slot just admits one tool, and each job executed on that machine requires a particular set of tools to be completed. Jobs are sequentially executed; therefore, each time a job is to be processed, the corresponding tools must be loaded in the machine magazine. The number of slots available in this magazine is obviously limited. Because in general the total number of tools required to process all jobs is also larger than the number of slots in the magazine, it may be required at some point to perform a tool switch, that is, removing a tool from the magazine and inserting another one in its place. In this context, tool management is a challenging task that directly influences the efficiency of flexible manufacturing systems: an inadequate schedule of jobs and/or a poor tool switching policy may result in excessive delays for reconfiguring the machine.

Although the order of tools in the magazine is often irrelevant, the need of performing a tool switching does depend on the order in which the jobs are executed. The tool switching problem (ToSP) consists of finding an appropriate job sequence in which jobs will be executed, and an associated sequence of tool loading/unloading operations that minimizes the number of tool switches in the magazine. Clearly, this problem is specifically interesting when the time needed to change a tool is a significant part of the processing time of all jobs, and therefore the tool switching policy will significantly affect the performance of the system. Different examples of the problem can be found in diverse areas such as electronics manufacturing, metalworking industry, computer memory management, and aeronautics, among others (Belady, Reference Belady1966; Bard, Reference Bard1988; Tang & Denardo, Reference Tang and Denardo1988; Privault & Finke, Reference Privault and Finke1995; Shirazi & Frizelle, Reference Shirazi and Frizelle2001).

It must be noted that the ToSP is an extremely hard problem, whose difficulty scales up depending on the number of jobs, tools, and magazine capacity. As later described in Section 2.1, exact methods ranging from integer linear programming (ILP) techniques to heuristic constructive algorithms have been already applied to the problem with moderate success. The reason is clear: the ToSP has been proved to be NP-hard when the magazine capacity is higher than 2 (which is the usual case), and thus exact methods are inherently limited. In this context the use of alternative techniques that might eventually overcome this limitation has been explored. In particular, the use of metaheuristic techniques (Blum & Roli, Reference Blum and Roli2003) can be considered. These techniques utilize high-lever strategies to combine basic heuristics, and their most distinctive feature is their ability to escape from local optima (or extrema). They thus have global optimization capabilities, although they cannot in general provide optimality proofs for the solutions they obtain. Nevertheless, when adequately crafted, they will likely provide optimal or nearly optimal solutions to a wide range of continuous and combinatorial optimization problems.

Amaya et al. (Reference Amaya, Cotta, Fernández, Blesa, Blum, Cotta, Fernández Leiva, Gallardo Ruiz, Roli and Sampels2008) recently proposed three metaheuristics to tackle the ToSP: a simple local search (LS) scheme based on hill climbing (HC), a genetic algorithm (GA), and a memetic algorithm (MA; Moscato & Cotta, Reference Moscato, Cotta, Glover and Kochenberger2003, Reference Moscato, Cotta and González2007; Krasnogor & Smith, Reference Krasnogor and Smith2005), based on the hybridization of the two latter methods. This MA produced very good results compared with a very efficient method [i.e., a beam search (BM) heuristic; Zhou et al., Reference Zhou, Xi and Cao2005] that generated high-quality results on a number of ToSP instances. That seminal work paves the way for considering other memetic approaches to the ToSP, based on the use of other recombination approaches, other LS techniques, partial Lamarckianism, as well as the utilization of alternative neighborhood structures. This has been done here, also providing an extensive empirical evaluation that includes a meticulous statistical comparison among 27 algorithms. Our analysis highlights the appropriateness of attacking the ToSP via metaheuristic, in particular, memetic approaches, and yields a sound ranking of techniques for the problem, providing useful insights on its heuristic resolution.

2. BACKGROUND

Before describing formally the ToSP, let us first overview the problem and its variants, and review related work.

2.1. Related work

The ToSP is a combinatorial optimization problem that involves scheduling a number of jobs on a single machine such that the resulting number of tool switches required is kept to a minimum. We are going to focus here on the uniform case of the ToSP, in which there is one magazine, no job requires more tools than the magazine capacity, and the slot size is constant. To the best of our knowledge, the first reference to the uniform ToSP can be found in the literature as early as in the 1960s (Belady, Reference Belady1966); since then, the uniform ToSP has been tackled via many different techniques. The late 1980s contributed especially to solve the problem (ElMaraghy, Reference ElMaraghy1985; Bard, Reference Bard1988; Kiran & Krason, Reference Kiran and Krason1988; Tang & Denardo, Reference Tang and Denardo1988). This way, Tang and Denardo (Reference Tang and Denardo1988) proposed an ILP formulation of the problem, and Bard (Reference Bard1988) formulated the ToSP as a nonlinear integer program with a dual-based relaxation heuristic. More recently, Laporte et al. (Reference Laporte, Salazar-González and Semet2004) proposed two exact algorithms: a branch and bound approach and a linear programming-based branch and cut algorithm. This latter one is based on a new ILP formulation with a better linear relaxation than that proposed previously by Tang and Denardo (Reference Tang and Denardo1988). An alternative definition to the problem was formulated by Ghiani et al. (Reference Ghiani, Grieco and Guerriero2007), who demonstrated that the ToSP is a symmetric sequencing problem; under this perspective, the authors enriched the branch and bound algorithm proposed by Laporte et al. (Reference Laporte, Salazar-González and Semet2004) with this new formulation, obtaining a significant computational improvement.

Despite the moderate success of exact methods, it must be noted that they are inherently limited, because Oerlemans (Reference Oerlemans1992) and Crama et al. (Reference Crama, Kolen, Oerlemans and Spieksma1994) proved formally that the ToSP is NP-hard for C > 2, where C is the magazine capacity, that is, the number of tools it can accommodate. This limitation was already highlighted by Laporte et al. (Reference Laporte, Salazar-González and Semet2004) who reported that their algorithm was capable of dealing with instances with 9 jobs, but provided very low success ratios for instances with more than 10 jobs. Some ad hoc heuristics have been devised in response to this complexity barrier. We refer to Amaya et al. (Reference Amaya, Cotta, Fernández, Blesa, Blum, Cotta, Fernández Leiva, Gallardo Ruiz, Roli and Sampels2008) for an overview of these. The use of metaheuristics has been also considered recently. In addition to Amaya et al. (Reference Amaya, Cotta, Fernández, Blesa, Blum, Cotta, Fernández Leiva, Gallardo Ruiz, Roli and Sampels2008) mentioned before, LS methods such as tabu search (TS) have been proposed (Hertz & Widmer, Reference Hertz and Widmer1993; Al-Fawzan & Al-Sultan, Reference Al-Fawzan and Al-Sultan2003). Among these, we find the approach presented by Al-Fawzan and Al-Sultan (Reference Al-Fawzan and Al-Sultan2003) specifically interesting, because of the quality of the obtained results; they defined three different versions of TS that arose from the inclusion of different algorithmic mechanisms such as long-term memory and oscillation strategies. We will return later to this approach and describe it in more detail because it has been included in our experimental comparison.

A different and very interesting approach has been described by Zhou et al. (Reference Zhou, Xi and Cao2005), who proposed a BS algorithm. BS is a derivative of the branch and bound that uses a breadth-first traversal of the search tree, and incorporates a heuristic choice to keep at each level only the best (according to some quality measure) β nodes (the so-called beamwidth). This sacrifices completeness but provides a very effective heuristic search approach. Actually, this method provided good results, for example, better than those of Bard's heuristics, and will be also included in the experimental comparison.

Note that the ToSP admits a number of variants. In this work we focus on the uniform ToSP (cf. Section 2.2), but this problem can be augmented if additional constraints are posed on tools or on the magazine. In this case, one refers to the so-called nonuniform ToSP (Crama et al., Reference Crama, Moonen, Spieksma and Talloen2007). For example, it might be the case that different tools required slots of different sizes (or more than one slot); this was precisely the case addressed by Tzur and Altman (Reference Tzur and Altman2004) that considered one magazine with slots of variable size, and pointed out three types of decisions to solve the problem, that is, how to select the job sequence, which tools to switch before each processing operation, and where to locate each tool in the magazine by means of an integer-programming heuristic. An additional variant of the ToSP consists of having multiple magazines. Several proposals for solving this problem variant can be found in the literature; for instance, Kashyap and Khator (Reference Kashyap and Khator1994) analyzed the control rules for tool selection in a flexible manufacturing system with multiple magazines and used a particular policy to determine tool requirements. Błażewicz and Finke (Reference Błażewicz and Finke1994) considered two-level nested scheduling problems (i.e., the part-machine scheduling problem, and the resource allocation and sequencing problem) and described some concrete models and solution procedures. In addition, Hong-Bae et al. (Reference Hong-Bae, Yeong-Dae and Suh1999) described several algorithms, (e.g., greedy search based techniques, as well as tool groupings-based methods) to solve the problem with a number of identical magazines, each of which had a particular capacity. A more general case with parallel machines and different magazine capacities was considered by Keung et al. (Reference Keung, Ip and Lee2001). From a wider perspective, Hop (Reference Hop2005) presents the ToSP as a hierarchical structure that can be analyzed under four different assumptions: variable size for the tools/slots, jobs requiring more tools than the magazine capacity, partial or complete job splitting, and (non)concurrent tool changes/job changes. All variations of the problem considered under these assumptions were proven to be NP-complete.

2.2. Formulation of the uniform ToSP

In light of the informal description of the uniform ToSP given before, there are two major elements in the problem: a machine M and a collection of jobs J = {J ₁, … , J _n} to be processed. Regarding the latter, the relevant information that will drive the optimization process are the tool requirements for each job. We assume that there is a set of tools T = {τ₁, … , τ_m}, and that each job J _i requires a certain subset T ^{(J _i )}⊆T of tools to be processed. As for the machine, we will just consider one piece of information: the capacity C of the magazine (i.e., the number of available slots).

Given the previous elements, we can formalize the ToSP as follows: let a ToSP instance be represented by a pair, I = 〈C, A〉, where C denotes the magazine capacity and A is an m × n binary matrix that defines the tool requirements to execute each job, that is, A _ij = 1 if and only if tool τ_i is required to execute job J _j, being 0 otherwise.

We assume that C < m; otherwise, the problem is trivial. The solution to such an instance is a sequence 〈J _{i ₁}, … , J _{i _n}〉 (where i ₁, … , i _n is a permutation of numbers 1, … , n) determining the order in which the jobs are executed, and a sequence T ₁, … , T _n of tool configurations (T _i ⊂ T) determining which tools are loaded in the magazine at a certain time. Note that for this sequence of tool configurations to be feasible, it must hold that T ^{(J _{i _j})} ⊆ T _j.

Let ℕ = {1, … , h} henceforth. We will index jobs (tools, respectively) with integers from ℕ_n (ℕ_m, respectively). An ILP formulation for the ToSP is shown below, using two sets of zero-one decision variables:

• x _jk = 1 if job j ∈ ℕ_n is assigned to position k ∈ ℕ_n in the sequence, and 0 otherwise, see Eqs. (2) and (3),
• y _ik = 1 if tool i ∈ ℕ_m is in the magazine at instant k ∈ ℕ_n, and 0 otherwise, see Eq. (4).

Processing each job requires a particular collection of tools loaded in the magazine. It is assumed that no job requires a number of tools higher than the magazine capacity, that is, ∑^m_i=1A _ij ≤ C for all j ∈ ℕ_n. Tool requirements are reflected in Eq. (5). Following Bard (Reference Bard1988), we assume the initial condition y _i0 = 1 for all i ∈ ℕ_m. This initial condition amounts to the fact that the initial loading of the magazine is not considered as part of the cost of the solution (in fact, no actual switching is required for this initial load). The objective function F (·) counts the number of switches that have to be done for a particular job sequence; see Eq. (1). We assume that that the cost of each tool switching is constant and unitary.

(1)

$\min F\lpar y\rpar =\sum_{\,j=1}^n \sum_{i=1}^m y_{ij}\lpar 1-y_{i\comma\, j-1}\rpar \comma$

(2)

$\forall j \in {\open N}_n \,\colon \sum_{k=1}^n x_{jk}=1\comma$

(3)

$\forall k \in {\open N}_n \,\colon \sum_{\,j=1}^n x_{jk}=1\comma$

(4)

$\forall k \in {\open N}_n \,\colon \sum_{i=1}^m y_{ik} \leq C\comma$

(5)

$\forall j\comma \; k \in {\open N}_n\quad \forall i \in{\open N}_m \,\colon \,A_{ij}\hskip1ptx_{jk} \leq y_{ik}\comma$

(6)

$\forall j\comma \; k \in {\open N}_n\quad \forall i \in{\open N}_m \,\colon \,x_{jk}\comma \; y_{ij} \in \{0\comma \; 1\}.$

Recall that this general definition shown above corresponds to the uniform ToSP in which each tool fits in just one slot.

2.3. The ToSP as a machine-loading problem

The ToSP can be divided into three subproblems (Tzur & Altman, Reference Tzur and Altman2004): the first subproblem is machine loading and consists of determining the sequence of jobs; the second subproblem is tool loading, consisting of determining which tool to switch (if a switch is needed) before processing a job; finally, the third subproblem is slot loading, and consists of deciding where (i.e., in which slot) to place each tool. Because we are considering the uniform ToSP, the third subproblem does not apply (all slots are identical, and the order of tools is irrelevant). Therefore, only two subproblems have to be taken into account: machine loading and tool loading. In the following we will show that the tool loading subproblem can be optimally solved if the sequence of jobs is known beforehand. This is very important for optimization purposes, because it means that the search effort can be concentrated on the machine loading stage.

As already mentioned, the cost of switching a tool is considered constant (the same for all tools) in the uniform ToSP, the relevant decision being whether the tool is to be loaded in the magazine or not at any given time (were the size of the tools not uniform, the location of the tools in the magazine would be relevant also). Under this assumption, if the job sequence is fixed, the optimal tool switching policy can be determined in polynomial time using a greedy procedure termed Keep Tool Needed Soonest (KTNS; Bard, Reference Bard1988; Tang & Denardo, Reference Tang and Denardo1988).Footnote ¹ The functioning of this procedure is as follows:

• At any instant, insert all the tools that are required for the current job.
• If one or more tools are to be inserted and there are no vacant slots on the magazine, keep the tools that are needed soonest. Let J = 〈J _{i ₁}, … , J _{i _n}〉 be the job sequence, and let T _k ⊂ ℕ_m be the tool configuration at time k. Let Ξ_jk(J) be defined as
${\rm \Xi}_{jk} \lpar J \rpar = \min \lcub t \vert \lpar t\gt k\rpar \wedge A_{j{J}_{i_{\rm t}}} = 1\rcub \comma$
that is, the next instant after time k at which tool τ_j will be needed again given sequence J. If a tool has to be removed, the tool τ_{j ^*} maximizing Ξ_jk(J) is chosen, that is, remove the tools whose next usage is furthest in time.

The importance of this policy is that, as mentioned before, given a job sequence KTNS obtains its optimal number of tool switches. Therefore, we can concentrate on the machine loading subproblem, and use KTNS as a subordinate procedure to solve the subsequent tool loading subproblem. As an aside remark, the tool loading problem is NP-hard in the nonuniform ToSP, even if the job sequence is known and unit loading/unloading costs are assumed (Crama et al., Reference Crama, Moonen, Spieksma and Talloen2007).

2.4. An illustrative example

To illustrate the formal definition of the problem given in previous subsections, let us present a small example. Let there be a machine with a magazine capacity C = 4, and let there be n = 10 jobs requiring a total number of m = 9 tools. More precisely, let the requirement matrix be the indicated in Table 1.

Table 1. Example of tool requirement matrix

Note: Each cell A _ij identifies if a particular job j requires (•) tool i or (○) not.

Now, let us assume we have a job sequence 〈1, 6, 3, 7, 5, 2, 8, 4, 9, 10〉. The initial loading of the magazine must thus comprise the tools required by job 1, namely, T ⁽¹⁾ = {2, 3, 6}. Because there are still free slots in the magazine, these are loaded with tools required by the next job in the sequence (job 6; this means tool 1 is loaded also; see Fig. 1).

Fig. 1. An example of the application of the Keep Tool Needed Soonest policy. The tool requirements for each job are those indicated in Table 1. Slots in the magazine are denoted by circles (each row depicting the state of the magazine at a give time step). Black circles denote a tool switch. Finally, the sequence of jobs is given by the dark squares, and the cumulative number of switches is indicated in the right side of the figure. [A color version of this figure can be viewed online at journals.cambridge.org/aie]

Job 1 can thus be executed, and so does job 6 without any tool switch. Next job is number 3, which requires tools {2, 6, 7}. Tools 2 and 6 are already in the magazine but 7 is not, so a tool must be unloaded to make room for it. Two options are available for this purpose: tools 1 and 3. The KTNS policy determines that tool 3 has to be replaced because the next time it will be required is when serving job 2 at position 6 in the sequence, whereas tool 1 is required again for job 5 in position 5 in the sequence. Job 7 come next and requires tools {6, 8}. Tool 8 is then loaded replacing tool 7 (required again by job 9 at time step 9; the other candidates for replacement were tool 2 (required by job 2 at time 6) and tool 1 (required by job 5 at time 5). Now, job 5 requires tools {1, 5, 9} and only tool 1 is loaded so a double switch is required. Candidates to be replaced at this point are: tool 2 (required by job 2 at time 6), tool 6 (not required again) and tool 8 (required again by job 8 at time 7). Therefore, tools 6 and 8 are replaced. Job 2 comes next and requires tools {2, 3, 5, 9}, among which only tool 3 is not loaded. In this case the only possibility is replacing tool 1 by tool 3. Proceeding to job 8, tools {5, 8, 9} are needed, so tool 8 enters in the magazine replacing tool 3 (not required again in the future; the same holds for tool 2, so it is irrelevant which one of the two is removed). Getting to job 4, tool 4 is required in addition to 9 (already loaded). The former enters the magazine substituting tool 8 (again, not used again, much like tool 2; tool 5 is however required later by job 10 at time 10). The last but one is job 9, needing tools {4, 7}. Because tool 4 is already in the magazine, only tool 7 has to be loaded, replacing either tool 2 or tool 9 (none of them required again in the future). Finally, job 10 is completed using tools {4, 5} already in the magazine, so no new switch is required.

3. SOLVING THE ToSP WITH METAHEURISTICS

Let us now describe the metaheuristics considered to tackle the ToSP. To do so, Section 3.1 deals with general issues of representation and neighborhood structure, whereas algorithm-dependent issues are described in Sections 3.2 and 3.3.

3.1. Representation and neighborhood structure

The use of metaheuristics to solve the ToSP requires determining in each case how solutions will be represented, and which the structure of the underlying search space will be. For the purpose of the techniques considered in this work, these considerations turn out to be general issues that we address here. According to the discussion presented in previous subsection, the role of the metaheuristics will be to determine an optimal (or near optimal) job sequence, such that the total number of switches is minimized. Therefore, a permutational encoding arises as the natural way to represent solutions. Thus, a candidate solution for a specific ToSP instance I = 〈C, A〉 is simply a permutation π = 〈π₁, … , π _n〉 ∈ ℙ_n, where π_i ∈ ℕ_n, and ℙ_n is the set of all permutations of elements in ℕ_n.

Having defined the representation, we now turn our attention to the neighborhood structure. This will be a central ingredient in the local search-based metaheuristics considered, both when used as stand-alone techniques or when embedded within other search algorithms. Permutations are amenable to different neighborhood structures. We focused on the following two:

1. The well-known swap neighborhood ${\cal N}_{\rm swap}(\cdot)$ , in which two permutations are neighbors if they just differ in two positions of the sequence, that is, for a permutation π ∈ ℙ_n
${\cal N}_{\rm swap}\lpar {\rm \pi}\rpar = \lcub {\rm \pi}^{\prime} \in {\open P}_{n} \;\vert\; H\lpar {\rm \pi}\comma \; {\rm \pi}^{\prime}\rpar = 2\rcub \comma \;$
where H(π, π′) = n − ∑_i=1ⁿ [π_i = π_i′] is the Hamming distance between sequences π and π′ (the number of positions in which the sequences differ), and [·] is Iverson bracket (i.e., [P] = 1 if P is true, and [P] = 0 otherwise). Given the permutational nature of sequences, this implies that the contents of the two differing positions have been swapped.
2. The block neighborhood ${\cal N}_{\rm block}(\cdot)$ , a generalization of the swap neighborhood in which a permutation π′ is a neighbor of permutation π if the former can be obtained from the latter via a random block swap. A random block swap is performed as follows:
1. (a) A block length b _l ∈ ℕ_n/2 is uniformly selected at random.
2. (b) The starting point of the block b _s ∈ ℕ_{n−2b _l} is subsequently selected at random.
3. (c) Finally, an insertion point b _i is selected, such that b _s + b _l ≤ b _i ≤ n − b _l, and the segments 〈π_{b _s}, … , π_{b _s + b _l − 1}〉 and 〈π_{b _i}, … , π_{b _i + b _l − 1}〉 are swapped.

Obviously, if the block length b _l = 1, then the operation reduces to a simple position swap, but this is not typically the case.

Having defined the neighborhood structures, the next step is deploying LS-based procedures on them. This is described in the next subsection.

3.2. LS metaheuristics for the ToSP

LS metaheuristics are based on exploring the neighborhood of a certain “current” solution, using some specific decision-making procedure to determine when and where within this neighborhood the search is to be continued. Thus, LS can be typically modeled as a trajectory in the search space, that is, an ordered sequence of solutions such that neighboring solutions in this sequence differ in some small amount of information. The quality of solutions in this sequence does not have to be monotonically increasing in general. Indeed, the ability of performing “downhill” moves, that is, moving to a solution of inferior quality than the current one, is a crucial feature of most LS metaheuristics, allowing them to escape from local extrema, and hence endowing them with global optimization capabilities. Even more so, the dynamics of some LS techniques cannot even be modeled as a simple trajectory in search space, because some additional mechanisms can be considered to resume the search from a different point when stagnation is detected.

The first LS technique considered is classical exhaustive steepest ascent HC. Given a current solution π, its neighborhood ${\cal N}(\rm \pi)$ is fully explored, and the best solution found is taken as the new current solution, provided it is better than the current one (ties are randomly broken). If no such neighboring solution exist, the search is considered stagnated, and can be restarted from a different initial point.

The basic HC scheme suffers when confronted with a rugged search landscape, keeping the search trapped in low-quality local optima. In order to escape from these, a mechanism for accepting strictly nonimproving moves has to be incorporated. One of the most classical proposals to this end is simulated annealing (SA; Kirkpatrick et al., Reference Kirkpatrick, Gelatt and Vecchi1983). Inspired in the physical process of thermal cooling and residual strain relief in metals, SA uses a probabilistic criterion to accept a neighbor as the current one. This criterion is based on Boltzmann's law, and is parameterized by a so-called temperature value (recall the analogy with thermal cooling). More precisely, let Δf be the fitness difference between the tentative neighbor and the current solution (in this case, a negative value if the neighbor is better than the current solution), and let T be the current temperature. Then, the neighboring configuration is accepted with probability P given by

$P = \left\{\matrix{1\comma \; \hfill &\hbox{if} \,\Delta f \gt 0 \hfill \cr e^{-\lpar \Delta f/k_{\rm B} T\rpar } &\hbox{otherwise} \hfill}\comma \; \right.$

otherwise where k _B is Boltzmann's constant (which can be ignored in practice, by considering an appropriate scaling for the temperature). The current temperature T modulates this acceptance probability (if T is high, higher energy increases are allowed). The temperature is decreased from its initial value T ₀ to a final value T _k < T ₀ via a process termed cooling schedule. Two classical cooling schedules are geometric cooling, that is, T _i+1 = γT _i for some γ < 1, and arithmetic cooling, that is, T _i+1 = T _i − ɛ for some ɛ > 0. These are, however, somewhat simplistic strategies, nowadays superseded by more sophisticated cooling schedules that adaptively modify the temperature in response to the evolution of the search. To be precise, we have also considered an approach based on adaptive cooling and reheating (cf. Elmohamed et al., Reference Elmohamed, Coddington, Fox, Burke and Carter1998).

The idea underlying the use of adaptive cooling is keeping the system close to equilibrium by decreasing the temperature according to a search state-dependant variable termed specific heat. This variable measures the variability of the cost of states at a given temperature; higher values indicate it will take longer to reach equilibrium and hence slower cooling is required. Following Huang et al. (Reference Huang, Romeo and Sangiovanni-Vincentelli1986), the next temperature is thus calculated as

$T_{i + 1} = T_{i}e^{-{\rm\eta} T_i / \bar{\rm\sigma} \lpar T_i\rpar }\comma \;$

where η is a tunable parameter and (T _i) is a smoothed version of σ(T _i), the standard deviation of cost at temperature T _i, computed as (Otten & van Ginneken, Reference Otten and van Ginneken1989; Diekmann et al., Reference Diekmann, Lüling, Simon and Vidal1993)

$\bar{\rm\sigma} \lpar T_{i+ 1}\rpar = \lpar 1 - {\rm \nu}\rpar {\rm \sigma} (T_{i + 1 })+ {\rm \nu}{\rm \sigma} \lpar t_i\rpar {T_{i+1}\over {T_i}}.$

Parameter ν tunes the learning rate and is generally set to 0.95. As to reheating, it is invoked whenever the search is deemed stagnated (after n _ι evaluations without improvement, where n _ι is a parameter). In that case, the temperature is reset to

$T_{i + 1} = {\rm \kappa} f_{\rm B} + T \lpar {C\hskip.7pt}_{\rm H}^{\max}\rpar \comma \;$

where κ is a parameter, f _B is the cost of the best so far solution, and T(C _H^max) is the temperature at which the specific heat C _H(T) = σ²(T)/T ₂ took its maximum value.

The last LS scheme considered is TS (Glover, Reference Glover1989a, Reference Glover1989b). TS is a sophisticated extension of basic HC in which the best neighboring solution is chosen as the next configuration, even if it is worse than the current one. To prevent cycling, that is, the search returning to the same point after a few steps (consider, e.g., that it may be the case that $y \in {\cal N}(x)$ is the best neighbor of x and vice versa), a tabu list of movements is kept. Hence, a neighboring solution is accepted only if the corresponding move is not tabu. This tabu status of a move is not permanent: it only lasts for a number of search steps, whose value is termed tabu tenure. This value can be fixed for all moves and/or the search process, or can be different for different moves or in different stages of the search. Furthermore, an aspiration criterion may be defined, so that the tabu status of a move can be overridden if a certain condition holds (e.g., improving the best known solution).

The TS method considered in this work is based on the proposal described by Al-Fawzan and Al-Sultan (Reference Al-Fawzan and Al-Sultan2003). Different TS schemes were defined and compared therein, the best one turning out to be a TS algorithm featuring long-term memory and strategic oscillation. The first feature refers to the maintenance of a long-term memory, in this case measuring the frequency of application of each move. The basic idea is to diversify the search penalizing neighbors attainable via frequent moves. As to the strategic oscillation mechanism, it refers to a procedure for switching between the two neighborhoods defined in Section 3.1. A deterministic criterion based on switching the neighborhood structure after a fixed number of iterations was reported by Al-Fawzan and Al-Sultan (Reference Al-Fawzan and Al-Sultan2003) to perform better than a probabilistic criterion (i.e., choosing the neighborhood structure in each step, according to a certain probability distribution). No aspiration criterion is used in this algorithm.

3.3. A population-based attack to the ToSP

Unlike LS methods, population-based techniques maintain a pool of candidate solutions, which are used to generate new candidate solutions, not just by neighborhood search but by using other higher-arity procedures such as recombination (i.e., two or more solutions, appropriately termed parents) are combined to create new solutions (Bäck, Reference Bäck1996). Although the relevance of recombination versus neighborhood search has been always debated (Reeves, Reference Reeves and Fogarty1994), a common criticism is that unless adequately crafted to the problem at hand, recombination may reduce to pure macromutation (Jones, Reference Jones1995), it is widely accepted that recombination can play a crucial role in information mixing, as well as in the balance between exploitation and exploration (Prügel-Bennett, Reference Prügel-Bennett2010).

The first population-based approach considered is a steady-state GA: a single solution is generated in each generation, and inserted in the population replacing the worst individual. Selection is done by binary tournament. As to recombination, there are many possibilities defined in the literature (among others, check Oliver et al., Reference Oliver, Smith, Holland and Grefenstette1987; Starkweather et al., Reference Starkweather, McDaniel, Mathias, Whitley, Whitley, Belew and Booker1991; Cotta & Troya, Reference Cotta and Troya1998; Larrañaga et al., Reference Larrañaga, Kuijpers, Murga, Inza and Dizdarevic1999). We have opted in this work for using uniform cycle crossover (Cotta & Troya, Reference Cotta and Troya1998), an operator based on the manipulation of positional information. To be precise, it is a generalization of cycle crossover in which all cycles are first identified and subsequently mixed at random. Notice that this operator ensures the new solution contains no exogenous positional information (each position is taken from one of the parents). As to mutation, we have considered the use of random block swap moves, as described in Section 3.1.

On the basis of this GA, we have defined a number of MAs. MAs are hybrid methods based on the synergistic combination of ideas from different search techniques, most prominently from LS and population-based search. The term memetic stems from the notion of meme, a concept coined by Dawkins (Reference Dawkins1976) to denote an analogous of the gene in the context of cultural evolution. Indeed, information manipulation is much more flexible in MAs, thanks to the usage of algorithmic add-ons such as LS, exact techniques, and so forth. It must be noted that although the connection to cultural evolution is sometimes overstressed in the literature, it is useful to depart from biologically constrained thinking that turns out to be very restrictive at times. As a matter of fact, the initial developments in MAs done by Moscato (Reference Moscato1989) did not emanate from a biological metaphor, but from the idea of maintaining a population of cooperating/competing search agents, for which a combination of evolutionary algorithms and LS was just a convenient instantiation (LS for encapsulating search agents, and an evolutionary algorithm for encapsulating cooperation, via recombination, and competition, via selection and replacement). Check Moscato and Cotta (Reference Moscato, Cotta, Gendrau and Potvin2010) for a recent up to date overview of these techniques.

The MAs considered in this work have been built by endowing the GA with each of the LS schemes previously defined. To be precise, we have used each of the algorithms (i.e., HC, SA, TS) defined in Section 3.2. Although in some early MAs LS was performed on every generated individual, this is not necessarily the best choice (Sudholt, Reference Sudholt2009). Indeed, partial Lamarckianism (Houck et al., Reference Houck, Joines, Kay and Wilson1997), namely, applying LS only to a fraction of individuals, can result in better performance. These individuals to which LS will be applied can be selected in many different ways (Nguyen et al., Reference Nguyen, Ong, Krasnogor, Srinivasan and Wang2007). We have considered a simple approach in which LS is applied to any individual with a probability p _LS; in case of application, the improvement uses up a number of LSevals evaluations (or in the case of HC until it stagnates, whatever comes first; see Algorithm 1 in Appendix A).

The underlying idea of this MA is to combine the intensifying capabilities of the embedded LS method with the diversifying features of population-based search, that is, the population will spread over the search space providing starting points for a deeper local exploration. As generations go by, promising regions will start to be spotted, and the search will concentrate on them. Ideally, this combination should be synergistic (this will depend on the particulars of the combination, such as the intensity, frequency, and depth of LS and its interplay with the underlying evolutionary dynamics; Sudholt, Reference Sudholt2009), providing better results that either the GA or the LS techniques by themselves. Empirical evidence of this will be provided in next section.

4. EXPERIMENTAL RESULTS

The experiments have been performed considering five different basic algorithms: BS presented by Zhou et al. (Reference Zhou, Xi and Cao2005), three LS methods (HC, TS, and SA), and a GA. From these, a wide number of algorithms were devised and tested. For instance, in the case of BS, five different values from 1 up to 5 were considered for the beamwidth. Finally, memetic approaches based on the combination of the GA with each of the LS methods have been considered.

Regarding LS methods, we consider HC, TS, and three variants of SA with arithmetic cooling (SAA), geometric cooling (SAG), and adaptive cooling and reheating (SAR), respectively. Note also that the exploration of the whole neighborhood becomes more and more costly as the number of jobs increases, for example, for 50 jobs, the number of swap neighbors for a given candidate is 1225, not to mention the even higher number of block neighbors. In a fixed computational budget scenario, this implies the allocated computational effort can be quickly consumed. For this reason, we have opted for also taking into account LS versions in which a partial exploration of the neighborhood is done by obtaining a fixed-size random sample. To be precise, the size of this sample has been chosen to be αn, that is, proportional to the number of jobs (the value α = 4 has been used). The notation HCP and HCF (respectively, TSP and TSF) is used to indicate the HC variant (respectively TS variant) in which the neighborhood is partially or fully explored respectively (in the case of TS, the full exploration refers just to the swap neighborhood, because the block neighborhood has a huge size). Other details of each particular LS method are as follows. In the case of HC, the search is restarted from a different initial point if stagnation takes place before consuming the allotted number of evaluations. As to SA, the initial temperature T ₀ has been chosen so that the initial acceptance rate is approximately 50% (this has been done by obtaining offine a small sample of random solutions to measure the average fitness difference θ, and taking T ₀ = 1.44θ). The cooling parameter (either geometric and arithmetic) has been chosen so that a final temperature T _k = 0.1 is reached in the number of evaluations allocated to the corresponding instance. As for adaptive cooling and reheating, we use η = 10⁻⁴, ν = 0.95, κ = T ₀/f (where f ₀ is the mean cost of random solutions), and n _ι = 20. Finally, regarding TS, the tabu tenure is 5, and the number of iterations on each neighborhood for performing strategic oscillation is 3. This corresponds to the setting used by Al-Fawzan and Al-Sultan (Reference Al-Fawzan and Al-Sultan2003).

As to the GA (and subsequently to the MA), an elitist generational model replacing the worst individual of the population (popsize = 30, p _X = 1.0, p _M = 1/n, where n is the number of jobs, that is, number of genes per individual) with binary tournament selection has been utilized. As mentioned in Section 3.3, mutation is done by applying a random block swap, and recombination uses uniform cycle crossover. Finally, regarding the MAs, we conducted preliminary experiments considering P _LS ∈ {0.001, 0.01, 0.1, 1.0} and LSevals ∈ {100, 200, … , 1000} to analyze parameter sensitivity; the best results were obtained for values of LSevals equal to 200 and 1000, and for P _LS = 0.01, and thus our MAs were run considering these values. Overall, this means 12 different versions of MAs, that is, those resulting from the hybridization of the GA with each of the six LS schemes pointed out above and fixing LSevals to the two values mentioned before. The notation MAxxyy is used, where xx stands for a particular LS technique, and yy ∈ {02, 10} indicate LSevals = 200 and LSevals = 1000, respectively.

As far as we know, no standard benchmark exists for this problem (at least publicly available). For this reason, we have selected a wide set of problem instances that were considered in the literature (Bard, Reference Bard1988; Hertz et al., Reference Hertz, Laporte, Mittaz and Stecke1998; Al-Fawzan & Al-Sultan, Reference Al-Fawzan and Al-Sultan2003; Zhou et al., Reference Zhou, Xi and Cao2005); to be precise, 16 instances have been selected, with number of jobs, number of tools, and machine capacity ranging in [10, 50], [9, 60] and [4, 30], respectively. Table 2 shows the different problem instances chosen for the experimental evaluation where a specific instance with n jobs, m tools, and machine capacity C is labeled as Cζ_n^m.

Table 2. Problem instances considered in the experimental evaluation

Note: The minimum and maximum of tools required for all jobs is indicated, as well as the works from which the problem instance was obtained: 1, Al-Fawzan and Al-Sultan (Reference Al-Fawzan and Al-Sultan2003); 2, Bard (Reference Bard1988); 3, Hertz et al. (Reference Hertz, Laporte, Mittaz and Stecke1998); 4, Zhou et al. (Reference Zhou, Xi and Cao2005).

Five different data sets (all data sets are available at http://www.unet.edu.ve/jedgar/ToSP/ToSP.htm; i.e., tool requirement matrices) were generated randomly per instance. Each data set was generated with the constraint, already imposed in previous works such as Hertz et al. (Reference Hertz, Laporte, Mittaz and Stecke1998), that no job is covered by any other job in the sense that for no two different jobs i and j, T ^{(J _i)} ⊆ T ^{(J _j)}. Were this the case, job i could be removed from the problem instance, because scheduling it immediately after job j would result in no tool switching. This consideration has been also taken into account by Bard (Reference Bard1988) and Zhou et al. (Reference Zhou, Xi and Cao2005).

All algorithms (except BS, see below) have been run 10 times per data set (i.e., 50 runs per problem instance), for a maximum of maxevals = φn(m − C) evaluationsFootnote ² per run (with φ > 0). Preliminary experiments on the value of φ proved that φ = 100 is an appropriate value that allows to keep an acceptable relation between solution quality and computational cost. Regarding the BS algorithm, because of its deterministic nature, just one run per data set (and per value of beamwidth) has been done. The algorithm was allowed to run till exhaustion of the search tree. Table 3 and Table 4 show the obtained results, grouped by problem instance.

Table 3. Results of genetic algorithm (GA), beam search (BSβ) considering several values (1 ≤ i ≤ 5) for the beam width β, and different versions of hill climbing (HC), tabu search (TS), and simulated annealing (SA)

Note: Best results (in terms of the best solution average) are underlined and in boldface. Av, solution average; SD, standard deviation.

Table 4. Results of the variants of memetic algorithm (MA) considered

Note: Best results (in terms of the best solution average) are underlined and in boldface. Av, solution average; SD, standard deviation.

A first consideration regarding the results is the fact that TSP performs better than remaining nonhybrid techniques. In addition, HCF performs better on average than BS versions in most of the instances (i.e., exactly in 13 out of 16 instances). However, HCP is not as competitive as its full-exploration counterpart. Note, for example, that the performance of HCP degrades when the instance is larger. This is not surprising, because such larger instances are likely to exhibit a much more rugged multimodal landscape, and basic LS schemes suffer in these scenarios. In this case, BS is capable of adjusting better than HCP to this curse of dimensionality, given its pseudopopulation-based functioning (it is not truly population based in the sense that no set of multiple full solutions is maintained, although it does indeed keep a population of constructive paths), which modulates the greediness of the branch selection mechanism. Observe that, in general, BS exhibits a very robust behavior in all its versions and shows a competitive performance with respect to the rest of the techniques (especially in larger instances of the problem, i.e., for C ≥ 15). SA, with adaptive cooling and reheating, significantly improves the performance of SAA and SAG (which do not generally provide competitive results with respect to the rest of techniques). These comparatively better results of SAR with respect to SAA and SAG, as well as the better results of TSP with respect to the remaining LS-based techniques and to the GA, highlight the need of adaptive strategies to traverse the search space of the ToSP effectively. As to the GA, it offers a robust performance given the fact that rather standard parameters have been used. It actually provides very good results especially in the smaller instances of the problem (i.e., for C < 10), and exhibits a good overall performance (very competitive with respect to HC, SA, and BS, as well as TSF). The GA shows an irregular performance in some instances though; in particular, its performance worsens when the number of jobs increases (i.e., n ≥ 40). This scaling difficulty in the case of the GA reflects the intricacy of the search landscape of the ToSP, and the problem it poses for a pure population-based approach in order to fine tune good solutions for larger sizes. Althoough the GA can be good at jumping among different basins of attraction, identifying the corresponding local optima requires stronger intensification of the search. Such an intensification capability can be provided via the integration of a LS method and a population-based technique, using the memetic approaches defined above.

Inspecting the results of the hybrid local/population-based techniques (i.e., the memetic approaches) shown in Table 4, it can be seen that these often provide better results than their constituent parts (with the exception of the MATS* versions). For instance, notice that despite the poor performance of SAA and SAG, MASAA/MASAG variants are still capable of performing better than most LS techniques, although the combination does not reach synergistic value, because the results are comparable to those of the GA alone. A similar consideration can be done with respect to MATS* variants, although in this case its performance drops below that of the constituent parts. This may be because the increased computational cost of a potentially larger search trajectory does not pay off (in other words, the TS schema has good diversification characteristics that results in good performance as a stand-alone technique, but does not contribute enough intensification in order to be effective within a MA). Finally, observe that the hybridization of GA with HCP (i.e., MAHCP*) provides the best overall results, even better than the combination of GA with HCF, despite the fact that HCF performs much better than HCP as stand-alone technique. The reason may be that, when used as local improvement, the full exploration scheme in hill climbing demands a higher computational cost to produce a move in the search space.

In order to analyze the significance of the results and obtain a global perspective on how they compare to each other, we have used a rank-based approach. To do so, we have computed the rank r _jⁱ of each algorithm j on each instance i (rank 1 for the best, and rank k for the worst, where k = 27 is the number of algorithms; in case of ties, an average rank is awarded). The distribution of these ranks is shown in Figure 2. Next, we have used two well-known nonparametric statistical tests (Lehmann & D'Abrera, Reference Lehmann and D'Abrera1998) to compare ranks:

• Friedman test (Friedman, Reference Friedman1937): we compute Friedman statistic value as
${\rm\chi\hskip.7pt}^2_{\rm F} = {12N \over k \lpar k + 1\rpar } \sum_{\,j = 1}^k \left(R_j - {k + 1 \over 2} \right)^2\comma \;$
where R _j is the mean rank of algorithm j across all N instances. The result is compared with the χ² distribution with k − 1 degrees of freedom.
• Iman–Davenport test (Iman & Davenport, Reference Iman and Davenport1980): a less conservative test based on Friedman statistic value as follows:
$F_{\rm F} = {\lpar N - 1\rpar{\rm \chi\hskip.7pt}_{\rm F}^2 \over N \lpar k - 1\rpar - {\rm \chi\hskip.7pt}_{\rm F}^2}.$

In this case, the result is compared with the F distribution with k − 1 and (k − 1)(N − 1) degrees of freedom.

Fig. 2. The rank distribution of each algorithm across all instances. As usual, each box comprises the second and third quartiles of the distribution, the median is marked with a vertical line, whiskers span 1.5 times the interquartile range, and outliers are indicated with a plus sign. [A color version of this figure can be viewed online at journals.cambridge.org/aie]

The results are shown in Table 5. As seen in the first row, the statistic values obtained are clearly higher than the critical values, and therefore the null hypothesis, namely, that all algorithms are equivalent, can be rejected. Because there are algorithms with markedly poor performance, we have repeated the test with the top seven algorithms (i.e., the MAs incorporating HC and SAR, and TSP), whose performance places them in a separate cluster from the remaining algorithms (cf. Fig. 2). Again, it can be seen that the statistical test is passed, thus indicating significant differences in their ranks at α = 0.01 level.

Table 5. Results of Friedman and Iman–Davenport tests for α = 0.01

Subsequently, we have focused in these top seven algorithms, and performed Holm's test (Holm, Reference Holm1979) in order to determine whether significant differences exist with respect to a control algorithm (in this case MAHCP02, the algorithm with the best mean rank). To do so, we compute the following z statistic for the ith algorithm:

$z = \lpar R_i - R_0\rpar {\bigg/} \sqrt{{k \lpar k + 1\rpar \over 6N}}$

Then, we determine the corresponding p value for a normal distribution, and sort the algorithms for increasing p values. Finally, these p values are compared with and adjusted critical p value α/i, where α is the significance level and i is the algorithm's position (1 for the lowest p value, k − 1 for the highest p value; recall that one algorithm is used as control, and hence there are only k − 1 slots). Tests are sequentially done for increasing p values until the null hypothesis cannot be rejected at a certain i. In that case, the null hypothesis is retained for every j ≤ i, that is, algorithms with larger p values. The results are shown in Table 6. Note that, with the exception of MAHCP10, for which statistical significance can only be established at 82% level, the test is passed at 99% confidence level for all algorithms with respect to MAHCP02. This is a robust result that indicates a clear trend of superiority of MAHCP02 over the remaining approaches.

Table 6. Results of Holm's test using memetic algorithm MAHCP02 as the control algorithm (α = 0.01)

5. CONCLUSIONS AND FUTURE WORK

We have tackled the uniform ToSP with different techniques, and showed how metaheuristics can be very adequate to solve the problem. To be precise, we have conducted an extensive empirical evaluation of three different LS heuristics (hill climbing, SA, TS), GAs, and MAs. The experimentation has included the BS method described in Zhou et al. (Reference Zhou, Xi and Cao2005), because it was demonstrated to be especially effective compared to other techniques previously published. The results show that metaheuristics provide encouraging results, and are capable of improving the results obtained by BS.

Starting on a general note, one of the main conclusions to be extracted from the results is the versatileness and effectiveness of MAs as a search paradigm. They constitute a natural framework in which different heuristics can be seamlessly integrated into a single optimization engine. Thus, MAs should not be regarded as competitors for existing approaches; on the contrary, it is much more appropriate to regard them as integrators: whenever single metaheuristics start to reach their limits, the use of MAs is in order to overcome these limitations.

Focusing now on each of the techniques considered, the experimental results indicate that TS is the most effective LS technique among the proposals considered. Its ability to traverse the search space escaping from local optima, and the enhanced exploration capabilities provided by the use of a strategic oscillation mechanism are crucial for this. Regarding the GA, the particular recombination operator utilized—uniform cycle crossover—has shown the relevance of processing structural positional information to create new tentative sequences. A similar consideration can be made with respect to the choice of both LS technique to be embedded in the MA and its neighborhood exploration policy. Regarding the first issue, the MA endowed with HC has yielded the best results, improving both the GA and the remaining LS techniques as stand-alone techniques, and thus providing evidence of the synergy of the combination. The reason why MAHC* behaves better than MATS* can be found in the better trade-off between search intensification and computational cost provided by the former. Although TS can provide improved solutions with respect to HC, its role when embedded within an MA is different, because it has to share exploration duties with the underlying GA. Hence, the savings in computational effort obtained by removing some of this diversification capability from the local searcher (which can thus focus on intensifying the search in promising regions) results in a net gain for the hybrid approach. This guideline seems generalizable to other related engineering problems (e.g., single machine total weighted tardiness; Maheswaran et al., Reference Maheswaran, Ponnambalam and Aranvidan2005; see also Cotta & Fernández, Reference Cotta, Fernández, Dahal, Tan and Cowling2007) in which simple and more intensive local improvement strategies perform adequately.

Regarding the choice of the scheme for exploring the neighborhood in the process of local improvement embedded in an MA, the less computationally demanding option considering a sample instead of the whole neighborhood produces better results to solve the ToSP. Again, this is due to the balance between the computational cost of LS and the potentially attainable gain in solution quality. The interplay between the LS and population-based component of the MA demands the former is applied at a low rate, and with a moderate intensity.

In connection to this last issue, and as an avenue for further research, it would be interesting to explore in more detail the intensification/diversification balance within the MA. In this work we have leaned toward a more explorative combination, by using a blind recombination operator in the GA. It would be worth exploring other models though, for example, by incorporating an intense exploration of the dynastic potential (i.e., set of possible children) of the solutions being recombined. Ideas from local branching (Fischetti & Lodi, Reference Fischetti and Lodi2003) or from dynastically optimal recombination (Cotta & Troya, Reference Cotta and Troya2003; Gallardo et al., Reference Gallardo, Cotta and Fernández2007) could be used here. We also plan to analyze new instances and variants of the problem (Kashyap & Khator, Reference Kashyap and Khator1994; Błażewicz & Finke, Reference Błażewicz and Finke1994; Hong-Bae et al., Reference Hong-Bae, Yeong-Dae and Suh1999) in the future.

ACKNOWLEDGMENTS

We thank the reviewers for their useful comments. The second and third authors are partially supported by Spanish MCINN under project NEMESIS (TIN2008-05941) and Junta de Andalucía under project TIC-6083.

Jhon Edgar Amaya is currently a PhD student in computer science at the University of Málaga under the supervision of Dr. Carlos Cotta and Dr. Antonio J. Fernández-Leiva. He is also a member of the High Performance Computing Laboratory of the Universidad Nacional Experimental del Tachira. He received an electronic engineer degree in 1997 from the Universidad Nacional Experimental del Tachira and an MS degree in computation in 2003 from the Universidad de Los Andes. Jhon's research interests are evolutionary computation and applications of artificial intelligence to practical problems.

Carlos Cotta is an Associate Professor in the School of Computer Science at the University of Málaga. He received MS and PhD degrees in computer science from the University of Málaga in 1994 and 1998, respectively. His research interests comprise areas such as metaheuristics (in particular evolutionary computation), combinatorial optimization, and bioinformatics. Dr. Cotta is involved in the technical organization and program committees of major conferences in the field of evolutionary computation and has coedited several books on knowledge-driven computing, adaptive metaheuristics, and evolutionary combinatorial optimization.

Antonio J. Fernández-Leiva is an Associate Professor in the School of Computer Science at the University of Málaga. He received BS and MS degrees in computer science from the University of Málaga in 1991 and 1995, respectively. In 2002, he obtained his PhD degree from the University of Málaga under the supervision of Dr. Patricia M. Hill at Leeds University, where he spent long periods of time over 4 years. His area of research comprises areas such as the implementation of constraint programming languages and the attainment of hybrid optimization techniques that involve evolutionary algorithms.

APPENDIX A

Algorithm 1: Pseudocode of a basic MA based on a local search LS

1 fori ∈ ℕ_µdo 2 pop[i] ← Random-Solution(); 3 Local-Improvement (pop[i]); 4 end for 5 i ← 0; 6 whilei < MaxEvals do 7 Rank-Population (pop); // sort population according to fitness 8 parent ₁ ← Select (pop); 9 if Rand[0, 1] < p _Xthen // recombination is done 10 parent ₂ ← Select (pop); 11 child ← Recombine (parent ₁, parent ₂); 12 else 13 child ← parent ₁; 14 end if 15 child ← Mutate (child; p _M); // p _Mis the mutation probability per gene 16 if Rand[0, 1] < p _LSthen // LS is applied 17 Local-Improvement (child); // Local Improvement 18 end if 19 pop[μ] ← child; // replace worst 20 end while 21 return best solution in pop;

Footnotes

¹ As Błażewicz and Finke (Reference Błażewicz and Finke1994) point out, the KTNS property was already known to Belady (Reference Belady1966).

² Observe that the number of evaluations increases with the number of jobs and tools (assumed to be directly related with problem difficulty) and decreases when the magazine capacity increases (thus making the decision problem less tight).

References

REFERENCES

Al-Fawzan, M., & Al-Sultan, K. (2003). A tabu search based algorithm for minimizing the number of tool switches on a flexible machine. Computers & Industrial Engineering 44(1), 35–47.CrossRef Google Scholar

Amaya, J., Cotta, C., & Fernández, A. (2008). A memetic algorithm for the tool switching problem. In Hybrid Metaheuristics 2008 (Blesa, M., Blum, C., Cotta, C., Fernández Leiva, A.J., Gallardo Ruiz, J.E., Roli, A., & Sampels, M., Eds.), LNCS, Vol. 5296, pp. 190–202. Málaga: Springer–Verlag.CrossRef Google Scholar

Bäck, T. (1996). Evolutionary Algorithms in Theory and Practice. New York: Oxford University Press.CrossRef Google Scholar

Bard, J.F. (1988). A heuristic for minimizing the number of tool switches on a flexible machine. IIE Transactions 20(4), 382–391.CrossRef Google Scholar

Belady, L. (1966). A study of replacement algorithms for virtual storage computers. IBM Systems Journal 5, 78–101.CrossRef Google Scholar

Błażewicz, J., & Finke, G. (1994). Scheduling with resource management in manufacturing systems. European Journal of Operational Research 76, 1–14.CrossRef Google Scholar

Blum, C., & Roli, A. (2003). Metaheuristics in combinatorial optimization: overview and conceptual comparison. ACM Computing Surveys 35(3), 268–308.CrossRef Google Scholar

Cotta, C., & Fernández, A. (2007). Memetic algorithms in planning, scheduling and timetabling. In Evolutionary Scheduling (Dahal, K., Tan, K.-C., & Cowling, P., Eds.), pp. 1–30. Berlin: Springer–Verlag.Google Scholar

Cotta, C., & Troya, J. (1998). Genetic forma recombination in permutation flowshop problems. Evolutionary Computation 6(1), 25–44.CrossRef Google Scholar PubMed

Cotta, C., & Troya, J. (2003). Embedding branch and bound within evolutionary algorithms. Applied Intelligence 18(2), 137–153.CrossRef Google Scholar

Crama, Y., Kolen, A., Oerlemans, A., & Spieksma, F. (1994). Minimizing the number of tool switches on a flexible machine. International Journal of Flexible Manufacturing Systems 6, 33–54.CrossRef Google Scholar

Crama, Y., Moonen, L., Spieksma, F., & Talloen, E. (2007). The tool switching problem revisited. European Journal of Operational Research 182(2), 952–957.CrossRef Google Scholar

Dawkins, R. (1976). The Selfish Gene. Oxford: Clarendon Press.Google Scholar

Diekmann, R., Lüling, R., & Simon, J. (1993). Problem independent distributed simulated annealing and its applications. In Applied Simulated Annealing (Vidal, R., Ed.), LNEMS, Vol. 3962, pp. 17–44. Berlin: Springer–Verlag.CrossRef Google Scholar

ElMaraghy, H. (1985). Automated tool management in flexible manufacturing. Journal of Manufacturing Systems 4(1), 1–14.CrossRef Google Scholar

Elmohamed, M.A.S., Coddington, P.D., & Fox, G. (1998). A comparison of annealing techniques for academic course scheduling. In Practice and Theory of Automated Timetabling II (Burke, E., & Carter, M., Eds.), LCS, Vol. 1498, pp. 92–112. Berlin: Springer–Verlag.CrossRef Google Scholar

Fischetti, M., & Lodi, A. (2003). Local branching. Mathematical Programming B 98, 23–47.CrossRef Google Scholar

Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association 32(200), 675–701.CrossRef Google Scholar

Gallardo, J., Cotta, C., & Fernández, A. (2007). On the hybridization of memetic algorithms with branch-and-bound techniques. IEEE Transactions on Systems, Man, and Cybernetics, Part B 37(1), 77–83.CrossRef Google Scholar PubMed

Ghiani, G., Grieco, A., & Guerriero, E. (2007). An exact solution to the TLP problem in an NC machine. Robotics and Computer-Integrated Manufacturing 23(6), 645–649.CrossRef Google Scholar

Glover, F. (1989 a). Tabu search—part I. ORSA Journal of Computing 1(3), 190–206.CrossRef Google Scholar

Glover, F. (1989 b). Tabu search—part II. ORSA Journal of Computing 2(1), 4–31.CrossRef Google Scholar

Hertz, A., Laporte, G., Mittaz, M., & Stecke, K. (1998). Heuristics for minimizing tool switches when scheduling part types on a flexible machine. IIE Transactions 30, 689–694.CrossRef Google Scholar

Hertz, A., & Widmer, M. (1993). An improved tabu search approach for solving the job shop scheduling problem with tooling constraints. Discrete Applied Mathematics 65, 319–345.CrossRef Google Scholar

Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 65–70.Google Scholar

Hong-Bae, J., Yeong-Dae, K., & Suh, S.H.-W. (1999). Heuristics for a tool provisioning problem in a flexible manufacturing system with an automatic tool transporter. IEEE Transactions on Robotics and Automation 15(3), 488–496.CrossRef Google Scholar

Hop, N.V. (2005). The tool-switching problem with magazine capacity and tool size constraints. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans 38(5), 617–628.Google Scholar

Houck, C., Joines, J., Kay, M., & Wilson, J. (1997). Empirical investigation of the benefits of partial Lamarckianism. Evolutionary Computation 5(1), 31–60.CrossRef Google Scholar PubMed

Huang, M., Romeo, F., & Sangiovanni-Vincentelli, A. (1986). An efficient general cooling schedule for simulated annealing. Proc. 1986 IEEE Int. Conf. Computer Aided Design (ICCAD), pp. 381–384. Santa Clara, CA: IEEE Press.Google Scholar

Iman, R., & Davenport, J. (1980). Approximations of the critical region of the Friedman statistic. Communications in Statistics 9, 571–595.CrossRef Google Scholar

Jones, T. (1995). Evolutionary algorithms, fitness landscapes and search. PhD Thesis. University of New Mexico.Google Scholar

Kashyap, A., & Khator, S. (1994). Modeling of a tool shared flexible manufacturing system. Proc. 26th Simulation Conf. Int. Society for Computer Simulation, pp. 986–993, San Diego, CA.CrossRef Google Scholar

Keung, K.W., Ip, W.H., & Lee, T.C. (2001). A genetic algorithm approach to the multiple machine tool selection problem. Journal of Intelligent Manufacturing 12(4), 331–342.CrossRef Google Scholar

Kiran, A., & Krason, R. (1988). Automated tooling in a flexible manufacturing system. Industrial Engineering 20, 52–57.Google Scholar

Kirkpatrick, S., Gelatt, C.D., & Vecchi, M.P. (1983). Optimization by simulated annealing. Science 4598, 671–680.CrossRef Google Scholar

Krasnogor, N., & Smith, J. (2005). A tutorial for competent memetic algorithms: model, taxonomy, and design issues. IEEE Transactions on Evolutionary Computation 9(5), 474–488.CrossRef Google Scholar

Laporte, G., Salazar-González, J., & Semet, F. (2004). Exact algorithms for the job sequencing and tool switching problem. IIE Transactions 36(1), 37–45.CrossRef Google Scholar

Larrañaga, P., Kuijpers, C., Murga, R., Inza, I., & Dizdarevic, S. (1999). Genetic algorithms for the travelling salesman problem: a review of representations and operators. Artificial Intelligence Review 13, 129–170.CrossRef Google Scholar

Lehmann, E., & D'Abrera, H. (1998). Nonparametrics: Statistical Methods Based on Ranks. Englewood Cliffs, NJ: Prentice–Hall.Google Scholar

Maheswaran, R., Ponnambalam, S., & Aranvidan, C. (2005). A meta-heuristic approach to single machine scheduling problems. International Journal of Advanced Manufacturing Technology 25, 772–776.CrossRef Google Scholar

Moscato, P. (1989). On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts: Towards Memetic Algorithms, Caltech Concurrent Computation Program Technical Report 826. Pasadena, CA: California Institute of Technology.Google Scholar

Moscato, P., & Cotta, C. (2003). A gentle introduction to memetic algorithms. In Handbook of Metaheuristics (Glover, F.W., & Kochenberger, G.A., Eds.), pp. 105–144. Boston: Kluwer Academic.CrossRef Google Scholar

Moscato, P., & Cotta, C. (2007). Memetic algorithms. In Handbook of Approximation Algorithms and Metaheuristics (González, T., Ed.), Chap. 27. New York: Chapman & Hall/CRC Press.Google Scholar

Moscato, P., & Cotta, C. (2010). A modern introduction to memetic algorithms. In Handbook of Metaheuristics (Gendrau, M., & Potvin, J.-Y., Eds.), Vol. 146, 2nd ed., pp. 141–183. New York: Springer–Verlag.CrossRef Google Scholar

Nguyen, Q.H., Ong, Y.-S., & Krasnogor, N. (2007). A study on the design issues of memetic algorithm. Proc. 2007 IEEE Congress on Evolutionary Computation (Srinivasan, D., & Wang, L., Eds.), pp. 2390–2397. Singapore: IEEE Press.CrossRef Google Scholar

Oerlemans, A. (1992). Production planning for flexible manufacturing systems. PhD Thesis. University of Limburg. Maastricht.Google Scholar

Oliver, I., Smith, D., & Holland, J. (1987). A study of permutation crossover operators on the traveling salesman problem. Proc. 2nd Int. Conf. Genetic Algorithms (Grefenstette, J., Ed.), pp. 224–230. Hillsdale, NJ: Erlbaum.Google Scholar

Otten, R., & van Ginneken, L. (1989). The Annealing Algorithm. New York: Kluwer Academic.CrossRef Google Scholar

Privault, C., & Finke, G. (1995). Modelling a tool switching problem on a single NC-machine. Journal of Intelligent Manufacturing 6(2), 87–94.CrossRef Google Scholar

Prügel-Bennett, A. (2010). Benefits of a population: five mechanisms that advantage population-based algorithms. IEEE Transactions on Evolutionary Computation 14(4), 500–517.CrossRef Google Scholar

Reeves, C. (1994). Genetic algorithms and neighbourhood search. In Evolutionary Computing (Fogarty, T., Ed.), LNCS, Vol. 865, pp. 115–130. Berlin: Springer–Verlag.CrossRef Google Scholar

Shirazi, R., & Frizelle, G. (2001). Minimizing the number of tool switches on a flexible machine: an empirical study. International Journal of Production Research 39(15), 3547–3560.CrossRef Google Scholar

Starkweather, T., McDaniel, S., Mathias, K., Whitley, D., & Whitley, C. (1991). A comparison of genetic sequencing operators. Proc. 4th Int. Conf. Genetic Algorithms (Belew, R., & Booker, L., Eds.), pp. 69–76. San Mateo CA: Morgan Kauffman.Google Scholar

Sudholt, D. (2009). The impact of parametrization in memetic evolutionary algorithms. Theoretical Computer Science 410(26), 2511–2528.CrossRef Google Scholar

Tang, C., & Denardo, E. (1988). Models arising from a flexible manufacturing machine, part I: minimization of the number of tool switches. Operations Research 36(5), 767–777.CrossRef Google Scholar

Tzur, M., & Altman, A. (2004). Minimization of tool switches for a flexible manufacturing machine with slot assignment of different tool sizes. IIE Transactions 36(2), 95–110.CrossRef Google Scholar

Zhou, B.-H., Xi, L.-F., & Cao, Y.-S. (2005). A beam-search-based algorithm for the tool switching problem on a flexible machine. International Journal of Advanced Manufacturing Technology 25(9–10), 876–882.CrossRef Google Scholar