Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2011, Article ID 390593,20 pages doi:10.1155/2011/390593

Research Article

A Hybrid Differential Evolution and Tree Search Algorithm for the Job Shop Scheduling Problem

Rui Zhang1 and Cheng Wu2

1 School of Economics and Management, Nanchang University, Nanchang 330031, China

2 Department of Automation, Tsinghua University, Beijing 100084, China

Correspondence should be addressed to Rui Zhang, r.zhang@ymail.com Received 15 July 2011; Accepted 26 August 2011 Academic Editor: Furong Gao

Copyright © 2011 R. Zhang and C. Wu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

The job shop scheduling problem (JSSP) is a notoriously difficult problem in combinatorial optimization. In terms of the objective function, most existing research has been focused on the makespan criterion. However, in contemporary manufacturing systems, due-date-related performances are more important because they are essential for maintaining a high service reputation. Therefore, in this study we aim at minimizing the total weighted tardiness in JSSP. Considering the high complexity, a hybrid differential evolution (DE) algorithm is proposed for the problem. To enhance the overall search efficiency, a neighborhood property of the problem is discovered, and then a tree search procedure is designed and embedded into the DE framework. According to the extensive computational experiments, the proposed approach is efficient in solving the job shop scheduling problem with total weighted tardiness objective.

1. Introduction

The job shop scheduling problem (JSSP) has been known as a very stubborn combinatorial optimization problem since the 1950s. In terms of computational complexity, JSSP is NP-hard in the strong sense [1]. Therefore, even for very small JSSP instances, it is by no means easy to guarantee the optimal solution. In recent years, the metaheuristics—such as genetic algorithm (GA) [2, 3], tabu search (TS) [4, 5], particle swarm optimization (PSO) [6, 7], and ant colony optimization (ACO) [8, 9]—have clearly become the research focus in practical optimization methods for solving JSSPs.

Remarkably, most recent research concentrates on the hybridization of two or more heuristics to solve JSSP. For example, ACO is hybridized with TS in [10]; GA is hybridized with VND (variable neighborhood decent) in [11]; a hybrid algorithm based on PSO, VNS

(variable neighborhood search), and SS (scatter search) is proposed in [12] for solving a biobjective JSSP. The underlying motivation for constructing hybrid algorithms is that different search mechanisms and neighborhood structures are usually complementary to each other (i.e., when one fails, the other may be effective). Therefore, a combination of several optimizers is likely to promote the probability of finding optimal solutions. A common strategy is to hybridize an algorithm good at large-scale exploration (diversification) with an algorithm good at fine-scale exploitation (intensification), which provides reliable searching ability for tackling complex solution spaces.

In a general JSSP instance, a set of n jobs J = {Jj }n=1 are to be processed on a set of m machines M = {Mk }m=1 under the following basic assumptions.

(i) There is no machine breakdown.

(ii) No preemption of operations is allowed.

(iii) The transportation time and the setup time can be neglected.

(iv) Each machine can process at most one job at a time.

(v) Each job may be processed by at most one machine at a time.

Each job has a fixed processing route which traverses all the machines in a predetermined order, and the manufacturing process of a job on a machine is called an operation. The duration time of each operation is fixed and known. Besides, a preset due date dj and a preset weight Wj are given for each job. Due date is the preferred latest finishing time of a job, so completion after this specific time will result in losses such as a worsened reputation in customers. Weights reflect the importance level of the orders from different customers, larger values suggesting higher strategic importance. If we use Cj to denote the completion time ofjob j, the objective function of JSSP can be makespan (Cmax = maxn=1{Cj}),

maximum lateness (Lmax = maxn=1 {Cj - dj}), total weighted tardiness (TWT = ^n=1 WjTj, where Tj = max{0, Cj - dj}), number of tardy jobs (U = ^n=1 Uj, where Uj = 1 if Cj > dj and Uj = 0 otherwise), and so forth.

Up till now, most research on JSSP has been focused on the makespan criterion. However, due-date-related performances are becoming more significant in the make-to-order manufacturing environment nowadays. In this sense, the total weighted tardiness measure better reflects the critical factors that affect the profits of a firm. Meanwhile, from the theoretical perspective, total weighted tardiness is more difficult to optimize than makespan. According to the concepts of computational complexity, minimizing Cmax is only a special case of minimizing TWT, which means the complexity of solving JSSP with total weighted tardiness objective is much greater than that of solving JSSP with makespan objective. Below we will use the abbreviation "TWT-JSSP" to denote the job shop scheduling problem with total weighted tardiness objective.

The rest of this paper is organized as follows. Section 2 provides a brief review on existing solution methods for TWT-JSSP and the differential evolution algorithm. Section 3 discusses the mathematical model of TWT-JSSP and a neighborhood property. Section 4 describes the design of a hybrid differential evolution algorithm for solving TWT-JSSP. Section 5 presents the computational results. Finally, Section 6 concludes the paper.

2. Literature Review

2.1. Existing Algorithms for TWT-JSSP

The contributions on TWT-JSSP are relatively rare in the literature. The only exact solution methodology is the branch-and-bound algorithm proposed by Singer and Pinedo [13], while all the following surveyed methods belong to the heuristic category.

Reference [14] presents efficient dispatching rules for sequencing the operations in TWT-JSSP, the most powerful one being the ATC (apparent tardiness cost) rule. In [15,16], the authors propose modified shifting bottleneck heuristics, in which the subproblems are solved by dispatching rules (the basic ATC rule or the BATCS rule for complex job shops). The large step random walk (LSRW) algorithm is designed in [17] and aimed at the TWT-JSSP with release times (Jm|rj1 ^ WjTj). References [2, 3] present hybrid genetic algorithms for TWT-JSSP. Reference [18] presents a rolling horizon approach for TWT-JSSP, in which each time window is scheduled with a modified shifting bottleneck heuristic. Reference [19] presents a tabu search algorithm for the generalized TWT-JSSP with release times and precedence constraints. Recently, a new meta-heuristic called electromagnetic algorithm (EM) [20] has been explored and tested on TWT-JSSP.

Although these algorithms are effective, they have not fully utilized the inherent properties of TWT-JSSP. Hence, they will not perform satisfactorily when faced with large-scale instances of the problem. Exploring the structural properties (including neighborhood properties) of TWT-JSSP is an important research direction, which also constitutes the motivation of this study.

2.2. The Differential Evolution Algorithm

The differential evolution (DE) algorithm, which was first proposed by Storn and Price [21] in the mid-1990s, is a relatively new evolutionary optimizer. Characterized by a novel mutation operator, the algorithm has been found to be a powerful tool for continuous function optimization [22]. Due to its easy implementation, quick convergence and robustness, the DE algorithm is becoming increasingly popular in recent years. A wide range of successful applications have been reported, such as the design of fixed-structure robust controllers [23], space trajectory optimization [24], multiarea economic dispatch [25], and exergoeconomic analysis and optimization [26] .

Despite the low computational complexity, DE has also been shown to have some weaknesses. In particular, DE is good at exploring the search space and locating the promising region, but it is slow at exploiting the high-quality solutions [27] . Recently, some researchers try to improve the performance of DE by hybridizing it with other local search-based algorithms. In [28], the authors propose the 2-Opt based DE (2-Opt DE) which is inspired by 2-Opt algorithms to accelerate DE. They show that 2-Opt DE can outperform the original DE in terms of solution accuracy and convergence speed. In [29], the authors incorporate the orthogonal design method into DE to accelerate its convergence rate. They show that the proposed approach outperforms the classical DE in terms of the quality, speed, and stability of the final solutions.

Due to its continuous feature, the traditional DE algorithm cannot be directly applied to scheduling problems with inherent discrete nature. Indeed, in canonical DE, each solution is represented by a vector of floating-point numbers. But for scheduling problems, each solution is a permutation of integers. To address this issue, two kinds of approaches can be identified in the literature.

(1) A transformation scheme is established to convert permutations into real numbers and vice versa [30,31]. In this way, we only need to add a few lines to the encoding and decoding procedures, and it is not necessary to change the implementation of DE itself. The advantage is that the search mechanism of DE is well preserved, while the disadvantage is the redundancy in the mapping from real to permutation spaces.

(2) The mutation and crossover operators in DE are modified to suit the permutation representation [32, 33]. Clearly, the difficulty is how to redesign these operators such that the characteristics of high-quality solutions can be inherited and exploited. The design of operators should be problem dependent and thus requires a specific analysis of the optimization problem. Therefore, the lack of generality is a disadvantage of this approach.

Despite the success on permutation flow shop scheduling problems, the application of DE to JSSP has rarely been reported. The job shop model is more complex because the processing sequences of each machine need to be optimized separately. To tackle such a difficult problem, we design a tree-based local search module which can enhance the exploitation ability of DE. To our knowledge, this is the first attempt that DE is applied to TWT-JSSP.

3. The Mathematical Model and Neighborhood Properties of TWT-JSSP

3.1. The Mathematical Model and Its Duality

We utilize the concept of disjunctive graph for formulating TWT-JSSP [34]. In the graph G(N,A,E), N = O U {0} = {0,1,2,...,n x m} is the set of nodes, where O = {1,...,n x m} corresponds to the operation set of the JSSP instance. Node 0 stands for a dummy operation which starts before all the real operations, while the starting time and the processing time of this dummy operation are both zero. A is the set of conjunctive arcs, and each conjunctive arc indicates the processing order of two operations belonging to the same job. Meanwhile, node 0 is connected with the first operation of each job with a separate conjunctive arc. In other words, if we use F(O) to denote the set of the first operations of all jobs, then for all fj e F(O), (0,fj) e A. E = UEk is the set of disjunctive arcs, where Ek represents the disjunctive arcs related with machine k. Each disjunctive arc connects two operations that should be processed by the same machine, but the processing order of the two operations is yet to be determined.

Under the disjunctive graph representation, TWT-JSSP can be described as a mixed-integer linear disjunctive programming model:

min TWT(S) = £ wU]TU]

uj eU(O)

s.t. (a) sh + pi1 < Si2 V(M2) e A,

(b) (si1 + pi1 < sh) v (si2 + pi2 < sh) V(M2>e Ek, k = 1,...,m, (31)

(c) Tuj > suj + pu, - duj Vuj e U(O),

(d) Tu, > 0 Vuj e U(O).

In the above formulation, si denotes the starting time of operation i, and the decision variable S is a vector consisting of the starting times of all the operations. pi denotes the required processing time of operation i. For the dummy operation 0, we set s0 = p0 = 0. Uj represents the last operation of job j, and thus U(O) = {u1,u2,...,un} denotes the set of ultimate operations of all the jobs. For any operation i, di and wi, respectively, denote the due date and the weight of the job to which operation i belongs. For the ultimate operation uj of job j, TUj is the tardiness of job j.

From the perspective of disjunctive graphs, finding a feasible solution to JSSP is equivalent to determining the directions of all the disjunctive arcs such that, for any (i1/i2) e E, only one of the two disjunctive inequalities in constraint (b) is satisfied. Now, let us suppose the directions of all disjunctive arcs have been determined, and we use a to denote the set of directed disjunctive arcs. In this situation, the problem (3.1) becomes a linear programming model, that is,

TWTa(S)= £ w»,Tu

u, eU(O)

s.t. (a) Si2 - Si1 > ph V(Z1,Z2) e A,

(b) Si2 - sh > ph V(i1, ¿2) e o,

(c) Tuj - Su, > puj - du, Vuj e U(O),

(d) TUi > 0 Vu, e U(O).

As suggested by [19], the dual problem of the linear program (3.2) is a maximum cost flow problem:

TCo(F)= 2 Pi1 Fhri2 + 2 (puj - du^Fujr0

(i1,i2)eAuo uj eU(O)

s.t. (e) FUj,0 < wUj Vuj e U(O),

(f) X F^ii - X Fit = 0 Vi e O,

£(£,i)eAuo Z:(i,Z)eAuouU

(g) Fi1ii2 > 0 V(i1,i2) e A u o u U.

In the above formulation, Fi1ii2 represents the flow over the arc (i1,i2); U = {(u1,0),(u2,0),...,(un,0)} = {(Uj,0)}n=1 is a newly defined arc set, each arc pointing from the last operation of job j to node 0; TC denotes the total cost of the flows.

Theorem 3.1. For the maximum cost flow problem (3.3), there exists an optimal solution F* which satisfies the following: each node (except node 0) has at most one incoming arc with nonzero flow.

Proof. See Appendix A. □

3.2. A Neighborhood Property for the Swap of Adjacent Operations

We know from the previous subsection that, given a feasible a (i.e., the directions of all disjunctive arcs), the schedule is determined, and meanwhile, a network flow graph is associated with the current schedule. The task of neighborhood search is to modify certain

parts of a to get a', so that the total weighted tardiness can be reduced, that is, TWTjmm < TWT™n (or equivalently, TC^ < TCax).

Definition 3.2 (block). A sequence of operations in a critical path is called a block if (3.1) it contains at least two operations and (3.2) the sequence includes a maximum number of operations that are consecutively processed by the same machine.

If we want to improve the schedule under the current a, we should only consider modifying the disjunctive arcs that satisfy the following two conditions: (1) the arc belongs to a certain block and (2) the arc carries positive flow in the dual network. The latter condition implies that this disjunctive arc is on the critical path of at least one tardy job, so altering this arc can possibly reduce the TWT.

Now we consider whether swapping two adjacent operations in a block can really improve the TWT. Under a given a, suppose the operations (1,2,...,z) constitute a block in the corresponding schedule, and the associated network flows are partially shown in Figure 1. The amount of flow on the outgoing disjunctive arc of operation i is denoted by xi and suppose xi > 0, for all i = 1,2,...,z - 1. We assume this dual solution satisfies the condition described in Theorem 3.1, that is, each node can have at most one incoming arc with positive flow. In this case, the input flows of these nodes all come from the input disjunctive arcs, so the input conjunctive arcs of these nodes must carry zero flow and thus they are not marked in the figure. However, each of the nodes can still have an outgoing arc with positive flow. For simplicity, these outgoing arcs with possibly positive flow are drawn in solid arrows in the figure, and the amount of flow is denoted by Fi > 0, respectively

After executing the SWAP operator, we can construct a new feasible solution to the dual problem by adjusting the amount of flows on each arc. In the new network, the flow on the outgoing disjunctive arc of operation i is denoted by yi. In order to enable accurate analysis, we keep all the Fi values constant in the process of adjusting the local flows within the block (so that the flow equilibrium outside the considered scope will not be affected).

Theorem 3.3. Suppose a block with consecutive positive flows contains the following operations: (1,2,...,a,fi,...,z).Ifthe condition

is satisfied, then swapping the two operations a and ¡5 will not lead to improvement on the objective function (TWT).

Therefore, the function of Theorem 3.3 is that it helps to exclude some nonimproving moves in the local search process, so that the optimization efficiency can be improved. However, such a greedy mechanism must be combined with a large-scale randomized search (like DE) in order not to get trapped by local optima.

4. The Hybrid DE Algorithm for TWT-JSSP

4.1. The DE Optimization Framework

Like other evolutionary optimizers, DE is a population-based stochastic global optimizer. In DE, each individual in the population is represented by a D-dimensional real vector

Pa > Pl

Proof. See Appendix B.

xi = (xi,1, xir2/..., Xiro), i = 1,..., SN, where SN is the population size. In each iteration, DE employs the mutation and crossover operators to generate new candidate solutions and then applies a one-to-one selection policy to determine whether the offspring or the parent can survive to the next generation. This process is repeated until a preset termination criterion is met.

The standard DE algorithm can be described as follows. Step 1 (Initialization). Randomly generate a population of SN solutions, {xi,..., xSN}. Step 2 (Mutation). For i = i,..., SN, generate a mutant solution vi as follows:

Vi = Xbest + F x (xn - Xr2), (4.1)

where Xbest denotes the best solution in the current population; ri and r2 are randomly selected from {1,..., SN} such that ri = r2 /i; F > 0 is a weighting factor.

Step 3 (Crossover). For i = 1,..., SN, generate a trial solution ui as follows:

( Vij if ¿j < CR or j = rj,

li,j = i

Uijj = \ (j = 1.....D (4.2)

^ Xi,j otherwise,

where rj is an index randomly selected from {1,...,D} to guarantee that at least one dimension of the trial solution ui differs from its parent xi; lj is a random number generated from the uniform distribution M[0,1]; CR e [0,1] is the crossover parameter.

Step 4 (Selection). If ui is better than xi, let xi = ui.

Step 5. If the termination condition is not satisfied, go back to Step 2.

According to the algorithm description, DE has three important parameters, that is, SN, F, and CR. In order to ensure a good performance of DE, the setting of these parameters should be reasonably adjusted based on specific optimization problems.

It is worth noting that there exist other variants of the DE algorithm with respect to the mutation and crossover method [35]. In fact, the above procedure is noted as "DE/best/1/bin" in the literature. If we change the mutation policy, we can obtain the "DE/rand/1/*" variant: vi = xri + F x (xr2 - xr3) where the base solution (xri) is also randomly chosen from the population.

4.2. The Encoding and Decoding Schemes

The encoding scheme used here is based on the random key representation and the smallest position value (SPV) rule. Each solution is described by n x m continuous values, and in the decoding process, this set of values will be transformed to a permutation of operations by the SPV rule.

Formally, let Xi = (xir1,Xir2,...,xi/nxm) denote the ith solution, where xi/d is the position value of the ith solution with respect to the dth dimension (d = 1,...,n x m). To decode a solution, the SPV rule is applied to sort the position values within a solution and then determine the relative positions of the corresponding operations. This process is exemplified in Table 1 for a problem containing 6 operations (suppose N = {1,..., 6}). In this example, the smallest position value (-0.99) resides in the second dimension, and thus, the operation "2" comes first in the resulting sequence (the third row of the table). After sorting all these dimension values constituting solution xi, the operation sequence ni = (2,5,4,1,6,3) can be obtained.

Then, the decoded operation sequence ni can be used to build an active schedule for TWT-JSSP. The detailed schedule building algorithm is described as follows.

Input. An operation sequence n.

Stepl. Let Q(1) = O = {1,...,nm} (the set of all operations), R(1) = F(O) = {f1,...,fn} (the set of first operations of each job). Set t = 1.

Step 2. Find the operation i* = argminieR(t) {ri + pi}, and let m* be the index of the machine on which this operation should be processed. Use B(t) to denote all the operations from R(t) which should be processed on machine m*.

Step 3. Delete from B(t) the operations that satisfy ri > ri* + pi*.

Step 4. Find the operation o which belongs to B(t) and meanwhile ranks first in n. Schedule operation o on machine m* at the earliest possible time.

Step 5. Let Q(t + 1) = Q(t) \ {o}, R(t + 1) = R(t) \ {o} U {JS(o)}, where JS(o) is the immediate job successor of operation o.

Step 6. If Q(t + 1) / 0, set t ^ t + 1 and go to Step 2. Otherwise, the decoding procedure is terminated.

In the above description, the release time ri equals the completion time of the immediate job predecessor of operation i. So (ri + pi) is the earliest possible completion time of operation i. Q(t) represents the set of operations yet to be scheduled at iteration t, while R(t) represents the set of ready operations (whose job predecessors have all been scheduled) at iteration t.

Table 1: Illustration of the decoding process using SPV.

Dim. d

1.80 2

-0.99 5

3.01 4

0.72 1

-0.45 6

4.3. The Local Search Module Based on Tree Search

As mentioned in Section 2.2, DE alone does not have satisfactory "exploitation" ability. In order to cope with complex search spaces, it is usually required that a local search module be embedded into the general framework of DE. In this paper, we devise a tree-based local search optimizer by borrowing ideas from the filter-and-fan algorithm in [36]. In each iteration of DE, the local search is carried out for the best e% of solutions in the current population immediately after the selection phase. Thus, e is an important parameter for adjusting the frequency of local search and achieving a balance between exploration and exploitation.

In order to improve a selected solution, the proposed tree search algorithm generates compound neighborhoods for the solution based on the SWAP operator. The algorithm searches in a breadth-first manner, but unlike the beam search heuristic, each tree node represents a complete schedule rather than a partial schedule. The tree is gradually expanded by applying the SWAP operator iteratively. In each trial, the pair of operations to be swapped is randomly chosen from the critical blocks in the current schedule. To avoid repeated search, the reverse of the previous SWAP operator is prohibited. For example, if the current schedule is obtained by swapping (a, ¡5), then a swap on (p, a) (if it is in the critical block again) should be tabooed in the immediate expansion from the current node. In the search process, we can utilize the relevant neighborhood property (Theorem 3.3) to promote the efficiency. In other words, when the theorem predicts that a certain move will not lead to improvement, then the action is canceled and another neighborhood move will be tried.

The proposed tree search algorithm is heuristic in nature, because it is computationally infeasible to enumerate all the possible swaps of the critical operations. Indeed, the algorithm only makes n2 trails for each solution other than the "root" solution. Meanwhile, we need to apply a pruning strategy in the breadth-first search process in order to control the computational time. In particular, only the n1 best solutions on each level of the tree will be exploited in the subsequent trials. We implement the tree search algorithm using the queue data structure as follows.

Input. A selected base solution a. Step 1. Let l = 1. Create an empty queue.

Step 2. Try applying the SWAP operator on n1 different locations in the critical blocks of a. Let the produced n1 solutions enter the queue. Denote the best solution among these by a*.

Step 3. Perform the following steps for n1 times.

(3.1) Take out the first solution in the queue, denoted by af.

(3.2) Try applying the SWAP operator on n2 different locations in the critical blocks of af. Let the produced n2 solutions enter the queue.

Step 4. Retain the best n solutions in the queue, and delete the rest ni(n - 1) ones. Denote the currently best solution in the queue as a*. If TWT(o*) < TWT(o*), let o* = a*.

Step 5. Let I ^ I + 1. If I < L, go back to Step 3.

Step 6. Output o*.

According the above description, the tree has n nodes on level 1, each of which will be expanded to n2 nodes on level 2. Thus, there are totally n x n2 nodes on level 2. However, only the best n nodes among these have a chance to be exploited and further expanded to level 3, while the rest will be abandoned. In this way, the number of nodes being considered on each level is controlled at n x n2 and will not increase exponentially. After expanding to L levels, the algorithm is terminated.

For example, if we set n = 3, n2 = 2, and L = 4, the entire tree structure may look like Figure 2. The starting solution is denoted by o0. On each level from I = 2, three promising solutions are expanded further while the other three are discarded. The best solution found by this local search endeavor may appear on the last level.

The complexity of the tree search algorithm can be briefly analyzed as follows. There are roughly n x L solutions that need to be expanded in the tree search process. For each expansion, the algorithm should first find the critical paths related with the tardy jobs. According to the Bellman's algorithm [37], this can be done in O(nm) time. Then, the neighborhood operator has to be applied for approximately n n2 x L times in the tree search process. Therefore, the computational complexity of the algorithm can be described as O(n^_Lnm + n^2L).

Compared with the extensive exploration behavior of DE, the proposed tree search algorithm works in a greedy manner. First, each neighborhood move is performed on the critical blocks of the corresponding schedule, because only such moves are possible to produce improvements. Second, the neighborhood property is utilized to exclude some unpromising moves when there are more than n2 candidates to choose from. Third, on each level of the tree, only n best solutions from the totally n x n2 will be further considered. These features make the tree search algorithm extremely concentrated, which provides a complementary mechanism to DE's search process. Thus, using the tree search as an embedded local search module is beneficial for enhancing the exploitation capability and the overall performance of the hybrid DE.

5. The Computational Results

5.1. The Test Problems and Parameter Setting

In order to test the performance of the proposed hybrid DE (abbreviated as HDE hereinafter), randomly generated TWT-JSSP instances with different sizes are used in the computational experiment. For a specific problem size, the processing route of each job is a random permutation of the m machines, and the required processing time of each operation follows a uniform distribution U[1,99] and takes only integer values. The due date of each job is set based on the total processing time of the job as dj = [u x f x ^ie0j pt\. In this expression, u ~ M[1,1.1 x max(1,n/m}] is a random number uniformly distributed in the interval [1,1.1 x max{1,n/m}], and Oj denotes the set of operations that constitute job j. f e {1.1,1.3,1.5} is a coefficient that reflects the tightness level of the due date setting. The weight of each job (integer values) follows a uniform distribution, that is, U[1,10].

Figure 2: A possible tree structure under n = 3, n2 = 2, and L = 4.

In order to reduce random errors in the computation, we randomly generate 5 different instances for each due date tightness and scale. In particular, 10 problem sizes (from 100 operations to 500 operations) and 3 due date levels (f = 1.5 for loose due dates, f = 1.3, for moderate due dates and f = 1.1 for tight due dates) are tested, so the total number of instances is 150. The first two columns of Tables 2, 3 and 4 list the scales of the 10 instance sets. We compare the performance of the proposed HDE with the hybrid genetic algorithm GLS for TWT-JSSP [2]. The algorithms have been implemented using Visual C++ 2010 and tested on a platform of Intel Core i5-750 2.67 GHz, 3 GB RAM, and Windows 7.

In this experiment, the parameters of HDE are set as follows.

(i) The DE parameters: SN = 50, F ~ U(0.5,1.0) (the "dither" strategy [38]), CR = 0.9.

(ii) The local search parameters: e% = 50%, n = 18, n = 9, L = 14.

(iii) Termination criterion: the best-so-far solution has not been updated for 50 generations (or controlled by the external computational time limit).

5.2. The Results and Discussions

The computational results are processed in the following way before listed in Tables 2, 3 and 4. For each instance i, HDE and GLS are, respectively, run for 10 independent times. The best objective value obtained in the 10 runs by algorithm a (a e {HDE, GLS}) is denoted by TWTj(a), the worst denoted by TWT^(a), and the mean denoted by TWT^(a).

Next, we can calculate the relative objective values by taking TWT^(HDE) as reference: RTWTb(a) = TWTb(a)/TWTb(HDE), RTWT^(a) = TWT^(a)/TWTb(HDE), RTWT^(a) = TWTJn(a)/TWTb (HDE).

Finally, when the above steps have been performed for each instance in the considered instance set, we calculate the average relative values (over this set) as RTWTb(a) = (1/5) £5=1RTWTb(a), RTWTro(a) = (1/5) ^5=1 RTWT^(a), and RTWTm(a) = (1/5) XiU RTWX^(a). In this way, the computational results are summarized in the tables with respect to each instance set.

Meanwhile, with respect to the first instance of each set under f = 1.3 (marked with "#-1" where # represents the index of instance sets), we record the average computational time (over 10 independent runs) of the two algorithms and plot the data as bar graphs in Figure 3.

Mathematical Problems in Engineering Table 2: The computational results under loose due dates (f = 1.5).

Instance Size (n x m) HDE GLS

set no. RTWTb RTWTw RTWTm RTWTb RTWTw RTWTm

1 10 x 10 1.000 1.150 1.021 1.002 1.187 1.052

2 20 x 5 1.000 1.128 1.085 1.004 1.136 1.073

3 10 x 20 1.000 1.149 1.099 1.008 1.144 1.117

4 20 x 10 1.000 1.129 1.065 1.010 1.113 1.098

5 20 x 15 1.000 1.120 1.054 1.000 1.118 1.075

6 50 x 6 1.000 1.126 1.072 1.017 1.103 1.087

7 20 x 20 1.000 1.115 1.034 1.013 1.119 1.068

8 40 x 10 1.000 1.119 1.062 1.027 1.122 1.076

9 50 x 10 1.000 1.103 1.027 1.015 1.130 1.100

10 100 x 5 1.000 1.121 1.078 1.026 1.126 1.092

Table 3: The computational results under moderate due dates (f = 1.3).

Instance Size (n x m) HDE GLS

set no. RTWTb RTWTw RTWTm RTWTb RTWTw RTWTm

1 10 x 10 1.000 1.181 1.090 1.085 1.248 1.124

2 20 x 5 1.000 1.146 1.099 1.084 1.266 1.151

3 10 x 20 1.000 1.165 1.117 1.010 1.205 1.144

4 20 x 10 1.000 1.200 1.153 1.005 1.306 1.178

5 20 x 15 1.000 1.218 1.101 1.094 1.279 1.154

6 50 x 6 1.000 1.164 1.115 1.095 1.252 1.139

7 20 x 20 1.000 1.184 1.110 1.087 1.324 1.167

8 40 x 10 1.000 1.166 1.086 1.086 1.214 1.118

9 50 x 10 1.000 1.147 1.096 1.010 1.262 1.130

10 100 x 5 1.000 1.195 1.086 1.080 1.329 1.159

Table 4: The computational results under tight due dates (f = 1.1).

Instance Size (n x m) HDE GLS

set no. RTWTb RTWTw RTWTm RTWTb RTWTw RTWTm

1 10 x 10 1.000 1.235 1.112 1.092 1.249 1.123

2 20 x 5 1.000 1.207 1.113 1.083 1.253 1.158

3 10 x 20 1.000 1.262 1.115 1.028 1.343 1.132

4 20 x 10 1.000 1.235 1.135 1.091 1.318 1.181

5 20 x 15 1.000 1.252 1.139 1.033 1.274 1.179

6 50 x 6 1.000 1.213 1.119 1.061 1.329 1.180

7 20 x 20 1.000 1.219 1.126 1.099 1.329 1.195

8 40 x 10 1.000 1.180 1.124 1.097 1.367 1.216

9 50 x 10 1.000 1.194 1.137 1.023 1.359 1.213

10 100 x 5 1.000 1.256 1.162 1.010 1.324 1.209

1-1 2-1 3-1 4-1 5-1 6-1 7-1 8-1 9-1 10-1 Instance no.

■ HDE

■ GLS

Figure 3: Comparison of computational time.

The following comments can be made according to the results displayed in Tables 2-4 and Figure 3.

(1) For most instances, the gap between the best and the worst solutions obtained by HDE is smaller than the gap obtained by GLS, which implies that HDE performs more robustly in different executions and for different scheduling instances.

(2) When the due dates are tighter (f = 1.3 or 1.1), we find that the solution quality of GLS is considerably inferior to that of HDE. This is because under tight due dates, many jobs are prone to be tardy, and the optimization difficulty increases systematically. In this situation, the proposed tree search mechanism exhibits greater advantage. The search is more effective because the neighborhood moves are conducted on the critical blocks and a part of unpromising moves are eliminated with simple calculations. To certain extent, HDE has overcome the blindness of traditional local search, and thus it can access the high-quality solutions with larger probability.

(3) By comparing the computational time, we see that, for most instances, the consumed time of HDE is less than that of GLS. More remarkably, the increasing speed of computational time with instance size is slower on the part of HDE. This is because the mutation and crossover based on random key representation in HDE is more efficient than the crossover and mutation based on operation sequence representation in GLS. Meanwhile, the phenomenon also verifies the fact that the tree-based local search is able to accelerate the convergence of DE (note that the termination criterion of HDE is decided by the convergence status).

5.3. Influence of the Parameter Settings

The following experiments are designed for observing the impact of parameter settings on the final solution quality of HDE. A 10 x 10 instance with f = 1.3 is used in these experiments. The termination criterion adopted here is an exogenously given computational time limit. When one parameter is being tested, the other parameters are all fixed at their recommended values given in Section 5.1.

The population size (SN) is one of the key parameters for the HDE algorithm. If SN is large, many solutions need to be maintained in the population, which increases the computational burden of solution decoding and evaluation. If SN is small, more generations can be implemented within the DE framework, but the limited population diversity will impair the effectiveness of mutation and crossover. Therefore, under a given computational time limit, the selection of SN can influence the overall optimization performance. In this experiment, we fix the available computational time at three different levels and then identify the most suitable value of SN in each scenario. The time limit is set as 20 sec (tight), 40 sec (moderate), and 60 sec (loose), respectively. The computational results are displayed in Figure 4, where the vertical axis gives the average objective value obtained from 20 independent executions of the proposed HDE under each SN.

According to the results, the selection of SN has a noticeable impact on the solution quality. Generally, the solution quality will deteriorate if SN is either too small or too large. But the best SN varies with different time constraints. If the time budget is tight or moderate, the best population size is 40 ~ 50. If the time resource is abundant, the best population size is 60 ~ 70. Thus, setting SN = 50 is reasonable for ordinary uses.

Another important parameter for HDE is e%, the percentage of solutions that undergo the local search process. A reasonable selection of e will result in an effective balance between exploration and exploitation. The computational results for this experiment are displayed in Figure 5.

According to the results, the setting of e has a considerable impact on the solution quality, especially when the computational time is scarce (20 sec). A small e means that only a few solutions in each generation can be improved by the local search, which has little effect on the entire population. A large e suggests that too much time is consumed on local search, which may reduce the normal function of DE.

The best setting of e under each constraint level is 70 (for tight time budget), 60 (for moderate time budget) and 40 (for loose time budget). When the exogenous restriction on computational time is tight, DE has to rely on frequent local search to find good solutions. This is because, in the short term, the tree-based local search is more efficient than DE's mechanism (mutation and crossover) in improving a solution. However, the price to pay is possibly a premature convergence of the optimization process due to the greedy nature of the tree search. On the other hand, when the computational time is sufficient, DE will prefer a larger number of generations to conduct a systematic exploration of the solution space. In this case, the local search need not be used very frequently, otherwise the steady searching process may be disturbed. Overall, a recommended value for e is 50.

Finally, we focus on the local search parameters n1, n2, and L. Following the suggestion of [36], we fix the relationship between n and n as n = 2^2 and then test the influence of L and n1 on the final solution quality. The tested ranges are L e (8,10,12,...,24}, n e (10,12,14,...,26}, with a step size of 2, which leads to 81 combinations. Again, we choose the first instance of each set (#-1) under f = 1.3 for this experiment. The computational time limit is set as 0.6 nm sec for an n x m instance. The results are displayed as relative values in Figure 6.

The following comments can be made according to the results.

(1) The tree search parameters produce a remarkable effect on the final solutions of HDE, which confirms that such a local search optimizer is effective in improving the performance of DE. When L and n both take small values, the tree search process does not have chance to examine a sufficient number of neighborhood solutions

40 -35 -30

10 20 30 40 50 60 70 80 90 100 SN

20s -■- 40s 60s

Figure 4: The influence of parameter SN.

20s -■- 40s -A- 60s

Figure 5: The influence of parameter e.

before termination, which results in solution quality decline. On the other hand, when L and n1 are both set large, the overall performance also worsens because the exploitation is consuming too much time and crowds out the exploration process of DE.

(2) The value of L (representing the depth of the search tree) and the value of ni (representing the width of the search tree) should be coordinated in order to guarantee satisfactory solution quality. When the tree is growing too wide (resp., deep) but not sufficiently deep (resp., wide), the overall effectiveness will

Figure 6: The influence of parameters L and ^1(= 2^2).

deteriorate. Based on the experiments, the recommended settings are L = 14 and n\ = 18 (thus ^2 = 9).

6. Conclusion

In this paper, we propose a hybrid differential evolution algorithm for the job shop scheduling problem with the objective of minimizing total weighted tardiness. Based on the mathematical programming models, we show a neighborhood property for the swap of adjacent operations in a critical block, which can be used to exclude some nonimproving neighborhood moves. Then, a tree-based local optimizer is designed and embedded into the DE algorithm in order to promote the exploitation function. The tree search algorithm generates compound neighborhood for the selected solution and therefore it helps DE to exploit the relatively high-quality solutions. Finally, the computational results verify the effectiveness and efficiency of the proposed hybrid approach.

The future research can be carried out from the following aspects.

(1) It is worthwhile to investigate the new and more effective neighborhood properties for TWT-JSSP. This could provide a deeper insight into the inherent nature of TWT-JSSP and facilitate the design of metaheuristics like DE.

(2) It is worthwhile to consider other encoding schemes and mutation/crossover mechanisms of the DE algorithm (such as the discrete DE), which may be more suitable for the application of the neighborhood properties.

Appendices

A. Proof of Theorem 3.1

Proof. We prove the theorem by construction.

First, it is noticeable that the unit cost of each arc in A u a is equal to the processing time of the operation that is connected to the tail of the arc, that is, cil/i2 = pil, which means the largest cost path from node 0 to node Uj is exactly the same as the critical path from operation 0 to operation Uj, and the total cost of this path equals the length of the critical path.

Then, because constraint (f) requires that the total input flow should equal the total output flow for every node, a feasible solution F to this problem must be composed of several cycle flows. However, we know that G(N,A u a) does not contain any cycles, so each cycle in F must contain at least one arc from U. On the other hand, we can see that the arcs in U all point to the same node 0, and because a cycle cannot include duplicate nodes, it is clear that each cycle in F contains at most one arc from U. Finally, it is concluded that each cycle contains exactly one arc from U.

Based on the previous discussions, we can construct an optimal solution F* by the following steps.

Step 1. Initialize the flow on each arc to be 0, and let j = 1.

Step 2. Find the unique critical path from operation 0 to Uj, denoted by P*(0,Uj) (the uniqueness is guaranteed by a simple rule detailed in the following).

Step 3. If the total unit cost (i.e., length) of the path P*(0,Uj) plus the unit cost of arc (uj, 0) is greater than 0, then add a flow amount of wUj to arc (Uj, 0) and each arc in P*(0,Uj) (notice that the capacity limit for arc (Uj, 0) is wUj).

Step 4. Set j ^ j + 1. If j < n, then return to Step 2; otherwise, terminate the algorithm.

In such a constructed solution F*, each node except 0 has at most one positive-flow incoming arc. This is because the flows are only distributed on the arcs belonging to critical paths, and meanwhile, only one critical path is considered from node 0 to any other node i.

The last issue is how to maintain the uniqueness of the critical path to a certain node. Here we use a rule called "machine predecessor first." Indeed, when looking for a critical path from 0 to Uj, we begin from Uj and move backward. In each step, we must select an operation whose completion time equals the starting time of the current operation i from its immediate job predecessor JP(i) and its immediate machine predecessor MP(i). Then, if CjP(i) = si and Cmp© = si both hold, the rule requires that the machine predecessor should be selected as the next current operation. This tie-breaking rule guarantees the uniqueness of the critical path to any node. □

B. Proof of Theorem 3.3

Proof. As Figure 1 shows, the flows in the initial network are denoted by xi (xi > 0), so we have the flow equilibrium condition:

Xa - Fß + Xß, Xß - Fß+i + Xß+i.

Now we are swapping the operations a and p. After the SWAP operation is performed, the flows on the relevant arcs are denoted by y. So the following equilibrium equations must be satisfied:

ya-1 = Fp + yp,

yp = Fa + ya, (B.2)

ya = Fp+1 + yp+1.

It is noticed that, in order to keep the equilibrium of the remaining nodes a - a (1 < a < a) and p + b (1 <b < z - p), we have ya-a = xa-a and yp+b-\ = xp+b-1. Solving (B.2) together with (B.1) yields

yp = Xa-1 + Xp - Xa,

ya = Xp.

We can ensure yp > 0, because Theorem 3.1 implies xa-1 > xa.

The difference in the total cost of the flows [yi] and the original flows [xi] is

AC = C{y) - C(x) = ^piyi - XPixi

i=a i=a

= paxp + pp(xa-1 + xp - xa) - Paxa - Ppxp (B.4)

= Pp(xa-1 - xa) - Pa(xa - xp) = PpFa - PaFp.

Since the flows outside this block are all kept unchanged, the difference in the total cost of the whole network resulted from executing SWAP(a, p) is the same as AC calculated for this subnetwork, that is, TC(y) - TC(x) = C(y) - C(x).

Therefore, if AC > 0, then TC^T > TC(y) > TC(x) = TCmax (a denotes the original set of directed disjunctive arcs while a' denotes the new arc set obtained after performing SWAP (a, p)). The last "=" is because we assume the original network is related with the optimal solution to the dual problem under a. In fact, TCmax = TWTmin and TC^T = TWTmin. So it concludes that, when AC > 0 (^ Fa/Pa > Fp/Pp), swapping a and p will not improve the current solution (TWTmin > TWTmin). □

Acknowledgment

This work is supported by the National Natural Science Foundation of China under Grant nos. 61104176, 60874071.

References

[1] J. K. Lenstra, A. H. G. R. Rinnooy Kan, and P. Brucker, "Complexity of machine scheduling problems," Annals of Discrete Mathematics, vol. 1, pp. 343-362,1977.

I. Essafi, Y. Mati, and S. Dauzere-Peres, "A genetic local search algorithm for minimizing total weighted tardiness in the job-shop scheduling problem," Computers and Operations Research, vol. 35, no. 8, pp. 2599-2616, 2008.

H. Zhou, W. Cheung, and L. C. Leung, "Minimizing weighted tardiness of job-shop scheduling using a hybrid genetic algorithm," European Journal of Operational Research, vol. 194, no. 3, pp. 637-649, 2009. E. Nowicki and C. Smutnicki, "An advanced tabu search algorithm for the job shop problem," Journal of Scheduling, vol. 8, no. 2, pp. 145-159, 2005.

J. Q. Li, Q. K. Pan, P. N. Suganthan, and T. J. Chua, "A hybrid tabu search algorithm with an efficient neighborhood structure for the flexible job shop scheduling problem," The International Journal of Advanced Manufacturing Technology, vol. 52, no. 5-8, pp. 683-697, 2011.

D. Y. Sha and C.-Y. Hsu, "A hybrid particle swarm optimization for job shop scheduling problem," Computers & Industrial Engineering, vol. 51, no. 4, pp. 791-808, 2006.

G. Moslehi and M. Mahnam, "A Pareto approach to multi-objective flexible job-shop scheduling problem using particle swarm optimization and local search," International Journal of Production Economics, vol. 129, no. 1, pp. 14-22, 2011.

M. Seo and D. Kim, "Ant colony optimisation with parameterised search space for the job shop scheduling problem," International Journal of Production Research, vol. 48, no. 4, pp. 1143-1154, 2010. L. N. Xing, Y. W. Chen, P. Wang, Q. S. Zhao, and J. Xiong, "A knowledge-based ant colony optimization for flexible job shop scheduling problems," Applied Soft Computing Journal, vol. 10, no. 3, pp. 888-896, 2010.

K. L. Huang and C. J. Liao, "Ant colony optimization combined with taboo search for the job shop scheduling problem," Computers and Operations Research, vol. 35, no. 4, pp. 1030-1046, 2008. J. Gao, L. Sun, and M. Gen, "A hybrid genetic and variable neighborhood descent algorithm for flexible job shop scheduling problems," Computers & Operations Research, vol. 35, no. 9, pp. 2892-2907, 2008.

R. Tavakkoli-Moghaddam, M. Azarkish, and A. Sadeghnejad-Barkousaraie, "A new hybrid multi-objective Pareto archive PSO algorithm for a bi-objective job shop scheduling problem," Expert Systems with Applications, vol. 38, no. 9, pp. 10812-10821, 2011.

M. Singer and M. Pinedo, "A computational study of bound techniques for the total weighted tardiness in job shops," IIE Transactions, vol. 30, no. 2, pp. 109-118,1998.

E. Kutanoglu and I. Sabuncuoglu, "An analysis of heuristics in a dynamic job shop with weighted tardiness objectives," International Journal of Production Research, vol. 37, no. 1, pp. 165-187,1999. S.J. Mason, J. W. Fowler, and W. M. Carlyle, "A modified shifting bottleneck heuristic for minimizing total weighted tardiness in complex job shops," Journal of Scheduling, vol. 5, no. 3, pp. 247-262, 2002. L. Monch and R. Driefiel, "A distributed shifting bottleneck heuristic for complex job shops," Computers and Industrial Engineering, vol. 49, no. 3, pp. 363-380, 2005.

S. Kreipl, "A large step random walk for minimizing total weighted tardiness in a job shop," Journal of Scheduling, vol. 3, no. 3, pp. 125-138, 2000.

M. Singer, "Decomposition methods for large job shops," Computers & Operations Research, vol. 28, no. 3, pp. 193-207, 2001.

K. M. J. Bontridder, "Minimizing total weighted tardiness in a generalized job shop," Journal of Scheduling, vol. 8, no. 6, pp. 479-496, 2005.

R. Tavakkoli-Moghaddam, M. Khalili, and B. Naderi, "A hybridization of simulated annealing and electromagnetic-like mechanism for job shop problems with machine availability and sequence-dependent setup times to minimize total weighted tardiness," Soft Computing, vol. 13, no. 10, pp. 995-1006, 2009.

R. Storn and K. Price, "Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces," Journal of Global Optimization, vol. 11, no. 4, pp. 341-359,1997. E. Mezura-Montes, J. Velazquez-Reyes, and C. A. Coello Coello, "Modified differential evolution for constrained optimization," in Proceedings of the IEEE Congress on Evolutionary Computation, pp. 25-32, July 2006.

L. Wang and L.-P. Li, "Fixed-structure Hot controller synthesis based on differential evolution with level comparison," IEEE Transactions on Evolutionary Computation, vol. 15, no. 1, pp. 120-129, 2011. M. Vasile, E. Minisci, and M. Locatelli, "An inflationary differential evolution algorithm for space trajectory optimization," IEEE Transactions on Evolutionary Computation, vol. 15, no. 2, pp. 267-281, 2011.

[25] M. Sharma, M. Pandit, and L. Srivastava, "Reserve constrained multi-area economic dispatch employing differential evolution with time-varying mutation," International Journal of Electrical Power and Energy Systems, vol. 33, no. 3, pp. 753-766, 2011.

[26] V. C. Mariani, L. S. Coelho, and P. K. Sahoo, "Modified differential evolution approaches applied in exergoeconomic analysis and optimization of a cogeneration system," Expert Systems with Applications, vol. 38, no. 11, pp. 13886-13893, 2011.

[27] N. Noman and H. Iba, "Accelerating differential evolution using an adaptive local search," IEEE Transactions on Evolutionary Computation, vol. 12, no. 1, pp. 107-125, 2008.

[28] C.-W. Chiang, W.-P. Lee, and J.-S. Heh, "A 2-Opt based differential evolution for global optimization," Applied Soft Computing Journal, vol. 10, no. 4, pp. 1200-1207, 2010.

[29] W. Gong, Z. Cai, and L. Jiang, "Enhancing the performance of differential evolution using orthogonal design method," Applied Mathematics and Computation, vol. 206, no. 1, pp. 56-69, 2008.

[30] G. Onwubolu and D. Davendra, "Scheduling flow shops using differential evolution algorithm," European Journal of Operational Research, vol. 171, no. 2, pp. 674-692, 2006.

[31] B. Qian, L. Wang, R. Hu, W.-L. Wang, D.-X. Huang, and X. Wang, "A hybrid differential evolution method for permutation flow-shop scheduling," The International Journal of Advanced Manufacturing Technology, vol. 38, no. 7, pp. 757-777, 2008.

[32] Q.-K. Pan, M. F. Tasgetiren, and Y.-C. Liang, "A discrete differential evolution algorithm for the permutation flowshop scheduling problem," Computers & Industrial Engineering, vol. 55, no. 4, pp. 795-816, 2008.

[33] L. Wang, Q.-K. Pan, P. N. Suganthan, W.-H. Wang, and Y.-M. Wang, "A novel hybrid discrete differential evolution algorithm for blocking flow shop scheduling problems," Computers &Operations Research, vol. 37, no. 3, pp. 509-520, 2010.

[34] S. D. Wu, E. S. Byeon, and R. H. Storer, "A graph-theoretic decomposition of the job shop scheduling problem to achieve scheduling robustness," Operations Research, vol. 47, no. 1, pp. 113-124,1999.

[35] K. V. Price, R. M. Storn, and J. A. Lampinen, Differential Evolution: A Practical Approach to Global Optimization, Springer, Berlin, Germany, 2005.

[36] C. Rego and R. Duarte, "A filter-and-fan approach to the job shop scheduling problem," European Journal of Operational Research, vol. 194, no. 3, pp. 650-662, 2009.

[37] R. Bellman, "On a routing problem," Quarterly of Applied Mathematics, vol. 16, no. 1, pp. 87-90,1958.

[38] U. K. Chakraborty, Advances in Differential Evolution, Springer, Berlin, Germany, 2008.

Copyright of Mathematical Problems in Engineering is the property of Hindawi Publishing Corporation and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.