Machine Scheduling to Minimize Makespan and Maximum Just-in-Time Deviations

This paper studies the unrelated parallel machine scheduling problem with three minimization objectives - makespan, maximum earliness, and maximum tardiness (MET- UPMSP). The last two objectives combined are related to just-in- time (JIT) performance of a solution. Three hybrid algorithms are presented to solve the MET-UPMSP: reactive GRASP with path relinking, dual-archived memetic algorithm (DAMA), and SPEA2. In order to improve the solution quality, min-max matching is included in the decoding scheme for each algorithm. An experiment is conducted to evaluate the performance of the three algorithms, using 100 (jobs) x 3 (machines) and 200 x 5 problem instances with three combinations of two due date factors - tight and range. The numerical results indicate that DAMA performs best and GRASP performs second for most problem instances in three performance metrics: HVR, GD, and Spread. The experimental results also show that incorporating min-max matching into decoding scheme significantly improves the solution quality for the two population-based algorithms. It is worth noting that the solutions produced by DAMA with matching decoding can be used as benchmark to evaluate the performance of other algorithms.


INTRODUCTION
In production scheduling, management concerns are often multi-dimensional.In order to reach an acceptable compromise, one has to measure the quality of a solution on all important criteria.This concern has led to the development of multicriterion scheduling [1].During scheduling, consideration of several criteria will provide the decision maker with a more practical solution.In production scheduling, objectives under considerations often include system utilization or makespan, total machining cost or workload, JIT related costs (earliness and tardiness penalties), total weighted flow time, and total weighted tardiness.The goal of total weighted flow time is to lower the work-in-process inventory cost during the production process, while the goal of just-in-time is to minimize producer and customer dissatisfactions towards delivery due dates.
Parallel machine models are a generalization of single machine scheduling, and a special case of flexible flow shop.Parallel machine models can be classified into three cases: identical, uniform, and unrelated (UPMSP).In the UPMSP case, machine i may finish job 1 quickly but will require much longer with job 2; on the other hand, machine j may finish job 2 quickly but will take much longer with job 1.In practice, UPMSPs are often encountered in production environments; for instance, injection modeling and LCD manufacturing [2], wire bonding workstation in integrated-circuit packaging manufacturing [3], etc.Moreover, many manufacturing processes are flexible flow shops (FFS) which are composed of UPMSP at each stage: PCB assembly and fabrication [4][5][6], ceramic tile manufacturing Ruiz and Maroto [7].Jungwattanakit et al. [8] proposed a genetic algorithm (GA) for FFS with unrelated parallel machines and a weighted sum of two objectives -makespan and number of tardy jobs.The numerical results indicate that the GA outperforms dispatching rule-based heuristics.Davoudpour and Ashrafi [9] employed a greedy random adaptive search procedure (GRASP) to solve the FFS with a weighted sum of four objectives.
Over the years, UPMSPs with a single objective have been widely studied.For a survey of parallel machine scheduling on various objectives and solution methods, we refer to Logendran et al. [10] and Allahverdi et al. [11].In contrast, there are relatively few studies on UPMSPs considering multiple objectives.T'kindt et al. [12] studied an UPMSP glass bottle manufacturing, with the aim of simultaneously optimizing workload balance and total profit.Cochran et al. [13] introduced a two-phase multi-population genetic algorithm to solve multi-objective parallel machine scheduling problems.Gao [14] proposed an artificial immune system to solve the UPMSPs to simultaneously minimizing the makespan, total earliness and tardiness penalty.For further references regarding multicriteria UMPSPs, refer to Hoogeveen [1].
In this paper, we consider a multi-objective unrelated parallel machine scheduling problems aiming to simultaneously minimize three objectivesmakespan, maximum earliness, and maximum tardiness.Hereafter we shall refer to this problem as MET-UPMSP, where the latter two objectives are used to evaluate the just-in-time performance of a schedule.This paper is organized as follows: Section 2 describes the problem MET-UPMSP; Section 3 presents the algorithms for MET-UPMSP; Section 4 introduces several performance metrics and analyzes experimental results; Section 5 provides concluding remarks.www.ijarai.thesai.org

II. PROBLEM DESCRIPTION
The MET-UPMSP has the following features: (1) the problem contains M unrelated parallel machines and J jobs; (2) each job has its own due date, and may also have a different processing time depending on the machine assigned; (3) each machine is allowed to process one job at a time, where the processing is non-preemptive; (4) setup times are job sequenceand machine-dependent.The following are notations and mathematical model for the MET-UPMSP.
A. Notations: m: machine index, m = 1,…, M j: job index, j = 1,…, J p jm : processing time of job j on machine m s ijm : setup time of job j following job i on machine m d j : due date of job j B. Decision variables: x ijm = 1 if both jobs i and j are processed on machine m, and job i immediately precedes job j; otherwise, x ijm = 0.
In the model, equation (1) shows the three objectives.Constraint set (2) restricts job sequence and machine assignment.Constraint set (3) ensures that each machine has the first job.Constraint set (4) specifies the relationships between the finish and start times of jobs processed on the same machine, where M big is a sufficiently large number; the inequality is invalid if jobs i and j are not processed on the same machine and/or job i does not immediately precede job j.
Constraint (5) specifies that the production makespan must not be smaller than the finish time of any job.Constraint set (6) defines the tardiness of a job and the maximum tardiness of all jobs.Constraint set (7) defines the earliness of a job and the maximum earliness among all jobs.Constraint set The MET-UPMSP is strongly NP-hard since the single machine scheduling problem with the objective of minimizing makespan, 1 | s jk | C max , is strongly NP-hard.
Where is the processing time of job j on machine m, d j is the due date of job j, ̅ and ̅ are the average processing time of the remaining jobs if they are processed on machine m, k 1 is the due-date related scaling parameter and k 2 the setup time related scaling parameter.D is the estimated makespan ( ̅ ̅ ) , where  is the total number of jobs divided by the total number of machines, ̅ is the mean setup time, and = 0.

1) Construction of the RCL
The greedy functions defined above are the larger the better.At any GRASP iteration step, a job j is selected using roulette method from the restricted candidate list (RCL), in which each element has a greedy function value within the interval, [ ( ) ] where , .

2) Reactive GRASP
In the construction phase, reactive GRASP is used, rather than basic GRASP.Prais and Ribeiro [21] showed that using a single fixed value for RCL parameter often hinders finding a high-quality solution, which could be found if another value www.ijarai.thesai.org was used.Another drawback of the basic GRASP is the lack of learning from previous searches.In our Reactive GRASP, a set of parameter values {0.05, 0.1, 0.3, 0.5} is chosen.Originally, each i value is used to find constructive solutions for a predetermined number of times.Let Nd* be the current largest nadir distance, and A i the current average nadir distance for i .Define q i = A i /Nd * .Then the probability of i being chosen is = /∑ .
An experimental result indicates that reactive GRASP outperforms basic GRASP for any fixed value in {0.05, 0.1, 0.3, 0.5}.In the experiment, three instances of problem size 200 x 5 were generated for each of the three due date parameters: (, R) = (0.2, 0.8), (0.5, 0.5), and (0.8, 0.2).Each instance has ten replication runs, and each run has 25 restarts, each of which performs 30 local search iterations.Afterward, the average nadir distance of the ten replication runs for each instance is computed, and then the average and standard deviation of results.The result shows that the average nadir distance of the reactive GRASP is larger than that of basic GRASP for MET-UPMSP.

3) Nadir distance
The nadir point in the objective space is computed as follows: where [ ⁄ ] is the smallest integer which is not smaller than ⁄ , is the sequence that ranks all job processing times on machine m in decreasing order, p( ( )) is the k-th largest processing time for machine m, is the sequence that ranks all job setup times on machine m in decreasing order, ( ( )) is the k-th sequence setup time.
, M}, which is the maximum job due date less the shortest processing time and smallest setup time.

{ |
, where D is the estimated makespan.
The nadir distance of a solution with objective vector a is defined as the Euclidean distance between a and nadir point.The neighborhood solution will replace current solution is the nadir distance of the former is greater than that of the latter.

4) Local Search
Given a current solution (CS), a neighborhood solution (NS) is generated as follows: In the CS, select the group-machine pair having the smallest nadir distance, and randomly select another group from the remaining groups.Each group first determines the number of jobs based on a random integer from [1, 0.25 J/M]; then randomly select a job set from the two groups for swapping.For each single-machine scheduling, apply 3-opt local search for a number of times.To determine whether NS will replace CS, the following rule is used: If NS dominates CS, set CS = NS; if CS dominates NS, leave CS unchanged; if NS and CS do not dominate each other, then set the one with a larger nadir distance to be the CS.
To enhance the local search improvement on solution quality, min-max matching is employed.The following describes this matching technique for a partition of jobs {G k : k = 1,…, M}.
Step 1: For each group-machine pair, {G j , M k }, apply 3-opt to obtain a local optimal solutions with respect to nadir distance, and then compute the corresponding three objectives.Thus, we can obtain an M by M matrix where each element has three objective values ( ).
Step 2: Apply min-max matching to each individual objective in the matrix.Let , , be the corresponding optimal values; let , , be the maximum values for the three objectives, respectively.Let SC = {C max | Step 3: For each configuration of ( ) with SE, ST, assign a very large value to the cells (f 1 , f 2 , f 3 ) in the matrix where f 1 > C max , f 2 > E max , and f 3 > T max .Apply maximum cardinality matching to the resulting matrix.If the maximum matrix is equal to M, then set S = S  {( ) .
Step 4: Compare all elements in S based on Pareto domination.
Let P be the set of all non-dominated elements in S. Output the set P.

5) Path-Relinking
The iterative two-phase process of GRASP aims to generate a set of diversified Pareto local optimal solutions that will be stored in an archive.In the final phase, path relinking is applied using these Pareto local optimal solutions to further refine the solution quality.At each iteration, an initiating solution and a guiding solution are drawn from the current archive to perform a PR operation.
Let be the number of solutions in the archive.Thus, there are Q-1 adjacent solutions.For each pair of adjacent solutions (x i , x i+1 ), backward and forward relinking search procedures will be applied.For each relinking path, a sequence of {1/p, 2/p, …, p-1/p} is selected and one point crossover operation is performed based on the position of the encoding list at 1/p, …, p-1/p.Each bi-directional path relinking search will calculate 2(p-1) solutions.The choice for p will be determined by the number of solutions used in performance comparison of the three algorithms.In GRASP, p is set to 5.
An experiment is conducted to determine the parameter settings for (number of restarts, number of PRs).The experiment tests three problem instances with (, R) = (0.8, 0.2).Each instance has four combination levels on (number of restarts, number of PRs), and each level is solved with 10 replications.Each restart and each PR will generate 100 solutions.The four combination levels will be compared using the average nadir distance based on 2,000 solutions for each replication.The experimental results indicate that (restart, PR) = (15, 5) and (10, 10) yield approximately the same average www.ijarai.thesai.orgnadir distance.Thus, policy (15,5) is selected for our GRASP in solving the MET-UPSMP.

B. Dual-Archived Memetic Algorithm (DAMA)
DAMA is a variant of SPEA2.It differs from SPEA2 in three aspects: (1) population evolves with two archiveselite and inferior, using competitive strategy to produce the population of next generation; (2) fuzzy C-means [22] is applied to maintain archive size; (3) min-max matching is included in decoding scheme.The proposed parallel archived evolutionary algorithm is termed memetic algorithm since minmax matching will serve as an effective local search to improve solution quality for decoding scheme.

1) Encoding and decoding schemes
DAMA and SPEA2 adopt random key list (RKL) as their encoding scheme.For each RKL, the integral value of a cell represents the group to which the job is assigned, and the decimal value ranks job processing order.Fig. 1 presents an example of RKL for 7 jobs on two machines.In the example, the initial processing sequence of jobs for the first group G 1 = {5, 2, 6, 3}, and for G 2 = {4, 1, 3} according to their decimal values in the RKL.Then the 3-opt local refinement is applied to generate a neighborhood solution for each group-machine pair using nadir distance to decide the current representative solution.The procedure is repeated until a pre-specified number of 3-opt operations have been reached.For the 3-opt local search process of group G k with machine m, nadir point is defined as follows: Besides maintaining one efficient archive (EA t ) at each generation t to assist in algorithm convergence, the DAMA uses an inefficient archive (IA t ) to prevent premature convergence, and enable the memetic algorithm to explore solutions in an extensive space.At each generation, two parallel memetic procedures collectively produce the subsequent population: one procedure applies memetic operation (recombination followed by min-max matching) on the union of GP t and EA t , and the other procedure applies memetic operation to the union of GP t and IA t .In the recombination operation, each cell of the child will take the value from parent 1 if the sum of the two parents' decimal values in the same cells exceeds one; otherwise, it will take the value from parent 2. The following illustrates the DAMA algorithm.

2) Fitness assignment
Generally, the fitness assignment for (GP t EA t ) follows SPEA2 [18] on minimization problems, and the fitness assignment for (GP t IA t ) follows SPEA2 on maximization problems.The fitness assignment considers domination and diversity factors.For DAMA, a modification is made on diversity measure because the problem under study is discrete.

IV. NUMERICAL RESULTS
An experiment was conducted to investigate the performance of the proposed algorithms.All algorithms were coded in Visual Studio C++.NET 2008, and implemented on a computer with Intel (R) core (TM) i5-2400@3.1 GHz and 4 GB DDR3.

A. Parameter settings
Population and archive sizes of DAMA and SPEA2 are N = 20, ̅ = 20, maximum iterations = 100, no. of restarts = 7.The competitive ratio of DAMA is r 0 = 0.9.For GRASP, we set (no. restart, no.PR) = (15,5).All algorithms were executed 10 replications for each instance.The performances of algorithms with min-max matching are compared based on the same number of matching iterations.Finally, the effect of including min-max matching in the decoding scheme will also be discussed.

B. Generating test instances
Two problem sizes are considered in this experiment: 100 (jobs) x 3 (machines), and 200 x 5. We shall refer to the former as large size and the latter as moderate size.For each problem size, three test sets each consisting of three instances, were generated according to Lee and Pinedo [19].Each test instance is denoted by four characters: AB0n.The first character A www.ijarai.thesai.orgrepresents problem size: "L" for large and "M" for moderate.The second character B represents due date tightness: "L" for loose due date factors (, R) = (0.2, 0.8), "M" for moderate (, R) = (0.5, 0.5), and "T" for tight (, R) = (0.8, 0.2).Finally, the last two characters 0n represent the problem instance index.The larger the problem size, the more complex the problem; the tighter the due date factors, the more difficult the problem.Thus, "LM" problems will be the easiest to solve and "LT" problems will be the most difficult.Table 1 shows the data sets.

C. Performance metrics
When developing an algorithm to solve multi-objective optimization problems, diverse evaluation techniques are required to measure algorithm performance.Generally speaking, performance metrics are classified into three categories: Proximity, Diversity, and both.The following are several metrics used in our research.

1) Proximity
This metric evaluates the total distance between the local Pareto optimal front generated by an algorithm and globally Pareto-optimal front.We consider a commonly used proximity metric, GD (generational distance).

GD(A) = ∑ ⁄
where A is the set of non-dominated solutions generated by algorithm, |A| is the number of solutions, and d i is the distance of objective values of solution i to the nearest Pareto front point.

2) Diversity
Diversity is another important characteristic for measuring the quality of a non-dominated set.One popular metric for diversity is Spread [23], which calculates a relative minimum distance between local Pareto-optimal front elements.This metric also considers the extent of the spread and requires a reference Pareto front set P r to be computed.For threeobjective problems, Spread will be computed using minimum spanning tree which involves three shortest distances from the local Pareto-optimal front elements A to the three planes.
is the shortest distance from A to X-Y, X-Z, and Y-Z planes, ∑ is the total distance of the minimum spanning tree for A, and ̅ is the mean distance counting all |A| + 2 arcs.

3) Proximity and diversity
Zitzler and Thiele [24] introduced a hypervolume (HV) metric which can measure both proximity and diversity.A nadir point is required to calculate the HV metric.It is clear to observe that if point a dominates point b, then the volume of a must be greater than that of b.Let A = {a 1 ,…,a q }.The better the quality of A in proximity and diversity, the larger the HV of A.

HV = volume (⋃
), where h i is the hypercube of in A.
For three-objective case, the following formula can be applied to calculate HV for A. Let v( ) be the volume of .
The calculation will be time-consuming if the set A contains a large number of elements.In our algorithms, the archive size is limited to 20.The computation time is acceptable.HVR(A) (hypervolume rate) is defined as ( ) ( ) ⁄ where P r is the reference Pareto front set obtained by comparing the local non-dominated solutions produced by all algorithms.

D. Performance comparisons
TABLEs II and III present the HVR performance of SPEA2, DAMA, and GRASP on medium-and large-sized problem instances.In the tables, the symbol " " in () represents the performance where the min-max matching technique is not used, and "" represents the performance where matching technique is applied.For example, the values 37.4(63.8)located in ML column and DAMA(M) row of TABLE II indicate that HVR is 37.4% for DAMA without matchingbased decoding, and HVR is improved to 63.8% for DAMA with matching.From TABLEs II and III, SPEA2 and DAMA with matching-based decoding considerably improve solution quality.However, GRASP does not reveal much advantage when matching is applied.For example, in MT instances, SPEA2 improves HVR from 27.5% to 84.9%, DAMA from 28.2% to 88.6%, but GRASP only from 56.5% to 59.7%.
GRASP performs best among all algorithms without matching, and there is little improvement for GRASP without matching.This indicates that GRASP is able to produce high quality solutions.However, for SPEA2 and DAMA, the effect of matching is significant, particularly for tight due-date instances.In summary, DAMA with matching (DAMA_M) is superior to the others in terms of HVR metric.TABLEs IV-V display GD performance of the algorithms.For 100 x 3 instances (TABLE VI), DAMA_M performs best for all three types of instances.GRASP_M is little better than GRASP, but both perform well for ML.For 200 x 5 instances, GRASP_M performs best for LL and LM instances.However, for LT instances, DAMA_M is superior in GD performance.From the entries of TABLE VII, we can conclude that SPEA2_M, DAMA_M, GRASP, and GRASP_M produce local solutions which are close to the reference set.The value behind the sign "" is standard deviation.TABLEs VI and VII present the Spread performance of the algorithms.Spread measures the diversity of the local solutions generated by an algorithm.A small Spread value indicates that the local solutions are more uniformly distributed.For 100 x 3 instances, DAMA_M generates more evenly distributed local solutions than the other algorithms.GRASP_M performs second best.For 200 x 5 instances, DAMA_M is superior to the others.In contrast, SPEA2_M performs next and generates Spread values closest to the best for every type of instances.From the entries of TABLEs VI and VII, we observe that using matching decoding will produce better distributed local solutions than not using.The gap of the Spread values is significant when problem size increases.

V. CONCLUSION
Parallel machine scheduling are often observed in production environment, and the goal that production management wishes to achieve is often multi-fold.This paper studies unrelated parallel machine scheduling problems with three minimization objectives: makespan, maximum earliness, and maximum tardiness.
Three algorithms are presented to solve this problem: GRASP, DAMA, and SPEA2.Our numerical results indicate that GRASP outperforms the other two algorithms without the min-max matching technique, but the performance improvement is not significant when the min-max matching is used.In contrast, the two population-based algorithms, SPEA2 and DAMA, including min-max matching in the decoding scheme will significantly improve the solution quality.Although the DAMA with matching-based decoding scheme requires more computation time, it will produce high quality solutions, which can be used as comparison standard to evaluate the performance of other algorithms.
4 + 10/ .The parameters k 1 and k 2 can be regarded as functions of three factors: (1) the due date tightness factor ; (2) the due date range factor R; (3) the setup time severity factor = ̅ / ̅ k 1 = 4.5 + R for R  0.5 and k 1 = 6 -2R for R 0.5  2 Fitness assignment: Calculate fitness values of individuals in (GP t EA t ) and (GP t IA t ), respectively.Step 3 Generate population GP t+1 : Step 3.1 Perform crossover on (GP t EA t ): Produce [r N] offspring from (GP t EA t ) by crossover operation using binary tournament for mating selection.Decode each offspring.Step 3.2 perform (GP t IA t ): Produce N -[r N] offspring from (GP t EA t ) by the same method in Step 3.1.Step 4: Update of EA t+1 and IA t+1 Step 4.1: Compute F 1 (GP t+1 ) and copy into EA t ; update EA t+1 .If |EA t+1 | > ̅ , trim EA t+1 to size ̅ by FCM.Step 4.2: Compute F L (GP t+1 ) and copy it into IA t , update IA t+1 .If If |IA t+1 | > ̅ , trim IA t+1 to size ̅ by FCM.Step 5: Compute r t+1 according to the following equation.r t+1 = |(crossover on (GP t EA t )r t  F 1 (GP t+1 )| /(|F 1 (GP t+1 )|+) Step 6: t = t+1; if t = T, proceed to Step 7; otherwise, return to Step 2. Step 7: If the number of restarts is not over, proceed to Step 0; otherwise, output global non-dominated set A from all EA T .

TABLE VI .
SPREAD OF ALGORITHMS ON 100 X 3 TEST SETS