Clustering Nodes and Discretizing Movement to Increase the Effectiveness of HEFA for a CVRP

A Capacitated Vehicle Routing Problem (CVRP) is an important problem in transportation and industry. It is challenging to be solved using some optimization algorithms. Unfortunately, it is not easy to achieve a global optimum solution. Hence, many researchers use a combination of two or more optimization algorithms, which based on swarm intelligence methods, to overcome the drawbacks of the single algorithm. In this research, a CVRP optimization model, which contains two main processes of clustering and optimization, based on a discrete hybrid evolutionary firefly algorithm (DHEFA), is proposed. Some evaluations on three CVRP cases show that DHEFA produces an averaged effectiveness of 91.74%, which is much more effective than the original FA that gives mean effectiveness of 87.95%. This result shows that clustering nodes into several clusters effectively reduces the problem space, and the DHEFA quickly searches the optimum solution in those partial spaces. Keywords—Swarm intelligence; capacitated vehicle routing problem; firefly algorithm; differential evolution; hybrid evolutionary firefly algorithm

An optimization algorithm determines such a great solution in solving VRP that finding the global optimum solution takes a long time. Not only the deterministic optimization algorithms but also the probabilistic ones have some specific problems. The deterministic algorithms guarantee to give a global optimum solution, but their processes take a long time. In contrast, the probabilistic algorithms are commonly fast, but they do not always produce a global optimum solution. In practice, probabilistic algorithms are preferable in terms of fast processing time.
In practice, HEFA has been proven to produce high performances for many optimization problems [28]. Hence, in this paper, a discrete version of HEFA, which is called as DHEFA, is exploited to develop a CVRP optimization model. A new idea of HEFA-based clustering is also proposed to make the model more effective in searching the minimum-cost route.
Next, the fundamental theory of CVRP and HEFA will be clearly described in Section II. The proposed models of HEFA-based clustering and DHEFA-based optimization are then explained more detail in Section III. After that, Section IV discusses the simulation results. Section V eventually provides conclusion and the further plan.

II. FUNDAMENTAL THEORY
VRP is a combinatorial optimization problem, which is an extension of a Traveling Salesman Problem (TSP) [3]. It has a basic form called Capacitated VRP (CVRP) [7]. Unlike VRP, the CVRP has an additional problem when searching for optimum vehicle order schedules. Each node visited has a load that should be accommodated, and each vehicle has a maximum capacity that cannot be violated. This not only makes the optimum solution depend on the results of vehicle scheduling but also considers the burden that each vehicle can accommodate. The total distance on the scheduling is formulated as where x tot is the total distance, k is the number of vehicles, c i is number of nodes contained in the ith vehicle, and x j is the route traversed by the jth vehicle.

A. Firefly Algorithm
FA is inspired by the movements of fireflies looking for a partner, which based on two things: the attraction between fireflies and the intensity of light. The light intensity is basically the value of a function. Unfortunately, the light intensity is not the same in every place. Therefore, in [20] the formula of light intensity in one firefly against the others can be formulated as where I 0 is the value of fitness, γ is the value of light absorption, and r is the distance between the chasing individual and the individual being pursued in a scalar value.
Just like the light intensity, the attraction is dynamic since the distance determines its change. The further the distance between the fireflies, the smaller the interest. Hence, the attractiveness function is formulated as where β 0 is the initial attractiveness value between two individuals and it is generally set to 1. In the original version for continuous-problem optimization, the distance is calculated using an Euclidean distance as where both x and y are n-dimensional vectors.
Meanwhile, the firefly movement is calculated using the formula where β is the attractiveness value and α is a random value from 0 to 1.

B. Differential Evolution (DE)
DE is one of the Evolutionary Algorithms (EAs) [28], where the key processes of this algorithm are mutation, crossover, and selection. The mutation process in DE uses velocity vectors from two random vectors. This velocity vector then becomes the driving force for new vectors, which are not the two previous vectors. The DE mutation formula is represented as where v i(t+1) is a vector of mutations, x i(t) is an old vector, and F (x k(t) − x j(t) ) is a random vector difference from other individuals.
Meanwhile, the crossover scheme in HEFA is simply represented as [28] where u i(t+1) is the vector of the crossover result and v i(t) is the result of the exchange of elements between the vector x 1(t) and the vector v i(t+1) with c r is a cross-over rate or a constant value when the element must be crossed-over.
The selection process is then formulated as where this process only selects between the old vector x i and the result of the crossover vector u i(t+1) based on its fitness value.

C. Hybrid Evolutionary Firefly Algorithm
When building a good program of collective intelligence, the balance between both exploitation and exploration plays an important role. High exploitation makes the program converges too quickly, which is known as a premature convergence, and consequently, the program fails to find the best solution (global optimum ). In contrast, too high exploration affects the program does not converge to a global optimum. The program tends to behave like a random search.
In FA, the process of balancing exploitation and exploration is more focused on regulating the values of γ and α. The α is responsible to the exploration process in the FA, which is usually a little value. The small α keeps the FA from behaving like a random search. But, at the same time, the exploration area became smaller, as illustrated in Fig. 1. A small radius α limits the movement of FA exploration. Each firefly drawn by a dark blue circle cannot explore areas outside its population. In cases where the solution space is greater than the radius of the distribution of fireflies, some areas within the solution space cannot be traced.
Nevertheless, this exploration problem can be solved using a DE. Fig. 2 shows that the DE behavior that moves based on other vectors makes DE has a significant exploration radius. With a broad reach, DE can explore even outside the population area. This feature makes it one of the reasons why DE can complement the FA.
HEFA is a combination algorithm between FA and DE. This algorithm is introduced by Afnizanfaizal Abdullah in [28]. The process of moving the algorithm is quite simple. HEFA only divides the firefly population into two parts based on their fitness values. Half of the population with high fitness values exploits the FA while the rests with poor fitness scores explore using the DE scheme. The experimental results in [28] prove that HEFA is excellent at solving complex problems and nonlinear biological models. The proposed DHEFA-based CVRP optimization model is illustrated in Fig. 3. It receives a dataset of nodes. First, the dataset is clustered using a HEFA. The produced optimum centroids are then exploited to initialize a population of fireflies in a DHEFA, where an individual of firefly represents a candidate solution of a route. Finally, the DHEFA searches a minimum-cost route as the best solution.
The most challenging step in this optimization problem is determining the division of the number of nodes against the available vehicles. This division can be done in a purely random way, selecting nodes in sequence until reaching maximum vehicle capacity, and so on. However, clustering the nodes into n cluster, which is the same as the number of available vehicles, is the best solution since clustering can reach the minimum total distance traveled by each vehicle.

A. Dataset of Nodes
The dataset used in this research is the Augerat et al. Set B. It has three instances: B-n50-k8 that contains 50 nodes with eight vehicles, B-n66-k9 that consists of 66 nodes with nine vehicles, and B-n78-k10 that contains 78 nodes with ten vehicles. All instances do not provide a cluster of nodes to the vehicle, which is important since it affects the total distance traveled by a vehicle. Therefore, a clustering procedure is needed to develop the optimization model.

B. HEFA-based Clustering
A HEFA-based clustering is exploited here since it has been proven to give a high performance. It is expected to produce as high possible as density cluster for each vehicle since the denser the cluster, the lower the total distance for the vehicle. Firefly at the beginning of an iteration contains a random vector with a size of two times the total vehicles. A pair of two vector elements in a firefly represents the centroid coordinates in the form (x, y).
All coordinates of centroids are then used to produce a fitness value obtained from the objective function. Half of the firefly population will move to pursue the best fitness value from its perspective while the rest move as if randomly in search of better fitness value. Once all fireflies move, they renew their respective fitness values. It is repeated until the stop condition is reached, and the HEFA produces the best firefly with the highest fitness value. An example of HEFA-based clustering a set of sixty nodes into three clusters is illustrated in Fig. 4. The coordinates of all centroids produced by the best firefly are then used to determine the cluster of nodes. Each node in a cluster is visited by a particular vehicle.
The objective function is simply designed here using a sum of square Euclidean distance (SSE). This function calculates the total distance of all nodes to their respective centroids, which is formulated as (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 4, 2020 where k is the number of cluster, x i is the ith node, and c j is the jth cluster.
Once the optimum clusters are generated, check if there is a vehicle carrying a load that exceeds the maximum capacity. Any node in an over-loaded vehicle is then redistributed to the nearest under-loaded vehicle, as illustrated in Fig. 5. The vehicle capacity Cap in cluster c 2 , which exceeds the maximum capacity MaxCap, looks for the closest vehicle to redistribute one or more nodes. A node, which is the closest to the cluster c 1 , is selected to move to the cluster c 1 .

C. DHEFA-based Optimization
Finally, a minimum-cost route is searched using DHEFA. Since the problem of determining the route is a discrete problem (sequence of nodes should be visited), the HEFA has to be redesigned into a discrete model. In [29], a discrete firefly algorithm (DFA) is proposed with a high performance. In this paper, the discrete model of FA is designed by following the concept of DFA.
At the beginning of the iteration, a firefly in DHEFA consists of a random vector with a total size of nodes and elements in the range of one to a total non-repeating node. This vector is divided into the total number of vehicles where the nodes contained in each vehicle are following the clusters resulted from the previous HEFA-based clustering and the redistribution procedure. An example of firefly representation is illustrated in Fig. 6.
HEFA uses a distance that is determined by the difference between two firefly vectors while DHEFA calculates the distance as the number of different elements between two fireflies (also known as the Hamming distance [30]), as illustrated in Fig. 7. Another difference is the movement of fireflies. This movement does not use the sum of the ith firefly vector with the distance to the followed firefly as in Equation 5, but instead uses an insertion function. This function takes a random node and swaps it with another random node [29], as illustrated in Fig. 8. In this CVRP case, the insertion is limitedly performed just for two nodes in the same cluster since the vectors in fireflies are divided by the number of vehicles. It cannot exchange two elements in two different vehicles. Therefore, when choosing a random element in k i , the second random element must be in k i . This exchange is carried out as much as the Hamming distance × γ. Just like HEFA, the movement of a firefly in DHEFA also depends on its fitness value that is calculated using Equation (5). Half of the firefly population chases the best fireflies from its perspective while the rest move randomly, expecting to get better fitness values. All fireflies then update their fitness values to be compared in the next iteration. When the stopping condition is reached, the best fireflies are chosen as the minimum-cost solution, as illustrated in Fig. 9.

IV. RESULTS AND DISCUSSION
In this research, the proposed DHEFA-based model is evaluated and compared with the original FA-based model using three cases of CVRP. The experiments are run five times to give a more accurate statistical result. In each case, an effectiveness metric is used here to measure how close the obtained optimum-cost route to the real global optimum-cost route from the dataset. In this evaluation, both FA and DHEFA have the same conditions of parameters: γ = 0.95, α = 0.2, and c r = 0.5. The results are listed in Table I.  In all cases, DHEFA produces higher effectiveness than the original FA. In the CVRP case of B-n50-k8, with 50 nodes and eight vehicles, DHEFA produces an averaged effectiveness up to 94.25% while the original FA just gives 90.23%. In the CVRP case of B-n66-k9, which contains 66 nodes and nine vehicles, DHEFA also reaches a higher averaged effectiveness of 93.84%, but the FA just obtains 92.27%.
Meanwhile, in the CVRP case of B-n78-k10, with 78 nodes and ten vehicles available, DHEFA gets much higher averaged effectiveness of 87.13% while the original FA yields 81.36% only. Thus, for the three cases, DHEFA reaches much higher averaged effectiveness of 91.74% than the original FA that just obtains 87.95%.
This effectiveness of DHEFA is highly supported by the procedure of clustering nodes. Dividing nodes into some clusters is capable of reducing the problem space in some areas so that the optimization can be partially applied. This concludes that the research objective stated in Section I has been reached.

V. CONCLUSION
The proposed model of DHEFA-based CVRP optimization is capable of reaching the averaged effectiveness of 91.74%. This result is better than the original FA that gives mean effectiveness of 87.95%. This fact shows that the proposed clustering significantly increases the effectiveness of DHEFA. It can be simply explained that clustering nodes into some clusters is capable of reducing the problem space in some areas so that the optimization can be partially applied. In the future, an advanced procedure of redistribution can be introduced to ensure all vehicles have fair loads as well as do not violate the maximum capacity.