Load Balancing based on Bee Colony Algorithm with Partitioning of Public Clouds

Cloud computing is an emerging trend in the IT industry that provides new opportunities to control costs associated with the creation and maintenance of applications. Of prevalent issues in cloud computing, load balancing is a primary one as it has a significant impact on efficiency and plays a leading role in improved management. In this paper, by using a heuristic search technique called the bee colony algorithm, tasks are balanced on a virtual machine such that their waiting time in the queue is minimized. In the proposed model, the cloud is partitioned into several sectors with many nodes as resources of distributed computing. Furthermore, the indices of speed and cost are considered to prioritize virtual machines. The results of a simulation show that the proposed model outperforms prevalent algorithms as it balances the prioritization of tasks on the virtual machine as well as the entire cloud system and minimizes the waiting times of tasks in the queue. It also reduces the completion time of tasks in comparison with the HBB-LB, WRR, and FCFS algorithms. Keywords—Cloud computing; load balancing; bee colony algorithm; public cloud; cloud partitioning


INTRODUCTION
Cloud computing provides ways of presenting IT services in a similar manner to public utility companies.In simple terms, cloud computing is a new approach to use computing resources.The cloud is a group of distributed nodes that supply resources, hardware, and software over the network based on user demand.The increasingly strong presence of such companies as Microsoft, Google, and Amazon in the arena of cloud computing indicates its rapid development and influence on IT.
New changes or concepts in the technological world can lead to problems and complications, and cloud computing is no exception.It poses many challenges to experts in the field.One of these is load balancing in the cloud.Load balancing in is crucial in computer science, and has attracted a considerable amount of research.Many techniques have been employed for load balancing, such as the genetic algorithm, bees' algorithm, neural networks, and distributed research.Load balancing algorithms make decisions about allocating resources to tasks and coordinating among them.The aim of load balancing is to share resources among tasks within the system such that for every resource, there are an equal number of tasks to be completed, and this minimizes the total time needed [3].Algorithms for load balancing and resource management can be categorized into three groups (Wu, Wang & Xie, 2013 [7]; Yan, Wang, Chang & Lin, 2007 [9]).The first consists of algorithms for static load balancing.In such algorithms, decisions concerning load balancing are made at compile time.The advantage of static load balancing is its simple implementation and low overhead, as there is no need to permanently monitor nodes to assess the efficiency of the system.These algorithms work well when there are small changes in loads in virtual machines.Therefore, they are not appropriate for cloud and grid computing environments because the load on the network is variable at every point of time in such environments.The second category of load balancing and resource management techniques consists of algorithms of dynamic load balancing.In these algorithms, the distribution of load among nodes changes, and they use the given information to make decisions about load distribution.The third category consists of hybrid algorithms, which involve the hybrid use of static and dynamic algorithms, and switch between them when necessary.
There are many papers have been proposed based on optimization in them [10]- [19] and [24]- [26].These papers tried to optimize their problems by presented some formal methods and fitness functions in them.In [18] authors presented new distributed method to reducing the energy for the communication between nods and coordinator.Also in [24], [25] and [26] authors used the optimization methods in Meso-Scaled material.Saffar Ardabili and Aghayi [20] evaluated efficiency score of decision making units (DMUs) by the undesirable outputs.Aghayi et al. [22] measured efficiency measure using common set of weights in present of uncertainty based on robust optimization.Aghayi [23] proposed the approach to obtain cost efficiency of DMUs by fuzzy data.Aghayi and Maleki [21] measured the efficiency of bank branches of Ardabil, Iran using robust optimization theory and undesirable outputs.Rostamy-malkhalifeh, and Aghayi [27] suggested the method for calculating overall profit efficiency using uncertainty as fuzzy in data.Aghayi and Ghelejbeigi [28] presented the improvement of cost efficiency based on resource allocation.Aghayi [29] computed revenue efficiency of DMUs with undesirable and fuzzy data.Salehpour and Aghayi [30] calculated the most revenue efficiency with price uncertainty.
In this study, a method for cloud partitioning is proposed that is also used to investigate the load on systems in heterogeneous environments, with the aim of reducing the time needed for scheduling and other tasks.The proposed method is dynamic.In this method, the algorithms of load balancing can www.ijacsa.thesai.orgrepresent each resource according to its capabilities and accessibility to tasks services, which enhances the efficiency of the cloud system.To balance the load, load balancing as used in cloud computing environments is used, inspired by the HBB-LB algorithm (Babu & Krishna, 2013 [1]).
The rest of this paper is organized as follows: in Section 2, prevalent scheduling algorithms are introduced.Section 3 describes the proposed scheduling algorithm, Section 4 details the simulation to test it and the results and Section 5 contains the conclusions of this study and recommendations for future work in the area.

II. REVIEW OF LOAD BALANCING ALGORITHMS
One method for load balancing involves the efficient use of virtual machines.It was proposed by Domanal and Reddy (2013) [2], and allows for load distribution on accessible virtual machines to guarantee that stable use of resources or virtual machines in the cloud system.This is in contrast to active load balancing, which involves the proper distribution of load within the system to solve the problem of inefficient use of virtual machines in other algorithms.However, the service time following load balancing has not been yet studied.In 2013, Babu and Krishna [1] proposed a load balancing algorithm inspired by the food-finding behavior of honey bees, and is used on the Web.The aim is to reach a balanced load within virtual machines by maximizing capability.Moreover, it balances the prioritization of tasks in virtual machines such that the waiting times of tasks in the queue are minimized.However, this algorithm is impractical for dependent tasks.Ren, Lan, and Yin (2012) [5] proposed a dynamic load balancing algorithm according to the migration of virtual machines within the cloud computing environment.This algorithm contains a unit to monitor excessive loads, one for diagnosis, and a unit for load scheduling.The unit of load monitoring is used to collect the load information pertaining to a group of virtual machines and the resources' server (calculating the load and updating it).The database information for this algorithm is collected according to the trigger strategy based on fractal methods.It determines the time of migration from an overloaded virtual machine in the system.In this method, operating capability is maximized by using unemployed nodes in the system.Moreover, the overload in load balancing systems is minimum.However, in this method, only the load is studied.TeraScaler ELB was proposed by Wu et al. (2013) [7] based on the prediction of elastic load balancing for resource management in cloud computing.In this algorithm, virtual machines are added or removed according to the analysis and prediction of the given load and its history.The algorithm of ELB resource management is regularly implemented through two events:  The load balancer regularly collects resource information from the back-end server.
 The load balancer determines whether there is a request to remove the back-end virtual machine according to the collected resource information from the back-end server.
In 2012, Nishant and et al. [4] proposed an algorithm for the load distribution of workloads among the nodes of a cloud using ant colony optimization.According to this study, ants can move in two directions: forward and backward.In the forward direction, if an ant faces overloaded nodes, they will move forward.In the backward direction, if an ant faces an overloaded node, which has faced an under-loaded node, it will move backward.The main duty of ants is to redistribute tasks.In this approach, the ants repeatedly update their pheromones during all their moves.They also identify the tasks of nodes and find their way among different types of nodes.In 2013, Xu, Pang, and Fu [8] proposed a load balancing model based on cloud partitioning for public clouds in different geographical locations.This method renders load balancing easier in extremely large and complicated environments.Clouds have a main controller to choose the proper sectors of tasks, and the balancer chooses the best strategy for load balancing per sector of the cloud.Sectors can be unemployed, normal, or overloaded.The sector of load balancing decides how to assign tasks to nodes of normal or unemployed sectors.In this algorithm, the features of virtual machines are not considered.Soni and Kalra (2014) [6] proposed a central load balancer to balance loads among virtual machines in a cloud data center.This algorithm distributes the load among heterogeneous virtual machines based on hardware configuration and their states in the cloud data center.This method can balance the load quickly and reliably in cloud computing environments by using all virtual machines based on their calculation capacities.The central load balancer communicates with all users and virtual machines, which are presented in cloud data center through a data center controller, which also analyzes the values' table containing the identifies, states, and priorities of virtual machines.It searches for the virtual machine with the highest priority to allocate user requests.The data center controller allocates the requests to the identifier of the virtual machine as presented by the central load balancer.
By studying research on the load balancing of tasks and resources, it can be concluded that more research has been conducted on load balancing in heterogeneous environments.Existing algorithms have some drawbacks in the cloud sample, and this study attempts to alleviate some of them with solutions for responding to requests quickly and managing virtual machines properly.

III. PROPOSED METHOD
In this method, load balancing in cloud computing environments, inspired by the algorithm that mirrors the foodfinding behavior of honey bees (HBB-LB), is used.This algorithm not only balances load, but also considers the priority of tasks removed from virtual machines due to overload.The techniques of load balancing are effective for reducing the time needed to answer and service requests.The load balancing of non-exclusive independent tasks on virtual machines is an important aspect of task scheduling in cloud computing.The load on virtual machines must be distributed on balance, so that the machine is used efficiently.
In the ABC algorithm, several species of bees act in a research atmosphere.The bee that is randomly chosen to act to search is called the scout bee.It determines the location of sources of food and nectar.www.ijacsa.thesai.org

A. Scheduling System Model
As in the cloud computing environment, we encounter a large space with several users and service providers, tasks are not predictable, and the capacity of each virtual machine is different.In the proposed method, the cloud is partitioned into several sectors.When the environment is very large, balancing the load within the entire system is difficult.In this method, to balance the load in smaller sectors, the load balancing algorithm inspired by the behavior of honey bees is used.
The aim of load balancing algorithms is to balance the load among virtual machines to maximize operating capability.The proposed algorithm balances tasks on virtual machines as well as the entire cloud system, so that the waiting time of tasks in the queue is minimized.
Fig. 1 depicts the proposed method.The user presents tasks to the cloud system, which contains a data center for independent tasks.The tasks are presented to the system and, prior to execution, the processing time of each task is calculated (or the processing times of tasks are estimated through mathematical models) and the characteristics of all tasks are identified.The service provider has a controller and a state table that records the features of all virtual machines.Moreover, in the controller sector of the load of each virtual machine, the system is divided into three general sectors (overloaded, under-loaded, and balanced sectors).If a virtual machine loses its load while performing tasks, it can be moved from one sector to another.Tasks are assigned based on the magnitude of loads of the virtual machines and processing times.Moreover, the indices of speed and cost are considered for the prioritization of the virtual machines.In the following, all stages of this process are discussed.

B. Mechanism of Partitioning the Cloud System
In the cloud model, an infrastructure is considered an IaaS service that provides users with virtual resources.The cloud is partitioned into several sectors.A cloud may contain a large number of nodes in different geographical locations.Partitioning the cloud leads to better use.Fig. 2 lists details of a cloud system that has been partitioned into several separate sectors.It is worth mentioning that each area can be partitioned into several sectors.Heterogeneous virtual machines have been used in various areas, and are managed by a central controller.The methods of management may vary by cloud service provider.When the environment is large, partitioning based on load balancing limits the search environment for assigning tasks.The cloud has a main controller that selects appropriate sectors for input tasks.The balancer per sector of the cloud selects the best strategy for load balancing (Xu et al., 2013) [8].The load states of all virtual machines per data center are stored in the central controller, which controller deals with information for each sector, collects the information of each node, and selects the best strategy for partitioning virtual machines and assigning tasks.This information is updated repeatedly.When a task is entered into the cloud, the first stage involves selecting the proper sector.To partition the data center, it is necessary to calculate the load of each virtual machine.The controller searches sectors of the cloud and investigates the overloaded, underloaded and balanced states.Therefore, the cloud environment is partitioned into overloaded, underloaded, and balanced sectors.The method of calculating load per virtual machine is explained in Table I.Following partitioning, the tasks are presented to the underloaded sectors.To prioritize virtual machines, speed and cost are used.It is worth mentioning that after each stage of assigning tasks, the load on the system is updated.In this cloud system, for each sector, a processor is used to monitor and investigate the load on the virtual machines.If the magnitude of load on a virtual machine changes, it is moved to another sector.The values of L VMk,t and L VMk,t+1 indicate the loads on virtual machines at times t and t + 1, respectively.Load differences in a time interval may show a change in the load on the virtual machine in overloaded or balanced sectors.When diffL is zero, there are no load changes in virtual machines and no changes in the sectors.However, if it is lower than zero, virtual machines are moved from one sector to another.

TABLE I. PSEUDO CODE USED TO DETERMINE CHANGES IN LOAD
The pseudo code used to determine changes in load 1) diffL = LVMk,t+1  LVMk,t 2) If (diffL = 0) no change in the magnitude of load 3) Else if (diffL  0) the virtual machine can be moved from one sector to another.4) End www.ijacsa.thesai.org In Table II, the parameters used in this study are introduced.

C. Calculating Processing Time
In HBB-LB, it is hypothesized that VM = VM 1 , VM 2 , VM 3 … VM m  is a collection of m virtual machines with no links and in parallel, where they must process n tasks.Tasks are shown as a collection T = T 1 , T 2 , T 3 , … , T n .Independent tasks are non-exclusive, and are scheduled on virtual machines.A collection of virtual machines for processing tasks are an underloaded collection of virtual machines in the data center.Makespan is the time taken for task completion, and is shown in (1) (Babu & Krishna, 2013): The processing time of task i on virtual machine j is P ij , and is calculated through (2) (Li, Xu, Zhao, Dong & Wang, 2011).The processing time of all tasks on virtual machine j is P i , and is obtained through (3): The processing times of all tasks on a virtual machine must be smaller than or equal to their completion times.Therefore, by minimizing CT max , (4) is obtained: 3) and ( 4), ( 5) is obtained as 5. 3 and 4 P j  CT max j = 1, 2, … , m 6. CT max = max n i=1 CT i , max m j=1  n i=1 P ij 

D. Calculating Capacity of Virtual Machines
The capacity of a given virtual machine as well as all virtual machines is calculated through (7) and ( 8) in conjunction with the HBB-LB algorithm.The total capacity of all virtual machines is equal to the capacity of the data center.C j is the capacity of virtual machine j and C the capacity of all virtual machines.Moreover, Pe numj , Pe mipsj , and VM bwj respectively indicate the number of processors in virtual machine, millions of instructions for all VM j and the bandwidth of VM j .

Calculating Load on Virtual Machines
Load includes all tasks assigned to a virtual machine.A problem in cloud systems that has negative effects on them is unbalanced loads.Heterogeneous and unequal distributions of loads among virtual machines of cloud systems can create this problem.As some processors may be overloaded and others unemployed, load balancing increases the efficiency of distributed systems.This happens when nodes with heavy loads are moved to other nodes for processing.The proposed algorithms for load balancing are inventive and varied.
The load on a virtual machine can be calculated as the number of tasks at time t in a queue in virtual machine j divided by the servicing speed of virtual machine j at time t.The magnitude of the load on all virtual machines in a data center is calculated through (10) (Babu & Krishna, 2013) [1]: 9. L VMj,t = 10.L =  m j=1 L VMj The processing time of a virtual machine, the processing times of all virtual machines, and the standard deviation of load are calculated through (11), (12), and (13), respectively: 11.PT j = 12.PT = 13.

√  
If δ for a virtual machine is equal or lower than [0-1] (δ  Ts), the system is balanced.Otherwise, it is unbalanced, and may or not have extra load.If the magnitude of load in a virtual machine is greater than the permissible capacity, the virtual machine is overloaded and load balancing is impossible.When the load on a system is balanced, tasks are assigned to virtual machines.Table III shows the pseudo code of the load balancing algorithm.www.ijacsa.thesai.orgfor all VMs of a host do

4.
The Ci and LVMj,t of every VMs are calculated.

5.
The PTi and PT of every VMs are calculated.

6.
The of every VMs is calculated.

7.
If  Ts Then

8.
System is balanced, and Send Task to Partition; 9.
The Posj of every VMs is calculated; 10.
The Pj and CTi of every tasks are calculated. 11.
If L  maximum capacity 15.
Load balancing is not possible 16.
end for 19.end while

F. Calculating Priority of Virtual Machines
Following partitioning, the priority of under loaded virtual machines is calculated through (14).The priority per virtual machine is set according to speed and cost.POS, Speed, and Cost, respectively, indicate the priority per node in the virtual machine, the speed per node in the virtual machine, and the cost per node in it.Fig. 3 shows the average time for task completion before and after load balancing in the HBB-LBP algorithm (the proposed algorithm) with different numbers of tasks.The xaxis shows the number of tasks and the y-axis their completion times in seconds.The completion times of tasks continued to decrease.If the number of tasks increased, the algorithm exhibited better performance.Fig. 4 shows a comparison between the completion times of tasks for HBB-LBP, HBB-LB, FCFS, and WRR.The x-axis shows the number of tasks and the y-axis their completion times.According to the results, the HBB-LBP algorithm was the most efficient.Fig. 5 shows a comparison between the average times of response of HBB-LBP, HBB-LB, FCFS, and WRR.In general, the HBB-LBP method reduced the completion times of tasks and their response times in comparison with the three other algorithms.According to Table V, it reduces completion times by 9.02%, 44.08%, and 51.97% compared with HBB-LB, WRR, and FCFS, respectively.Moreover, HBB-LBP reduced response time by 13.80%, 44.34%, and 63.94% in comparison with HBB-LB, WRR, and FCFS, respectively.In this study, a method to partition a cloud system and investigate system load in heterogeneous environments was proposed.In the proposed model, the cloud is partitioned into several sectors and a load balancing method is used in the smaller sectors.This was inspired by the food-finding behavior of honey bees.This load balancing algorithm can consider each resource based on its capabilities and accessibility to tasks service providers, which increases the efficiency of the cloud system.Moreover, the indices of speed and cost are considered for the prioritization of virtual machines.Therefore, the proposed algorithm selects efficient resources for performing tasks based on the indices of speed and cost for the prioritization of virtual machines and the amount of load on the resources, which minimizes the time needed to service all tasks and balances system load.For future resources, the effect of fixed cost as well as the time can be evaluated, and a pricing model for user payments to cloud service providers can be organized.Moreover, cloud system modeling can be accomplished through reliable models and a hierarchical scheduler that considers the reliability of an application.This kind of load balancing can be expanded for independent tasks and the algorithm can be improved to include other factors pertaining to service quality.Moreover, the magnitude of load within the entire cloud system can be investigated, and migration can be used to distribute load between under loaded and overloaded resources.
The coefficients  and  indicate the importance of the speed and cost indices, respectively.As users assign varying priorities to indices of  and , their values change in the interval [1, 1].14.Pos j = * Speed j + * Cost j IV.EVALUATION AND RESULTS OF SIMULATIONThe accuracy of the proposed method was tested (according to a CloudSim simulation) and the performance of the HBB-LBP algorithm assessed in comparison with the HBB-LB, FCFS, and WRR algorithms.For the simulation, a data center and four groups of tasks of 10, 20, 30, and 40 were used.

Fig. 3 .
Fig. 3. Completion times of tasks before and after load balancing in the proposed algorithm.

Fig. 4 .
Fig. 4. Comparison among the completion times of tasks in the algorithms.

Fig. 5 .
Fig. 5. Comparison among average times of response by the algorithms.

TABLE II .
DEFINITIONS OF THE PARAMETERS IN THE EQUATIONS

TABLE III .
PSEUDOCODE OF LOAD BALANCING ALGORITHM

Pseudocode of load balancing algorithm 1.
Input: the set task; the set VM. 2. While there are tasks in the list do 3.
Table IV lists the values of the simulation.

TABLE IV .
VALUES OF SIMULATION

TABLE V .
INDICES USED TO ASSESS THE PROPOSED ALGORITHM (HBB-LB) AND THE PERCENTAGE OF IMPROVEMENT