Multi-level Hierarchical Controller Assisted Task Scheduling and Resource Allocation in Large Cloud Infrastructures

The high-pace emergence in Cloud Computing technologies demands and alarmed academia-industries to attain Quality-of-Service (QoS) oriented solutions to ensure optimal network performance in terms of Service Level Agreement (SLA) provision as well as Energy-Efficiency. Majority of the at-hand solutions employ Virtual Machine Migration to perform dynamic resource allocation which fails in addressing the key problem of SLA-sensitive scheduling where it demands timely and reliable task-migration solution. Undeniably, VM consolidation may help achieve energy-efficiency along with dynamic resource allocation where the classical heuristic methods which are often criticized for its local minima and premature convergence doesn’t guarantee the optimality of the solution, especially over large cloud infrastructures. Considering these key problems as motivation, in this paper a highly robust and improved metaheuristic model based on Ant Colony System is developed to achieve Task Scheduling and Resource Allocation. CloudSim based simulation over different PlanetLab cloud traces exhibited superior performance by the proposed task-scheduling model in terms of negligible SLA violence, minimum downtime, minimum energy-consumption and higher number of migrations over other heuristic variants, which make it suitable towards realistic Cloud Computing purposes. Keywords—Task-scheduling; VM-migration; improved ant colony system; SLA assurance; energy-efficient consolidation


I. INTRODUCTION
In the last few years, the high-pace rise in advanced software systems and decentralized computing environments has broadened the horizon for a state-of-art new paradigm named cloud computing. Cloud computing has emerged as a potential technology serving decentralized scalable services to the significantly large number of users for respective data and/or query driven computation and information services. Cloud computing technology can be characterized as an array of network-enabled services facilitating quality-of-service (QoS) assured scalable and personalized (computing) solutions, even at the inexpensive cost [1][2][3]. The potential to serve decentralized data or (computing) infrastructure, independent of the geographical boundaries makes cloud computing an inevitable need to meet contemporary or even NextGen industrial as well as personal computing demands [2]. Based on the usage of the Cloud it is understood that it has been applied as a key technology to serve civic purposes, financial sector, industries, government agencies, scientific community, diverse business houses, etc. Noticeably, to serve aforesaid stakeholders, cloud services are classified into three key types; Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). Irrespective of the service types, fulfilling QoS in cloud computing has always remained a challenge. To meet aforesaid service demands industry requires providing decentralized storage infrastructure, often called data centers; however, with exponential rise in computing demands with non-linear (demand or use) patterns, the at-hand solutions often undergo disrupted performance or connectivity. This as a result impacts overall QoS performance. Typically, delivering Service Level Agreement (SLA) by Cloud Service Providers ensures to provide QoS support to its customers while maintaining reliable services with higher scalability, reliability and continuity over operating periods [4,5]. It is a challenging task to retain SLA over highly dynamic load demands and use patterns across a gigantically large userbase, located around the globe.
A cloud infrastructure mainly encompasses physical machines, also called servers, virtual machines (VMs) and allied controllers. Noticeably, hosts of the physical machines primarily acts as the component serving computing ability and memory, while VMs function as containers possessing different independent tasks. A huge cloud infrastructure may consist of multiple hosts, where each host can have multiple VMs, carrying different parallel-computing tasks. In this case, due to dynamism in resource demands by each task a VM might undergo an exceedingly large resource demand, which could not be facilitated by the currently attached physical machine or host. In such a case, a VM carrying multiple tasks is required to be migrated to the suitable host, which could provide sufficient resources to the associated task for SLA assurance and QoS provision. However, it may take significantly large traversal time or allocation scheduling related delay, impacting downtime and hence overall performance. Being an uncertain demand scenario, the tasks or allied VMs can have to traverse across the network as per athand overloading and under-loading scenario. Undeniably, it can increase downtime as well as QoS violation. On the other hand, cloud being an energy-exhaustive technique requires addressing energy-minimization needs and therefore simultaneous dynamic resource allocation, task scheduling and energy minimization turn out to be a complex NP-hard problem [1][2][3][4][5]. In sync with cloud with the heterogeneous demand types, the load pertaining to each VM might vary as per task-types and demand-density over the operating period. www.ijacsa.thesai.org Therefore, merely random host selection concepts or even the classical bin-packing models, cannot be appropriate. Such classical methods might give rise to the overloading or underloading condition, and hence can impact both SLA as well as energy-efficiency.
With this context, the research work proposes a "Multi-Level Hierarchical Controller Assisted Dynamic Task Scheduling and Resource Allocation Model for Large Cloud Infrastructures" which involves a hybrid evolutionary concept named Improved Ant Colony System (I-ACS) to achieve SLA with energy efficiency to meet cloud demands. The proposed model is developed using CloudSim platform, where simulation over PlanetLab cloud trace data revealed superiority of the proposed model over major existing approaches in terms of downtime, SLA violation, number of migrations and energyconsumption.
The further sections of the presented document are given as follows. Section II discusses the Literature Survey pertaining to SLA oriented and Energy-Efficient task-scheduling methods, Section III discusses the proposed method followed by Section IV which provides Results and Discussion. The overall research Conclusion and allied inferences are presented in Section V. References followed in this research are provided at the end of the manuscript.

II. RELATED WORK
Afzal et al. [6] focuses on Load balancing based heuristic assisted task scheduling concept under static or dynamic load conditions. However, unlike classical static resource allocation that employs a first-come-first-servemethod, it can't be suitable under dynamic load conditions. Pradhan et al. [7] discusses about modifying especially round robin methods, which authors applied in their research to reduce the waiting time. Mogeset al. [8]focused on energy efficiency as the key concept to perform task scheduling. To reduce energy-exhaustion, authors proposed VM consolidation concept, which was performed to shut-down underutilized hosts and by removing hotspots. However, the classical use of bin-packing based consolidation could not address latency and QoS degradation issues. In addition to the power enhancement, the work suggested to perform consolidation scheduling in such a manner that it could retain lower task response time to meet SLA demands. To achieve it, authors suggested to focus on modified bin-packing based consolidation.
Syed Arshad Ali et al. [9] implemented task scheduling using Resource aware min-min algorithm where taskscheduling was performed on the basis of the load of the servers to minimize makespan. Mosa et al. [10] on the other hand emphasized on load balancing in the cloud by distributing the workload dynamically across the cloud infrastructure with multiple nodes. Authors applied utility functions and GA heuristic model to optimize VM allocation, Energy consumption and SLA violations. Jyothi S et.al. [18] Bhaskar R et.al [19] discussed numerous key challenges in dynamic load management in heterogeneous cloud environments. Authors proposed a heterogeneity-aware dynamic application provisioning model to reduce energy consumption in cloud environments. Doppaet al. [11] designed a self-aware framework to adjust or optimize resource and SLA. However, the use of DVFS based methods can't be suitable for a heterogeneous cloud network with dynamic load conditions. In addition to the SLA expectations, authors [12 -13] focused on resource allocation while maintaining lower computation and energy-exhaustion. Liet al. [12] designed a directed acyclic graph (DAG) model to perform priority bound task scheduling. Here, in DAG construction the nodes characterize the tasks, while the edges represent the allied messages among jobs [14][15][16]. Tang et al. [14] applied DAG-based workflow where tasks were prioritized based on respective sizes to perform resource allocation. Zhu et al. [17] Jyothi et al. [18] performed task scheduling on the different multiprocessing environment, which can be solved using NP-hard optimization. Considering this as motivation, dynamic task-scheduling and resource allocation is performed by applying the concepts of coevolution system and multi-population strategy for Metaheuristic method such as ACO is considered.

III. SYSTEM MODEL
This discussion primarily discusses the proposed model and its implementation including the multi-controller assisted overload and underload detection, VM selection and the proposed Improved Ant Colony System (I-ACS) based task scheduling.
The task scheduling or allied VM migration can be inducted as per the task-(heterogeneous) demands' and hence a controller can migrate one or multiple VMs to the suitable hosts (via consolidation) while retaining SLA performance and energy-efficiency. The proposed model introduces multilayered controller units to dynamically monitor the VMs and allied task's demand to stochastically predict the demands and accordingly the global controller performs scheduling in advance to avoid any SLA violation, QoS-compromise or even energy-exhaustion. The overall proposed model encompasses four key steps. They are: Step-1 Hierarchical Multi-layered controller assisted cloud monitoring, Step-2 Underload and Overload detection, Step-3 Minimum Migration Time (MMT) oriented VM selection, Step-4 Improved ACS (I-ACS) assisted S-DTS The details of the overall proposed model are given in the subsequent sections.
Hierarchical Multi-layer Controller assisted Cloud Monitoring. Typically, cloud infrastructures that often accommodate a significantly large number of independent tasks operating or executed onto assigned VMs, undergo exceedingly high demand-dynamism. In other words, the different tasks connected to each VM undergo non-linear traffic demands, and therefore might require dynamic resource to continue its operation. Under such scenario, a VM encompassing single or multiple tasks might exhibit non-linear resource demand, influencing a host or physical machine to undergo underutilization or overloading. Consequently, it might significantly impact the overall performance and SLA-reliability of the system. Considering this fact, performing demand-sensitive resource allocation or task-scheduling is must. To achieve it in the proposed method, a state-of-art new Hierarchical Multi-Layered Controller (HMLC) design is applied, which especially monitors demands or resource utilization pattern at each task connected to a VM. The proposed HMLC model encompassed a local controller and a global controller, especially designed to perform task-scheduling or dynamic resource allocation so as to preserve SLA, QoS as well as energy-efficiency. To perform task-level resource utilization assessment, local controller (LC) is applied that measures resource utilization per VM and updates the same to the global controller (GC), dynamically so as to make stochastic prediction-based task-scheduling decision in advance.
As shown in Fig. 1, the proposed local controller unit operates over each VM, accommodating multiple tasks. Here, it acts as an autonomous VMM manager that measures resource utilization dynamically and updates to the global controller so as to make dynamic task reallocation. Additionally, the proposed controller mechanism enables dynamic underload/overload detection and (proactive) avoidance. Once detecting any hotspot or any PM undergoing overload, the local controller executes VM selection mechanism (discussed in subsequent section) and selects the VM to be unloaded from the at-hand overloaded hosts. However, recalling the SLA assurance to the offloaded VM and allied tasks, the proposed model introduces a state-of-art new and robust dynamic VM scheduling model which guarantees optimal task-scheduling and allied VM migration, without affecting SLA performance. To achieve it, the proposed global controller model retrieves VM's and hosts' information proactively from the local controller and executing the proposed I-ACS concept it schedules VM placement or migration in advance so as to retain SLA intact. Once traversing or offloading the suitable VM from a host, the local controller updates the node-parameters and updates the same to the global controller for further decision making. To achieve SLA-assurance and energy-efficiency, at first, a dynamic threshold-based underload and overload detection unit is applied. The details are given as follows.

A. Underload and Overload Detection
To cope up with the dynamic resource demands and allied scheduling tasks, the work is carried out which examines the load condition of each task and associated host that helps in identifying under-loaded and overloaded nodes in the network. To ensure SLA-sensitive and energy-efficient scheduling, once detecting a node as under-loaded either certain specific VM (including all connected tasks) or all VMs are off-loaded, which are then migrated to the other suitable hosts. This approach not only helps in optimal resource allocation, but also preserves significant energy. On the other hand, detecting a host undergoing overload, the proposed model offloads tasks or allied VM(s) and migrates them to the other suitable host, while ensuring that the migration doesn't cause overload on another host (say, target host) or impacts SLA performance.

1) Underload detection:
The proposed model discusses a host with load lower than a predefined minimum workload condition or resource utilization is referred as an underload host. In order to preserve energy, once identifying a host with under-utilized resources, it's connected VMs or allied tasks are migrated to the other host(s) strategically. However, this scheduling or migration takes place in such a manner that it doesn't cause overload on other nodes or hosts. In sync with the concept of VM consolidation, once migrating all VMs to the other host, successfully, it shuts down the host to preserve the energy. Here the task-migration or allied resource allocation strategy schedules the migration in such a manner that neither it causes SLA violation nor energy exhaustion or any possible overload situation on the target host. To guarantee SLA provision, the source host remains active or ON, until all allied tasks and the target host(s) holds the migrated connected VMs.
2) Adaptive threshold-sensitive host overload detection: To detect the overloaded VM (containing independent tasks), a stochastic prediction assisted approach is applied. In this each host node performs periodic load assessment of each host which eventually assists detecting an overloaded node. Here, each host's resource (i.e., CPU or MIPS) utilization is measured to assess the host node whether it is overloaded or not. Most of the existing approaches towards task-scheduling apply a static threshold method to detect an overloaded host. Unfortunately, IaaS which often undergoes dynamic loads over the operating period and the different tasks consume different resources at the varied time-instant. Therefore, the use of the static threshold method can't be suitable for overload detection. Here, dynamic CPU utilization (cumulative CPU utilization per VM over multiple independently processing tasks) assessment method to perform overload detection is applied. More specifically, in (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 11, 2021 385 | P a g e www.ijacsa.thesai.org this method, the CPU utilization threshold value is adjusted dynamically on the basis of the changes in continuous CPU utilization. It assumes that higher fluctuation in use-pattern can be stated as the lower upper CPU utilization (threshold).
In general, the higher value of such non-linear resource utilization indicates an overloaded condition, with 100% resource utilization. To cope up with the exceedingly high dynamism in the cloud network, a hybrid concept encompassing both inter-service (task) relation along with varying information to achieve dynamic thresholding is applied. Here, a state of art new hybrid concept to exploit task level resource utilization and their cumulative impact as eventual load to perform overload detection is designed.
More specifically, interquartile range (IQR) and modified local regression methods is applied to measure dynamic CPU utilization and eventually predict adaptive threshold. Here, IQR algorithm follows a statistical dispersion approach to represent association between the first and the third quartile, as depicted in equation (1). The value of IQR is estimated to employ the equation (2) to obtain the upper-threshold of the CPU utilization.
With the consideration of dynamic load conditions and fluctuations in resource utilization for the same ongoing task, there can be significant effect on the upper threshold estimation (2). Any possible inaccuracy in threshold estimation might cause wrong resource allocation and allied task migration activities that as a result can affect overall SLA performance. Realizing this fact, in this research paper a state-of-art new dual-level threshold estimation model is formulated, where at first it applies IQR based estimation, while in the subsequent phase it applies local linear regression (LRR) method. Noticeably, in the proposed model, LRR exhibits fitting of the (utilization) trend polynomial to the preceding CPU utilization values, obtained as per (3) for each observation value.
Now, measuring the observation values, the next observation value,   1 ˆk gx  is estimated. Now, to perform offloading of a host, the following condition is applied.
In above conditions (4), signifies the maximum level of tolerance by a host. Here, the maximum time required to migrate a VM (containing one of multiple independently executing tasks) from host be . The classical local regression concepts which are often found limited under higher dynamic value changes and allied regression estimation. Additionally, it performs inferior due to the outliers introduced by leptokurtic or heavy-tailed distributions. Considering this fact, modified the classical least square (LR) algorithm is applied by a bi-square model. Noticeably, LR improves iteratively so as to estimate the initial fitting for which the tricube weights are obtained using a Tricube Weight Function (TWF). Here, the obtained fitting parameter at was applied to retain the fitted values using ̂ . In this manner, the residual value, signifying ̂ was estimated. Thus, with the estimated values of and , it was assigned in (5) to estimate a factor called robustness factor .
Every observation value was allocated i R . In (6), ( )represents the bisquare weight function and represents the Medium Absolute Deviation (MAD) to achieve least square fitting. Thus, obtaining ( )As per (6).
In above derived equation (5), was obtained as per (7). ˆi s mediun   Thus, employing the above derived model (4) for the estimated trend line, the predicted possible value or instance, for any inequalities (with reference to the predicted value and the observed value), a host was identified as an overloaded host. Eventually, identifying the overloaded host, the local controller unit informs the global controller and meanwhile identifies the VMs to be migrated to the other resourcesufficient (optimal) host node. Though, in literature, researchers have randomly considered any VM to execute migrate; however, for SLA-sensitive task migration purposes, such approaches might undergo SLA-violation phase or QoS compromise, especially due to increased downtime and even complete task or transaction failure. Considering this fact, distinct unit called VM selection model is necessary. A snippet of the VM selection method applied is given as follows.

B. SLA Oriented Minimum Migration Time-based VM Selection
In order to preserve SLA, while guaranteeing minimum downtime, the minimum migration time (MMT) based VM selection method is applied. In other words, once identifying an overloaded host, only that specific VM is migrated, which takes minimum migration time. Hypothesizing the fact that higher downtime can lead higher losses, so maintaining lower downtime as favorable, MMT as a VM selection policy is considered. This approach can be suitable towards SLA preserving effort as well as reliable cloud service provision. Migration time for each task and allied VM connected to the overloaded host of PM is estimated. Thus, sorting the VMs based on their respective migration time, the VM is chosen with the minimum migration delay, at first to migrate towards the target host. Thus, applying this method, broadened the horizon for delay-resilient migration over cloud platforms. Now, once selecting the VM to be migrated, the local controller passes all allied details to the global controller, which employs a highly robust improved ACS heuristic (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 11, 2021 386 | P a g e www.ijacsa.thesai.org concept to perform VM placement of migration. Though, the proposed VM migration or allied task scheduling concept resembles a VM consolidation problem; however, considering real-time tasks characteristics, classical meta-heuristics is improved to not only alleviate local minima and convergence but also ensure timely and SLA-centric task-migration or allied resource allocation.
The following section discusses the proposed I-ACS model for task-scheduling over a large Infrastructure as a Service (IaaS) cloud platform.

C. Task-Migration Problem Definition
Consider that the set of operating physical machines or the hosts be * + where, represents the specific host conditioned as . In the same manner, let the set of VMs encompassing or containing multiple autonomously operating tasks be { }, where each VM is connected to certain host.
be the th connected on the th host. The variable presents a binary variable signifying on host j connected by the -th VM. Consider that be the resource capacity (in terms of the CPU utilization) of on the th host and the resource demanded by the th VM be . In this manner the total load at that host can be characterized in the form of the total load caused by all VMs and allied tasks running onto it. Let, be the time-period or observation period. Thus, the sub-gap can be estimated by splitting into intervals . Noticeably, the time-slot represents the interval . Thus, over , the CPU utilization is estimated at a host using (8).
In (8), the parameter refers to the CPU utilization which was collected for certain period. The average CPU utilization is estimated using (9).
In (9), ( ) states the total amount of sub intervals or gap over observation period. Let, ( ) be the power of th host over span, then the power status can be obtained based on the CPU utilization value.
( ) which signifies the energy consumption by the th host from the last time interval to the current time interval and hence is estimated as per (10).
Based on host consumption hypothesis, for any host , employing CPU utilization, ( ) the energy consumed can be obtained as per (11).
In (11), signifies the portion of energy exhausted when the host (i.e., ) is in idle state; while refers the energy exhaustion of when being utilized completely. Moreover, the parameter ( ) presents the CPU utilization by over duration. Thus, applying this mechanism, the resource utilization is estimated dynamically over each host and correspondingly the resource demand by each task or allied VM is estimated. Now, the resource consumption for all active hosts, ( ) since the last or passed time interval to the current instant is estimated as per (12). The key dominant goal behind task migration or VM allocation problem is to obtain the set of VM-host mapping, where the proposed allocation model is supposed to place the targeted VM onto the suitable host, without impacting SLA performance or energy-exhaustion. Here, the resource allocation is performed in such manner that the proposed scheduling model attains minimal resource exhaustion ( ), conditioned at: Thus, with the above derived motive, in this research work, a state-of-art new improved ACS heuristic model is developed for SLA-centric task-scheduling and allied VM or resource allocation strategy. The details of the planned I-ACS model is given in the following section.

D. Improved-ACS based Task Scheduling
Unlike classical heuristic models, a hybrid ACS algorithm for task scheduling or allied VM allocation is applied. The VM scheduling model proposed in this paper is based on a wellknown heuristic model named ACO in which multiple agents estimate the solution-likelihood in iterative cycles. During this process, they converse ultimately by dropping the pheromone, which is a chemical substance called on respective paths they traverse. But, for research-intended task-scheduling or VM placement doesn't employ the notion of path, in the proposed model pheromone is deposited by the ants on each task (or VM) and within a pheromone matrix by the host pair. The ants retrieve VMs in each series, and starts forming local solutions by means of a probabilistic decision rule that signifies the attractiveness for an ant to select a specific VM (MMT based VM selection) as the next one to pack in its current host. In this mechanism, the higher the amount of pheromone deposition and higher information related to a VM-host pair, the probability that it will be selected for migration will also be higher. Fig. 2 presents the solution formation for a single ant. In this mechanism, the ant initiates with four VMs, calculates the probabilities for each of the VMs using the probabilistic decision rule, and begins allocating the (selected) VMs for each selected host as per the estimated probabilities. Once the host is completely occupied with the migrated task or allied VMs the proposed model identifies a new host on the basis of corresponding likelihood.
(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 11, 2021 387 | P a g e www.ijacsa.thesai.org This process continues till all tasks or VMs are assigned to the suitable hosts. During the optimal solution estimation, in each cycle or iteration, local solutions are assessed and the one demanding the minimum number of hosts is selected as the different optimal solution globally. Subsequently, it updates the pheromone matrix to estimate pheromone loss and reinforce VM-host pairs belonging to the set of the optimal or the best solution. To achieve it, ACS implements the pheromone update rule. In the proposed ACS based task-migration model, the proposed I-ACS gets triggered during VM assignment to the target host. It outputs a solution comprising VM to host (map) while maintaining no (or negligible) SLA violation or maximum host-shutdown (to achieve energy-efficiency).
In this research the emphasis has been made on optimizing classical ACO to avoid local minima, convergence and enhance solution diversity to meet dynamic cloud resource optimization. The proposed model encompasses some of the key optimizations such as multi-population strategy, coevolution concept, dynamic pheromone update and dynamic pheromone diffusion. Such optimization measures enable the proposed model to retain optimal balance in between the convergence rate as well as solution diversity which helps perform better VM scheduling or placement decisions. Moreover, it helps perform swift computation which is effective towards large scale mega-cloud infrastructures. To achieve it, the proposed model is designed in such manner that it splits overall optimization problem into multiple subproblems where to avoid local minima and convergence (i.e., to achieve local optima), the ant-population is split into two specific categories; Elite Population and Normal Ant-Population. This process not only increases computational efficiency (i.e., higher convergence rate) but also retains swift global-optima identification. Additionally, incorporating a dynamic pheromone update mechanism to enhance optimization ability over large network sizes. Similarly, the intent of pheromone diffusion was to make the pheromone released by ants at a certain point, which gradually affects a certain range of adjacent regions.
Realizing large scale cloud infrastructure, the concept of co-evolution (in reference to both populations based as well as diffusion based) is applied that helps interchanging local information amongst varied sub-population to achieve dynamic information sharing. Noticeably, each VM is hypothesized to be a component possessing or encompassing operating tasks, and therefore the concept of co-evolution can enable dynamic decision without imposing an SLA violation issue. Thus, the implementation of the overall proposed model can ensure optimal energy as well as QoS oriented task scheduling across a large-scale cloud-infrastructure.
Before discussing the overall proposed improved ACS model for the intended task-scheduling or VM migration, a ACO model is given as follows.

E. Probabilistic Decision Rule
In VM allocation strategy, at first ACS defines the likelihood of an ant to select a VM for migrating it to a specific host using (14).
In (14), the parameter states the pheromone-based attractiveness to migrate or attack VM onto the host . Similarly, the parameter represents the VMs heuristic information. Moreover, the other variables are applied to either focus more on the pheromone or the vital heuristic information. Moreover, states the total number of VMs encompassing single or multiple tasks which are suitable to be attacked or connected with the current host . states the overall utilized memory or capacity of the current host, which can be estimated as the sum of all requested resources by the connected VMs, i.e., ∑ . Here, the task scheduling or allied VM placement is accomplished by means of a parameter , which is estimated as per (15).
Thus, once estimating the value of (15), a d-dimensional demand vector is created which is subsequently mapped in terms of a scalar value. Here, the L1-norm method is applied to perform mapping the VM and host so as to perform migration decisions.

F. Pheromone Trail Update
Once performing initial solution construction, the pheromone trails on all the pairs of VM-hosts are updated, which helps global solution retrieval. Classically ACS systems apply MAX-MIN Ant System (MMAS) and therefore the ant with the best solution in iteration is permitted to deposit pheromone. Thus, the pheromone update is performed as per (16).
In (16), ( ) plays a vital role in simulating the pheromone evaporation. Noticeably, the higher value of results in an increased rate of evaporation. Additionally, a few pairs of the target VM and host pair require reinforcement and therefore is defined as the best pheromone amount deposited in each iteration by that VM-host pair. In other words, states the amount of pheromone added or deposited to the edge ( ). In this manner, the VM-host pairs with is reinforced that gains higher attraction. In ACS based task-scheduling or resource allocation, VMs and host nodes are considered as input along with respective demanded resource capacity and total capacity and , respectively. Furthermore, certain parameters like α , , are initialized and the initial pheromone trails for VM (tasks)-host pairs is defined as . nCycles represents the number of iterations. In individual iteration an ant initiates a host set and performs solution retrieval process . Thus, with these initialized parameters, ACS model performs task-migration or VM allocation to the different suitable hosts, while maintaining optimal SLA and higher energy-efficiency. A snippet of the classical ACS based task scheduling is given as follows. As depicted in the above snippet, once identifying the best host solution, the proposed global controller schedules the VM (containing task(s)) to the selected host, and this process continues till all tasks or allied VMs are assigned a suitable host to continue respective functions. In classical ACS based optimization methods, ACS algorithm exploits the positive feedback and parallel computing concept to perform optimization. However, the majority of the ACS solutions undergo local minima and convergence problems, especially due to the complexity in estimating the optimal control parameter, etc. Though, a few efforts such as co-evolution, derived on the basis of the co-evolutionary phenomenon in nature have emerged as potential alternatives to the classical optimization solutions. These approaches employ the concept of decomposition and coordination to split a complex problem into multiple small but interacting optimization sub-problems. Such sub-problems are enhanced distinctly and perform as an eventual standalone solution. Thus, the strategic implementation of multi-population strategy along with coevolution can improve overall performance. In sync with the ACS solution, the implementation of multiple generation, coevolution, improved pheromone update concept and pheromone diffusion can achieve relatively better performance. Additionally, such approaches can greatly help avoiding local minima and convergence issues in the ACS system.

Algorithm 1 ACS-based
The intended improvement in convergence rate can significantly help in avoiding local optimal value and hence more precise resource allocation can be accomplished. This approach can be well suited towards the large-scale taskscheduling and allied resource allocation problem in cloud (IaaS) infrastructure. In reference to the above stated ACS optimization requirement, the proposed model applies a multigeneration concept that splits the complete population or ants into two broad categories; elite ants and the common ants. Moreover, it introduces state-of-art new and robust pheromone update mechanism to enhance the optimization capacity of ACS to meet at-hand task scheduling and allied VM migration control. Subsequently, a novel pheromone diffusion model is applied that effectively controls the pheromone release by ants at specific points, which subsequently impacts adjacent regions to optimize solution faster.
On the other hand, the proposed co-evolution concept helps exchanging information amongst the varied sub-populations for better information sharing. These enhancement efforts intend to achieve more efficient, fast and accurate taskscheduling over cloud to meet real-world cloud demands. The detailed discussion of the above stated improvement and allied implementation towards S-DTS purpose is given in the subsequent sections.

G. Multi-Population Generation Mechanism
In the classical ACS model, as discussed in the previous section, it applies merely one kind of population (i.e., ants) to (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 11, 2021 389 | P a g e www.ijacsa.thesai.org retrieve new solutions. In this process, these classical methods apply predefined fixed values of the ant-colony size, convergence parameter, and selection parameter to control the solution-estimation. However, under dynamic applications such as at-hand cloud computing problems, it is highly complex and challenging to estimate the suitable set of parameters to retrieve the enhanced performance with swift convergence rate. Such limitations often results in premature convergence, and hence seems inferior towards task-scheduling in cloud infrastructure. To alleviate such problems, a concept of multi-population is applied that splits the entire population of ants into two categories: elite ants and common ants. The elite ants retrieve information from the solution archive that eventually helps in generating solutions by implementing a Gaussian kernel function assisted likelihood selection model. More specifically, the proposed elite ants possess a set of distinct parameters that help them (i.e., elite ants) to enhance the convergence rate. On the contrary, the common-ants are employed to generate new solutions with relatively slower speed by means of a single Gaussian function. Noticeably, to achieve it, the common ants employ the mean value of each dimension that helps in avoiding local optima. The proposed model applies the following Gaussian function to generate common ants. In (18), the parameter ( ) represents the Gaussian function used for common ant generation in the dimension. The other parameter, represents the sample value while refers to the obtained standard deviation. The average value of the solution in the -th dimension is given by . Here, be the constant employed to control the convergence rate of the common ants. Thus, the proposed model enables common antsto increase the search space sufficiently large which eventually helps improve the global search ability.

H. Multi-level Pheromone Update
In the majority of the classical ACS solutions, the key challenge is the pheromone update. To alleviate such limitations in the proposed ACS solution, the two different pheromone update mechanisms; the local pheromone update and the global pheromone update is applied. A snippet of the proposed multi-level pheromone update method is given as follows:

I. Local Pheromone Update
In the proposed model, before executing the optimization (say, the first iteration of the optimization), the pheromone deposition on each edge (signifying the VM-host pair) remains the same and constant. The local pheromone model is executed on each (passed) VM-host pair's edge once any ant completes the current iteration. Similar to the classical ACS model, it updates the local pheromone using (21).

J. Global Pheromone Update
Once all ants complete one iteration and achieve a solution set, the passed nodes exhibit the global pheromone update. Unlike classical pheromone update model (17), the proposed model performs pheromone update as per (22).

K. Pheromone Diffusion
In Pheromone Diffusion process, the ants (agent) apply a single pheromone release mechanism. This approach can merely influence the subsequent ants with the passed same point; however doesn't guide the ant-search within a specific range of neighboring regions, and therefore influences the overall optimization performance. Based on the above discussed multi-layer pheromone update model, the pheromone diffusion concept to enhances the performance. The likelihood of superior solutions in the neighboring region used to be higher in comparison to the other neighboring regions. Hence, the pheromone diffusion concept can enable pheromone release by the agents at a certain point that slowly influences a specific range of the adjoining regions. On the other hand, the other ants (elite ants) intend to avoid making any search in its vicinity of the poor solution and often intend to search the solution near or in the neighborhood of the better solution. This as a result not only improves time performance but also accuracy of the selected solution in each iteration. Mathematically, the pheromone update and diffusion concept are presented as per (24-25).
(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 11, 2021 390 | P a g e www.ijacsa.thesai.org In (25), refers the total number of estimated solutions in current iteration, while ( ) refers the left guiding pheromone concentration on the source object . The other parameter, ( ) ( ) represents the correlation distance in between the two maps or objects.

L. The Co-Evolution
Unlike classical evolutionary computing approaches, coevolution is an improved concept that enables higher biological diversity, by emphasizing on certain reliance on intraorganisms (between organisms and organisms), interorganisms (organisms and environment) during the evolution process. Functionally, it employs evolution theory to construct the competition relation or cooperation relation among two or more populations so as to enhance optimization performance by the interaction of multiple populations. It also focusses on exploiting at-hand interaction amongst the varied subpopulations, and eventually influences each other to co-evolve altogether to attain superior optimization performance. In proposed ACS solution, a co-evolution concept to realize the information interaction amongst the varied sub-population to yield better optimization performance. Thus, implementing the above stated improved ACS model dynamic task scheduling and allied resource allocation. The results obtained by carrying out simulation and its inferences are discussed in the following sections.

IV. RESULTS AND DISCUSSION
Ensuring SLA/QoS centric task migration while preserving energy-efficiency is a NP-hard problem, a state of art new Improved ACS model (I-ACS) for VM migration scheduling is applied. Unlike classical heuristic methods, including the conventional ACS or ACO, the proposed method applied multi-population with co-evolution and dynamic pheromone update capacity. This approach not only intended to improve overall scheduling efficiency but also intended to alleviate the problem of local minima and convergence. Thus, performing above stated activities achieves SLA-sensitive and energyefficient task scheduling in large scale cloud infrastructure. The details of the simulation environment applied is given as follows.

A. Experimental Setup
To simulate the overall proposed model, CloudSim simulation environment and allied benchmark tool is considered. The overall programs were developed in Java programming language and emulation was performed over Java Eclipse platform. Noticeably, the higher scalability, ease of implementation and realistic problem realization was the foundation behind the selection of CloudSim based simulation.
In cloud configuration setup, each host is characterized in terms of corresponding utilization of memory and the performance of Central Processing Unit (CPU). The parameters are Million Instruction Per Second (MIPS), signifying the resource being used or demanded by each task and the resource available onto a host. Moreover, memory (RAM) utilization and bandwidth information of each host as well as VM, which are supposed to be monitored continuously to ensure QoSand SLA oriented task scheduling.
To consider the effectiveness of the proposed taskmigration of the VM allocation model, the multiple real-time cloud-computing traces obtained from the CoMon data project, a PlanetLab simulation benchmark (cloud trace) dataset are used. The employed dataset comprised the cloud traffic and allied CPU utilization traces from 1000 plus VMs and allied autonomous tasks, where the different VMs were located at the different locations. The considered benchmark data encompassed the cloud traces over 10 randomly selected data in March and April, 2011. In the considered dataset, the CPU utilization measurement interval was fixed at five minutes. A simulation environment is considered with the system architecture consisting of two heterogeneous servers with dualcore CPUs, one HP ProLiant ML110 G5 with Intel Xeon 3040, 2 cores 1860 MHz processors, armored with 4GB RAM. Additionally, it encompassed HP ProLiant ML110 G5 server with Intel Xeon 3075, 2 cores 2660 MHz, 4 GBRAM) to represent a heterogeneous cloud environment. The server's frequency is mapped onto MIPS specifications where HP ProLiant ML110 G4 server was mapped with 1860 MIPS, while for HP ProLiant ML110 G5 server mapping with 2660 MIPS. Each server was armored with 1 Gbps network bandwidth. To assess the efficacy of the proposed task migration or VM allocation (say, resource allocation) model, the performance is obtained in terms of SLA violation (often called, SLAV), SLA downtime, number of migration and energy-consumption. Before discussing the empirical outcomes, a snippet of the different SLA sensitive performance variable is given as follows:

B. The Cost of Tack-Scheduling or VM Migration
Undeniably, the key intent behind the task-migration or allied VM migration is its QoS-affinity or SLA demands. Additionally, this mechanism demands the proposed scheduling model to ensure minimum SLA violation (SLAV), maximum migration with minimum downtime performance. Moreover, maintaining lower energy-consumption has always been the dominant demand from cloud infrastructures. Typically, the SLAV or downtime probability primarily rely on the key factors such as resource demand or memory expected by the different tasks operating onto the VMs, number of memory disks updated over varied execution periods, etc. Under dynamic workload scenarios, the average performance degradation caused due to the downtime is nearly 10% of the overall CPU utilization. Each VM migration introduces a certain SLAV and therefore the minimization of the migration while maintaining SLA performance can be vital. However, maintaining higher task migration without causing any SLAV can also be suitable towards real world application. It seems more realistic under resource constrained scenarios with exceedingly high dynamism. Practically, the migration period relies on the total amount of memory used by the tasks at a certain VM and the available network bandwidth. The migration period for a specific VM, say can be estimated as per (26).
In (26), the memory employed by is , while the available bandwidth is given by . Here, the focus is on reducing SLAV by maintaining MMT to avoid downtime. To assess performance, the overall performance degradation during the targeted task-scheduling was assessed as per (27).
In (27), the parameter signifies the overall performance degradation during the task-migration or VM allocation from one host to another, be the initial migration (start) time, while be the overall time exhausted during migration. The other parameter ( ) is the overall CPU utilization by a node .

C. SLAV Metrics
Considering the SLA objective in cloud infrastructure, the performance of the proposed task scheduling or VM migration model in terms of the different SLAV parameters is examined. To meet QoS and SLA demands, migration model are required to be optimal in delivering minimum throughput and maximum response time. Functionally, these performance parameters change based on the application demands and allied scheduling modalities. The overall SLAV is defined as the disparity in between the demanded MIPS by the tasks or VMs ( ( )) and the actual assigned MIPS ( ( )) over the life time of VM (28).
In (28), the total number of active VMs is given as .This work considered MIPS information as well as CPU utilization. Noticeably, here the CPU utilization refers the memory demands which couldn't be assigned when demanded. In the proposed method, distinct two SLA metrics, one the duration through which the active host nodes have experienced 100% CPU utilization, called Overload Time Fraction (OTF); and the performance degradation by VMs (PDM) caused due to VMs migrations have been considered for performance analysis. Here, the value of OTF and PDM is estimated using the following equations (29-30).
In (29-30), represents the total number of active hosts, while the number of active VMs is . The other parameter be the total time-period over which the th host experienced complete (i.e., 100%) resource utilization giving rise to the SLAV. Here, the total number of active hosts or servers are and be the performance degradation of due to migration. In the proposed model, the overall CPU demanded by the cumulative tasks at is . Since, the above stated SLAV parameters or metrics, OTF and PDM represent SLAV distinctly, and therefore combining the both metrics as a unified performance parameter named SLAV, which is defined as (31). .

SLAV OTF PDM 
The detailed discussion of the simulated performance outcomes in terms of the above discussed SLA performance metrics, downtime and energy is given as follows: Unlike major classical researches such as [1][2][3][4][5], authors have focused on assessing resource scheduling performance based on the parameters like make span, scheduling time, etc.; however, could not assess whether their approach delivers SLA or not. Unlike the performance assessment in terms of makeover or scheduling time, a real-world cloud infrastructure, especially IaaS often demands ensuring minimum or even negligible downtime, SLAV, etc. Moreover, assessing their suitability in terms of energy is equally significant. Therefore, taking into consideration of this fact, in this research the performance of the proposed system is examined in terms of the following parameters: No. of VM migrations, SLA-Violation (SLAV), SLA performance degradation, SLA Violation per active host, Host Shut-Down, Energy-Consumption.
Amongst the above stated performance metrics, 2, 3, and 4 represents robustness of the scheduling methods towards SLA assurance or QoS. On the contrary, 1 and 5 presents scalability of the proposed cloud model, while 7 indicates swiftness. Though, 1, 3 and 5 are highly dependent. Similarly, 6 th performance metrics indicate the energy-efficacy by the proposed model. Noticeably, for an SLA-oriented solution a task scheduler requires maintaining a greater number of migrations while maintaining negligible SLAV, SLAV per active host, and scheduling time. On the contrary, higher number of active hosts shut down indicates energyconvergence ability by the proposed model. To compare the performance by the proposed model i.e. I-ACS model, with other recent approaches as well; though these methods examined their performance in the different terms like makespan or time over varying tasks. Noticeably, scheduling methods are considered as the foundation and performed taskmigration hypothesizing that each VM carries a single operating task, and hence the task migration can be realized as a classical VM-consolidation or migration problem. Thus, with this hypothesis, three different existing approaches as mentioned in [2], [3] and [4] are implemented.
Velliangiri et al. has focused on improving heuristic model to achieve better performance and local minima and convergence avoidance. In this regard, authors [2] designed a (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 11, 2021 392 | P a g e www.ijacsa.thesai.org Hybrid Electro Search with GA (HESGA) algorithm for taskscheduling. To achieve better performance, authors applied GA to obtain local optimal solution, while Electro Search algorithm was applied to improve global optima solution. However, authors failed in addressing the dynamism of the resource demands under uncertain predefined heterogeneous (dynamic) clouds. Recalling the fact, unlike [2], where authors applied static threshold-based hotspot detection, to cope up with the exceedingly dynamic cloud environment IQR-LRR based stochastic prediction concept for overloading detection is applied, which helped making task-scheduling on time and hence preserved SLA performance. Recently, an improved effort was made in [3], where Liu et al. [3] proposed an improved GA based collaborative scheduling concept for cloud infrastructure. With the same intend as [2], or the proposed I-ACS model, authors [3] targeted on avoiding local minima and convergence problems for better scheduling.
Xiang et al. [4] recently proposed the Greedy-ACO algorithm for workflow scheduling in heterogeneous cloud environments. To be noted, there are a large number of existing method or literatures discussing heuristic based task scheduling, VM consolidation and VM migration; however, considering these three key recent methods which not only intend to perform task-scheduling, but also address the existing drawbacks of the major existing methods such as local minima and convergence.
Recalling the fact that the considered cloud traces or benchmark data was taken from PlanetLab datasets, to examine or simulate the proposed model (as well as the existing methods [2][3][4]  To generalize the performance over multiple test instances or cases, the average performance is considered. The outputs obtained in terms of the different SLA metrics is given as follows: Fig. 3 presents the number of VM migrations by the different techniques. After the observations, the overall results obtained by the proposed I-ACS model show a higher number of task migration, exhibiting robustness towards superior scalability. It is further be identified in terms of the minimum SLA violation and downtime, as depicted in Fig. 4 to Fig. 6. Noticeably, literature hypothesizes that maintaining a lower number of migrations can avoid any likelihood of SLAV; however, the proposed model has exhibited on the contrary, affirming that one can achieve superior SLA performance even with a higher number of migrations. Since, in the proposed model, each VM was considered as one autonomously operating task, scheduling a larger number of tasks shows the superior scalability by the proposed method. It affirms robustness of the proposed model towards realistic mega data center applications.  Fig. 4 presents the SLA violation, here called SLAV. The observations with overall results achieved by the proposed I-ACS model shows better than other existing approaches; however, its performance is far better than the classical ACO algorithms. This performance enhancement could be contributed because of multiple-generation, dynamic pheromone update and co-evolution concept. Statistically, I-ACS model has exhibited almost 0.03% of SLA violation, which shows its robustness. A similar performance was observed in terms of SLA performance degradation per host (Fig. 5). As depicted in Fig. 5, the proposed method performs superior over other heuristic based scheduling. To be noted, since HESGA [2] and improved GA [3] algorithms were developed similar to the proposed I-ACS concept, where the key focus was made on alleviating the at hand local minima and convergence and hence these approaches showed better performance than the classical ACO based scheduling. However, these methods [2][3], due to the lack of adaptive overloading or hotspot detection and dynamic scheduling (performed using multiple controller-based systems), were found inferior than the proposed model.  393 | P a g e www.ijacsa.thesai.org A similar performance was found in SLA per active host (Fig. 6). Observing overall performance, it can easily be found that the proposed multi-controller assisted I-ACS based taskscheduling model achieves better SLA performance and eventual QOS to meet major cloud computing demands. In terms of time of execution, Fig. 6 reveals that the proposed I-ACS model exhibits superior in terms of the SLA time peractive host (second), signifying very small or near tolerable downtime. The comparative outcomes too reveal that the proposed model shows almost 18% lower downtime than other heuristic based approaches.
Considering about the number of hosts shut-down, Fig. 7 reveals that the proposed I-ACS based task-scheduling model exhibits a higher number of host-shut down, signifying better energy-efficiency and optimal resource utilization. Fig. 8 can be found in affirmation, where the proposed I-ACS model has exhibited almost 8% lower energy than the classical ACO based scheduling. Noticeably, in Fig. 8, the energy consumption by GA variants is relatively higher. This could be because of the predefined number of stopping criteria (considering 200 number of generations). It could have taken more time for computation and hence higher energy exhaustion. Thus, considering the overall performance outputs, it can be stated that the proposed I-ACS based model achieves superior performance than other existing (recent) heuristic based task-scheduling systems or resource allocation (say, VM migration) methods. The overall research conclusion and its related inferences are given in the subsequent sections.

V. CONCLUSION
The research work primarily focused on improving the task-scheduling and allied dynamic resource allocation to meet SLA-centric cloud services. To meet contemporary as well as future demands including QoS, SLA-agreement and energyefficiency, the proposed work introduced multiple enhancement at the different levels of computation. The proposed model applied multi-controller strategies, where the use of local controllers enabled task-level resource utilization assessment and stochastic prediction-based overloading or underloading detection avoiding any possible downtime. The proposed local controller applied minimum migration time based VM selection strategy that greatly helped for timely taskmigration scheduling. Eventually, exploiting the task and possible target host information, the proposed involves improved multi-population, adaptive or dynamic pheromone update and co-evolution-based I-ACS model which performs dynamic task-migration or allied resource scheduling. The overall proposed I-ACS model not only enabled superior taskmigration but also avoided any possible local minima and convergence problem. This as a result affirmed optimality of the proposed solution exhibiting superior performance in terms of minimum SLA violation, minimum downtime, lower energy consumption and higher number of task-migration. 394 | P a g e www.ijacsa.thesai.org