Intrusion Detection and Prevention Systems as a Service in Could-based Environment

Intrusion Detection and Prevention Systems (IDPSs) are standalone complex hardware, expensive to purchase, change and manage. The emergence of Network Function Virtualization (NFV) and Software Defined Networking (SDN) mitigates these challenges and delivers middlebox functions as virtual instances. Moreover, cloud computing has become a very cost-effective model for sharing large-scale services in recent years. Features such as portability, isolation, live migration, and customizability of virtual machines for high-performance computing have attracted enterprise customers to move their in-house IT data center to the cloud. In this paper, we formulate the placement of Intrusion Detection and Prevention Systems (IDPS) and introduce a model called Incremental Mobile Facility Location Problem (IMFLP) to study the IDPP problem. Moreover, we propose a novel and efficient solution called Adaptive Facility Location (AFL) to efficiently solve the optimization problem introduced in the IMFLP model. The effectiveness of our solution is evaluated through realistic simulation studies compared with other popular online facility location algorithms.


I. INTRODUCTION
Cloud computing has become a cost-effective model for sharing large-scale services in recent years.Its success is due to the attractive features offered by the underlying virtualization concept, including portability, isolation, live migration, and customizability of virtual machines.Popular examples of cloud-based services are Microsoft Azure, Google AppEngine, and Amazon Elastic Computing Cloud (EC2).Cloud services are generally categorized into three areas: Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS).In SaaS, a third-party provider host customer's application over the Internet (i.e., Rackspace and SAP Business ByDesign).In PaaS model, both hardware and software are provided and hosted by third-party (i.e., Google App Engine and Microsoft Windows Azure).Finally, IaaS refer to providing virtualized computing resources, usually in terms of VMs (i.e., Amazon EC2, GoGrid and Flexiscale).
Intrusion Detection and Prevention Systems are an essential defensive measure against a range of attacks [44,47].In enterprise networked system, IDPSs examine packets sent over networks and trigger alerts when malicious content is discovered and defend against attacks when prevention mode is active.Most issues regarding security in cloud systems are inherited by the current enterprise network [34].Traditional distributed IDPSs are best practice in providing security for large scale networks.However, the deployment of distributed IDPSs in cloud systems raise many challenges due to the diversity of its services and the complexity of its infrastructure [43].
Network Functions Virtualization (NFV) [1] [2] promises a reprive from the vertically integrated hardware middlebox model followed for decades, by advocating the use of software Network Functions (NFs) running on commodity hardware.This means a reduced acquisition and operational costs, flexible programability, and easier management [31] [42].Another orthogonal idea is the Software Defined Networking (SDN) that advocates flexible programability in the network.This is done by the separation of the control-plane from the data plane and centralized logical control of the network.SDN simplifies the overall management of the network by allowing deeper programability of the networking devices.Leveraging SDN in environments where NFV are used can leads to several interesting use cases.The high precision control of forwarding elements (switches) provided by SDN can be used to orchestrate traffic patterns between various appliances and NFVs across a data center [22].In recent years, the cloud has become a mature platform for deploying scalable and cost effective services.With huge growth forecasts, the public cloud industry has grown to become a multi-billion dollar industry [6].Combining the agility of the cloud with the flexibility of Virtualized Network Functions (VNFs) and the fine-grained control of SDN can bring about a new class of cloud based services for IDPSs [13].
In this paper, we introduce a model in which infrastructure providers support Vritual Intrusion Detection and Prevention Systems (IDPSs) as a Service (IDPSaaS) by leveraging NFV, SDN, and cloud.IDPSaaS services can be enabled or disabled for tenant's Virtual Machines (VMs) on their demands and can be scaled up or down to cope with their service workloads.Moreover, the deployment of multiple IDPS instances of a network functions motivates an interesting challenge, which we call Intrusion Detection and Prevention Systems Placement problem (IDPSP).In order to study the IDPSP problem, we propose Incremental Mobile Facility Location Problem (IMFLP) based on the online facility location problem.IMFLP takes into account the online actions, such as live migrations in cloud, which are ignored in almost all of the existing models [21].To the best of our knowledge, it is the first time that the online version of facility location problem has been used to study placement of IDPS.Furthermore, we present an efficient solution for the optimization problem defined in this model called Adaptive Facility Location (AFL).This solution by employing online actions, such as migrations and switches, adjusts the placement of IDPS instances to efficiently adapt to changes in service demands.The effectiveness of our solution is evaluated though realistic simulation studies and empirically compared with several popular online facility location algorithms.
The remainder of this paper is organized as follows.In section II, we formulate the IDPSP problem and present the IMFLP model for studying this problem.We present AFL in section III and conduct experiments to evaluate this algorithm in section IV.The related works are discussed in section V. Finally, we conclude and discuss about future works in section VI.

II. PROBLEM FORMULATION
As mentioned before, the placement module receives an event of an arrival or leaving of a demand, and by information and functions supported by the management module, adjusts the placement of facilities.In this section, we introduce the Intrusion Detection and Prevention System Placement problem (IDPSP) in section II-A.In section II-B we formally define our model of facility location problem that can be used for modeling the IDPSP problem.

A. Intrusion Detection and Prevention System Placement Problem (IDPSP)
Without loss of generality, we introduce this problem through an example.Suppose that an infrastructure provider offers a IDPSaaS service.From the client's point of view, her VMs can be installed any time, and the IDPSaaS service can be requested and enabled for her VMs at any moment.Moreover, VMs are different and have various service workload on the IDPS instances (IDPSInst).Let call each unit of VM's workload as a demand.Thus, we can view the problem as dynamic demands that should be served by multiple IDPSInsts.
From the view point of the infrastructure provider, enabling this service incurs certain amount of the installation, operational, and management costs.The installation cost includes the cost of resource consumption of a host machine on which a IDPSInst is installed, and the cost of certain messages between the controller and the host.In our system, all IDPSInsts are same, and therefore the installation cost is same for all IDPSInsts.The operational cost consists of the traffic processing delay cost, and the cost of steering the traffic to the IDPSInst and then to the destination VM.It can be shown that the cost of steering the traffic is related to the distance between IDPSInst and the VM.Finally, the management cost includes the cost of certain statistics collection and syncronization messages between the controller and the IDPSInsts.The management cost is related to the cost of shortest path between the controller(s) and the IDPSInst.Optimizing the management cost is similar to the placement of SDN controllers [8] [29], and is outside of the scope of the current paper.
Considering Figure 1, suppose that a VM exists on host a.As illustrated in Figure 1(a), when there is no IDPSInst enabled (the service-less case), the internet traffic travels the shortest path from the core switch r to the host a with an intermediate switch m.Let d(r, a) represents the cost of the shortest path between r and a.In the service-less case, the cost of traffic traversal is d(r, a) = d(r, m) + d(m, a).On the other hand, as shown in Figure 1(b), when the IDPSInst is installed on a host b (the IDPSInst enabled case), extra costs are paid.Certain amount of b's resources are allocated to the IDPSInst and certain controlling messages from the controller are exchanged with the host b (the installation cost).This installation cost is independent of the where IDPSInsts are located, and only depends on the number of IDPSInsts.Moreover, IDPSInst adds certain processing delay time t, and the traffic travels a longer path (the operational cost).Delay time t is independent of where the IDPSInst is placed and related to how much traffic is assigned to.Additionally, the traffic is steered from core switch r to host b, and from host b to the host a.In this case, the cost of the traffic steering is There is another complexity dimension that makes the problem even more complicated.Assignments of demands to the IDPSInsts are not irrevocable decisions, and demands can be reassigned to other IDPSInsts.However, these reassignments are not free of charge and associated with certain costs related to the routing reconfiguration and transferring source IDPSInst's internal state to the destination IDPSInst [22].Furthermore, after assigning more demands to an IDPSInst during the time, this IDPSInst can migrate to another location in order to minimize its distance to the VMs and subsequently reduce the operational costs; however, migrations are not free and are associated with certain cost.Any model describing this problem must consider the dynamic nature of the problem, optimizing the installation and operational cost of the IDPSInsts, and possibility of assignments switches and IDPS migrations.

B. Increamental Mobile Facility Location Model
In this section, we introduce a new model of facility location problem called Incremental Mobile Facility Location Problem (IMFLP) to study the IDPSP problem.Before describing our model, we briefly describe why a new model of this problem is needed to be formulated.The details of other existing models will be discussed in the section V-B.
The offline model of facility location problem has been studied comprehensively in the literature [15,9,40,16].Unfortunately, it cannot describe IDPSP, becuase this model requires demands and their locations to be known in advanced, but in IDPSP, VMs are installed at any moment and subsequently their demands are not known beforehand.In other words, assignments of demands to IDPSInsts are done without knowledge about the future demands.Hence, the online model of this problem should be used.However, the existing online models in the literature (as will be discussed in section V-B) are not representative for our problem, thus we design a new model of this problem.Our IMFLP model relaxes certain constraints of the these models and resolve their limitations in describing IDPSP problem to model migrations and assignments switches.
We describe our model of facility location problem by defining the space and metrics, facilities, demands, and allowed actions.
Space and metrics.Given a connected weighted graph M = (V, E) representing the architecture of the data center network, where V denotes the set of nodes (switches or hosts), and E : V × V → R + represents the set of network links.V hosts ⊂ V represents host nodes in which demands and facilities can reside.The shortest path between two nodes p, q ∈ V is denoted by d(p, q).We also use the notation of d(V , p) to denote the shortest path between the closest node in a subset V ⊂ V to a node p ∈ V .Moreover, let B(p, r) = {q ∈ V hosts , r ∈ R + |d(p, q) ≤ r} indicates the nodes within distance r to the node p (the points that lie inside or on the ball with center p and radius r).We assume that the distance metric is symetric and satisfies triangle inequality.
Facility.In IMFLP, a facility represent a VNF instance and is uncapacitated.The location of a facility z in the space is identified by the γ(z) ∈ V hosts .We use term open or install interchangeably for the installation of a facility.Besides, the notation C(z) represents a set of demands that are assigned to a facility z (z's cluster).
Demand.A demand u denotes a unit of service workload of a VM.Similar to a facility, the location of a demand is given by γ(u) ∈ V hosts , which is equal to the node that VM resides.We use term arrive to denote that a new demand from a VM should be served.We also assume that each VM has a correct number of demands.
Allowed Actions.In IMFLP following actions are allowed: • A facility can be opened in any node p ∈ V hosts at any time by paying the installation cost f ∈ R + .A facility also can migrate to another location with the migration cost k ∈ R + .we assume that k < f .Moreover, a facility can be closed at any time, and its installation cost is refunded.However, if any demand is assigned to that facility, they should switched to a new facility and for each switch, the certain amount of cost as described next is payed.• A demand is allowed to arrive and leave at any time in any node p ∈ V hosts .The migration of a VM can be modeled by leaving of its demands and their arrivals in the destination node.Furthermore, a demand assignment can be switched to another facility by paying the switch cost h ∈ R + .We assume that h ≤ k < f .
Additional notation.Please note, for a demand u and a facility z, instead of d(γ(z), γ(u)) we simply use d(z, u) to represent their distance.In addition, we define (x − y) + = max(0, x − y) for x, y ∈ R + .
The model is described as follow.Upon arrival or departure of a demand u t at time t (the input of our model), a new facility ω or a set of facilities can be opened, closed, or migrated.Likewise, a subset of demands can be switched to other facilities.Therefore, the following costs are defined at time t: of a set of facilities F t at time t.
Here, |F t | denotes the number of facilities.2) Total operational cost (C op ) represents the operational cost of a set of facilities F .
As shown in equation 2, this cost is defined based on the shortest paths between facilities and their assigned demands.3) Total migration cost (C mig ) is the cost of migration of a set of facilities since start time until time t.
In equation 3, γ i (z) represents the location of facility z at time i.Please note that term , otherwise it is 0. 4) Total switch cost (C sw ) denotes the switch cost of a set of demands L t at time t.
Here, φ i (u) represents the facility that demand u is assigned to at time i.Note that (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 9, No. 7, 2018 The objective of the optimization problem in the IMFLP formulation is to minimize the overall cost (C overal ) as defined in equation 5.
The IDPSP problem can be reduced to the optimization problem defined in the IMFLP model.This optimization problem is NP-hard (facility location problem is NP-hard, and our online model is even more complicated than original problem).Motivated by this observation, we developed an online algorithm for IMFLP model.

III. ADAPTIVE FACILITY LOCATION (AFL)
In this section, we propose our solution, Adaptive Facility Location (AFL), for the optimization problem introduced in the IMFLP model.We introduce two novel algorithms that use the simple idea of profit and loss for handling a demand arrival and a demand departure.
However, before describing our model, we justify our selection over other candidate approaches.In the area of SDN, some of ubiquitous approaches for modeling the optimization problems are the linear programming [28,49], simulated annealing [48,36], and Markov approximation [30,41].The linear programming approach solves an offline problem, and is not descriptive enough to model the dynamicity and online nature of these kind of problems.In addition, the linear programming is known that is slow.To deal with the dynamic nature of these optimization problems, simulated annealing and markov approximation are used.In the simulated annealing techniques, at each step again an offline problem is defined, and known to be trapped in the local minimums, and might suffer from the bad initial state.Finally, Markov chain techniques might also affected from bad initial state and slow convergence to the steady state.

A. Demand Arrival
Two functions namely, migration potential and installation potential are defined to represent how far facilities and assignments of demands are from the optimal or stable configuration, and how much profit is gained by the installation or migration of a facility, respectively.Then by comparing with the cost of certain actions (the loss), AFL decides which action is applied.
Installation potential function (P ot ins ) is defined as equation 6.This function represents how much of the current cost can be reduced by installation of a facility at a node p ∈ V hosts .In this equation, u t denotes a new arrived demand at time t.F t−1 and L t−1 represent a set of opened facilities and demands at time t − 1 just before arrival of u t .The first term computes the profit of the situation where u t is assigned to a facility that can be installed at node p against when u t is assigned to the closest facility in F t−1 .The second term shows that if some demands are switched to a facility that can be installed at node p, how much the operational cost of related to these demands will reduce (recall that each switch incurs switch cost h).

P ot ins
Migration potential function (P ot mig ) is defined in equation 7.This function describes how much the migration of a facility z from its current location to a node p reduces its total operational cost when a new arrived demand u t is assigned to z as well.This function can be interpreted in another sense as well.Each demand v ∈ C t−1 (z) attempts to reduce its cost by pulling facility z toward its location γ(u t ).If a new arrival demand u t will be assigned to z, u t also tries to pull facility z toward itself.The potential function P ot mig (z, p) represents how much z becomes more stable by migration form γ(z) to p.In other words, z is close enough to each demand v ∈ C t−1 and more closer to u t in comparison to facilities F t−1 including z itself.
Algorithm 1 shows AFL algorithm (for the sake of simplicity, we drop t subscript, but we insist that the presented algorithm is run at time t).By exploiting the aforementioned functions, AFL attempts to improve the current placement of facilities and current assignment of demands.Upon arrival of a new demand u, AFL considers three actions: 1) Installation action: Installation of one new facility in the best place with the best possible switches.2) Migration action: Migrating one of the existing facilities (the best one) without any demand switches and assigning u to this facility.3) Assignment action: Assigning u to the nearest existing facility.
As shown in algorithm 1, AFL computes the installation and migration potentials.By comparing the the computed values, AFL applies the best action.For the installation action, AFL calculates the installation potential P ot ins for every point p in the distance of f from u (B(u, f )).AFL selects the best point ω ins , which maximizes the P ot ins .If AFL decided to apply this action, it switch the neighbor demands to ω ins , if this switches reduce the service cost and the deducted service cost is bigger than switch cost h.
For the migration action, AFL computes the migration potential P ot mig for each facility z and for each point p in the space V hosts .Eventually, AFL chooses the best facility ω mig to migrate to point ρ that maximizes P ot mig .
Ultimately, AFL decides which action is applied.The installation action is considered first.If it is beneficial (P ot ins (ω ins ) − f > 0), and its profit is greater than best migration action (P ot mig (ω mig , ρ)−k), AFL applies the installation action.Otherwise, the best migration action is considered.If this migration is beneficial (P ot mig (ω mig , ρ) − k > 0),

B. Demand Departure
Similar to the case of demand arrival, AFL defines closing potential and migration potential functions to represent how far the current configuration of a facility whose demand departures is from the stable configuration.Let u t denotes a demand departuring at time t, and z = φ(u t ) represents the facility to which u t was connected at time t − 1 just before departure.
Closing potential function (P ot cls ) is defined in equation 8.This function denotes the profit of closing a facility and switching its demands to the closest facilities.
Migration potential fucntion (P ot mig ) for the departure of a demand is defined by equation 9.It can be interpreted exactly same as the migration potential for a demand arrival.
AFL for the departure considers two actions: 1) Closing action: Closing facility z and assigning each of its demands to the closest facility in F t−1 /z. 2) Migration action: Migration of facility z to another location to serve C t−1 (z)/u t more efficiently.Algorithm 2 represents AFL's algorithm for handling a demand departure.For the sake of simplicity, we omit subscript t from the notation.AFL computes the closing potential of

IV. EXPERIMENTS
We evaluated the effectiveness of our placement algorithm in several simulation studies.We implemented our AFL algorithm in a discrete event simulator and compared it to other five popular algorithms namely: FFL [19], AFL [18], OPTFL [17], RFL [38], and SNFL [20].The details of these algorithms will be discussed in section V.The OPTFL, AFL, and FFL algorithms have certain input parameters.We ran these algorithms for miscellaneous values of parameters and did not observe substantial difference.Ultimately, their input parameters were set to the values suggested by their authors, specifically for OPTFL α = 10 [17], for AFL α = 18, β = 8.0, ψ = 4.0 [18], and finally for FFL x = 19  8 [19].In the last decade a tremendous research has been done to search for an efficient and inexpensive data center networks (DCN) architecture.Several architectures like fat-trees [3], VL2 [24], Portland [39], BCube [25] and DCell [26] have been proposed to address different challenges of current DCN architectures such as scalability, agility, and reconfigurability.For the experiment, we select Al-Fares et al. fat-tree [3] architecture.This architecture is one of the well known DCN architectures [27] [37] [7].Fat-trees are more scaleable and reliable than conventional tree-based architectures.This topology allows us to leverage identical cheap commodity switches in the all communication layers.In the theory, the over-subscription ratio of this rearrangeable architecture is 1 : 1, which means that this architecture is non-blocking; however, in the practice preventing packet reordering might make it difficult to guaranty non-blocking network.The fattree topology proposed by [3] is a k-ary tree in which k denotes number of ports and number of pods.This topology connects homogeneous switches with the same number of k ports.As depicted in Figure 2, the Al-Fares's fat-tree consists of three switch layers.At the highest level, there are ( k 2 ) 2 core-switches.Each core-switch is connected to all k pods (i-th port of a core-switch is connected to the ipod).A pod contains k switches ( k 2 aggregation-switches and k 2 edge-switches).At the second level, aggregation switches are connected to k 2 of core-switches upward and k 2 edgeswitches downward.Furthermore, each aggregation-switch is only connected to edge-switches that are in the same pod.At the third level, edge-switches are linked to the k 2 hosts dipping and k 2 aggregation-switches mounting.There are k 3 4 hosts which are located in the leaves of this architecture.For all experiments, the oversubscribing ratio was set to 1 : 1, which means that this architecture is non-blocking.The demands are generated randomly (only in the leaves) from the uniform and normal distributions.The mean and standard deviation parameters of the normal distribution were set to 0.5 and 0.1, respectively, and each generated value was multiplied by the number of leaves and a demand was generated at the position of the result.Moreover, in all experiments, the value of parameter g was set to 1.All algorithms receive one demand at a time and reconfigure the placement of the facilities to serve this demand upon its arrival.The costs of installation, migration, and switch for all algorithms are collected.For each configuration, the average of 10 tests has been reported as the final result.We have conducted three experiments to evaluate the the behavior of AFL under different circumstances.

A. Impact of Number of Demands
In this experiment, the impact of number of demands on the behavior of AFL is examined.As depicted in Figure 3, five tests for 1024, 2048, 3072, 4096, and 5120 number of demands for uniform (Figure 3(a)) and normal distribution (Figure 3(b)) are conducted.We assign 6, 2, 1 for f , k and h, respectively.The idea behind choosing these values is that we assume that the cost of installation of a facility f is always greater than the cost of migration k and switching h, and the cost of migration is equal or greater than the cost of switching.Moreover, The space is the fat-tree with 1024 hosts.For each test, each algorithm receives one demand at a time and returns a placement of facilities.As shown in Figure 3, the total cost of all algorithms in the uniform distribution are considerably greater than normal case.The reason is that in the uniform case the demands spread in more hosts in comparison to the normal distribution that demands tend to arrive in the middle hosts.As depicted, AFL outperforms all other algorithms in all cases for both distributions.The average of overall costs of AFL is 11.82% and 14.46% lower than the second best algorithm in the case of uniform and normal distributions, respectively.

B. Impact of Number of Hosts
In this experiment, the impact of number of hosts is studied.Fat-trees with 64 (k = 8), 250 (k = 10), 432 (k = 12), 686 (k = 14), and 1024 (k = 16) hosts are generated and employed as the space for each test.In each test, 1024 demands are generated from uniform and normal distributions.Similar to the previous experiment, values of f , k, and h are set to 6, 2, and 1, respectively.
Figure 4 depicts the results of this experiment.Figures 4(c) and 4(d) represent the results for the uniform and normal distribution, respectively.Similar to the previous experiment, the total costs in the uniform case is noticeably greater than the normal case.As shown, AFL outperforms the other algorithms in both distributions and in all cases.The total cost of AFL is lower than the second best algorithm by the average of 14.16% and 15.66% for the uniform and normal distributions, respectively.
The different costs of AFL for the uniform generated demands are shown by table I. Facility and service costs are the most significant part of the overall cost.By increasing the number of demands, the migration and switching costs increase.In the case of 64 demands, AFL does not migrate or switch, however, when the number of demands increase, AFL migrates certain facilities and switch some of demands in order to reduce the total cost.For instance, the switch and migration costs is 3.45% of the total cost for the 1024 demands.It means that AFL by paying small amount of migration and switch cost saves a significant amount of the facility and service cost.In the case of 1024 points, AFL pays 2914.2 lower than the best second algorithm by paying extra 441.5 migration and switch cost in the average.

C. Impact of Cost Parameters
In this experiment, the impact of the costs parameters (f , k, and h) is investigated.For all tests, the space is fixed to the fat-tree with 1024 hosts (16-ary tree).Note that in this k-ary tree, the maximum distance between two points is 6 (Please note that we fixed the value of g to 1 in all experiments).Similar to the previous experiment, demands and facilities are located in the hosts.In particular, we vary the cost of installation, switch, and migration to investigate the impact of these parameters on the performance of our algorithm compared to others.We strictly specify that the cost of switching h to be always less or equal than the cost of migration k but cannot exceed the facility installation cost f (i.e, f < k ≤ h).We run several tests for low, medium, and high values of f , k and h.Specifically for the facility cost f , 2, 4, 6 are considered as low, medium and high values, respectively.For the migration cost k, the values 1, 3, 5 and for the switch cost h, the values 1, 2, 4 are selected as the low, medium and high values, respectively.Ultimately, 10 different configuration of values for the cost parameters are examined.Furthermore, we select the number of demands from 1024, 2048, 3072, 4096, 5120 and randomly place them on leaves based on normal and uniform distribution.Figure 5 shows the overall cost of our algorithm compare to others when changing the installation, switch, and migration cost parameters.However, as can be seen in figures 5(c), 5(d), and 5(e), by increasing the number of demands, AFL again outperforms the other algorithms in these configurations as well.It seems that by increasing the number of demands, AFL converges to the more stable configuration and performs more efficient.

D. Evaluation of Demand Departure
In this experiment, we examine the behavior of our solution for demand departures.Because we did not find any online algorithm in the literature considering demand departures, we compared our algorithm with a famous offline facility location greedy algorithm with approximation ratio of 1.61 [32].The same configuration as experiments for the impact of number of demands and number of hosts (sections IV-A, IV-B) are used.A fat-tree with 1024 hosts, and values of 6, 2, 1, and 1 for parameters f , k, h, and g, respectively.

A. Existing Systems
Network Function management solutions in the existing literature can be classified in to two separate groups.1) Systems that deal with NFs that are deployed on pre-designated static hardware.These include systems such as CoMB [45], SIMPLE [42], xOMB [5] and PLayer [33].2) Systems that deal with VM based NF deployments, such as Stratos [23].
Static NF deployments are a step up from the traditional NFs, and introducing software based NFs within pre-placed commodity or specialized hardware.This gives such NFs the ability to use the best of both software and hardware world: multiple NFs can co-exist on the same high speed hardware and work in a coordinated manner to provide superior performance [45].Since the NF itself is in software, it is easy to update and maintain.The hardware can also evolve independently of the software as the hardware and servers on which the NFs are hosted can be upgraded and replaced.This comes at price -the location of the NF, due to its rigid placement, might not always be ideal.The demand for NFs is not always uniformly distributed with in the data center [23].

B. Facility Location Problem
Facility Location Problem (FLP) is one of the well-known problems in the location theory.This classical optimization problem is concerned with optimal locations of certain facilities to minimize the cost of providing service to demands [46] with the offline settings.This problem is known to be NP-Hard, and several approximation algorithms has been developed for this problem [46].The best known algorithm a The complexity of processing m th demand b n is the number of all demands c The number of demands at time t d d max is maximum distance in the space e F max denotes the maximum number of facilities opened by the algorithm is proposed by Li et al. [35] and achieves 1.448 approximation ratio.In addition, several models of this problem with offline settings have been defined in the litereture.Farahani and Hekmatfar [15] provided extensive review of different models of offline FLP, and Boloori Arabani and Farahani [10] overviewed the dynamic models.Unfortunately, none of the above models are not applicable for our problem, becuase they do not consider the online nature of the problem.Hence, we focus on online models of FLP.Fotakis [21] overviewed the online models of FLP and identified two major models of online version of FLP, namely Online Facility Location Problem (OnFLP) and Incremental Facility Location Problem (IncFLP).In addition, some of the other relaxed version are discussed here.Table II represents the well-known algorithms, and their competitive ratios for all models.
1) Online Facility Location Problem: Meyerson et.al. [38] for the first time designed the Online Facility Location Note: Please note that x-axis in the figures r epresent d ifferent values f or f, k, h a s t he fi rst, se cond, an d th ird di git, re spectively.Fo r in stance, 65 2 represents 6, 5, and 2 for f, k, and h, respectively.Fig. 5.The total costs of the algorithms for the different values of f, k, and h for 1024 leaves d(r, b) + d(b, a) = d(r, m) + 2d(m, b) + d(m, a) (We assume that the shortest path cost is symetric).By deducting the cost of service-less case, the extra cost in the IDPSInst enabled case is 2d(m, b).Because a and b are in the same level (host level) d(m, b) = d(m, a), and therefore the extra cost is 2d(m, b) = d(m, b) + d(m, a) = d(a, b), which is the cost of shortest path between host a containing IDPSInst and host b containing the VM.

Fig. 1 .
Fig. 1.The comparison of the traffic path

Figures 5 (
Figures 5(a), 5(b), 5(c), 5(d), and 5(e) represent the results of tests for 1024, 2048, 3072, 4096, and 5120 demands, respectively.As shown, for all configuration of cost values Algorithm 1 AFL-Demand Arrival F ← ∅; L ← ∅; for all new demand u do L ← L ∪ {u}; ρ ins ← arg max p∈B(u,f ) {P ot ins (p)}; p ins ← P ot ins (ρ ins ); ω mig , ρ mig ← arg max z∈F,p∈V hosts /{γ(z)} {P ot mig (z, p)}; p mig ← P ot mig (ω mig , ρ mig , u); if (p ins − f > 0) ∧ (p ins − f ≥ p mig − k) then ω ins ← open a facility at ρ ins F ← F ∪ {ω ins } Switch facility of each demand v ∈ L/{u} if d(φ(v), v) > d(ρ ins , v) + h; Assign u to the nearest facility; else if p mig − k > 0 then Assign u to ω mig ; Migrate ω mig to point ρ mig ; Algorithm 2 AFL-Demand Departure z ← φ(u ); p cls ← P ot cls (z); ρ mig ← arg max p∈V hosts {P ot mig (z, p)}; p mig ← P ot mig (z, ρ mig ); if (p cls + f > 0) ∧ (p cls + f > p mig − k) then Switch facility of each demand v ∈ C(z)/{u } to the closest facility in F/{z}; facility z serving u and the best migration potential.First, AFL considers the closing action.If closing z is beneficial (p cls + f > 0) and is more profitable than migration action (p cls + f ≥ p mig − k), AFL applies this action.Otherwise, the migration of z is considered, and if this action is profitable (p mig − k > 0), AFL migrates z to ρ mig .If none of closing and migration actions are beneficial, AFL only remove demand u from the list of demands.

TABLE I :
AFL's CostsNote: We omit word cost from the headers.For instance, by the Facility we mean Facility Cost

TABLE II :
Algorithms for OnFLP and IncFLP models