ReCSDN : Resilient Controller for Software Defined Networks

Software Defined Networking (SDN) is an emerging network paradigm that provides central control over the network. Although, this simplifies the network management and makes efficient use of network resources, it introduces new threats to network reliability and scalability. In fact, a single centralized controller is a single point of failure. Moreover, a single controller may become a performance bottleneck as processing overhead increases. Distributed SDN controller platforms improve the reliability and scalability to some extent, however they remain vulnerable to Distributed Denial of Service (DDoS) attacks, specifically on control plane. We believe that there is a need for a distributed controller framework that is capable of providing service continuity without performance degradation in case of excessive network traffic or DDoS attacks on controller. In this paper, we aim to address the vulnerabilities of SDN control plane. We propose and implement an efficient and Resilient Controller for Software Defined Network (ReCSDN). This framework is capable of detecting and mitigating DDoS attacks timely and ensures the continuity of services without performance degradation. We created an experimental test bed using Mininet to conduct extensive experiments. We deployed ReCSDN on top of Open Network Operating System (ONOS) cluster to confirm the viability of our approach. The experiment results show that with ReCSDN, control plane is not only able to withstand excessive network load but will also continue to provide services in case of any controller failure. Keywords—Software Defined Networking (SDN); SDN Controller security; Distributed Denial of Service (DDoS) attack; load balancing; SDN controller cluster; Open Network Operating System (ONOS)


INTRODUCTION
Software Defined Networking (SDN) paradigm has revolutionized the traditional networking by separating the control plane and data plane of the network.With this separation of the control plane and data plane, control logic is implemented in logically centralized controller and network switches becomes simple forwarding devices [1].This decoupling provides several benefits which includes easier network management, increased visibility into the network, programmability, efficient use of network resources, dynamic updating of network policies [2], [3].The centralized control plane leads to global knowledge of the network thereby providing effective resource management.Moreover, network policies can be easily configured and modified via software applications running on top of the controller.Customized network applications can be developed and deployed directly without any vendor dependency [4], [5].
Nevertheless, these core benefits that are the hype of SDN are also the main causes of concern.The centralized control plane that provides critical advantages over the traditional networking has introduced new threat vectors.First and foremost it can become the single point of failure [6].The controller becomes the core of network and any attack, such as, DDoS attack can bring down the whole network.This vulnerability introduces new threat vector in SDN.Many approaches, such as primary backup replication mechanism and distributed controller platforms [7] exists that addresses this critical reliability issue.However, there are numerous issues with these approaches which makes it an open research problem [8], [9].
Second, the controller may turn out to be a performance bottleneck as the network size increases [8].Whenever a new flow is initiated in the network, the OpenFlow switch forwards it to controller for deciding the suitable forwarding path.Similarly, all the unknown flows that are not recognized by the switch are sent to controller for processing.The performance of the controller is largely affected as the network grows thereby increasing the number of traffic flows.Various schemes for controller load-balancing [10] has been proposed to improve the performance of centralized controller platforms.However, due to their limited capabilities the problem remains an open research area.
Many researchers have explored the new threat vectors introduced by SDN [11], [12].Several attacks, including DDoS attacks, and their mitigation strategies has been proposed [13]- [15] for SDN networks.However, very limited work has been done to detect and mitigate attacks specifically on SDN controllers [6].Also, most of this work has been done for centralized controllers such as Floodlight [16]- [18] and POX [19], [20].
Keeping in view the above mentioned limitations we presume that there is a need to explore load balancing and DDoS attack vulnerabilities in distributed SDN controller platforms.We also need a framework that can detect excessive load on controllers and ensure the continuity of services without performance degradation.
To this end, we propose and describe Resilient Controller for Software Defined Networks (ReCSDN) that addresses the above mentioned problems.ReCSDN, is a novel framework www.ijacsa.thesai.org that is built on top of a distributed controller environment.It provides a reliable, efficient and resilient control plane that not only overcomes the single point of failure problem but also ensures the service continuity without performance degradation.ReCSDN is able to detect the excessive network traffic coming to the controller and provides a load-balancing mechanism that ensures that performance of controller is not degraded.Excessive traffic may be generated due to DDoS attack on controller or flash crowds.In either case, the objective of ReCSDN is to provide fault tolerance and service continuity while maintaining the performance quality.ReCSDN also ensures that network latency remains consistent and does not increase as we increase the number of distributed controllers.
The main contributions of this paper are summarized below:  We proposed and implemented ReCSDN, a reliable, efficient and resilient framework for SDN.
 We performed extensive experiments using Mininet and ONOS [21], a distributed SDN controller platform to test the effectiveness of our framework.
 We were able to detect and mitigate DDoS attack on SDN controller effectively.
 We are able to ensure quality of service performance by providing appropriate load balancing among controllers.
 We are able to provide fault tolerance by using backup controllers timely.
The rest of this paper is organized as follows.Section II comprises three sub-sections.First two sections briefly introduces Software Defined Networking and ONOS followed by a detailed review of existing research on SDN security.The proposed architecture and its implementation is discussed in Section III followed by the threat model which is discussed in Section IV.The experimental setup and results are presented in Section V and Section VI, respectively.Section VII concludes the paper.

II. BACKGROUND AND RELATED WORK
We have divided this section in three sub-sections.Motivation for this research and the benefits of software defined networking over traditional networking are enlightened in the first sub-section.Next sub-section discusses ONOS, followed by the related work.

A. Towards Software Defined Networks
Building and administrating a computer network is an onerous task.Managing networks includes many challenges, such as, heterogeneity of network elements [22], vendor dependency [23], lack of centralized control, no programmability.Moreover configuration of complex networks which are dynamic in nature is more difficult, because of lack of automated mechanism for defining centralized policies.This creates scalability and configuration issues which makes traditional networks less innovative [24].The network administrator have to configure each network device individually to apply network policies [25].As the size of network increases number of devices also increases thereby increasing the administrative overhead.SDN addresses the above mentioned issues by separating the control plane and the data plane as shown in Fig. 1.In SDN control plane provides a centralized control of the network.Control plane can manage the entire network centrally [12].Major objective is to provide a centralized control over the entire network, so that all the control process and services are separated from the data forwarding tasks.Hence, the software that controls the network is decoupled from the devices that implement it [26].Switches became simple forwarding devices that work according to the policies defined by the controller.Many open source SDN controllers has been developed which includes POX [27], NOX [28], Beacon [29] and Floodlight [30].More recently distributed SDN controller platforms such as ONOS and OpenDaylight [31], [32] have been developed to cater the needs of large enterprise networks.We briefly discuss ONOS in the next sub-section.Apart from open source controllers, major industry leaders have also developed proprietary SDN controllers such as; HP [33], [34] and brocade [35].
Although, SDN has been gaining immense popularity since its inception, it is no silver bullet.SDN comes with its own set of vulnerabilities that were not present in traditional networks.Subsequently, after the adaptation of SDN in network infrastructures, many researchers have been questioning the security of SDN [36], [37].The centralized control plane which has been its prominent feature has also become the major point of concern.Adversaries can launch DDoS attack on the control plane of the SDN subsequently leading to service degradation or a complete network shutdown.Similarly, performance, scalability and reliability of SDN have not been thoroughly investigated yet.www.ijacsa.thesai.org

B. Open Network Operating System
The ONOS (Open Network Operating System) is an open source project hosted by The Linux Foundation.The software is written in Java and provides support for distributed SDN applications atop Apache Karaf OSGi container as shown in Fig. 2. The first version of ONOS was released in 2014.The ONOS is a distributed platform for SDN networks that caters the need of enterprise networks.The key features of ONOS includes scalability, high performance and high availability.ONOS is basically designed to operate as a cluster of nodes such that it can withstand the failure of individual nodes.ONOS overcomes the limitations of centralized SDN controllers like POX, NOX and Floodlight.It provides a highlevel abstraction to application programmers by providing a platform for developers to write novel applications that can run on top of ONOS.Its model can be extended by programming variety of applications.The ONOS has been used today in variety of applications ranging from multilayer network control to datacenters [38].Major use cases of ONOS includes CORD (Central Office Rearchitected as a Datacenter [39], [40], Multi-Layer Network Control, Migrating MPLS Network, and Global Research Network Development.ONOS also provides its partner driven use cases such as Huawei Agile L3 VP, Huawei Enterprise CPE, DirectTV Multicast and NEC Transport SDN.
To ensure strong consistency ONOS adopted Atomix framework after its v1.4 release.Atomix uses RAFT consensus algorithm [41] to ensure consistency among cluster nodes.Atomix deals with distributed computing problems.In contrast with the Hazelcast [42], Atomix chooses availability over consistency.Due to this Atomix ensures that data is never lost, even in the network partitioning or complete failure.

C. Related Work
Security of SDN has been a point of concern since its adoption [37].Many researchers have questioned the security of SDN itself [12].However others have proposed SDN based security solutions [43], [44].DDoS attack detection in SDN with the entropy variation technique was presented in [6], [18] Niyaz et al. [45] proposed a deep learning multi vector DDoS system.Fonseca et al. [46] designed CPRecovery by component organization.Another technique was AVANT-GUARD [14] which is based on complete TCP handshake mechanism.Hong et al. [47] proposed a TopoGuard technique.It focused the attack over data plane communication channel.R. Braga et al. [13] classified the flows by self-organizing maps.An inference-relation context based technique was presented by Aleroud et al. [48].They proposed technique utilizes contextual similarity with existing attack patterns to identify DoS in an OpenFlow infrastructure.Cui et al. [49] performed attack detection by neural network techniques.Botelho et al. [50] has replicated the sheared database of the whole network state to improve reliability.
Majority of the research work discussed above is based on the centralized SDN controller.Few researchers have implemented replication between master and backup controller.When master controller fails, backup controller becomes an active controller.In contrast to existing research, we have developed a resilient framework for distributed controller environment.We emulated our network using ONOS.In our approach, all controllers in a cluster are active.If there is an attack on any of the controllers, load is distributed to other controllers within a network.The controllers share the information of flows and switches consistently.Moreover in previous research works, different SDN controllers [51] were used such as, POX [27], NOX [28], Beacon [29], and Floodlight [30], but ONOS [21] controller was not explored for the attack detection.In this paper we are creating a distributed environment using ONOS controller with Mininet emulation to detect DDoS flooding attack on the controller.

III. PROPOSED APPROACH
This section presents the design of Resilient Controller for Software Defined Network -ReCSDN.The ReCSDN is a proficient solution that efficiently detects and mitigates DDoS attack on the control plane.It is capable of providing fault tolerant and consistent services to the network without performance degradation.ReCSDN detects excessive traffic network coming to the controller and uses load balancing mechanism that ensures the reliability and performance of the control plane.The ReCSDN module runs on top of distributed controller platform.It monitors the processing load of the controller and ensures that the load is distributed to other controllers in the cluster before any controller reaches its full capacity.

Role Ass ignme nt
Eng ine

Policy E ngine
Thr e shold De te ct or L oad-Ba lancing E ngine ReCSDN consists of four modules as depicted in Fig. 3.The Policy Engine is used to configure the number of active and backup controllers within a cluster.Also, threshold for www.ijacsa.thesai.org each controller is setup using the Policy Engine.The threshold value indicates the tolerance level of controller after which the performance of controller may be degraded.Therefore, the threshold Detector module monitors the state of controller to ensure that load of the active controller is distributed by the Load Balancing Engine before crossing the threshold.The Role Assignment Engine is used to assign the master/backup status to controllers within a cluster.The scope of this work is focused on DDoS attack on SDN controller.In such an attack adversaries may use compromised nodes to send unknown flows to the OpenFlow switches.These unknown flows are not recognized by the OpenFlow switches and are sent to controller for further processing.Thus, the controller is overwhelmed by the huge number of illegitimate packets and is either completely halted or results in its performance degradation.
We have considered two threat vectors that targets SDN control plane in our threat model.The two threat vector are based on generating flows that are not recognized by the switches thereby targeting the SDN controller and the communication channel between SDN control plane and data plane.Fig. 4 depicts the threat model.During a DDoS attack multiple hosts generate fake or forged traffic.Such traffic flows are not recognized by OpenFlow switches and are forwarded to controller for deciding the suitable forwarding path.This scenario not only depletes controller resources but also results in exhaustion of the communication channel between controller and the network.

V. EXPERIMENTAL SETUP
To determine the viability of our approach, we have setup a test bed on a server with an Intel Core i7, 3.67 GHz processor and 16GB RAM running Ubuntu 14.04.5.We conducted our experiments to emulate the DDoS attack scenario on a controller using Mininet and ONOS cluster.We deployed ReCSDN module on top of ONOS cluster.We included different types of legitimate traffic to build a realistic scenario.The legitimate traffic included TCP, UDP and ICMP.The D-ITG tool [52] was used to generate the traffic and to collect performance metrics.The metrics include delay, jitter and number of packet loss.
To create DDoS attack scenario on a controller huge number of new flow requests were generated.When a new flow is received by the OpenFlow switches, it is not recognized and is forwarded to the SDN controller for deciding the transmission path.The increase in the number of new flow requests, increases the processing overhead of controller leading to performance degradation or completed denial of service.The ReCSDN module monitors the network and controllers state and ensures that load of the controller is distributed to other controllers in the network before the threshold is reached.The ReCSDN provides fault tolerance mechanism by using back controllers.These back controllers are active controllers that can also be used for load balancing in case of DDoS attack or flash crowds.
We conducted extensive experiments discussed in next section to evaluate the performance and reliability of ReCSDN.

VI. RESULTS AND ANALYSIS
One of the key characteristics of the ReCSDN is achieving resiliency.We exploited the distributed architecture of ONOS to build a fault tolerant environment.We created a cluster of ONOS controllers that provided multiple backups for each active controller.Multiple backup controllers lead to more fault tolerance.As ReCSDN is specifically developed to work with distributed controller cluster a key aspect of characterizing the performance of ReCSDN is to analyze and compare performance at various scales.We created several scenarios to measure the response time as number of controllers in a cluster scales from 1 node to 3, 5, and 7 nodes.We observed that increasing the number of controllers within a cluster has no overhead and response time remains below 0.1ms.Fig. 5 depicts the result of experiment.
To evaluate the effect of increasing number of controllers in a ReCSDN cluster on latency we conducted multiple experiments.For each experiment we increased the number of controllers from 1 to 3, 5 and 7. We generated constant amount of TCP traffic for each experiment and noted delay and jitter.The network traffic comprises huge number of unknown flows.The ReCSDN ensured that load is distributed among the other controllers before the master controller is overwhelmed.As we increase the number of controllers in the cluster the delay decreases as shown in Fig. 6.
The latency decreased due to the consistent load distribution among the controllers.The overall performance of network improved as ReCSDN enabled load balancing before the maximum capacity of a controller is reached.www.ijacsa.thesai.orgTo determine the single controller's capacity of processing maximum number of flows we performed a stress test.We flooded the controller with new flow requests, generated by pushing random intents.Intents are high-level policies that are translated by ONOS Intent Framework into installable forwarding rules.We repeatedly pushed 2000 intents till the controller halted.Fig. 7 illustrates the capacity of single controller.For our experiment, as the intent count reached 1600, the controller stopped responding.However, the capacity and performance of controller is dependent upon the configuration of physical machine on which the controller is running.After repeating the experiments number of times we choose 15000 as a threshold value for next ReCSDN experiment on this configuration.Nonetheless, the threshold value can be configured using the Policy Engine of ReCSDN whenever required.
After determining the threshold value, we launched a DDoS attack on SDN controller by pushing unknown flows in the network.We created a three controller ReCSDN cluster and started pushing intents gradually.As we moved from 1000 intents to 40,000 the ReCSDN control plane remained active without performance degradation as shown in Fig. 8.The master ReCSDN controller distributed the load to ensure the continuity of service.We also generated the legitimate traffic on the network during the attack.There were no packet losses and the response time remained consistent throughout.ReCSDN is capable of provided resiliency not only in case of DDoS attack but also in case of controller failure.It improves the network performance by timely load distribution among the controllers.

VII. CONCLUSION AND FUTURE WORK
A Software Defined Network (SDN) is an emerging network paradigm that provides central control over the network.Although the centralized control is one of the major advantages of SDN, it also brings about many critical concerns including a single point of failure in case of attacks.The central control can also become a bottleneck affecting the network's overall quality of service.
In this paper we highlighted the security threats specific to centralized control, that is, SDN control plane.We addressed the SDN's control plane issues of performance bottle neck and single point of failure.
In order to improve the performance and fault tolerance of SDN, we proposed and implemented a resilient framework-ReCSDN.Our proposed solution is not only capable of detecting excessive network traffic coming towards an SDN controller but also provides a mechanism to ensure the continuity of services in case of DDoS attack.ReCSDN uses load balancing strategy to invoke backup controllers in ReCSDN cluster to distribute and manage the load without performance degradation.We performed extensive www.ijacsa.thesai.orgexperiments by emulating the network using Mininet and implementing ReCSDN on top of ONOS.The experiments prove that the proposed framework provides resiliency and improved performance consistency.Even though, our results are specific to the ONOS controller but the methodology we presented is general and can be applied to any distributed controller platform.In future, we intend to experiment with larger number controllers.

Fig. 5 .
Fig. 5. Effect of adding backup controllers by calculating response time.This test was performed with varient number of backup controllers, 1, 3 ,5, and 7 respectively.

Fig. 6 .
Fig. 6.Delay decreases as number of controllers increased in ReCSDN cluster.

Fig. 7 .
Fig. 7. Stress test for checking controller processing capability.Red indicator shows controller resources saturation point.