A Multi-layer Machine Learning-based Intrusion Detection System for Wireless Sensor Networks

With the increase relay on the internet, and the shift of most business to provide remote services, the burdens of protecting the network and detecting any attack quickly become more significant, as the attack surface and Cyberattack increases in return. Most current Wireless Sensor Networks (WSNs) intrusion detection models that use machine learning methods to identify non-previously seen attacks utilize one layer of detection, meaning that a costly algorithm should be run before detecting any suspicious activity. In this paper, we propose a multi-layer intrusion detection framework for WSN; in which we adopt a defense-in-depth security strategy, where two layers of detection are deployed. The first layer is located on the network edge sensors are distributed; it uses a Naive Bayes classifier for real-time decision making of the inspected packets. The second layer is located on the cloud and utilizes a Random Forest multiclass classifier for an in-depth analysis of the inspected packets. The results demonstrate that our proposed multi-layer detection model gives a relatively high performance of the TPR, TNR, FPR, and FNR, additionally achieving a high Precision rate with values of, 100%, 90.4%, 99.5%, 97%, 99.9% for the Normal, Flooding, Scheduling, Grayhole, and Blackhole attacks, respectively. Keywords—Intrusion detection; wireless sensor networks; machine learning; defence in depth strategy


I. INTRODUCTION
With the emergence of wireless devices, especially in the Wireless Sensor Networks (WSN), and due to the rapid spread of the Internet of Things technology, this has led to a dramatic increase of the attack surface resulting in the network being exposed to various types of attacks [1]. For this reason, intrusion detection methods with highly stability, efficiency, and adaptability are in urgent need to protect such networks. At present, the traditional wireless network intrusion detection methods suffer from some limitations like: low detection accuracy, low precision rate, and high false positive rate [2]. Therefore, there is a growing need to propose a more accurate and efficient intrusion detection framework to enhance the intrusion detection qualification in the wireless sensor network environment.
Nowadays, the application of artificial intelligence methods to intrusion detection systems has become one of the most important research fields carried out by researchers, especially using machine learning algorithms. Additionally, some researches are applying other methods including neural networks [3], [4], [5], genetic algorithms [6], [7], and deep learning techniques [8], [9], [10].
Most of the current frameworks proposed for detecting intrusions in Wireless sensor Networks deal with the network as a whole; thus, they tend to propose one layer of detection, while WSNs consist of a considerable number of sensors distributed in a large area, as the works done by [1], [2], [3]. Therefore, our target in this paper is to divide the task of detecting the network intrusions between two detection layers. Where in the first layer, a simple classifier that has a very low computational cost (i.e. Naive bayes) is used to filter the malicious traffic and pass it to the second layer in which more extensive processing is carried out by utilizing a multi-class Random Forest classifier [11]. In the last few years, many approaches have been proposed to design intrusion detection systems for wireless sensor networks. authors in [12] introduced an evolutionary mechanism to extract intrusion detection rules. In order to extract diverse rules and control the number of rule sets, rules are checked and extracted according to the distance between rules in the same type of rule set and rules in different types of rule sets.
Likewise, Sun et al. [13] proposed a WSN-NSA intrusion detection model based on the improved V-detector algorithm for wireless sensor networks (WSN). The V-detector algorithm is modified by modifying detector generation rules and optimizing detectors, and principal component analysis is used to reduce detection features. Similarly, Tajbakhsh et al. [14] proposed an intrusion detection model based on fuzzy association rules, which uses fuzzy association rules to construct classifiers, and uses some matching metrics to evaluate the compatibility of any new samples with different rule sets.
Singh et al. [15] proposed an advanced hybrid intrusion detection system (AHIDS) that automatically detects wireless sensor network attacks. Moreover, authors in [16] proposed a method of using the synthetic minority oversampling technique (SMOTE) to balance the dataset and then uses the random forest algorithm to train the classifier for intrusion detection. The simulations are conducted on a benchmark intrusion dataset, and the accuracy of the random forest algorithm has reached 92.39%, which is higher than other comparison algorithms.
The rest of this paper is organized as follows. Section 2 illustrates reviews on related works with some background. In Section 3, our proposed Multi-Layer detection model is demonstrated. Then, Section 4 presents the implementation and the experimental results obtained from our proposed model. the results' analysis and discussions were clarified in Section 5. Finally, conclusions and future work are presented in Section 6.

II. RELATED WORKS AND BACKGROUND
WSN faces threats and security issues during the transmission process of data packets between its elements. This is mainly due to the vulnerable nature of WSNs, as these types of network has a considerable number of sensor nodes which are prone to being attacked and receive severe kinds of threats. From the previous studies, we found that such issues have been tackled by abnormal detection methods [17], [18], [19] and misuse detection methods [20], [21]. Authors in [22] proposed an anomaly detection framework in heterogeneous WSNs using real-data. They combined two different approaches: the first approach is the short-term approach, which locally analyzed the data that sense the individual nodes; the second approach is the long-term approach that compares data coming from several heterogeneous sensors over the network. The proposed framework demonstrated a combination of short-long term approaches which can reduce the drawbacks of using each of them separately and gives better performance.
According to [1], the authors presented an intrusion detection method for wireless networks based on improved Conventional Neural Network (ICNN) by first pre-processing the network traffic data, and then used the ICNN to model that data. Their results give an improved accuracy and a higher true positive rate of intrusion detection; it also gives a lower false positive rates compared with the other models. In the work presented by [23], an approach for jamming detection in WSN is proposed based on cooperation with the feedback received from the other connected neighbor's nodes. The model used two techniques, a connected mechanism and an extended mechanism. the results display that this model is more effective when applied on a hierarchical protocol like the Multi-Parent hierarchical.
Another intrusion detection model based on deep learning was proposed by [2]. They built a Deep Belief Network (DBN) combined with multi-restricted Boltzmann machine (RBM), in addition to using the support vector machine (SVM) in training the model. Their experimental results showed that the proposed detection model improved the detection accuracy. An intelligent WSN intrusion detection approach was introduced by [24], which shows that it could decrease the attacks efficiently. They proposed an Artificial Neural Network classifier with Multilayer Perceptron (ANN-MLP) by using holdout and 10-Fold cross-validation methods. In addition to building their own dataset that specialized for the WSN attacks. Their results concluded that with one hidden layer they got the most high accuracy values; however, their approach was mainly based on one detection layer that applies a very computationally expensive learning method.

III. THE PROPOSED MULTI-LAYER DETECTION MODEL
In this paper, we propose a framework for intrusion detection in WSN, that is shield with a defence in depth strategy; leading to an increased security of the working system as a whole. Fig. 1 shows an overview of the system, where the two protection layers represented as the Edge-based Method, and the Cloud-based Method; both layers deploy a machine learning algorithms to facilitate the process of identifying nonpreviously seen network attacks. This is an extension work of our recent research paper [25]. The following subsections described the deployed methods in details:

A. First Detection Layer: Naive bayes-based Method
In order to avoid complexity and overwhelming the first detection layer, we chose to implement a binary classifier where the traffic is classified to either, normal or malicious traffic only [26], [27]. We have used Naive bayes algorithm as a base of the classifier, due to its simplicity and computational efficiency, that makes it a promising choice for real-time decision making of the inspected packets.
Naive Bayes classifier is based on the well-known Bayesian theorem; and it is particularly suited to high-dimensional datasets [28]. Despite its relative simplicity, in many complex real-world conditions this classifier works very well and it might outperform more sophisticated classification methods. Naive Bayes model allows each attribute to contribute equally and independently to the final decision, in which it results in being more computationally efficient compared to other classifiers.

B. Second Detection Layer: Random Forest-based Method
As discussed in the previous subsection, the first layer will classify the monitored traffic into either: normal or malicious traffic, with no further details in terms of the attack type; this is mainly due to the fact that on that layer we are mainly seeking for simplicity and time efficiency of the decision making process. However, as the second detection layer is located on the cloud and mainly handle the suspicious traffic, there will be less complications in terms of the provided resources, meaning that more complex algorithms and more thorough analysis could be carried out. Therefore, the Random Forest (RF) with multi-class classifier has been used to confirm the traffic with the malicious intent; the classifier has been used also to identify the type of the launched attack, thus, providing guidelines for choosing the appropriate defence mechanism.
Random Forest classifier composed of a set of Decision trees, where every tree provides an insight about each sample's class. At the end of the classification, the class with the most votes is selected as the likely class. The aggregation approach follows in this classifier is based on Breiman 's concept of bagging with randomly selected features on each generated bag, thus creating a set of variation decision trees [29]. Decision trees, which are constructed during the classification task on Random forest classifier, are supervised learning algorithms that are used to address both classification and regression tasks. They originates rules from training several samples represented by a set of attributes; where they derives specific rules that can be easily interpreted as they are visualized as a tree-like graph.

IV. IMPLEMENTATION AND EXPERIMENTAL RESULTS
Python 3.7 has been used to implement the proposed framework, in addition to using the latest version of Sciketlearn, which is an open source machine learning library [24]. For the testing purposes, we have used WSN-DS, which is a dataset generated mainly for intrusion detection systems in wireless sensor networks.
A number of metrics have been utilized to assist and evaluate the performance of the implemented system, those metrics could be described briefly as follows: • True positive (T P ): the number of network connections correctly identified as attacks.
• True negative (T N ): the number of network connections correctly identified as normal connections.
• False positive (F P ): the number of network connections incorrectly identified as attacks.
• False negative (F N ): the number of network connections incorrectly identified as normal connections.
Those terms have been used to derive different evaluation metrics, i.e. the True Positive Rate (T P R), True Negative Rate (T N R), False Positive Rate (F P R), and False Negative Rate (F N R); in addition, they have been used also to calculate the P recision (P), as follows: F P R = F P/(F P + T N ) P recision = T P/(T P + F P ) To establish the feasibility of the proposed approach, and to determine its accuracy we have used a dataset generated mainly for evaluating Intrusion Detection Systems in Wireless Sensor Networks (referred to as he WSN-DS) [24]. The dataset consists of a number of 19 features monitored during normal and abnormal scenarios, where in the latter various number and types of Denial of Service (DOS) attacks were simulated (i.e. Blackhole, Grayhole, Flooding, and Scheduling attacks (TDMA)). Table I gives an overall view of the WSN-DS dataset features including their description.

A. First-Layer Results and Discussions
As the main purpose of the first layer is identifying the abnormal traffic with the least resources possible, we used Mutual information (MI) algorithm to quantify the importance of each feature (as seen in Fig. 2), therefore, selecting the most relevant ones; MI is widely known as a good indicator to determine the relevance between variables, and it is usually used in the area of AI as a feature selection algorithm [30], [31]. Fig. 2 emphasises the computed MI score for each feature, where the higher the score, the more important the feature.
Based on some preliminary tests, we have found that choosing the best three features, as ranked by MI, will give the highest classification performance. Fig. 3 (a & b) shows the classification accuracy when including the best three features, and all of the 19 features provided by WSN-DS, respectively. Thus, the first three features, ADV S, Is CH, and Join S, have been used as an input to the Naive bayes classifier in order to filter the malicious traffic and pass it to the second protection layer for further examination. It can be seen from Fig. 3 (a) that a 99% detection accuracy of the abnormal activities has www.ijacsa.thesai.org be achieved with the use of 3 features only, while maintaining a low usage of computational resources; the Area Under the Curve (AUC), which is a commonly used stat to show the overall performance of a classification method, is also shown on Fig. 4.

B. Second-Layer Results and Discussions
On the second detection layer, more examination of the malicious traffic will be carried out; thus, a multi-class classification using RF classifier is performed to identify the specific type of the attack, thereby choosing the appropriate defence mechanism. Classification results obtained by RF classifier is shown on Fig. 5; it could be seen that a relatively high performance was achieved as illustrated in Table II. Therefore, such a high detection performance allows more concrete countermeasures to be adopted automatically by the system. Generally, the aim of an IDS is to obtain a high precision [32], as this measure shows how many cases, predicted as an intrusive, are actually correct. Based on that, when we compare the performance obtained with the RF classifier in this paper with a previous work that used the same dataset, e.g. [24], it could be clearly seen that a higher precision has been achieved, where the precision of the attacks detection were 73%, 90%, 99.5%, 91.1%, and 99% in Blackhole, Flooding, Scheduling, and Grayhole attacks, in addition to the normal case (without attacks), respectively. A comparison of the performance metrics between the previous work done by [24] and our proposed model is illustrated in Fig. 6 & 7, which show an improvement in the performance values of TPR, TNR, FPR, FNR and Precision, compared to the previous work.
However, Fig. 6 shows one case where our proposed work has achieved a slightly lower value; this is the case of the TNR of the Normal packets. Consequently, the FPR derived from the Normal packets becomes higher. In such a case, this means that more packets will be inspected further, and flagged as malicious, although they do not carry any harmful intentions. This case could be costly (in terms of the time spent during the investigation); however, it would not be as expensive as if a malicious packet has been missed to be identified, and instead recognised as a Normal one.
Moreover, our work provides other advantages inherited by the use of RF classifier (rather than artificial neural network on [24]), such as the fact that it is considered less computationally expensive compared with ANN classifier. The usage of RF classifier also increases the performance of the security of the  system as a whole in that it provides the interpretability and transparency of the results, as shown in Fig. 8 where the result of a tree generated by RF classifier could be easily interpreted; the resulting rules could also be investigated further using tools such as [33]. Such properties are very important in the analysis of the attacks, optimisation and handling of the system errors [34]. Most importantly, the proposed work employs a layered defence mechanism that enhances the security by providing  an extra protection layer to defend the whole system in cases where the first layer has been bypassed or fail as a result to the ever-changing attack techniques, and the present increasing threat landscape.

VI. CONCLUSIONS
Intrusion detection in wireless sensor networks is a very challenging task. The majority of the current WSN intrusion detection models were using machine learning methods, but they apply only one method for the whole network. In this paper, we propose a multi-layer framework for intrusion detection system in WSN, leading to increase the network security. Our proposed model consists of two consequent protection layers; the first layer is located on the edge of the network where the sensors are located. It used the Naive bayes classifier where the traffic is classified into normal or malicious traffic which achieving simplicity and time efficiency of the decisionmaking process. While the second layer is located on the cloud, and mainly handle the suspicious traffic by using a multi-class Random Forest classifier.
The implementation results demonstrate that our proposed multi-layer protection model improved the values of TPR, TNR, FPR, and FNR in addition to achieving a high Precision rate with values 100%, 90.4%, 99.5%, 97%, 99.9% for the Normal, Flooding, Scheduling, Grayhole, and Blackhole attacks, respectively. While the previous work has the values 99.8%, 90.4%, 99.5%, 91.1%, 73% for the Normal, Flooding, Scheduling, Grayhole, and Blackhole attacks, respectively. Nevertheless, the results in Fig. 6 show only one case where our proposed work has achieved a slightly lower value; this is the case of the TNR of the Normal packets. Consequently, the FPR derived from the Normal packets becomes higher. In such an instance, this means that more packets will be inspected further, and flagged as malicious, although they do not carry any harmful intentions. This case could be costly (in terms of the investigation time); however, it would not be as expensive as if a malicious packet has been missed to be identified, and instead recognised as a Normal one.
As future work, we plan to improve the performance of our multi-layer detection model in WSN by using one of the deep learning techniques in the second layer, where the higher number of attacks types appear.