Predicting DOS-DDOS Attacks: Review and Evaluation Study of Feature Selection Methods based on Wrapper Process

Now-a-days, Cybersecurity attacks are becoming increasingly sophisticated and presenting a growing threat to individuals, private and public sectors, especially the Denial Of Service attack (DOS) and its variant Distributed Denial Of Service (DDOS). Dealing with these dangerous threats by using traditional mitigation solutions suffers from several limits and performance issues. To overcome these limitations, Machine Learning (ML) has become one of the key techniques to enrich, complement and enhance the traditional security experiences. In this context, we focus on one of the key processes that improve and optimize Machine Learning DOS-DDOS predicting models: DOS-DDOS feature selection process, particularly the wrapper process. By studying different DOS-DDOS datasets, algorithms and results of several research projects, we have reviewed and evaluated the impact on used wrapper strategies, number of DOS-DDOS features, and many commonly used metrics to evaluate DOS-DDOS prediction models based on the optimized DOS-DDOS features. In this paper, we present three important dashboards that are essential to understand the performance of three wrapper strategies commonly used in DOS-DDOS ML systems: heuristic search algorithms, meta-heuristic search and random search methods. Based on this review and evaluation study, we can observe some of wrapper strategies, algorithms, DOS-DDOS features with a relevant impact can be selected to improve the DOS-DDOS ML existing solutions. Keywords—DOS-DDOS attacks; feature selection; wrapper process; machine learning


I. INTRODUCTION
With the exponential proliferation of Internet users, the network traffic has known a massive generation of data. These data are coming from individuals, private and public organizations. Moreover, the hard complexity of the Internet architecture and its interdependent suffers from different vulnerabilities, threats and risks ( [1], [2]). Consequently, the attackers find an impressive amount of vulnerable systems [3].
Nowadays, cybersecurity attacks are becoming increasingly sophisticated, particularly the infrastructure attacks that make security analysis systems more vulnerable to several failures [1]. One of these most famous threats is Denial Of Service attack (DOS) and its variant Distributed Denial Of Service (DDOS) ( [4], [5]). These serious and dangerous attacks violate the availability of information systems, which is a pillar of information security ( [6], [5]). The attackers seek to target computer systems, network devices, services and web applications to consume their CPU power, bandwidth, memory and processing time ( [7], [3]).
The DDOS attack has the same purpose but with the difference of using intermediate of multiple networks between the attacker and its target ( [7], [8]). This technique allows the attacker to amplify its attack with orchestrating a simultaneous sending of an excessive number of unwanted computing requests to its victim to overload its computing capacity.
To deal with these DOS-DDOS attacks, some traditional mechanisms are deployed such as firewalls, software updates, antivirus, Intrusion Detection Systems (IDS), etc.
However, many challenges and limits hinder these traditional techniques [6]. To overcome these limitations and drawbacks, Machine Learning (ML) techniques can be used as artificial intelligence systems to enrich, complement and enhance the traditional security experiences.
One of the key and critical pre-processing phases to success these DOS-DDOS ML models is feature selection. This process selects the most representatives DOS-DDOS characteristics from the initially DOS-DDOS dataset by eradicating those that are redundant and insignificant. Consequently, the obtained features subset improves the execution time, the detection rate and the accuracy of the used DOS-DDOS models.
In this context, this investigation presents a review and evaluation study related to DOS-DDOS attacks prediction based on one of the effective methods to select relevant DOS-DDOS features: Wrapper process. This paper is organized as follows: In Section 2 we study some traditional mitigation solutions and their limits. Section 3 describes the interest of using machine learning (ML) in DOS-DDOS attacks prevention. Section 4 exposes the impact of feature selection on DOS-DDOS machine learning projects. In Section 5 we review and we evaluate recent and relevant feature selection results obtained by using three commonly used wrapper strategies: heuristic search algorithms, meta-heuristic search and random search methods. Finally, Section 6 presents our conclusions.
The Anomaly Detection supervises the behavior of network traffic. It alerts the system at the slightest changes compared to the normal behavior. This method can detect new forms of attacks but generates high false positives and doesn't give clear information about the malicious events in some forms of attacks. Moreover, it is not feasible to IDS to manipulate high dimensional variables. Consequently, this technique can affect the efficiency and the velocity in detecting intrusions ( [15], [16], [17]).
In addition to the limitations and drawbacks mentioned above, traditional techniques are hindered by many others challenges [6]. As an example, many traditional strategies of security are not sufficient to protect information systems against the new forms of DOS-DDOS attacks, need extrastorage and computational resources due to the high level of network traffic, suffer from a lack of source attacks information and are unable to detect and prevent many DOS-DDOS attacks in real-time.
To overcome these drawbacks, Machine Learning has become one of the key techniques to enrich and complement these traditional security experiences. In the paragraph below we discuss briefly the benefits that can be attained by using ML-techniques in DOS-DDOS attacks prevention.

III. THE USE OF MACHINE LEARNING IN DOS-DDOS ATTACKS PREVENTION
Machine Learning (ML) is an evolutionary field of Artificial Intelligence (AI) composed of a set of rules, methods and functions [18]. Applied to deal with many challenges in DOS-DDOS attacks, ML algorithms can learn from DOS-DDOS datasets and discover hidden knowledge from them [19].
By finding interesting DOS-DDOS patterns from training DOS-DDOS data, ML algorithms allow preventing and predicting many recent forms of DOS-DDOS behaviors.
Contrary to the traditional security solutions, ML models are powerful tools that can analyze in real time high dimensional DOS-DDOS traffic [20], classify the behavior of the DOS-DDOS traffic to determine the normal one from the abnormal and predict with high accuracy DOS-DDOS attacks before they happen.
Based on DOS-DDOS security modeling process (Fig. 1) and many common algorithms like K-Nearest Neighbors Algorithm (KNN), Support Vector Machines (SVM), Random Forest (RF) as well as Naïve Bayes (NB), etc. many recent research projects have shown other important preventing benefits of ML algorithms compared to the existing traditional solutions ( [1], [12], [21]).
Feature selection is one of the critical pre-processing process to succeed and to improve the benefits mentioned above. In the paragraph below, we summarize the benefits of this process. Feature selection is one of the most critical pre-processing process in building DOS-DDOS Machine Learning (ML) models. This process is the first and crucial phase to improve the prediction accuracy, the detection rate and to reduce the execution time of DOS-DDOS models [22].
According to Bindra et al. [23], feature selection methods allow the DOS-DDOS security systems to distinguish DOS-DDOS attacks by using a minimum number of the most important features from network streams.
Applied to DOS-DDOS ML algorithms, feature selection is focused on selecting small and concise DOS-DDOS sets of characteristics describing the ML models [24]. It avoids the used features to contain redundant (correlation with other features) and noisier information of DOS-DDOS attacks without losing any piece of information. Consequently, it reduces the high memory requirements of security systems based on ML models ( [25], [26], [27]).
Generally, the existing DOS-DDOS ML security systems use three commonly main categories of feature selection approaches: Filter, Wrapper and Hybrid methods [28].
The Filter methods are based on statistical methods which evaluate the relevance of DOS-DDOS features independently of any machine learning algorithms [27]. As a faster solution that computationally costs less, these methods are often used in high dimensional DOS-DDOS traffic ( [29], [30]). However, the evaluation of individual information cannot take into consideration the correlation between the DOS-DDOS features. Consequently, the final DOS-DDOS subset can contain redundancy because some DOS-DDOS features can have the same ranking.
The wrapper strategies use a predetermined algorithm and its performance to assess the optimal DOS-DDOS subset features [31]. It executed in an iterative process, and at each iteration a new subset of DOS-DDOS features is generated to be evaluated by the classification algorithm [32]. The criterion of selection is principally based on the cross-validation accuracy during the DOS-DDOS training data [33].
The Hybrid method is a combination between filter method followed by wrapper approach, which offers the advantages of the two previous methods. It exploits their different criteria in different search stages [34].

A. Objective of the Study
To detect and prevent DOS-DDOS attacks accurately, wrapper methods one of the most effective strategies to identify informative DOS-DDOS feature subsets from many high-dimensional DOS-DDOS network streams. This approach of feature selection is often addressed in many security solutions based on ML tasks. Indeed, increasing number of research projects have shown that many wrapper strategies can have an important impact on Accuracy, Detection Rate and time execution of existing DOS-DDOS ML systems.
In this context, we decided to focus our attention on the assessment of the performance of many DOS-DDOS experiments based on wrapper strategies and machine learning algorithms.
By studying different DOS-DDOS datasets, algorithms and recent results of several research projects, we review and we assess the impact of many recent wrapper strategies applied to predicting DOS-DDOS attacks. We have taken a more focused look at the impact of these strategies on number of DOS-DDOS features, detection rates, execution times and accuracies of DOS-DDOS attacks prediction.
We present four dashboards that are essential to understand the performances of three wrapper strategies commonly used in DOS-DDOS ML systems: heuristic search algorithms, meta-heuristic search and random search methods.

B. Review and Evaluation Study of Feature Selection
Methods based on Wrapper Process 1) Used Datasets: To evaluate the performance of the wrapper strategies used in DOS-DDOS machine learning models, we start our review by studying relevant DOS-DDOS datasets commonly used by several DOS-DDOS research projects. These datasets are cited below: The Knowledge Discovery and Data Mining (KDD'99) dataset was built based on the synthetic data captured in DARPA'98. This dataset is mainly composed of redundant records. Moreover, this configuration forces ML algorithms to learn less about infrequent records than the redundant ones. The inequality of attacks distribution between training and testing phase made the cross-validation more complicated.
This dataset is composed of four main families of attacks and forty one features.
The NSL_KDD was created to overcome the limits of the KDD'99 [35]. However, the main disadvantage of the NSL_KDD dataset, it does not include the modern low footprint attacks scenarios like the KDD'99.
The UNSW_NB15 is composed of nine family attacks and forty nine features. It includes a hybrid of the real modern normal behaviors and the synthetic attack activities [35].
Cyber Range Lab of the Australian Centre for Cyber Security (ACCS) is a dataset mainly composed of hybrid modern normal activities and attacks behaviors. It is composed of forty-seven features [36].

2) Use model evaluation metrics:
To evaluate the reviewed DOS-DDOS Wrapper strategies, we have selected different metrics [37]. These metrics namely are: Classification Accuracy (Acc), Detection Rate (DR), Recall The formulas associated with these metrics are listed above: (1) 3) Impact of used DOS-DDOS datasets and algorithms on the wrapper process: Generally, the performance of DOS-DDOS prediction models based on the Wrapper process depends strongly on the used ML algorithms and datasets. As shown in Table I  The experiments based on the NB, C4.5, RF algorithms and UNSW_NB15 dataset realized by Bellouch et al. (2018) [39], has shown that the prediction accuracy obtained by RF (Acc_ RF = 99.94%) is better than C4.5 (Acc_ C4..5 = 95.82%) and SVM (Acc_ SVM = 92.28%). The NB algorithm shows less accuracy (Acc_ NB = 74. 19 %) compared to RF, C4.5 and SVM.
The Bayesian Network (BN) algorithm used in the experiment of Katkar and Kulkarni [40] achieved good accuracy (Acc_ BN = 99.68%) in detecting DOS-DDOS attacks thanks to its capacity of detecting anomalies in a multi-class [41].
By comparing the experiments carried out by Jalill et al. [38] and Katkar and Kulkarni [40], we have observed that SVM algorithm predict DOS-DDOS more accurately on the dataset UNSW_NB15 compared to the KDD'99 dataset (Acc_ SVM_UNSW_NB = 92.28% > Acc_ SVM_KDD = 62.5 %). This important difference according to W. Xingzhu [42] is caused by the redundant records on the KDD'99 dataset and SVM has slower training on high dimensional datasets.

4) DOS-DDOS feature selection based on wrapper process and heuristic search algorithms:
Based on heuristic functions or cost measures, wrapper strategies using heuristic search algorithms optimize and iteratively improve the process of DOS-DDOS feature selection [43].
Many heuristic searches such as SFS (Sequential Forward search), SBS (Sequential Backward search), LRS (Plus L Minus R Selection), RELR (Random Effect Logistic Regression), and GFR (Gradually feature removal method) have been used by many recent important research projects to solve accurately the problem of DOS-DDOS feature selection.
We discuss these projects in the paragraph below. At the end of this subsection, we present our first dashboard (Tables IIA, IIB, IIC) to summarize and to compare the performances of these strategies.
As an example of wrapper strategies based on heuristic search algorithms, we can cite the important investigation of Kavitha and Chrita (2010) [44]. In this study, the authors used the Best First Search (BFS) method. They selected two subsets composed simultaneously of seven and fourteen DOS-DDOS features. They applied four classifying algorithms: ID3, J48, NB and One R. These experiments have shown that ID3 and J.48 using a subset composed of fourteen DOS-DDOS features has the highest accuracy (Acc = 99%). One R and NB performed well in execution time (T=0.5s) with only seven features. The NB classifier achieved the highest specificity with Sp_ NB = 99% by using seven features and Sp_ NB =100% by using fourteen features.

5) DOS-DDOS based on wrapper process and metaheuristics search:
Meta-heuristics are new optimization methods used in DOS-DDOS feature selection problems to provide near-optimal solution [34]. These methods are based on two main search strategies [58]. The first strategy is used to guarantee a global and efficient search to find a solution of DOS-DDOS feature selection. The second strategy is used to improve feature selection solutions. Important research projects have applied meta-heuristic strategies to solve the problem of DOS-DDOS feature selection. In the paragraph below we discuss the important results of these investigations. At the end of this subsection, we present our second dashboard (Tables IIIA, IIIB, IIIC) to summarize and to compare the performances of these strategies. As an example of relevant research projects based on wrapper process and meta-heuristic search, we can cite the important investigation of Jun Wang et al. [59]. In this study, the ABC-SVM approach was adopted as wrapper feature selection process. This wrapper strategy selected five DOS-DDOS best features from the KDD'99 dataset and found the best parameter to the SVM classifier. This method achieved good accuracy (Acc_ SVM_(5 features) = 99.92%) and improved the time of execution (T_ SVM_(5 features) = 12.20 s).
Alomari and Ali Othman (2012) [60] used an approach based on the Bees Algorithm (BA) as a wrapper feature method by using the classifier SVM. This experiment selected www.ijacsa.thesai.org six DOS-DDOS features collected from the KDD'99 data set. They compared BA-SVM with other methods and concluded that their method achieved high detection rate and accuracy (DR_ SVM_(6 features) = 90.22%, Acc_ SVM_(6 features) = 93.36%) on detecting attacks with a low FAR (FAR_ SVM_(6 features) = 4.56%).
De La Hoz et al. (2014) [61] used a multi-objective procedure based on NSGA-II algorithm as wrapper feature selection to reduce the complexity of Growing Hierarchical Self-Organising Maps (GHSOM) algorithm. This wrapper method selected twenty-five representative features. As one of the multiple-objective based on the NSGA-II, the Jaccard index is evaluated after training the GHSOM. Their proposition improved the accuracy compared to the baseline model (Acc_( 25 features) = 99.5% > Acc_( 42 features) = 96.02%). Gaikwad and Thool (2015) [63] used Genetic Algorithm as wrapper feature selection which selected fifteen features. The authors used two classifiers Partial Decision Tree (PART) and C4.5, and they employed the Bagging on the two previous classifiers. This experiment has shown that using PART with the bagged classifier enhanced the accuracy and increased the execution time (Acc_ Bagging_PART = 99.71% > Acc_ PART = 77.79%, T_ Bagging_PART = 1589s > T_ PART = 274s ). On the other side, using C4.5 with Bagging decreased the accuracy and increased drastically the execution time (Acc_ Bagging_C4.5 =77.86% < Acc_ C4.5 = 79.08%, T_ Bagging_C4.5 = 1795s > T_ C4.5 = 176.05s).
Wang Xingzhu (2015) [42] combined ACO feature weighting SVM. This wrapper strategy selected ten most important DOS. The Tables IIIA, IIIB, IIIC summarize and compare the performances of all wrapper process and meta-heuristic strategies discuss above.

6) DOS-DDOS feature selection based on wrapper process and Random search methods:
Random search methods applied DOS-DDOS feature selection projects to evaluate the DOS-DDOS features on random sampling around the problem region. These stochastic methods are mainly used to solve the global problem optimizations [71].
To optimize the DOS-DDOS feature subsets, many important research projects have used wrapper process and random search methods to solve this problem. We discuss these projects in the paragraph below. At the end of this subsection, we present our third dashboard (Table IV) to summarize and to compare the performances of these strategies.
As an example of these important investigations, we can cite the important study of Lin et al. (2012) [72] which combined Simulated Annealing (SA) with SVM algorithm to get the best feature subset. This experiment selected twenty three best DOS-DDOS features which evaluated by SA as random search and C4.5 decision tree as classifier. Compared to the initial set of features, the selected subset achieved a high accuracy equal to 99.96%.   [74] proposed an IDS model that combined the Binary Firefly (BFA) method with the Naïve Bayes (NB) classifier by using the NSL_KDD dataset. The BFA is initialized by a binary sequence contrary to the Firefly (FA) algorithm. This model was iterated two hundred times with fifteen selected features and achieved better accuracy compared to all used features (Acc _(25 features) = 94.83% > Acc _(42 features) = 89.9%).

VI. CONCLUSION
Nowadays, cybersecurity attacks grow over time, especially the Denial of Service attack (DOS) and its variant Distributed Denial of Service (DDOS). These famous attacks continue to threaten private and public activities everywhere.
Dealing with these threats by using Machine Learning (ML) models can hold a great promise in DOS-DDOS security systems. By learning from and identifying a large amount of network traffic, these predictive models can efficiently handle the DOS-DDOS threats and overcome several limits and performance issues of the traditional security solutions.
One of the key preprocessing phases to success and optimize these DOS-DDOS cybersecurity intelligence models is feature selection step, particularly the feature selection method based on the Wrapper strategies.
Using Wrapper techniques improved significantly the selection of the relevant DOS-DDOS features and enhanced the performance of many existing ML solutions.
In this paper, we have advanced the development of this previous work by studying different DOS-DDOS datasets, algorithms and the results of several research projects. We have reviewed and evaluated the impact of many important wrapper strategies used by many existing DOS-DDOS security systems.
We have summarized the findings in three dashboards that are essential to understand the performance of three wrapper strategies commonly used in DOS-DDOS ML models: heuristic search algorithms, meta-heuristic search and random search methods.
This study shows that many wrapper strategies, algorithms, DOS-DDOS features with a relevant impact can be selected to improve the DOS-DDOS ML existing solutions.
ACKNOWLEDGMENT I would like to express my sincere gratitude to my Professors and my family for the continuous support of my study and related research, for their patience, motivation, and immense knowledge.