A Paradigm for DoS Attack Disclosure using Machine Learning Techniques

— Cybersecurity is one of the main concerns of governments, businesses, and even individuals. This is because a vast number of attacks are their core assets. One of the most dangerous attacks is the Denial of Service (DoS) attack, whose primary goal is to make resources unavailable to legitimate users. In general, the Intrusion Detection and Prevention Systems (IDPS) hinder the DoS attack, using advanced techniques. Using machine learning techniques, this study will develop a detection model to detect DoS attacks. Utilizing the NSL-KDD dataset, the suggested DoS attack detection model was investigated using Naive Bayes, K-nearest neighbor, Decision Tree, and Support Vector Machine algorithms. The Accuracy, Recall, Precision, and Matthews Correlation Coefficients (MCC) metrics are used to compare these four techniques. In general, all techniques are performing well with the proposed model. However, The Decision Tree technique has outperformed all the other techniques in all four metrics, while the Naive Bayes technique showed the lowest performance.


I. INTRODUCTION
The world is currently living in the digital era. This digital era has produced many services and applications to make life easier. One of the primary concerns of these services and applications is security [1]. Companies and even individuals live a nightmare due to the number of cyberattacks. At the same time, more than 61000 websites attack is blocked every day. In addition, around 24000 malicious mobile applications are blocked every day on the stores of the applications [2]. One of the most dangerous cyberattacks is a Denial of Service (DoS) attack. The main goal of the DoS attack is to make a resource unavailable to the intended users. DoS attack is increasing rapidly; it is expected that the number of worldwide DoS attack will reach 15.4 million by 2023 [3].
Intrusion Detection and Prevention Systems (IDPS) are among the techniques available to counteract a DoS attack. IDPS is software/hardware that observes and inspects system events in order to sense and warn of unauthorized efforts to access system resources in real-time or near real-time. IDPS detects intrusion by either searching for a pre-defined pattern in the traffic or by observing anomalies of what is considered normal traffic for the network or host [4]. IDPS should be equipped with smart and self-learning techniques to detect zero-day DoS attacks. Machine learning is a subfield of artificial intelligence that encompasses a number of techniques for accomplishing this goal [5].
As the name implies, machine learning systems improve automaticity through experience and by using existing data, which makes it suitable to detect zero-day DoS attacks. Supervised, unsupervised, and semi-supervised machine learning are all types of machine learning. Generally, supervised learning algorithms operate on structured and labeled data similar to that used by the IDPS [6] [7]. Hence, the fundamental aim of this research is to suggest a paradigm for identifying suitable supervised machine learning algorithms for detecting DoS attacks via IDPS. This paper is structured as follows. Section 2 covers the topics fundamental to this work. These topics include NSL-KDD dataset machine learning techniques, min-max scaler, and K-Fold Cross-Validation. Section 3 discusses related works that have employed machine learning approaches to detect DoS attacks. Section 4 discusses the proposed DoS attack detection model. Finally, Section 5 concludes the paper and discusses the scope for future work.

II. BACKGROUND
This section discusses the basic concepts that are related to this work. This includes a brief description of the NSL-KDD dataset used in this article. The Machine learning techniques used in this article will also be briefed. Finally, the algorithms used in the data pre-processing and to validate the result will be discussed.

A. NSL-KDD Dataset
NSL-KDD dataset is a processed version of the KDD-CUP99, in which the records that adversely impact the systems are removed. NSL-KDD dataset still has some problems; however, it is still considered an adequate benchmark dataset that helps security developers investigate intrusion detection techniques. The number of records in the NSL-KDD dataset is good to run the experiments and evaluate the results of different techniques. Table I shows the number of records in the NSL-KDD dataset according to the attack type. The NSL-KDD dataset has four different attack types. This paper is only interested in the DoS attack, and all records of the other attacks

B. Machine Learning Techniques that are used in this Article
Supervised machine learning deals with data sets that contain both inputs and the corresponding desired outputs. The classification algorithms category is used within supervised learning when the outputs are discrete; restricted to a limited set of values. The most common classification algorithms are Naive Bayes, K-Nearest Neighbors (KNN), Decision Tree, and Support Vector Machines (SVM) [7][10][11] [12].
1) Naive bayes: Naive Bayes is a simple technique based on the Bayes theorem and used to handle classification problems. The Naive Bayes assumption is that the features are independent of one another; existing of any feature is unrelated to any other feature. It is known as one of the best classification algorithms and creates fast machine learning models that predict quickly. In Naive Bayes, the features are making independent and equal contributions to the outcome. Equation 1 shows the probabilistic expressions used in Bayes' theorem [7] [10].
2) K-NN: One of the most important and extensively used machine learning algorithms is K-NN. As the name implies, K-NN finds the closest K (number of neighbors) nearest neighbor points to the target point. Then, it predicts the output of the target point from these neighbor points. K can be constant or vary based on the local density of points. Typically, k equals the square root of the dataset's record count. Euclidean is one of the algorithms that are used to find the neighbor points by KNN. Equation 2 shows the formula of the Euclidean algorithm [7] [11].
Euclidean Distance between X and Y = 3) Decision Tree: The decision tree technique creates an upside-down tree to represent the classification model. It is easy to understand, visualize, and requires little data preparation. The tree consists of nodes that symbolize a dataset's features, branches symbolize the decision rules, and leaves symbolize the class, as shown in Fig. 1. The decision tree is based on the if-else statements (True/False) to move to the next node till reaching the leaf [7] [12]. 4) SVM: SVM is a widely used supervised learning approach for classification. The SVM technique plots the data items as a space split into categories. Then, it finds the hyperplane that distinctly separates the points in space. The SVM technique should choose the hyperplane with the maximum distance between the target data points. This gives a more accurate classification for any new data points. Fig. 2 clarifies the SVM technique [7][10]. www.ijacsa.thesai.org

C. Min-max Scaler
Most machine learning techniques perform better when the data are distributed similarly. In many cases, the data within the dataset is distributed on a wide-scale and, thus, the data should be scaled. Min-max scaler is one of the most used techniques to scale the data within the acceptable range for the machine learning techniques. By default, the Min-max scaler technique returns a value between 0 and 1, using Equation 3.
Where Znew is new derived value, Z is the original value, Zmin is the minimum value of the feature, Zmax is the maximum value of the feature [13].

D. K-Fold Cross-Validation
When it comes to machine learning, the approach known as K-Fold Cross-Validation is used to validate the results of a model. It is widely used because it is simple, easy to understand, and, more importantly, reduces the validated model's bias. Using the K-Fold Cross-Validation method, the data is split into various groups (k groups). The proposed machine learning is trained on k-1 groups, and the remaining group is used to validate the model [14].

III. RELATED WORK
This section discusses related work on detecting DoS attacks using machine learning approaches.  [15].
Another article that used machine learning techniques for DoS attack detection was proposed by Zhe W., Wei C., and Chunlin L. However, the proposed model in this work is designed specifically for smart grid technology. The authors have investigated three different machine learning techniques to protect the smart grid: SVM, Decision Tree, and Naive Bayesian. After examining these three techniques on the KDD99 dataset, it is found that the SVM technique is the best for protecting smart grid technology from DoS attacks. The data is first collected from the network, then certain features are selected from the dataset, and the primary component analysis is used for dimensionality reduction. The accuracy, precision and recall, and F1 score measures have been used to evaluate the suitable machine learning techniques for the proposed model. Among the three techniques tested, SVM outperformed the others in detecting DoS attacks on smart grid technology. [16].
He Z., Zhang T., and Lee, R. B. have advocated the use of machine learning techniques to detect DoS attacks originating in the cloud. The proposed system has investigated four different DoS attack techniques: SSH brute-force, ICMP flooding, DNS reflection, and TCP SYN attacks. This method utilizes statistical data from the hypervisor of the cloud server and the virtual machines to prohibit network packages from being sent out to the external network. The authors have implemented a prototype of the proposed detection system www.ijacsa.thesai.org under natural cloud settings. The cloud is comprised of six servers (labeled S0 to S5), each of which hosts many virtual machines. Several machine learning techniques have been used in the proposed system, including SVM Linear Kernel, SVM RBF Kernel, SVM Poly Kernel, Decision Tree, Naive Bayes, and Random Forest. Among the investigated techniques, SVM Linear Kernel has outperformed other techniques in detecting the DoS attack sourced from the cloud [17].

IV. PROPOSED DOS ATTACK DETECTION MODEL
This section outlines the suggested model for detecting DoS attacks. First, the NSL-KDD dataset will be processed to be prepared for training and testing the proposed model. Then, the proposed DoS attack detection model will be introduced in detail.

A. Data Preprocessing
Data preprocessing is a set of operations applied to the data to prepare the dataset for machine learning. As discussed below, data transformation and normalization are two of these processes that have been applied to the NSL-KDD dataset in this paper [8] [18]. Table II. One of the first steps in data preprocessing is transformation, converting all data to numerical for the machine learning techniques to be applicable. Three nominal features in the NSL-KDD dataset have been transformed to numeric values: protocol type, service, and flag. These features have been converted using the label encoding method [19]. Label encoding changes the values to a number between zero and the number of classes minus one, as shown in Table III. Tables IV and V show samples of the NSL-KDD dataset before and after the transformation operation. Besides, the output column in the NSL-KDD dataset contains four different types of attacks, each of which has several sub-types. All the attack sub-types have been removed except for the DoS sub-types, which is our target in this paper. Then, all DoS sub-types have been replaced to be DoS attack, so that the output column contains only two outputs: DoS attack and normal data. Again, these two outputs have been converted to from nominal into numeric data using the label encoding method. Now, the output column contains 0 representing the DoS attack and 1 representing normal data.

1) Data transformation: NSL-KDD dataset contains numerical and nominal data, as shown in
2) Data normalization: An essential step in data preprocessing is normalization operation. Normalization techniques convert the large-scale values into a compatible scale. This enhances the performance of the machine learning techniques and leads to more accurate results. NSL-KDD dataset contains several features distributed at a large scale and needs to be normalized. This study has applied the Minmax scaler technique (as discussed above), which scales the values of a feature between 0 and 1 [7] [13]. Table VI shows a sample of the NSL-KDD dataset after normalization. Fig. 3 illustrates the NSL-KDD dataset data preprocessing steps.

B. DoS Attack Detection
This section contains a detailed discussion of our detection model of DoS attacks. As discussed earlier, the NSL-KDD dataset has been preprocessed to be prepared for the machine learning techniques. At first, besides the normal traffic, the NSL-KDD dataset has been filtered to contain only the subattack types that cause the DoS attack. These sub-attack types include: Back, land, Neptune, pod, smurf, teardrop, mailbomb, processtable, udpstorm, apache2, and worm. Then, all these sub-attack types have been labeled as DoS attack in the output column. Table VII shows the number of records of each subattack type and, eventually, the DoS attack. As such, now the NSL-KDD dataset contains only DoS attack type and normal traffic data. Then, the nominal features have been transformed using the label encoding technique including the output column. After that, the NSL-KDD dataset was normalized using the Min-max scaler technique (as discussed above). At this point, the NSL-KDD dataset is preprocessed and ready for the machine learning techniques to be applied. The generated NSL-KDD dataset was utilized to train and test the suggested DoS attack detection model.
The resulted NSL-KDD dataset (after data preprocessing) contains well well "labeled" data. In addition, the output variable is categorical; DoS attack and normal data. Therefore, the classification algorithms within the supervised machine learning are used in the proposed DoS attack detection model. Accordingly, the machine learning techniques used in the proposed model are Naive Bayes, KNN, Decision Tree, and SVM. The technique with the best performance measures, as shown below, will be determined for the proposed system. The K-Fold Cross-Validation technique has been used to validate the proposed model. In which, the NSL-KDD dataset has been divided into five groups. Four groups are used to train the used machine learning technique in each iteration, and the remaining one is used to test the used technique. In this way, each group is used to test the entire dataset. After testing and training, the suitable machine learning technique to detect DoS attacks was determined. Consequently, the traffic is analyzed using a highperformance machine learning technique that distinguishes between normal traffic and DoS attack traffic. Fig. 4 clarifies the proposed DoS attack detection model.  Four measures have been employed to evaluate the proposed system based on the elements of the confusion matrix. These measures are Accuracy, Recall, Precision, and Matthews Correlation Coefficients (MCC). Accuracy is the ratio of properly forecasted attacks to the total number of forecasted attacks. Accuracy can be calculated using Equation 4. The Recall is the number of samples in the attack class that is successfully predicted to the total number of the prediction of the attack class. Recall can be calculated using Equation 5. Precision is the number of attacks that are correctly predicted as an attack to the number of attacks that are predicted as an attack. Precision can be calculated using Equation 6. MCC is a measure of the quality of classification with two classes. The closer the value to 1 indicates a more accurate classification. MCC can be calculated using Equation 7 [7][9][20] [21].

VI. CONCLUSION
DoS is a hazardous attack that threatens governments, businesses, and individuals. New techniques to launch DoS attacks emerge continuously. These techniques required an adaptive system to mitigate them. This paper developed a new paradigm for disclosing DoS attacks using machine learning approaches. The proposed model's primary objective is to mitigate existing and newly discovered DoS attack types. Several machine learning techniques were Naive investigated with the proposed model. Among these techniques, the Decision Tree technique has shown the highest performance. Whereas the Accuracy, Recall, Precision, and MCC, of the Decision Tree technique with the proposed model is 99.891%, 99.904%. 99.912%, and 99.964%, respectively. Therefore, the proposed detection model is promising for mitigating the newly emerged DoS attack types.