Network Intrusion Detection System based on Generative Adversarial Network for Attack Detection

—The Intrusion Detection System (IDS) is the main element to prevent malicious traffic on the network. IDS will quickly increase the ability to detect network threats with the help of Deep Learning algorithms. As a result, attackers are finding new ways to evade identification. Polymorphic attacks, search for the attackers, as they can bypass the IDS. Generative Adversarial Networks (GAN) is a method proven in generating various forms of data. It is becoming popular among security researchers as it can produce indistinguishable data from the original data. This work proposed a model to generate DDoS attacks using a GAN. Several techniques have been used to regenerate the feature selection to identify the attack and generate polymorphic data. The data will change feature profile in every cycle to test if the IDS can detect the new version of attack data. Simulation results from the proposed model show that with constant changing attack profiles, defending arrangements that handle incremental knowledge will yet stay exposed to current attacks.


I. INTRODUCTION
The Internet is being used in many fields, like data transfer, e-learning, and many more, and its growth has impacted all aspects of life. This increasing usage of the Internet causes concerns about network security and needs constant improvements in securing Internet technologies from various attacks. Examples of these attacks include DDoS attacks, Manin-the-middle attacks, Phishing, Password-based attack, SQL injection, and many more. Network vulnerabilities can cause damage to small or large organizations. According to one survey, 98% of businesses in the UK depend on Information Technology services. Over 43% of small scale and 72% of large-scale organizations suffered from cyber-attacks in the past years. There are many tools available to secure or prevent cyber-security attacks, including but not limited to: Intrusion Detection Systems (IDS), Intrusion Prevention Systems (IPS), Anti-malware, Network Access Control, Firewalls. Among those, one of the most commonly used and effective tool is the Intrusion Detection System. IDS analyzes the data traffic is to be distinguished from the malignant and the normal traffic, and to generate a warning, so that all the necessary precautions must be taken to avoid damage [1]. With the development of network attacks and security, improved detection and prevention systems. Artificial intelligence (AI) is now widely used for security tools in the IDS [2],and activists have begun to use artificial intelligence techniques to malicious attacks [3] [4] . AI and deep learning algorithms require a large amount of data to train and test the models. Some of the techniques that can be used for the production of large data sets to finish the malware detection [5] [6] and security orchestration, [7].
One of the frameworks to generate adversarial data is Generative Adversarial Networks (GAN). It is an architecture of two neural networks: the Generator and the Discriminator. The Generator uses gradient descent or the response from the discriminator and generates adversarial data. The discriminator distinguishes between the original and the adversarial data. The Generator and the discriminator compete in this way, and, in the end, the Generator produces synthetic or adversarial data [8].GAN has been utilized in research to generate various types of datasets like images [9], sound [10], text [11], and network attack data [12].

II. RELATED WORKS
The recent development in deep learning, intrusion detection systems are getting advanced with these methods. However, there is limited research testing the integrity of the advanced IDS against adversarial data.
According to a study by [13], the authors created a framework that generates adversarial malware using GAN to bypass the detection system. The objective of this research is to use a black-box malware detector because most of the attackers are unaware of the detection techniques used in the detection system. Instead of directly attacking the black-box detector, researchers created a model that can observe the target system with corresponding data. Then this model calculates the gradient computation from the GAN to create adversarial malware data. With this technique, the authors received a model accuracy of around 98%.
This section covers some previous works on generating adversarial attack data using the Wasserstein GAN. The Wasserstein GAN model was introduced in [14], and it improves upon the traditional GAN. Wasserstein GAN is an extension of traditional GAN that finds an alternate method of training the Generator. In WGAN the Discriminator provides a critic score that depicts how real or fake the data generated.
To generate a malicious file [12] proposed a method that uses WGAN so that a detection system signifies the adversarial malicious file as a regular file. They have achieved an accuracy of around 99%, proving that their method can generate adversarial malicious files that can bypass the detection system.
A recent study in [15] uses Wasserstein GAN to generate simulated attack data. According to the authors, many tools can generate simulated attack data. However, this process could take a long time and a lot of resources. Using the proposed technique, they have produced millions of connection records with just one device and within a short period. They used the KDD Cup 1999 dataset as the training set. Their experiment suggests that as compared to GAN, the Wasserstein GAN learns faster and generates better results. A paper published by Ring et al. [16] proposed a method that produces flowbased attack data using Wasserstein GAN. This research uses the CIDDS dataset to test and train the proposed method. They have suggested that the flow-based dataset consists of categorical features like IP address, port numbers, etc. The GAN is unable to process categorical data. They have also proposed a method to preprocess the categorical data and transform them into continuous data. Lastly, they have used several techniques to evaluate the quality standard of the adversarial data. Results suggest that it is possible to generate real network data using this method.
A recently published paper by Lin et al. [17]  This research aims to create a framework that detect attacks using GAN, motivated by [17].
• This work begins with the important feature selection method using SHAP. This work identified the most critical features from the dataset that contribute to a DDoS attack.
• The next goal is to Generate adversarial data using the selected feature set and evaluate the IDS if it can detect the adversarial attack, followed by preparing the IDS with the produced adversarial data.
• This work propose a polymorphic engine that updates the feature profile of the attack by Manual feature update and Automated feature update.
• The research work have conducted a comprehensive simulation and analyzed the results to compare the Reinforcement Learning method against the Manual Feature profile attacks and presented how many cycles an attacker can bypass an IDS with polymorphic adversarial DDoS attacks.
The primary objective of the Generative Model is to learn the unknown probability distribution of the population from which the training observations are sampled from. The most popular GAN architectures are DCGAN [18], Conditional GAN [19], BiGAN [20], Cycle GAN [21].

A. Datasets and Feature Selection
Datasets: This work uses a dataset published by the Canadian Institute of Cyber Security, CIC-IDS2017, published in [22] by Lashkari et al., which, according to the authors, supersedes the datasets generated earlier by the institute. CI-CIDS2017 consists of eight different files that contain regular traffic and attack traffic data. Moreover, this dataset consists of various types of attacks along with the normal network flow. This dataset also covers all the available standard protocols like HTTP, HTTPS, FTP, SSH, and email protocols. The dataset consists of more than 70 features that are important as per the latest network standards, and most of them were not available in the previously known datasets.
Feature Selection: Feature selection is an essential aspect of the Deep Learning technique. SHAP (Shapley Additive ex-Planations) [23] is one of the new feature selection techniques. The goal of the proposed method is to signify the contribution of each feature to the predicted value. Two critical measures to define feature importance are Consistency and Accuracy. The authors of the paper discuss that SHAP is the method that satisfies these qualities. The SHAP values explained by the authors are based on Shapley values that are a concept from game theory. The idea behind Shapely values is that the outcome of each possible combination (or coalition) of each feature needs to be examined to determine the importance of a single feature. The mathematical explanation of this is as follows in Equation 1: Here, g represents the overall result of the Shapely values, z ′ ϵ{0, 1} M is a coalition vector, M is the max coalition size, and ϕ represents the presence of feature j that contributes towards the final output. The authors have described a coalition vector as simplified features in the paper. In coalition vector, 0 means the corresponding value is not present" and 1 means it is "present." Equation 1 can be called a power set and can be explained as a tree as follows. Equation 1 can be called a power set and can be explained as a tree shown in Figure.  Each node here represents a coalition of features. Edges represent the inclusion of a feature that was not present in the previous coalition. Equation 1 trains each coalition in the power set of the features to find the most critical feature from the dataset.
The following results were obtained as shown in Fig. 2 by running the SHAP explainability model on the CICIDS2017 data file that shows the list of essential features responsible for the DDoS attack in the most important to least important order. Furthermore, the dark red color represents a higher impact of a feature, and the blue color represents a lower impact of a feature on the output value.

B. Adversarial Attack Generation using Wasserstein GAN
The methodologies used in this research involves the Generative Adversarial model that produces adversarial attacks, training IDS by earlier generated polymorphic datasets, polymorphic engine to generate polymorphic DDoS attacks, and use the polymorphic data to attack the IDS. DDoS attack data from the CICIDS2017 [22] used to Generate the adversarial attack by combining a random noise vector of the same size as the selected features from the dataset to train the model. The framework is a feed-forward neural network that consists of 5 linear layers. The input layer consists of neurons as per the selected number of features, and the output layer consists of 1 neuron as shown in Fig. 3.  The input layer receives several numbers of features according to the experiment, and the output layer generates the desired data. The Generator consists of 3 hidden layers that are optimal for this scenario; the results showed fewer layers would underfit the training data. Anything more than that overfits the training data.
In the next step, the generated adversarial attack combined with the benign or normal network flow data will be fed to the Intrusion Detection System. The IDS will detect the attack and sends predicted labels to the Discriminator as shown in Fig. 4, the detection success rate, and the Discriminator will send the critique to the Generator using the backpropagation so that in the next cycle, the Generator can improve the production of adversarial DDoS attack. The IDS consists of 4 layers, from which the input and output layer consists of 2 neurons each. The IDS consists of 2 hidden layers that are ideal because it only detects if the test data consists of an attack or benign.
The signature-based black-box intrusion detection system used to test the detection rate of the adversarial DDoS attacks. The reason for using this is that most of the time, the type of attack detection system is unknown to the attackers. Attackers rely on the responses received from the detection system, and black-box IDS is the right choice for this model as shown in Fig. 5. The input layer accepts two types of data from the black-box IDS. The output layer provides two critics, one for the Generator and one for itself.  Loss functions used to calculate the Loss [17] for the Generator and the discriminator, shown in the Equation Equation 2 as follows.
(2) Figure 6 depicts that the generated adversarial data is DDoS attack or abnormal or normal. Here, P G represents the Penalty to the Generator in attack vector, and in noise vector. E is calculated random inputs value to the model. S attack represents. If the penalty is less to the model means the model is performing well and produces attack datasets that can bypass IDS shown in Equation 3.
Here,P D represents the Penalty to the discriminator. "E" is overall calculated feature values of the models attack datasets. "A" is the actual feature value of benign and the attack data. The lesser the penalty to the discriminator means the discriminator performs well. It calculates if the generated data is closer to the DDoS attack or benign or regular data.
Algorithm -1 shows the process that was represented in figure 5. This section specifies the details about the learning process of the Generator and how it produces adversarial data. If the generator continuously generates random data, the data will be unmeaningful, which can change the entire network flow data. So, the Generator needs to produce the data to maintain the intensity of an attack. To ensure that, the work need to maintain the feature values constant that have higher SHAP values as shown in Fig. 2.
Here is a sample of how the Generator produces an adversarial attack by the proposed technique. In this diagram, the darker shade explains the feature values of the features that are contributing to the attack. Whereas non highlighted values depict the feature value of a regular or non-attack feature.   This research work considered three inputs to train the IDS: normal or benign data, new adversarial data, and previously generated adversarial data. The IDS learns about the adversarial data and tries to detect the DDoS attack data. Algorithm 2 suggests the overall process for the same.   In the above methods shown in Fig. 9, the research work www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 11, 2021 assumed that an attacker would manually modify the feature profile and train the model with the new feature profile every time the ISD detects a polymorphic attack. This study considered using only a total of 20 features that were provided by the SHAP method.
3. It will be challenging to keep manually changing the feature profile if the study will use more than 20 features. So as an alternative a Reinforcement Learning method has been used to automate the feature profile selection for generating a polymorphic attack as shown in Fig. 10.  The Reinforcement Learning method is an ML-based technique that focuses on retraining the algorithm following a trialand-error approach. The agent in this architecture evaluates the current IDS attack detection score. Then the agent takes action and receives feedback from IDS. Positive feedback is a reward, and negative feedback is a penalty to the agent. The following algorithm will explain the process. The overall process of generating a polymorphic attack is explained in the following algorithm 5 and Fig. 11.

Algorithm 5
Require: Input -Use any five features with a high impact score and any 5 with the lowest score from the shortlisted features. Ensure: shortlisted features and five normal features.

F. Performance Evaluation
To evaluate the performance and the results of this work, the research work used the following parameters.
Accuracy -Represents the fraction of precisely classified data in comparison to the total processed data. The formula to calculate accuracy is as follows Precision -a ratio between True Positive values and all the positive values received from the Deep Learning model.

P recision =
T P T P + F P Recall -a ratio between correctly detected samples over total sample data. It is also known as a ratio between True Positives and the sum of True Positives and False Negatives.
F1-Score -a calculation of a mean of precision and recall.
IV. RESULTS AND DISCUSSION The experimental setup has done by using libraries like PyTorch, Scikit-learn, Pandas, Numpy, Matplotlib. Hyperparameters are essential properties that define the characteristics of the training process of the Deep Learning model.  A. Attack Generation The first step of the research is to generate an adversarial DDoS attack to evade this Black-box IDS. As seen in Fig.  12 graph initially, Generator produces data that is unable to bypass the IDS. However, after training the Generator for 100 epochs, it discovers to create adversarial data to deceive IDS.

B. Training IDS by Adversarial DDoS Information
This section describe the result of the discovery time of IDS after training. As shown in Fig. 13, in initial cycles, IDS struggles to detect the attacks. However, after training it for 100 epochs, it detects almost all the attacks.

C. Polymorphic Adversarial DDoS Attack Generation
This section illustrates the detection rate of the Black Box IDS under the generation of polymorphic adversarial attacks.
In the first experiment, new features have been selected manually to produce polymorphic attacks. For this test, limited features from the datasets have been used. The following is the initial result using algorithm 3.  In the Fig. 14, above result, a red-colored graph suggests a polymorphic attack being generated and proceed towards the BlackBox IDS. As seen, the polymorphic attack can deceive the IDS. The green-coloured graph depicts the training of IDS by earlier generated polymorphic adversarial DDoS datasets. After 100 epochs, IDS detects the polymorphic adversarial DDoS attack. The following result indicates all the cycles of polymorphic attacks on the IDS. The Generator utilizes the same combination of the features to generate attacks until an IDS detects all the previous attacks.
Each data point in Fig. 15 depicts the IDS detection rate. Once the IDS detects all the previous versions of the polymorphic DDoS attack that uses the same feature set (as seen in Fig. 15), the generator manually selects new predefined features and generates a new polymorphic adversarial DDoS attack. For this test, only a group of 10 features have been used.
In the next test, the research work used a technique that follows algorithm 4 to revise the attack to generate a polymorphic adversarial DDoS attack. For this experiment, the work has been began with ten features to generate polymorphic attack data. To generate a new polymorphic attack, two new features have been added in the existing attack data and used a total of 20 features. The Fig. 16 is the first result of the initial polymorphic attack. Each data point in Fig. 17   The first two experiments focus on testing if the Generator can produce polymorphic adversarial DDoS attack data by updating the feature profile manually. After confirming the possibility of doing so, the next step is to automatically select features and manipulate the attack feature profile to generate polymorphic adversarial attack data. To automate this task, the Reinforcement Learning technique has been applied. It receives an IDS detection rate and learns to select new features, add them to the old feature set, and create a new feature set. This experiment also indicates the number of times a generator can produce polymorphic adversarial DDoS data. To examine this,four sets of feature combinations have been used for each test to generate the automated Polymorphic adversarial DDoS attack.
• The first test includes a total of 40 features from the dataset • The second test includes a total of 50 features from the dataset • The third test includes a total of 60 features from the dataset • The fourth test includes a total of 76 features from the dataset...
The above experiments begin with ten features, from which 5 are a functional feature with a high impact score, and 5 are usual or benign.

D. Test Evaluation
The Table I describes the overall values for the Precision, Recall, and F1-score for each test. In all the above results shown in Fig. 18, 19, 20, 21, 22, the Polymorphic DDoS adversarial attack successfully evading the IDS; the orange bar suggests the polymorphic attack is becoming weak once the IDS detects them. By counting the red bar, It has been observed that how many times the Generator produced a polymorphic attack in each cycle. Fig. 19 suggest that when the Generator uses a small number of features, more than 90% of the polymorphic attack evades the IDS. By noticing these figures, it is clear that using fewer features to generate a polymorphic attack has a higher evasion rate but fewer chances of generating more polymorphic attacks. Fig. 20, 21, 22 suggest that initially, more than 90% of the polymorphic attacks can evade the IDS. However, results propose that if the Generator utilizes more features to generate a polymorphic DDoS attack, the success rate gets lower each time. Comparing all the results confirms that while using a fewer number of features to generate polymorphic adversarial DDoS attacks, the attack success rate stays up to the acceptable amount. However, when more features have been used, the attack success rate depletes after certain cycles. Table II describes the total runtime for each experiment.

V. CONCLUSIONS AND FUTURE WORK
The work proposed a framework to create polymorphic adversarial DDoS attacks using a CICIDS2017 dataset using a Wasserstein GAN. To generate polymorphic attacks, three different techniques have been proposed that change the feature profile of the attack. New features have been selected manually each time to generate polymorphic adversarial attacks in the first two techniques. Furthermore, to automate the feature selection to generate polymorphic attacks, a Reinforcement Learning technique has been applied in each technique; the Generator creates a polymorphic attack until no more new features are remaining to choose from the feature set.
From the results, it has been observed that the Generator can produce polymorphic adversarial DDoS. Results also depict that while using a small number of features to create a polymorphic attack, the attacks were successfully deceiving the IDS with more than a 90% success rate while using a manual selection of features.
In the future, it could be interesting to consider using other variants of GAN like DCGAN, Conditional GAN, BiGAN, Cycle GAN to generate adversarial network attack data and evaluate the detection systems. Another limitation of this research is that it focused on generating only one type of attack, as every attack has different functional features. It would be difficult to use one Generator to create other types of attacks with the same generator. So it would be interesting to use multiple generators for each type of attack and evaluate the performance of the IDS against all types of polymorphic adversarial network attacks.