Network Anomaly Detection via Clustering and Custom Kernel in MSVM

Multiclass Support Vector Machines (MSVM) have been applied to build classifiers, which can help etwork Intrusion detection. Beside their high generalization accuracy, the learning time of MSVM classifiers is still a concern when applied into etwork intrusion detection systems. This paper speeds up the learning time of MSVM classifiers by reducing the number of support vectors. In this study, we proposed KMSVM method combines the K-means clustering technique with custom kernel in MSVM. Experiments performed on KDD99 dataset using KMSVM method, and the results show that the KMSVM method can speed up the learning time of classifiers by both reducing support vectors and improve the detection rate on testing dataset. Keywords-IDS; K-mean; MSVM; RBF; KDD99, Custom Kernel.


INTRODUCTION
The intrusion detection system is designed in such a way that any kind of malicious activities in computer network and its resources can be identified and vigilance [1].Intrusion Detection Systems (IDS) are computer programs that tries to perform intrusion detection by comparing observable behavior against suspicious patterns, preferably in real-time.Intrusion is primarily network based activity [2].The primary aim of Intrusion Detection Systems (IDS) is Monitoring and analyzing both user and system activities, Analyzing system configurations and vulnerabilities, Assessing system and file integrity, Ability to recognize patterns typical of attacks, Analysis of abnormal activity patterns and Tracking user policy violations to protect the availability, confidentiality and integrity of critical networked information systems.IDS can be classified based on which events they monitor, how they collect information and how they deduce from the information that an intrusion has occurred.IDSs that operates on a single workstation are known as host intrusion detection system (HIDS), A HBIDS adds a targeted layer to security to particularly vulnerable or essential systems, it monitors audit trails and system logs for suspicious behaviors [3] while A network-based IDS monitors network traffic for particular network segments or devices and analyzes network, transport, and application protocols to identify suspicious activity.Misuse detection uses the "signatures" of known attacks to identify a matched activity as an attack instance.Misuse detection has low false positive rate, but unable to detect novel attacks.It is more accurate but it lacks the ability to identify the presence of intrusions that do not fit a pre-defined signature, resulting not adaptive [4].Misuse detection discovers attacks based on patterns extracted from known intrusions [5].

A. BI ARY CLASS SUPPORT VECTOR MACHI E
The basic principle of SVM is finding the optimal linear hyperplane in the feature space that maximally separates the two target classes.The hyperplane which separates the two classes can be defined as: Here xk is a group of samples: , and k is the number of styles; n is the input dimension; w and b are nonzero constants [6] [7]., k is the number of samples.Thus, the problem can be described as: .This is a quadratic programming (QP) problem.To solve it, we have to introduce Lagrangian: According to the Kuhn-Tucher conditions, we obtain With the Lagrange multiplier for all i =1, 2… k.So the dual of equation ( 1) is: For this problem, we also have the complement condition .So the optimal separating hyperplane is the following indicator function: We can obtain the value of vector ω from (3).In the nonlinear problem, it can be solved by extending the original set of variables x in a high dimensional feature space with the map Φ. suppose that input vector x Є Rd is transformed to feature vector Φ (x) by a map Φ: Rd→H, then we can find a function , so we can replace the inner-product between two vectors (xi , xj )by K (xi, xj) and the QP problem expressed by (4) becomes: The optimal separating hyperplane (5) can be rewritten as:

B. MULTICLASS SUPPORT VECTOR MACHI E
The multi-class classification problem is commonly solved by decomposition to several binary problems for which the standard SVM can be used.The MSVM can be constructed in two kinds of way: One-Against-All (OAA) and OAO.OAO approach for multi class classification has been shown to perform better than OAA.OAO method constructs k (k 梓1) / 2 classifiers where each one is trained on data from two classes.For the training data from i-th and j-th classes, we solve the following binary classification problem: (8) After all k (k 梓1) / 2 classifiers are constructed, we use the following voting strategy to do future test: if says x is in the i-th class, then the vote for the i-th is added by one.Otherwise, the j-th increased by one.Then we predict x is in the class with the largest vote.In case those two classes have identical votes, we simply select the one with the smaller index.Practically we solve the dual of Eq. ( 8) whose number of variables is the same as the number of data in two classes.Hence if in average each class has l / k data points, we have to solve k (k 鞺1) / 2 quadratic programming problems where each of them has about 2l / k variables.

C. CUSTOM KER EL A D SUPPORT VECTOR MACHI E D. K-MEA ALGORITHAM[21]
K-means is a centroid-based clustering with low time complexity and fast convergence, which is very important in intrusion detection due to the large size of the network traffic audit dataset.Each cluster in profile can be simply expressed as a centroid and an effect influence radius.So a profile record can be represented as the following format

(Centroid, radius, type)
Centroid is a centric vector of the cluster, radius refers to influence range of a data point (represented as the Euclidean http://ijacsa.thesai.org/distance from the centroid), and type refers to the cluster's category, e.g.normal or attack.We can determine whether a vector is in the cluster or not only by computing the distance between the vector and the centroid and comparing the distance with the radius.If the distance is less than radius, we consider that the vector belongs to the cluster.And then we can label the vector as the cluster's type.Therefore, the whole search in the profile only includes several simple distance calculations, which means we can deal with the data rapidly.Of course, not all clusters can serve as the profile.Some maybe include both normal and attack examples and not fit for the profile apparently.It is necessary to select some clusters according to a strategy.A majority example is an example that belongs to the most frequent class in the cluster.The higher the purity is, the better the cluster is served as a profile.A cluster with small purity means that there are many attacks with different types in the cluster, so we don't select such cluster as our profile.Instead, we use them as the training set for classifier.After the clusters are selected for the profile, we put them into the profile repository.The basic contents include centroid, radius and type.Here, we use the type of majority examples in one cluster as the whole cluster's type regardless of the minority examples.

III. PRAPOSED KMSVM MODEL
To separate attacks from legitimate activities, all of the machine learning based intrusion detection technologies will have two main phases, training procedure and detection procedure.As shown in Fig. 2, in the training procedure of KMSVM, K-mean is used to extract the optimal discriminate support vectors of the whole training data.In MSVM for making decision function needs support vectors other vectors far from decision boundary useless for MSVM

A. KMSVM ALGORITHM
Step 1: three input parameters are selected: the kernel parameter γ, the penalty factor C, and the compression rate CR Step 2: the K-means clustering algorithm is run on the original data and all cluster centers are regarded as the compressed data for building classifiers Step 3: SVM classifiers are built on the compressed data Step 4: three input parameters are adjusted by the heuristic searching strategy proposed in this paper according to a tradeoff between the testing accuracy and the response time Table1: RBF kernel (confusion matrix) http://ijacsa.thesai.org/Table2: Polynomial kernel (confusion matrix) Table3: Custom kernel (confusion matrix) Above give table 1, table 2 and table 3 show that RBF kernel give good result for class normal, probe and DoS, polynomial kernel gives the best result for R2L and DoS attack and Custom kernel gives best result for DoS class.We used complete dataset for training and testing and got accuracy of multi class support vector machine 73%for whole testing dataset.

V. CONCLUSION AND FUTURE WORK
There are many kernel functions which can be used for intrusion detection purpose.Among those we have conducted experiment using RBF, Polynomial and custom kernel function over MSVM.And found that the RBF kernel function's performance is better for intrusion detection.We can improve over all performance of the MSVM for four types of attack and normal (five classes) by combining above three kernels into one.

Figure 1 .
Figure 1.The optimal linear hyperplane (SV=Support vector) Assume a training set:

Step 5 :
return to Step 1 to test the new combination of input parameters and stop if the combination is acceptable according to testing accuracy and response time Step 6: KMSVM classifiers are represented as the formula in equation (8) IV.DATASET AND EXPERIMENTS The KDD Cup 1999 uses a version of the data on which the 1998 DARPA Intrusion Detection Evaluation Program was performed.Each instance in the KDD Cup 1999 datasets contains 41 features that describe a connection.Features 1-9 stands for the basic features of a packet, 10-22 for content features, 23-31 for traffic features and 32-41 for host based features.There are 38 different attack types in training and test data together and these attack types fall into five classes: normal, probe, denial of service (DoS), remote to local (R2L) and user to root (U2R) [14].In this experiment we use Pentium (IV 3GH) processor, 512 MB RAM, running window XP (SP2) based SVM multiclass [15].The experiment using RBF [16] [17] [18], polynomial and Custom kernel function for intrusion detection (multiclass classification) with parameters as g=0.001, c=0.01, q=50, n=40.Results are shown in below table.