Online Incremental Rough Set Learning in Intelligent Traffic System

In the last few years, vehicle to vehicle communication (V2V) technology has been developed to improve the efficiency of traffic communication and road accident avoidance. In this paper, we have proposed a model for online rough sets learning vehicle to vehicle communication algorithm. This model is an incremental learning method, which can learn data object-by-object or class-by-class. This paper proposed a new rules generation for vehicle data classifying in collaborative environments. ROSETTA tool is applied to verify the reliability of the generated results. The experiments show that the online rough sets based algorithm for vehicle data classifying is suitable to be executed in the communication of traffic environments. The implementation of this model on the objectives’ (cars’) rules that define parameters for the determination of the value of communication, and for reducing the decision rules that leads to the estimation of their optimal value. The confusion matrix is used to assess the performance of the chosen model and classes (Yes or No). The experimental results show the overall accuracy (predicted and actual) of the proposed model. The results show the strength of the online learning model against offline models and demonstrate the importance of the accuracy and adaptability of the incremental learning in improving the prediction ability. Keywords—Vehicle to vehicle communication; online learning; rough sets theory; intelligent traffic system


INTRODUCTION
Nowadays, road accidents are one of the major problems in modern societies that lead to death. The increase of travel time is a main reason for increasing traffic accidents, fuel consumption and increased pollution [1], [2]. Road safety field is on focus by researchers to detect traffic congestions and, thereby, to offer solutions.
The Intelligent Transport System (ITS) is a technology to achieve safe roads and comfortable driving, by reducing accidents and delay [3]. In recent years, a research area in the road safety called vehicular network offers a possible solution that allows a communication and information exchange between vehicles, which is called vehicle to vehicle (V2V) communication, or between vehicles and road infrastructure, which is called vehicle to infrastructure (V2I) communication, [4] as shown in Fig. 1.
Since its development, Rough sets theory has been able to devise a computationally efficient and mathematically sound techniques handling imprecision in decision making [5]. The optimal solutions without losing any information can be found by using the rough sets theory which find reducts the rules for training the dataset and classifying the test set [6]. The remainder of this paper is organized as follows. In Section 2, we describe basic concepts of rough sets; in Section 3 we describe architecture and the feasibility decision table of our model. In Section 4, we present the implementation of our proposed model and show the results. Finally, we conclude our paper at Section 5.

II. BASIC CONCEPTS OF ROUGH SETS
Rough sets theory (RST) is a mathematical tool that is developed by Pawlak in 1982 [7]. In this theory, the data is collected in a table, called a decision table. Rows of concepts on rough sets theory are reviewed as follows:

Definition 1 (Information system):
Is the 4-tuple [8], [9] (U, A, C, D) where U consists of objects and A consists of attributes, the subsets C and D are called condition attribute set and decision attribute set, respectively. Every a ∈ A corresponds to the function a: U →Va where Va is the value set of a.
Definition 2 (Indiscernibility relation): Let =( , ) be an information system, and B ⊆ . we define the Bindiscernibility relation as [8], [10]: , then x and y are indiscernible by attributes from B. The equivalence classes of the Bindiscernibility relation of an object x is denoted by[ ] ind(B). www.ijacsa.thesai.org

Definition 3 (Lower and upper approximation):
Two fundamental concepts of rough sets are the lower and upper approximations of sets (which are a classification of the domain of interest into disjoint categories) in Fig. 2 [8], [9]. Given a set B ⊆ A, the lower and upper approximations of a set X ⊆ U are defined by, respectively,  [8]- [10].The positive region POS B (X) is defined by:

Definition 4 (Lower approximation and positive region):
If an object x  POS B (X), then it belongs to target set X certainly.

Definition 5 (Upper approximation and negative region):
The negative region BND B (X) is defined by [8]- [10]: If an object x NEG B (X), then it cannot be determined whether the object x belongs to target set X or not.

Definition 6 (Boundary region):
The boundary region is the difference between upper and lower approximations of a set X [8]- [10]: If an object x BND B (X) it doesn't belong to target set X certainly.

III. PROPOSED MODEL
Continuous data streams reflect continuous environmental changes, raising the need for online learning to adapt to changing conditions [11]. In this section we present the proposed model that is online rough sets learning vehicle to vehicle communication algorithm. The proposed algorithm is a methodology which uses rough sets theory to compute accurate objects for new (rules) data streams from online traffic environments. Vehicles can detect potential issues on the road and alert nearby users about incoming dangers and reduce the risk of accidents as shown in Fig. 3.

A. Model Algorithm
In our method online learning rough sets theory in vehicle to vehicle (V2V) communication environment determines objects (cars) one by one and satisfying any object to any rule or addition new rules. Our model use GPS and a wireless LAN module to allow cars to communicate within a range of 100to-300 meters.
The proposed model is designed to work based on the following algorithm:

Algorithm: Proposed model Algorithm
Input: Incrementally arriving objects in traffic environment.
Output: Optimal decision of communication.
Step 1. Initial data have one object and set of rules R.
Step 2. F= ϕ {List all best of objects} Step 3. f  get new object (car) Step 4: Determine object and rule r Step 5. If f ⸦ F then the rule r is satisfying to any exited rules Step 6. Otherwise compute a decision table, generate reduct and generate rules R.
Step 7. Update online data streaming Step 8. Repeat steps 3-7 until finish all objects.

B. Decision Table of the Model
The rough sets theory has been developed for knowledge discovery in databases and experimental data sets. An attribute-oriented rough sets technique reduces the computational complexity of learning processes and eliminates the unimportant or irrelevant attributes so that the knowledge discovery in database or in experimental data sets can be efficiently learned [12].
The Rough Sets analysis of data involved calculation of reducts from data, derivation of rules from reducts, rule evaluation and prediction processes. The rosetta rough sets toolkit was employed to carry out reducts and generate decision rules. The reducts were created from our selected data are revealed in Fig. 4. We used the Johnhon's reducer algorithm and the equal binning decretized method. www.ijacsa.thesai.org A data set can be represented as a decision table, which is used to specify what conditions lead to decisions. A decision table is defined as T= (U,C,D) where U is the set of objects in the table, C is the set of the condition attributes and D is the set of the decision attributes as shown in Fig. 4. The features are: decision, the speed of cars, the range between cars, the directions of the cars, the color and the type of cars. Fig. 4 can be used to decide whether a car has a Yes or No decision according to its features (e.g., the speed, the range and the directions). For example, the first row of this table specifies that the speed of the car is 2, with 13 range, -1 direction, green color and a truck type. The rows in this To evaluate the ability of our model to learn incrementally, we conducted experiments using 10 objects in different behaviors. Data are divided into two parts: training and testing sets. The very first task is to find reducts and rules. This Data set contains 7 attributes including the decision attribute which may be Yes or No, and there are 100 objects (or) records in this data and with no missing attribute values. The value and meaning of condition and decision attributes is shown in Fig. 4 as true (Yes) class, or false (NO) class.

IV. EXPERIMENT, RESULTS AND DISCUSSION
In the experiment, we have evaluated the data with ROSETTA software. ROSETTA is a toolkit application which allows the analysis of tabular data using the rough sets methodology to implement Johnson's algorithm rough sets for attribute selection. Rosetta is an open source collection of C++ classes and routines used for data mining and machine learning in general and particularly for rough sets theory [13].
The toolkit follows some important procedures for producing the accurate result.

A. Implementation Process
The steps are: importing data from any valid data source (e.g. Excel format), applying the binary splitting algorithm in the imported data to split the original dataset into training and test data, removing the missing values, and finally applying the reduction and classification algorithms. The reduction algorithm is used to compute the reduct set and the classification algorithm is used to reduct rules and compute the classification result.
The input data set is divided into two parts with the 0.9, 0.8, 0.7, 0.6 and 0.5 split factor. The first part is known as training data set and the other one is known as test data set. The training set was reduced by using Johnson's reduction algorithm [14], which uses greedy search to find one reduct. In Table I, numbers of reduct sets produced through the application of Johnson's reduction algorithm are illustrated. The Johnson's reduction algorithm produced 9 combinations of reduct sets. An example of a rule obtained from reducts is shown in Table I. A full training dataset of each dataset object is used to train the classifiers to build the classification models that were evaluated on the test data of the same objects. The table shows the average accuracy for the predicted accuracy and the accuracy sensitivity as 50.59% and 65.87%, respectively.
The testing1 is training-testing of 90-10, testing 2 is training-testing of 80-20, testing 3 is training-testing of 70-30, testing 4 is training-testing of 60-40 and testing 5 is trainingtesting of 50-50. The classification results of original object are shown in Table II.
The reduction rule explains the rule support, stability, length, coverage and accuracy. Each row of the reduction rule is called descriptors (Attribute→ value). The left hand side of the rule is called the antecedent and right hand side of the rule is called consequent. This reduction rule result used in the classification process. This rule is used to make the confusion matrix.

A. Results and Discussion
Table II exhibits the classification accuracy of original object. To find the percentage of accuracy, dataset has been changed as training set and testing set according to the mentioned ratio.
For more evaluation for the model's capability to learn incrementally, we conducted experiments using different testing types this process was repeated 10 times in different (10 objects) cars. The ten different testing results (classification accuracy) for our model are shown in Table III. Our model has the capability to learn new objects (cars) from data streams in online environments and can accurately detect the appropriate car to communicate in road traffic.    Rough sets have been employed here to remove redundant conditional attributes from discrete-valued datasets, while retaining their information content. This approach has been applied to aid classification of online traffic environment, with very promising results.
Using online rough sets in our model allows efficient updates and avoids the process of retaining the whole data, a major disadvantage of offline models, when new data are coming, and it is most helpful when dealing with big data.   Vehicle to vehicle communication has offered several solutions for traffic problems that reduce the danger of collision. This research aimed to design a prototype online rough sets learning in traffic vehicular communication. Optimal communication vehicles were decided by using online learning rough sets to evaluate the percentage of accuracy decision (Yes or No). Generally, online learning data requires an updatable model. That is, the model should in some objects "evolve" in response to the streaming data.
This study presents an incremental learning algorithm which learns new classes in online environments, allowing our model to be updatable and evolve to detect new objects (cars). The model attempts to learn the rules of objects (cars) where the following vehicle notifies the cars.
Our model uses ROSETTA tool in rough sets data analysis to emphasize the classification, in prediction of the learning. Several tests were done by changing the training and the testing dataset ratio. The confusion matrix is used to assess performance of chosen model and classes (Yes or No). The experimental results show that overall accuracy (predicted and actual accuracy) of the object is evolved in our proposed online model.
The limitation of our model is that the run time is not perfect which affects the classification accuracy. More experiments are needed in future.