A New Clustering Algorithm for Live Road Surveillance on Highways based on DBSCAN and Fuzzy Logic

Video streaming over Vehicular Ad Hoc Networks is a promising technique (VANETs), and it has gained great importance in the last few years. The highly dynamic topology of VANETs makes high-quality video streaming very challenging. In order to provide the most useful camera views to the vehicles, clustering and cluster head selection techniques are used. Too frequent camera view changes can be annoying; therefore, we propose a new stable clustering algorithm to ensure a stable live road surveillance service without interruptions for vehicles that do not have enough vision area. In the proposed solution, we integrated Density-Based Spatial Clustering of Applications with Noise (DBSCAN) with Fuzzy Logic Control (FLC). DBSCAN is used to form the clusters, while FLC is used to find the best cluster head for the cluster. Different parameters are utilized like density parameters for DBSCAN, and relative speed, acceleration, leadership degree and vision area for fuzzy logic. Our proposed algorithm showed better results in terms of cluster lifetime and vehicle status change. Our proposed algorithm has been compared with another clustering scheme to prove the effectiveness of our proposed algorithm. Keywords—Vehicular ad hoc networks (VANETs); V2V; intelligent transportation systems; clustering algorithms; road surveillance; DBSCAN algorithm; fuzzy logic control

cluster maintenance and coordinating the transmission among CMs in the same way as an infrastructure wireless access point.
In this paper, we propose a new algorithm aiming to achieve stable clusters and find a suitable CH for vehicles that tend to get the road conditions via video surveillance service. DBSCAN technique is used to configure the clusters, while Fuzzy Logic Control (FLC) is used to select the best CH. The elected CH will be responsible for providing video surveillance on the conditions of the road to all CMs in the same cluster, depending on the on-board camera substantiated inside the vehicle. Our proposed scheme has been compared with the Effective-Vision-Area-Based Clustering Algorithm with the Adaptive Video-Streaming Technique (EVAC-AV) algorithm and showed an effective result in increasing the cluster lifetime. This paper is organized as follows. Section II presents the literature review. Section III describes the proposed clustering approach. In Section IV, the simulation environment and the methodology is shown. The performance evaluation and results are introduced in Section V. Finally; Section VI concludes the paper.

II. RELATED WORK
The clustering mechanism is an effective technique, which is used to streamline some critical functions like media access management, quality of service achievement, and bandwidth allocation, etc. In general, the nodes in the clustering algorithms have three states: CH state, normal state (NS), and CM state. These terms may vary in some articles, but they have the same notions. CH is the focal point of the cluster, which is elected to coordinate the cluster, while NS represents the state of a node that does not belong to any cluster. When it joins a cluster, it becomes a CM. Fig. 1 shows the topology of three clusters, in which each cluster elects a single CH. It clarifies how the different nodes are formed and grouped.
Due to the significance of the issues that clustering addresses, many clustering methods have been proposed lately in the context of VANETs. Most of them aim to achieve network constancy. Amjad Mehmood and et al. [19] have employed the flow of traffic knowledge in addition to using several metrics, like the degree of connectivity, the node position, direction, and speed variation to form stable clusters. The naïve Bayesian probabilistic estimation technique is used to enhance cluster stability and increase the CH lifetime. The proposed technique was compared with other algorithms and showed improvements in cluster and CH lifetime. Regardless of the efficiency of the ANTSC algorithm in selecting the CH and increasing the cluster lifetime, it is used for a particular scenario, so it was unclear whether it could be used in different scenarios. Moreover, the naïve Bayesian network probabilistic estimation requires real datasets for each zone, which makes it inapplicable in case lack of dataset.
The authors in [20], proposed a new clustering algorithm to select the most suitable CH based on FLC. A blend of several metrics was considered as inputs of the proposed cluster head selection algorithm, such as speed, distance, acceleration, and direction. The results showed that developed fuzzy logic (FL) based Cluster Head Selection Algorithm (CHSA) has increased the stability of CH and improved generally the performance in various scenarios in VANETs.
The Fuzzy-Based Cluster-Management System (FBCMS) has been proposed in [21]. Two models of this system have been created of this system, where each model has different parameters to select the most appropriate cluster head. The first model utilizes three parameters, which are the group speed, relative acceleration, and security as inputs of fuzzy logic, while the second model uses four metrics. Three of them are the same as the first model in addition to the degree of connectivity as the fourth parameter. However, using the location of vehicles in relation to a fixed RSU as one of the parameters to determine the CH in a highly dynamic environment like VANET could have a negative impact on the stability of clusters and may lead to frequent network disconnection, especially on highways.
The authors of [22] proposed a novel clustering scheme, which depends on the average speed of vehicles and standard deviation to increase the cluster lifetime. Two clustering patterns have been introduced which rely mainly on the principle of the normal (Gaussian) distribution and the relative speed. The calculated residence time of vehicular nodes in a cluster is used as a stability criterion. The first pattern represented a very high stable cluster in which the vehicles having speeds within the range of mean and standard deviation are used to configure this cluster (i.e., only 68% of the vehicles permitted to form this cluster). The election of the cluster head is carried out from the vehicles having speeds close to the average of cluster speed. The second pattern aims to group about 95% of the vehicles by selecting only the vehicles having speeds with a deviation lower than the double of the average standard deviations ( 2 σ σ ≤ ) in one cluster. The analytical analysis showed that the second pattern has less stability than the first one. These two metrics (average speed and standard deviation) alone are not enough to establish stable clusters and select the optimal cluster heads for them, particularly as many parameters should have been taken like acceleration and position. 581 | P a g e www.ijacsa.thesai.org About video streaming and live road surveillance, the EVAC-AV [23] has been proposed as a solution for this kind of clustering. The cluster is initiated when a vehicle disseminates a request to join a cluster for having a live road surveillance service, so the vehicles which are ahead of it will be triggered to calculate their vision area. If their vision area is larger than a predefined vision area threshold, they will be deliberated as candidate CHs. Using the largest vision area as a single parameter to determine the best cluster head is not enough to create a stable cluster especially since the video streaming service is the most affected by the changes and reclustering furthermore, the other algorithms which aim to provide stable clusters depend mainly on RSUs as a key parameter which makes it difficult to apply these algorithms in highways environments lacking to V2I technologies, therefore, this paper proposes a new stable V2V clustering algorithm entitled "A New Clustering Algorithm for Live Road Surveillance on Highways Based on DBSCAN and Fuzzy Logic". This algorithm aims to create stable clusters based on the density of vehicles on the street by using DBSCAN to form clusters and select the optimal CH based on FLC. DBSCAN ensures constructing the clusters without having to rely on RSUs as other algorithms do while FLC is used to select the best cluster head.

III. PROPOSED ALGORITHM
This paper presents a new algorithm that has the ability to detect and form a cluster automatically when the density of vehicles increases as well as selects the optimal CH. The strength of our algorithm is derived from the integration of the DBSCAN algorithm with fuzzy logic control. In our assumption, all vehicles are fitted with OBUs to be able to handle the IEEE802.11p as a Dedicated Short Range Communications (DSRC) system. Each vehicle broadcasts its information by sending Cooperative Awareness Message (CAM), which is a single hope broadcast communication. This CAM message is sent periodically at regular time intervals called T update . Based on these messages, each vehicle will sense its current neighboring vehicles and update its Neighbor Table through exchanging the speed, position, and direction. Our proposed algorithm aims to provide a stable density-based clustering technique on a highway that consists of two phases cluster configuration phase and cluster head selection phase.

A. Overview of DBSCAN
Clustering represents the most commonly used and more powerful unsupervised learning mechanism in machine learning. It is a useful tool that aims to classify the input data into sets depending on some similarity calculations. These algorithms are categorized into groups like partitional algorithms, density-based algorithms, hierarchical algorithms, etc. [24]. Among them, DBSCAN has been selected in our proposed algorithm because it has many features that make it more suitable than other clustering techniques. DBSCAN is an effective density-based clustering algorithm for spatial data systems due to its ability to discover clusters with arbitrary shapes in one scan, not like, for example, K-mean, which needs many iterations to find out the clusters. It is characterized by its capability to detect outliers as well as it does not need to predetermine the number of clusters. In DBSCAN, the distance of two points is determined by a distance metric, such as the Euclidean distance. However, there are two parameters in DBSCAN which are required to be specified, Epsilon (ε) and Minimum Points (MinPts). ε represents the maximum distance between two points, which means that if the distance between two points is lower or equal ε, these points are considered neighbors. MinPts represents the minimum number of points counted neighbors for that point. It is used to identify if the point is a core point, border point, or noise point [25].

B. Vehicle Vision Area
Vision area plays an essential role in defining the cluster topology and electing the CH because the CH is responsible for providing live video surveillance to all vehicles located behind it that do not have enough vision area. No vehicle can be nominated to be a CH if it does not have a sufficient vision area. Therefore, the vehicles will be classified into two classes: (i) vehicles that have vision area (V vision ), which can be potential CHs; (ii) vehicles that do not have vision area V novision, which can be possible CMs. We assume each vehicle has a camera mounted on the vehicle dashboard to capture live road conditions. The Distance Threshold (D th ) is used to define the vision area. We can say any vehicle is a V vision if the distance between it and the adjacent front vehicle on the same lane is less than the D th , but if the distance is less than D th , it is considered a V no-vision . It is worth noting that the D th value is the same as the value of ε parameter used in the DBSCAN algorithm, which represents the safety distance that gives the driver the sufficient time for appropriate decision in case if he decides to overtake or in case of any emergency like a sudden incident.
Where y any vehicle in the dataset of vehicle x and D represents the DBSCAN dataset. Three types of nodes are defined in DBSCAN: 1) Core Node: The vehicle x is considered a core node if |N ε (x)| ≥ MinPts.
2) Border Node: The vehicle x is considered a border node if |N ε (x)| < MinPts, but one of the N ε (x) is a core node.
3) Noise Node: The vehicle (x) is considered a noise node if |N ε (x)| < MinPts and no one of N ε (x) is a core node.
After DBSCAN has found a core node, the remaining adjacent vehicular nodes are checked consecutively to identify the next core node. If another nearby node becomes a core point, the cluster domain is extended. DBSCAN continues this process until no more core points can be found. Fig. 3 shows how the cluster is established after being originated by vehicle x. All the vehicles are included in the group, except the vehicle considered as a noise node. It should be noted that the vehicle in the center of the blue circle is a core node, and the vehicle in the center of the yellow circle is a border node, while the vehicle in the center of the red circle is a noise node. In case of existing a suitable CH, it will send a request to join the cluster and become a CM. In the case of more than one CH in front of it, it will select the closest one.

D. Cluster Head Selection (CHS) Phase
Cluster Head Selection (CHS) plays a significant role in cluster stability, which in turn represents one of the performance criteria in VANETs. The CHS process starts after cluster creation in which only V vision vehicles in the cluster will enter to CHS phase as Candidate Cluster Heads (CCHs). FLC is the technique used to find the most suitable CH in the cluster. Fuzzy logic is an effective multi-characteristic decision technique because of its ability to a trade-off between significance and precision. Three parameters are considered in the CHS phase: Cluster Speed (CS), Vehicle Acceleration (VAcc), and Leadership Degree (LD). CS is determined by calculating the average speed of the vehicles in the clusters. LD is a value between 0 and 1 which shows if the CCH is eligible to be a CH or not, where 0 means that the CCH is in the back of the cluster (all potential CMs are in front of it), which means it is not eligible to be a CH while 1 represents that the CCH is at the front of the cluster and has the highest degree of eligibility to be a CH. The LD metric is calculated for each CCH so ( ) Where N is the number of vehicular nodes in the cluster, V B is the number of V no-vision behind the CCH in the cluster. These three metrics are fuzzified using the fuzzy logic system. Fig. 4 shows our CHS System. As shown in this figure, three parameters (CS, VAcc, LD) are considered as an input of the fuzzification system. The function of the fuzzification system is to convert the actual values of the input parameters into fuzzy sets by using membership functions. There are many types of membership functions. In our CHS system, we have utilized triangular and trapezoidal membership functions, as shown in Fig. 5 because they are more efficient in real-time applications [26]. The term sets of CS, VAcc, and LD are defined respectively as:  leaves the cluster when it overtakes the CH and becomes in front of it, so the video streaming service from the CH becomes useless. In this case, it will send a Leave message to its CH.

IV. TOOLS AND METHODOLOGY
Our proposed clustering algorithm has been evaluated using MATLAB R2017b, while the mobility of vehicles has been simulated by the Simulation of Urban Mobility (SUMO). Our algorithm is designed for highway, so a 5 km highway is modeled in SUMO. The road consists of six lanes, three lanes for each direction in which 50 vehicles moving in the same direction with different speeds were deployed. The speed of vehicles is ranging from 60-120 km/h. The standard D th is 50 m, which is the distance approved as safety distance between the vehicles when the speed of vehicles is 100 km/h [23]. In our work, we use different D th to confirm the efficiency of our proposed algorithm. Concerning the DBSCAN parameters, we considered the number of the lane in the same direction represents the Minpts, and the safety distance represents the value of ε, so if the D th is 50, then ε is 50 m. The main parameters applied in the simulation are mentioned in Table  II. MATLAB and SUMO blocks have been connected together by TraCI (Traffic Control Interface). TraCI creates a TCP connection to make a connection between MATLAB and SUMO. SUMO acts as a server (TraCI-Server) and MATLAB as a client (TraCI-Client). The performance evaluation of the proposed clustering algorithm was done by comparing our results with EVAC-AV algorithm results. It should be noted that we chose EVAC-AV introduced in the related work section as a benchmark algorithm because it is the only clustering algorithm based on vision area estimations and use V2V mode. The rest of the other algorithms have been excluded from the comparison because they differ in terms of purpose, parameters, and calibration. Our aim is to improve the performance and increase the stability of CHs as well as decreasing the number of vehicles status change. The following two performance metrics were used: • Average cluster lifetime: It is defined as how long each cluster will last continuously. A higher value of this measure denotes a better stability.
• Vehicles status change: Vehicle status change is defined as the number of status changes per vehicle during its lifetime. Fig. 6 and Fig. 7 demonstrate the distribution of CHs, CMs, and NSs during the simulation time when the D th =50 m and 70 m, respectively.
They show a high number of NSs in each time step. The reason for this is that these vehicles still in NS because they have enough vision area and do not need to join to any cluster as well as they are moving in an area with a low density of vehicles, so they are detected as noise nodes (NSs) during the forming the cluster by DBSCAN algorithm. This is considered a virtue because our proposed algorithm aims to form clusters and provides stable CHs for only the V no-vision that need video streaming to know the road conditions; therefore we classified the NSs into NS no-vision and NS vision to calculate the exact number of vehicles that do not have enough vision area in NS and do not find CH. Fig. 8 shows the number of NS vision and NS no-vision during the simulation time. The results showed the percentage of remaining NS no-vision is 2% during the simulation time when the D th =50 m (Fig. 8 (a)) while it is up to 8% when D th =70 m (Fig. 8 (b)).
Our proposed algorithm is compared with EVAC-AV in the term of cluster lifetime and number of cluster. As shown in Fig. 9, the cluster lifetime was increased in comparison to the EVAC-AV algorithm when we applied different D th s. The average cluster lifetime was (51 sec) while the average cluster lifetime of EVAC-AV is (18.84 sec) at D th = 50 m during the simulation time. At D th = 70m, the cluster lifetime of our algorithm is (50.57 sec) whereas, the EVAC_AV is (26.52 sec). Also, the number of clusters has been more than halved compared to the EVAC-AV algorithm at different D th s, as shown in Fig. 10.
Regarding the vehicle's status change, it was measured by calculating the rate of total status changes of the vehicles and applying this step by using a different D th s. The results have been compared with EVAC-AV under the same condition. As displayed in Fig. 11 and Fig. 12, our proposed algorithm shows better results by decreasing the rate of vehicle status change.         Vol. 11, No. 8, 2020 VI. CONCLUSION Video streaming enriches drivers with substantial information for safety, emergency, and entertainment. Clustering algorithms can be used as effective methods to improve and organize the work of the network. In this paper, a new clustering algorithm is proposed for road surveillance. It is characterized not only by the ability to detect and form the clusters when the density increases on the highway but also by finding the optimal CH for each cluster. The merits of this algorithm come from merging DBSCAN with FLC. DBSCAN algorithm is responsible for forming the cluster when it is triggered by a vehicle need video streaming to know its road condition. The CH is responsible for providing live road surveillance to all vehicles in the cluster. Fuzzy logic control is used to select the CH based on the metrics, which are cluster speed, acceleration, and leadership degree. Our proposed algorithm uses the vision area as a crucial parameter, so no vehicle can be nominated to be a CH if it does not have a sufficient vision area. Simulation results showed that our proposed algorithm provides a lower number of CHs and clusters and less variation of vehicle status. Additionally, our proposed algorithm increases the cluster lifetime in comparison with the EVAC-AV algorithm, which means the road surveillance service will be more efficient and more stable. In future work, we would like to add some additional metrics like link connectivity duration.