Hybrid Deformable Convolutional with Recurrent Neural Network for Optimal Traffic Congestion Prediction: A Fuzzy Logic Congestion Index System

—In the field of Intelligent Transportation Systems (ITs), traffic congestion is considered as an important problem. Traffic blockage usually affects the quality of time, travel time, economy of the country, and transportability of people. The information of traffic congestion is collected and analyzed in ITs, and the methods to prevent the traffic congestion are predicted. However, the tackling of huge data is still challenging. The rapid increase in vehicle usage and road construction has resulted in traffic congestion. Various studies are undergone in ITs to recognize the traffic management system by adopting few resources. Real time-based traffic services are implemented to prevent the traffic congestion in existing areas. These services provide high expense accuracy. This paper plans to develop a new technique to predict the traffic congestion using improved deep learning approaches. At first, the benchmark dataset is gathered and the pre-processing of data is performed with removing the bad data, organizing the raw data, and filling the null values. The optimized weighted features are selected from the pre-processed data by adopting a new meta-heuristic Hybrid Jaya Harris Hawk Optimization (HJHHO) algorithm. The prediction of congestion parameters such as speed reduction rate, very low speed rate, and volume to capacity ratio of vehicles are performed by the proposed Improved Deformable Convolutional Recurrent Network (IDCRN) prediction model. These predicted measures are subjected to fuzzy interference system for congestion index computation. From the experimental analysis, it has proved that the proposed method has reduced the error rate while comparing with other deep learning and machine learning approaches.


I. INTRODUCTION
Population growth and the rapid urbanization in economy have increased the traffic clogging drastically in most developing cities and large areas in all over the world. It also increases both the road rage statistics and commute time. Consequently, the cause of accidents is increasing due to high traffic congestion [1]. Nowadays, the study for traffic management is highly significant among the researchers. The traffic clogging can be prevented by developing the infrastructure of transportation, which is expensive or else by organizing possible traffic schemes, like analyzing the blocking pattern or prediction of traffic congestion in short term that can be more effective for road networks in a short time [2].
The timely congestion prediction models focus on the traffic factors such as speed, volume, and traffic stream on group of roads, one way road and especially small rural roads due to the unavailability of data. The limitation on predicting the road networks causes trouble for both the traffic agencies and commuters [3]. The researchers use the data, which is gathered from sensors like CCTV cameras, road sensors etc. These road sensors are located everywhere in road and even in vehicle networks to operate in all routes. This type of data is inconvenient because the process of operating, installation and maintenance is difficult, and it is expensive as it requires permission to access the data for third parties [4]. The real time traffic information like the average speed and blocking level on the section of roadway are provided in the web services such as Seoul Transportation Operation, Google Traffic, Information Service (TOPIS), Baidu Map and Bing Map [5]. Therefore, these web services are freely accessible and offer traffic congestion-related information to the majority cities all over the world, but only few studies are available based on this web services. The data requires multiple inputs, which are in time series, the processing of numerous input traffic images is complex due to high cost [6].
In past years, the data-driven methods were followed by the researchers to develop the mathematical and statistical models to detect the relation of time series in traffic data, also known as parametric method [7]. Mainly, the existing works were related to the presumption of stationary and linearity to capture the prediction trends such as smoothing model, error component method, and historical average method [8]. The seasonal pattern and long-term trends are decomposed to detect the pattern using Autoregressive Integrated Moving Average (ARIMA) model. However, it is unable to focus on the mean value of time series and incapable to predict the intense. The researchers began to concentrate on machine learning parametric models due to the limitations in parametric model [9]. The nonparametric model depends on the training data to establish parameter numbers and structural model [10]. The machine learning approach called K-Nearest Neighbor (KNN) predicts the traffic stream by searching the closer neighbor matching to the present data from the traditional database. The Support Vector Machine (SVM) approach minimizes the structural risk and has advantages in high dimensional data and minute samples [11]. The Bayesian Network (BN) approach  To select the weighted features with HJHHO algorithm by optimizing the weight and selecting the features from the pre-processed data through minimizing the variance of features.  To compute congestion index computation by considering the inputs as the speed reduction rate, very low speed rate and volume to capacity rate by the wellperforming fuzzy interference system.
 To determine the performance of the suggested model with other existing approaches by evaluating different measures to examine the outcomes of the proposed model.
The other sections of this paper are illustrated here. Section II represents the related works. Section III denotes the architectural view of optimal traffic congestion prediction with new hybrid heuristic algorithm with fuzzy logic system. Section IV denotes the enhanced traffic congestion prediction with new optimized weighted feature selection. Section V depicts the optimal traffic congestion prediction by hybrid deep learning with traffic congestion index computation by fuzzy model. Section VI denotes the result and discussion. Section VII indicates the conclusion.

A. Related Work
In 2018, Tseng et al. [18] have applied Apache storm in real time traffic congestion prediction scheme by analyzing various data such as rainfall volume, road density and traffic incidents. The new SVM-based Real-Time Highway Traffic Congestion Prediction (SRHTCP) method has gathered the traffic incident data reported in roadways from the Taiwan police broadcasting service and the weather reports were collected from Taiwan Central Weather Bureau. Here, the fuzzy set theory was used to estimate the traffic congestion in real time with regarding rainfall volume, road density, and road traffic incidents. The road speed for next timeline was predicted with SRHTCP method by analyzing the weather data and traffic flow data. The suggested SRHTCP scheme has achieved better accuracy prediction than other prediction models based on weighted exponential moving average approach.
In 2020, Ranjan et al. [19] have proposed three techniques to predict the traffic congestion like, an effective and low-cost data attainment scheme by captivating a snap of traffic clogging from the online open source web service TOPIS and a hybrid deep learning approach to predict the traffic congestion level by extracting the temporal and spatial information. The relationship of temporal and spatial for predicting the traffic congestion level was analyzed efficiently and effectively by the proposed model. This suggested model effectively has outperformed other Deep Neural Network (DNN) models.
In 2020, Shin et al. [20] have proposed a deep learning approach to predict the traffic congestion and in addition to correct the missing spatial and temporal values. The proposed model was pre-processed with outlier removal method using the Median Absolute Deviation (MAD) of traffic data. Therefore, the spatial and temporal values were corrected using spatial and temporal trends and pattern data. The proposed prediction model was combined with LSTM for learning the data in time series. The suggested model was compared over other existing model and has achieved better prediction results.
In 2019, Zhao et al. [21] have proposed an optimized GRU architecture to predict the speed of trucks on urban roads under non-periodic congested situations. The driving data of struck in Beijing Road was gathered. To get rid of the unwanted data, the pre-processing and screening of data were approached, and then, sequence of traffic speed was extracted. The learning rate was not adjusted by the Suitable Development Goals (SGD) weight optimization algorithm. The weight was optimized by Adadelta, Rmsprop and Adam in this proposed GRU model. The accuracy of the suggested model was verified based on four scenarios, such as accident, workday, rainy and weekend. www.ijacsa.thesai.org In 2020, Zaki et al. [22] have suggested a new Hidden Markov Model to determine the traffic stages in two dimensional (2D) spaces during the high peak hours. The proposed model has captured variance in traffic pattern data using the contrast and mean speed. The proposed model has enhanced the prediction error than other neuro fuzzy and HMM approaches. In 2019, Wena et al. [23] have suggested a Hybrid Temporal Association Rules Mining mode to predict the traffic clogging. The traffic states were predicted using DBSCAN algorithm, which has suitable rules in analyzing the traffic congestion in road ways. The temporal associated rules in traffic states were extracted using genetic algorithm based temporal association rules mining algorithm. The classification process was used to predict the traffic congestion level. The prediction and the stimulation tests were studied in different sizes of road ways. The stimulation results have determined that the suggested model has predicted the traffic congestion with high precision.
In 2016, Li et al. [24] have proposed an adaptive real time prediction model. This scheme has comprised a traffic pattern recognition algorithm related to an adaptive threshold calibration method, the adaptive K-means clustering and a 2D speed prediction method. The patterns from traffic data were obtained by the adaptive K-means clustering. The adaptive threshold calibration method was applied to recognize the traffic congestion and prediction. The obtained result has shown that the adaptive K-means clustering has recognized the traffic pattern better than Gaussian Mixture method. The proposed model has performed well in real time application of traffic congestion prediction and gained better accuracy.
In 2019, Song et al. [25] have applied the k-means clustering algorithm to classify the spatio-temporel distribution of congested roadways. Then, the spatiotemporal features were mined to extract the potential factors using geo-detector. The congested patterns were selected from six inter-regional and intra-regional roads on weekdays. The public properties like tourist spots, hospitals and green spots were often congested in off peak and peak hours. The result has suggested that the roads built-in high-density areas could reduce the repeated trips in center of the city. The land utilizing plan has involved with a detailed design of the environment to enhance various travel approaches in order to reduce the traffic congestion and increase the effectiveness of traffic. More advanced techniques were applied in land use plan and traffic congestion based on multi source real time data.

B. Problem Statement
Multi source data are collected and evaluated to predict and prevent the traffic congestion and traffic stream in transportation system. The training of all this data is still challenging. Various studies were undergone to handle this type of huge road data, which prevented the real time traffic congestion by using different techniques. Table I shows the features and challenges of traditional congestion prediction on traffic flow methods. SVM [18] returns better prediction accuracy and the car speed of the proceeding time period is predicted by the SRHTCP model. But, used open datasets are not verified using t-test technique. CNN and LSTM [19] train the image data using a large resolution on a smaller resource and better performance is achieved with respect to the computing time. Still, it does not enhance the performance of the model by including external factors such as weather information for every road. LSTM [20] solves the long term dependency problem and time-series features associated with the traffic data are learnt. Yet, it does not enhance the accuracy in the urban areas and low-speed regions. HMM [21] enhances the recovery vehicle count in a specific stretch of road at particular times and also supports the enhancement strategies and the traffic management on the longer term like ramp metering. But, the technique of contrast is not verified on extra datasets. GRU [22] offers an efficient information service for truck drivers and simultaneously meets the efficiency and accuracy of matching. Still, the applicability is not considered for the real traffic systems. GA [23] minimizes the correlation in the environments and also predicts the traffic congestion with a low error. Yet, it does not automatically choose the best parameter values. Adaptive K-means cluster [24] is insensitive for the actual congestion probability and also enhances the prediction performance in the case of real time. But it does not consider the external influencing factors like mega-event and weather for enhancing the traffic mode recognition. Mining technique [25] improves the accessibility for distinct travel modes and also identifies the potential urban form factors and the traffic congestion hotspots. Still, it does not consider the useful knowledge for the improved decision making using novel approaches. Thus, it is necessary to introduce novel deep learning methods for predicting the congestion in the traffic flow management system in order to reduce the error and enhance the accuracy of the overall system. www.ijacsa.thesai.org Ranjan et al. [19] CNN and LSTM Better performance is achieved with respect to the computing time.
The image data is trained using a large resolution on a smaller resource.
The performance of the model is not enhanced by including external factors such as weather information for every road.
Shin et al. [20] LSTM The time-series features associated with the traffic data are learnt. The long term dependency problem is solved.
The accuracy is not enhanced in the urban areas and low-speed regions.
Zhao et al. [21] HMM It supports the enhancement strategies and the traffic management on the longer term like ramp metering. The recovery vehicle count is enhanced in a specific stretch of road at particular times.
The technique of contrast is not verified on extra datasets.
Zaki et al. [22] GRU It simultaneously meets the efficiency and accuracy of matching. An efficient information service is offered for truck drivers.
The applicability is not considered for the real traffic systems.
Wena et al. [23] GA The traffic congestion is predicted with a low error. The correlation in the environments is minimized.
The best parameter values are not automatically chosen.
Li et al. [24] Adaptive Kmeans cluster The prediction performance in the case of real time is enhanced. It is insensitive for the actual congestion probability.
The external influencing factors like megaevent and weather are not considered for enhancing the traffic mode recognition.
Song et al. [25] Mining technique It identifies the potential urban form factors and the traffic congestion hotspots. The accessibility is improved for distinct travel modes.
The useful knowledge for the improved decision making is not considered using novel approaches.

A. Proposed Model and Description
Traffic congestion is one of the most important issues in all over the world. The current infrastructure is unable to cope with new traffic applications. The small spaces and other construction activities influence the traffic congestion. Due to the cause of traffic blockage, the fuel costs and the travel time of employers and distributing workers is affected. Traffic congestion is defined as the transportation vehicles surpass the capacity of roadway in peak time. The congestion indicators are mostly used to evaluate the traffic congestions in the urban road routes. Millions of peoples are affected by traffic congestion. This also causes noise and air pollution in whole surroundings. The impact of traffic blocking can be associated to fuel price raise, environment related matters, and transits cost. Various studies and researchers were undergone to overcome the traffic congestion. The timely prediction of traffic congestion in real time can prevent the unnecessary blockage. Deep learning and machine learning approaches were implemented to predict the traffic congestion. Machine learning-based model is most popular than other nonparametric models. It analyses the traffic patterns with low restrictions and gives better prediction results. Deep learning approaches are discussed to predict the real time traffic congestion. The traffic data are huge data, which are difficult to train. In this case, various techniques were executed to train the huge data volume and to enhance the prediction accuracy. The architectural diagram for the proposed traffic congestion prediction is given in Fig. 1.
The proposed traffic congestion prediction model covers five main phases (a) data collection (b) Pre-processing (c) Feature selection (d) Prediction and (e) Congestion index computation. The benchmark dataset is gathered from Radar traffic counts, which is publicly available. These datasets are pre-processed by three methods such as removal of bad data, organizing the raw data and filling the null values. In this preprocessing phase, the raw data are cleaned by eliminating the unwanted data and filling the missing data. The pre-processed data is inputted to the optimized weighted feature selection phase, where the weighted feature selection is enhanced by hybrid HJHHO algorithm by optimizing the weight and features. The selected features are given to the classification phase. The features are predicted with DCN and RNN architecture by optimizing the hyperparameters like hidden neuron count, learning rate and epoch count using HJHHO algorithm. As the prediction uses hybridization of two deep leading models with architecture optimization, the proposed model is termed as IDCRN. The main purpose of the proposed prediction is to minimize the error rate and maximize the prediction performance. The predicted parameters such as speed reduction rate, low speed rate, and volume to capacity rate are attained from IDCRN. Finally, the predicted parameters are inputted to fuzzy interference system for congestion index computation such as high, low, moderate and very high.

B. Dataset Description
The benchmark dataset Radar Traffic counts are collected from "https://data.austintexas.gov/Transportation-and-Mobility/Radar-Traffic-Counts/i626-g7ub"-Access date: 24-12-2021. The traffic speed and count are gathered from various Wavetronix radar sensors, which is taken from the city of Austin. The dataset contains the hourly transportation of vehicles with 6.83 million of rows and 17 columns. The 70% of data is used for training and 30% of data is used for testing.
The collected input data is determined as , here the term f is indicated as the total number of gathered dataset.

C. Data Pre-processing
The process of converting the collected raw data into suitable format is known as data pre-processing. The gathered raw data are pre-processed to remove the unwanted data and to eliminate the noises before utilization. The pre-processing of raw data is essential for accurate precision analysis. In this proposed work, the pre-processing phase is performed with removal of bad data, organizing the raw data, and filling the null values.
Removing bad data: The input data ip f T is given to bad data removal process. Data cleaning or removal of bad data is a process to remove the unwanted data in the gathered dataset. The conversion of data from a structure or format to another is known as data transformation. It is essential to remove the bad data before utilization. In this process, the corrupted data, duplicate data, missing data are removed or fixed from the dataset. The presence of duplicate or unwanted data results in producing error in the prediction performance. The removal of bad data is denoted as bad f T . Organizing the raw data: The data gathered are mostly unorganized or non-systematic, which are known as raw data. The deconstruction analyze method is used to manipulate or organize the data. The obtained raw data are in the form of recorded values and the systematic process of organizing them is referred as raw data organization. The organized data is denoted as Filling the null values: The traffic data are usually affected in two different categories. First, the data are missed in certain time periods and locations. The entire data is necessary for the prediction and modelling of transportations. Second is the loss of statistical information. It causes violation of missing traffic patterns. The null values are filled with appropriate methods. When the missing values are to predict, the 0 or NA is used instead of the missing values. Filling the null values generates robust data models. Finally, the pre-processed data is denoted as pre f T and it is applied to feature selection phase.

A. Optimal Weighted Feature Selection
The pre-processed data In Eq. (2), the term var refers to the variance, 1 objfn is the objective function, the variance is defined as "a statistical measurement of the spread between numbers in a data set. More specifically, variance measures how far each number in the set is from the mean and thus from every other number in the set" as given in Eq.
Here, the term d y denotes the th d data feature, y represents the data features and r denotes the total number of data features. Consider, the optimal weighted features as The representation for optimal weighted feature selection is given in Fig. 2.

B. Proposed HJHHO
The proposed HJHHO algorithm is used for selecting the optimized weighted feature selection from data by optimizing the selecting the features and weight and improving the hybrid based IDCRN prediction performance by optimizing the hidden neuron count of DCN, learning rate of DCN, epoch count of DCN and further, hidden neuron counts of RNN.
HHO [26] algorithm is flexible and used in various optimization problems to attain optimal solutions. It has high efficiency and predicted the failure probability using key factors. The exploration phase is maximized using compound agents. However, the HHO algorithm suffers from challenges like population diversity problem and local optima in high dimensional issues. To over these challenges in HHO, JA algorithm is used. JA [27] algorithm is easy and simple to implement and it contains few parameters in a single phase. It is applied to resolve various problems in optimization and finds optimal solutions within less computational time. The proposed hybrid HJHHO algorithm enhances the problems in optimization and more efficient than other optimization algorithms. If escaping energy 1 B  , the position is updated using HHO through exploration phase, otherwise the position is updated using JA algorithm. HHO algorithm is inspired by exploring and attacking behavior of Harris hawks. The HHO algorithm is gradient free population-based optimization technique, which can be proposed to all kinds of optimization problems. The exploration phase of HHO is utilized in HJHHO. Harris hawks can spot and track their prey with their powerful eyes. The prey is not visible often. The hawk waits in the desert spot for several hours to observe, monitor, and to detect the prey. The candidate solutions are denoted as Harris hawks and the best candidate solution is referred as the intentional prey. The hawk waits in a location for hours to detect the prey in terms of two strategies. The term r is denoted as the strategy of each perching, hawks perch depend on the position of their family members and the rabbit. If the condition 5 . 0  r , they perch on random trees is given in Eq. (4).
In above equation, the term   In above equation, the term The flowchart of designed HJHHO algorithm is given in Fig. 3.

A. Proposed IDCRN
The proposed IDCRN is a hybridization of improved DCN and RNN, which is used to predict the traffic congestion parameters in the proposed mode. The main function of IDRCN is to predict the data by optimizing hidden neuron count of DCN, learning rate of DCN, epoch count of DCN and further, hidden neuron counts of RNN by HJHHO algorithm. The proposed IDCRN minimizes the MAE and RMSE measures, thus reducing the error rate. DCN [28] architecture is used to extract the deep features and predict the data. The offset vector for each sampling is introduced with DCN. However, generalization ability of DCN can be reduced in some extent. In addition, the RNN are used in time series prediction as it retains information about previous input and it is capable to remember all information throughout time. However, the computation of RNN is slow and it is difficult to train. To overcome the drawbacks in CNN and RNN, the hybrid IDCRN model is introduced. The parameters related to the traffic congestion index are predicted using IDCRN.
DCN are used to extract the input feature maps, where the field of offset is calculated with convolution networks. The offset obtained from the additional convolution network is inputted to the original convolution network. The present location of the random sampling is realized by the kernels and the location is not inadequate for the standard grid. Each offset location is learned by the network rather than the convolutional kernels. The end-to-end spatial transformations are effectively and easily realized by DCN. To analyze the 2D convolutions, Here, the term s r rs  represents the irregular location of offset, in which, rs  is naturally fractional. The bilinear interpolation is given in Eq. (10).
Here, the term os r r r rs     indicates the fractional location, t denotes all spatial location present in feature map y , the term   , K  denotes the bilinear interpolation of kernel.
The bilinear interpolation defines the linear interpolation in Y direction and X direction. The term K is divided into 2D dimensional kernel is referred in Eq. (11).
, . , Hence, the weighted feature data are predicted using DCN architecture.
RNN [ is another type of Artificial Neural Networks (ANN), which is applied in prediction of sequential traffic data by using previous traffic data. RNN The RNN model reduces the loss function and the errors in the predictions are compared with the actual values. The loss function is denoted as w , with regards the output state at given time xx . The correlation of gradient parameters is given here.
Finally, the prediction data outcomes are attained from RNN. The optimized weighted features are inputted in DCN and RNN and the parameters such as speed reduction rate, volume to capacity rate and very low speed rate are predicted. The average from both the prediction models is utilized to perform the traffic congestion prediction.

B. Objective Function for IDCRN-based Parameter Prediction
The proposed IDCRN-based parameter prediction approach is used to predict the traffic congestion. The data is predicted using DCN and RNN. The parameters such as volume, time and speed of vehicles are inputted to IDCRN. The obtained output parameters such as volume to capacity rate, speed reduction rate and very low speed rate from IDCRN are subjected to fuzzy interference system to predict the congestion index computation in traffic flow. The main aim of the proposed prediction model is to optimize the hidden neuron count of DCN, learning rate of DCN, epoch count of DCN and further, hidden neuron count of RNN by minimizing the MAE and RMSE is given in Eq. (17).  (19) In above equations, term xx is represented as actual value and yy is denoted as forecasted value. The congestion measures like speed reduction rate, volume to capacity rate and very low speed rate are inputted to fuzzy interference for congestion index computation, each of these parameters indicates the traffic flow rates accurately. The fuzzy interference gives less error rate. The architectural diagram for IDCRN-based prediction parameters is given in Fig. 4.

C. Congestion Index Computation by Fuzzy Interference System
The training of different congestion measures has individual advantages and drawbacks. The congestion is an incident which is caused by various factors and the efforts are incorporated in different measures. The traffic congestion measurement is significant to detect the passenger perception. The boundary between each passenger differs due to the travel situations. These limitations are considered and fuzzy interference system is incorporated to detect the indistinct boundary in a set, and to find the solution uncertainty problems. The values of input parameters are calculated, categorized into various groups, determined different stages of traffic congestion and then finally established the congestion index computation. 644 | Page www.ijacsa.thesai.org Input parameters: The input parameters are observed from proposed IDCRN and then combined to form a single fuzzy measure. The three parameters such as speed reduction rate, volume to capacity ratio and very low speed rate are compared with traffic volume and traffic travel time to roadway capacity rate. These three parameters are calculated separately based on the gathered data and combined according to fuzzy interference system rule. The traffic condition is represented with these three input parameters, by varying the volume to capacity, average speed and the variation in speed.
Speed reduction rate: Speed reduction rate is used to minimize the speed of vehicles that causes traffic congestion. The congestion in various routes during peak and non-peak situation are compared with speed reduction rate as given in Eq. (20).

NPS PS NPS Speed reduction rate
  (20) In above Eq. (20), the term NPS denotes the non-peak flow speed and PS denotes the peak flow speed. The speed obtained value ranges from [0, 1], where 1 is considered as the worst state when the peak flow speed is nearer to 0. The 0 is denoted as best state when the peak range is equal to or larger than non-peak flow speed.
Very low speed rate: It is defined as the amount of travelling time at very low speed when compared over the total travel period as given in Eq. (21).
T Very low spee r T d ate SD T  (21) In Eq. (21), the term TSD is represented as times spend in delay and the term TT denotes total travel time. The obtained value ranges from [0, 1], where 0 denotes the best state with no delay and 1 denotes the worst state with delayed travel time. The delay is referred as the time travel speed, which must be lesser than 5km/hr. Volume to capacity ratio: It calculates all traffic block during peak hour situations. The volume is computed by taking the vehicles unit per hour, when the capacity of the roadway is increased as given in Eq. (22). R Volume to capacity ratio VP C  (22) In Eq. (22), the term VP denotes the volume of vehicle in peak hour and CR represents the capacity of roadway. The obtained value ranges from [0, 3], where the value closer to 0 is the best condition, when the capacity to ratio is minimized and value 3 is the worst condition when the huge volume of Output parameters: The output process is termed as congestion index, which is grouped into four phases low, high, very high and moderate. The function condition is given in a scale range from 0 to 1, where 0 is denoted as good and 1 as bad. The congestion condition of the four phases is determined in terms of this scale value.
Rules: The task of compiling three inputs and evaluating the traffic congestion measures are executed with a fuzzy interference system. When the speed reduction rate is considered as S, the very low speed rate is denoted as R, the volume to capacity ratio is indicated as V, and the congestion is referred as the term C. Here, S, R, V and C denote the degree of congestion. In the following rule, IF part is known as antecedent and THEN part is known as consequent. The exemplary representation of the generated fuzzy rules is given below.
 IF the speed reduction rate is low, AND the volume to capacity ratio is low, AND the very low speed rate is low, THEN, the congestion is low.
 IF the speed reduction rate is high, the volume to capacity ratio is high, AND the very low speed rate is high, THEN, the congestion is very high.
In the same way, totally, 54 numbers of rules are considered, among them three intense conditions are appropriate for using the combination of three categories such as volume to capacity ratio, speed reduction rate and very low speed. Among these rules, 38 numbers of suitable rules are evaluated in this process to get required output. In this way, the traffic congestion index is computed. The architectural illustration for congestion index computation with fuzzy interference system is given in Fig. 5.

D. Output of Congestion Index by Fuzzy Interference System
The output of congestion index by fuzzy interference system in terms of membership function is given in Fig. 6

A. Experimental Setup
The proposed model traffic congestion prediction was implemented in python and the experimental results were evaluated. The performance of the suggested model was compared over various existing models in terms of different measures. The suggested model was compared with different heuristic algorithms and prediction approaches. The maximum iteration was 10 and the number of populations was 10. The proposed is compared over various heuristic algorithms like GWO-IDCRN [29], WOA-IDCRN [30], HHO-IDCRN [26], JA-IDCRN [27] and prediction models like LSTM [31], CNN [32], RNN [33], and CNN-RNN [34].

B. Performance Metrics
The performance measures used for traffic congestion prediction are given here.
1) L1-Norm "is the sum of the magnitudes of the vectors in a space. It is the most natural way of measure distance between vectors that is the sum of absolute difference of the components of the vectors" as shown in Eq. (23).
2) L2-Norm "is also known as the Euclidean norm. It is the shortest distance to go from one point to another" as shown in Eq. (24).
3) L-infinity Norm "is the vector space of essentially bounded measurable functions with the essential supreme norm and only the largest element has any effect" as shown in Eq.

C. Performance Analysis on Heuristic Algorithms
The performance analysis of the suggested model HJHHO-IDCRN algorithm is compared with other meta-heuristic algorithms by varying the learning percentage is given in Fig.  7. The MAE of proposed HJHHO-IDCRN algorithm is 1% higher than GWO-IDCRN, 3% higher than WOA, 1% higher than HHO-IDCRN and 4% higher than JA-IDCRN at learning percentage 60. The MEP measure had attained higher results over other heuristic algorithms, thus the proposed HJHHO-IDCRN algorithm is 3% superior to GWO-IDCRN, 2% superior to WOA-IDCRN, 4% superior to HHO-IDCRN and 2% superior to JA-IDCRN at learning percentage 40. The performance of the suggested algorithm has reduced the error rate and obtained high prediction accuracy than other existing algorithms.

D. Performance Measures on Prediction Models
The performance analysis of the proposed HJHHO-IDCRN algorithm is compared with other prediction models by varying learning percentage is given in Fig. 8. At learning percentage 55, the L-infinity form measures of proposed HJHHO-IDCRN algorithm is 12% higher than LSTM, 16% higher than CNN, 11% higher than RNN and 16% higher than CNN-RNN. The MAE measures of HJHHO-IDCRN algorithm are 2% is improved than LSTM, 1% improved than CNN, 3% improved than RNN and 5% improved than CNN-RNN at learning percentage 75. The MEP measures had obtained higher results over other heuristic algorithms, thus the proposed HJHHO-IDCRN algorithm is 8% superior to LSTM, 4% superior to CNN, 5% superior to RNN and 6% superior to CNN-RNN at learning percentage 85. The proposed algorithm has gained high prediction accuracy than other prediction models and reduced the error rate. www.ijacsa.thesai.org

E. Overall Performance Analysis on Algorithms
The overall performance analysis of traffic congestion is evaluated with HJHHO-IDCRN is illustrated in Table II. The SMAPE measures of the proposed HJHHO-IDCRN algorithm is 5% higher than GWO-IDCRN, 3% higher than WOA-IDCRN, 1% higher than HHO-IDCRN and 6% higher than JA-IDCRN. The RMSE measures of proposed HJHHO-IDCRN algorithm is 15% superior to GWO-IDCRN, 11% superior to WOA-IDCRN, 14% superior to HHO-IDCRN and 9% superior to JA-IDCRN. The MAE measures of proposed HJHHO-IDCRN algorithm is 3% higher than GWO-IDCRN, 1% higher than WOA-IDCRN, 5% higher than HHO-IDCRN and 4% higher than JA-IDCRN. Therefore, the outcomes of the proposed HJHHO-IDCRN algorithm obtain better results than other heuristic algorithms while evaluating with all the measures.

F. Overall Performance Analysis on Prediction Models
The overall performance measures of the proposed HJHHO-IDCRN algorithm are compared with various prediction models and the prediction measures are given in Table III. The MASE measures of proposed HJHHO-IDCRN algorithm is 2% better then LSTM, 4% better than CNN, 3% better than RNN and 6% better than CNN-RNN. The MEP measures of proposed HJHHO-IDCRN algorithm is 11% better then LSTM, 22% better than CNN, 16% better than RNN and 15% better than CNN-RNN. The MEP measures had gained higher results over other prediction models, thus the proposed HJHHO-IDCRN algorithm is 15% superior to LSTM, 22% superior to CNN, 15% superior to RNN and 16% superior to CNN-RNN The suggested HJHHO-IDCRN algorithms give better prediction performance and reduce the error rate.

G. Results on Traffic Congestion Index
The obtained congestion parameter results are illustrated in Table IV. The three parameters capture the real condition of traffic congestion. The results have shown when a particular input parameter is high, then the other two parameters are on lower sate. This state defines the moderate condition. When all the three parameters are on higher state, it results in high congestion result. In other hand, when all three parameters are in lower state, the congestion index results in very low values. The proposed model gives accurate results in traffic congestion prediction while comparing to other existing methods.

VII. CONCLUSION
In this suggested work, a hybrid-based deep learning approach with congestion index computation by fuzzy model was implemented to predict the traffic congestion. The traffic data was gathered from a publicly available database. The dataset was pre-processed with three phases to remove the unwanted data and to fill the missing data by removing bad data, organizing the raw data and filling the null values techniques. The weighted feature selection of data was performed with proposed HJHHO algorithm by optimizing the weight and minimizing the variance. The prediction phase was done with proposed IDCRN. The main purpose of prediction phase was to minimize the MAE and EMSE measures by optimizing the hidden neuron count of DCN, learning rate of DCN, epoch count of DCN and further, hidden neuron counts of RNN. The congestion parameters such as speed reduction, volume to capacity ratio and very low speed rate obtained from IDCRN were subjected to fuzzy interference system for congestion index computation. Finally, the proposed HJHHO-IDRCN algorithm has improved the performance of prediction. The performance analysis has shown the MAE measures of HJHHO-IDRCN algorithm was 5% higher than GWO-IDRCN, 3% higher than WOA-IDRCN, 1% higher than HHO-IDRCN and 6% higher than JA-IDRCN. Thus, the suggested HJHHO-IDRCN algorithm has attained better prediction in traffic congestion. In future works, the proposed model will be run over different datasets of different road networks and furthermore it will be deployed in route planning solutions.