Texture and Color Descriptor Features-based Vacant Parking Space Detection using K-Nearest Neighbors

org


I. INTRODUCTION
The requirement for vehicle parking space is increasing due to the increasing number of people owning vehicles causing unbalance in demand and supply of parking spaces.In addition, another underlying reason is the number of parking spaces in a certain location is fixed but the demand for parking spaces varies at different times of the day.People may not be able to locate parking spaces even after spending significant time looking for one during rush hours.Utilization of the available parking space efficiently can be a key solution in this regard where motorists will need real-time information of the available parking spaces of the nearby parking areas.In this context, the necessity of automatic vacant parking space detection is a demanding research topic in the field of computer vision domain.Vision-based automatic vacant parking space detection can help to find available parking spaces at certain locations using images from preinstalled cameras in a parking area.
Previous researchers proposed various approaches to solve vacant parking space detection, i.e., various sensors-based detection [1,2], counter-based detection [3,4], and visionbased detection [5,6,7].Wireless sensors were previously used to develop parking monitoring systems but the problem with the sensor-based method is that deployment and maintenance of sensors are required also the sensors add additional costs [2].Counter-based methods keep track of the number of vehicles entering and exiting the parking area, however, counter-based methods cannot provide information about parking space location which cannot be used for a wide range of applications [4].Although previously some research criticized the vision-based approaches technique because of expensive camera equipment and the need to deal with a large amount of data [6], a vision-based vacant parking space detection system can easily be deployed using existing security cameras of the parking space without additional hardware requirements in parking area [7].This research focused on detecting vacant parking spaces using texture analysis based on Gray Level Co-Occurrence Matrix (GLCM) also known as Grey Tone Spatial Dependency Matrix and RGB color descriptors.The proposed method was validated using the PKLot dataset [7] which is a large dataset of three parking areas with different weather conditions and different times of the day.PKLot dataset was previously used for validating various previous methods, i.e.CarNet [8], mAlexNet [9], and Deep CNN [10] which are compared with the proposed method based on various performance metrics.
The rest of the research paper is structured as follows: Critical previous research is illustrated in the background study, mentioned in Section II, comprehensive details of the proposed methodology are elaborated in Section III, details of experimental results with analysis for experimental validation are demonstrated in the experimental results and analysis in Section IV and finally, the conclusion in Section V presents concluding remarks.

II. BACKGROUND STUDY
Several research has been done to solve the parking space detection problem while keeping three factors in consideration, i.e., mind robustness, deployment effort, and maintenance cost.Previous research has been conducted resulting in different methods i.e., multicamera vehicle detection [11], drone-based and aerial image analysis [12,13], image descriptor-based [14], geometric features-based [15], edge-based [16], planebased [17], convolutional neural network [18,19], sensor network [2] [20].Some of the previous research was done www.ijacsa.thesai.orgbased on wireless sensor networks.Sensors are being placed in parking spaces to transmit data to a server to detect if space or unoccupied is occupied [2] [20].However, existing related methods related to sensors, cost, and maintenance were not feasible in large parking areas.Currently, researchers are more focused on vision-based parking space detection where images of the parking area are taken using the camera.Research in [21] used two types of approaches for image-based systems, one of them was car-driven and another one was space-driven.In the car-driven approach, methods were developed to detect cars as objects as the main objective.However, the problem with this approach was that when a camera was placed at an angle, images of vehicles near the camera and far from the camera have a significant difference in quality as the number of pixels will be less for far images.So, the detection of objects or vehicles became complex.In the space-driven approach, the detection of space was the main objective.The space-driven approach was less complex than the car-driven approach because space has more similarity than vehicles at a particular place.However, the challenge of the space-driven approach was to create robust methods for different parking areas under different scenarios.
Research in [15] proposed a geometric feature-based method to detect vacant parking space where a line segment was used to detect an algorithm for creating a line-clustering method consisting of several parallels for separating lines with a fixed distance and one guideline.The false line was removed and then the guideline was detected using a learning-based method.This method performed better than the single bird's eye view method proposed by research in [15].However, this method was unable to meet real-time processing.Research in [13] used aerial images and line detection and combined selective filtering calculated using the prevalence of line length and angle.Their algorithm aimed to do automated detection of parking space regions in parking lot images for collecting parking occupancy information.The advantages of this method have four aspects, i.e., provided well enough results for automatic region extraction, fast, covered a large area, and can be used for automatic segmentation.However, some lines were not detected by that method due to light and shadow variance.Research in [22] used a deep convolutional network by introducing an architecture called "Siamese architecture" for learning robust features of the parking spaces for eliminating inter-object occultation and increased performance in various illumination conditions.They also used three space input patches of a single parking space and trained the network for the classification of parking occupancy.The method proposed by research in [22] was better than the method proposed by research in [13] due to the usage of a drone coupled with a line detection algorithm to detect parking space which was not robust, and the deployment and maintenance would be more difficult than just using single or multiple fixed place cameras.In addition, research in [22] performed better in different illumination conditions wherein research in [13] method of light and shadow variance caused problems in line detection.
Research in [11] proposed a method based on dilated convolution neural networks.They claimed their model to be more robust than other methods proposed by research in [14] and [16].Research in [8] used the dataset that was created by research in [2] named PKLot and compared the results with other research methods.There was a significant difference when the dataset was trained in one parking lot and tested in other parking lots.Results by research in [5] showed that their method was more robust as their accuracy did not fall like previous research while training and testing in different datasets.In this context, research in [7] used two texture-based features, i.e., local binary patterns and quantization of the local phase.Support Vector Machine (SVM) was trained in their research to detect vacant parking spaces and received an accuracy of 99% when the model was trained and tested on the same dataset.However, for other datasets, they achieved the best accuracy of 89% using textural features.The method proposed by research in [5] performed better than research in [7] in terms of robustness.Experimental results by research in [8] showed that their proposed method accuracy did not fall like research in [7] while training and testing in different datasets.
Existing research methods need to be robust for parking space detection in terms of various illumination and environmental conditions like sunny, cloudy, and rainy.In addition, these methods need to be easily deployable, and maintenance should be cost-effective.The method proposed by research in [15] was able to provide real-time parking space detection.However, that method cannot be used for a large number of parking areas because outdoor parking areas contain a large number of parking spaces.Research in [13] used aerial images for the detection but using a drone is not cost-effective and not suitable for daily use as the drone has a very limited battery life and high maintenance cost.Research in [22] used fixed-place cameras to detect vacant parking spaces, they used three space input patches of a single parking space detection using CNN and were able to achieve good accuracy.However, research in [7] used only a single patch and texture-based features for the detection which claimed to be performed better compared with research in [22].Research in [7] and research in [18] lack in terms of robustness because despite providing good accuracy on a single parking area, their accuracy decreased when multiple parking areas were used in the scenario.This research proposes an efficient method for parking space detection using texture and color-based features to make the validation robust.In addition, the proposed method is easily deployable and has low maintenance costs as it can be deployed using the camera that already exists in parking areas.

III. PROPOSED METHODOLOGY
The proposed method aims to detect vacant parking spaces in different weather and illumination condition of the day shown in Fig. 1.The proposed method mainly focuses on the extraction of features that reflect the difference between unoccupied and occupied parking spaces.The overall methodology consists of four steps, i.e., acquisition of images for processing, segmentation of parking spaces using a fixed mask and preprocessing segmented images, color descriptorsbased feature and texture-based feature extraction, and detection of parking spaces using supervised machine learning algorithm k-nearest neighbors.The proposed method by this research is depicted in Fig. 1. www.ijacsa.thesai.org

A. Input Image
This research used the PKLot dataset of three different parking areas.According to research in [7], a single camera covering the whole parking area was enough for the detection of vacant parking spaces.The dataset was annotated with information of the locations of parking spaces and occupancy status.In terms of real-life use, camera calibration was followed as research in [4] [23].PKLot Dataset consists of three subsets for three parking spaces, i.e., PUCPR, UFPR04, UFPR05, and images of three different weather conditions sunny, cloudy, and rainy for each parking area.Sample images of the PKLot dataset are mentioned in Table I.

B. Parking Space Segmentation
The parking area may consist of many parking spaces.For this reason, automatic segmentation of parking space will produce extra computational overhead and causes the process not suitable for real-time use.For faster segmentation, a fixed mask once for all spaces was created manually as research in [7] and [24] did in their research which leads them to achieve very fast segmentation of parking spaces.The fixed mask uses the coordinates of the parking spaces.After placing the camera, coordinates of the parking spaces were collected to crop out the patches of the parking spaces from images.The terminology is that for a fixed camera, the parking spaces are static, only a vehicle will move into a parking space or will move out of one.Two copies of a segmented parking space were created for two types of feature extraction, i.e., color feature-based and texture feature-based extraction.For colorbased feature extraction in preprocessing part, a segmented RGB parking space is split into three channels R, G, and B as the intensity of the pixels also varies on different channels.The R, G, and B channels of occupied parking space have more variations than the unoccupied parking space, also there is a significant difference between occupied and unoccupied R, G, and B channels, and can be used as features to distinguish between occupied and unoccupied parking spaces.After colorbased features, texture-based features were extracted for the highly distanced parking spaces from the camera which are smaller than less far parking spaces from the camera in addition, highly distanced parking make the structural or geometrical features extraction more complex and not effective but texture-based features can be used to find the similarities and dissimilarities between occupied and unoccupied parking spaces in that scenario.For texture-based feature extraction, images were converted RGB to grayscale followed by median filtering.Details of features extraction regarding color features extractions and texture features extractions are explained comprehensively in the next section.

C. Color Descriptors and Texture-based Features
This research used texture features-based information of the parking space which were collected from the Gray Level Co-Occurrence Matrix and combining them with color-based feature helped to achieve more accuracy and even if the parking space was near or far from the camera, they had similar values and leading to more robust methodology comparing with the previous research.In this section, color descriptors-based features and texture based-features were extracted.For color descriptors-based features, there is a significant difference in occupied and unoccupied RGB channels.Using Eq. ( 1), the mean value of each channel is calculated to estimate significance difference in occupied and unoccupied RGB channels.
where, pixels and k_j is the value of the intensity of a pixel.For texture-based features, Gray Level Co-Occurrence Matrix (GLCM) was used to derive some statistical values of the images.The GLCM is constructed from gray images.GLCM www.ijacsa.thesai.orghas rows and columns that are equal to the number of tones or gray levels in a grayscale image.In addition, GLCM calculates how often a gray tone or intensity occurs with the adjacent pixels from the input image.Fig. 2 shows the calculation of GLCM of 8 tone grayscale images, but in this research, GLCM was calculated for 256tone grayscale images.GLCM can extract certain texture properties from the spatial distribution of the gray image.In the proposed method by this research, four statistical texture features were computed from the GLCM matrix, i.e., Contrast, Correlation, Energy, and Homogeneity.Various texture features were extracted based on Eq. ( 2) to Eq. ( 5) [26].
where, L = number of intensity levels, Q mn = element at (m,n), μ = GLCM, σ 2 = intensity variance.Thus, the proposed method dealt with multiple features which were extracted and used for the detection of vacant parking spaces by manipulating the correlation among these multiple features.

D. Vacant Parking Space Detection
The proposed method extracted features represented as numeric values collected from many images from subsets of the PKLot dataset with three different weather conditions.Several classifiers, i.e., logistic regression, Support vector machine, k-nearest neighbor, weighted k-nearest neighbor, and linear discriminant were tested to check how these classifiers perform on the features extracted by this research.Among all the classifiers, the weighted k-nearest neighbors algorithm worked well for the classification of parking space.The parameters used for the model are shown in Table II.
Due to better performance and low complexity, this research used a weighted K-nearest neighbor as the classifier.As all the extracted features are numerical using KNN which makes training and testing the model less complex.The weighted KNN model was trained and tested with the features that were extracted using the proposed methodology.As the supervised machine learning algorithm needs labeled data, vacant information was collected from the PKLot Dataset [7].Classified parking spaces were marked as green for unoccupied ones and red for occupied ones shown in Fig. 3 as a result of the proposed method.

IV. EXPERIMENTAL RESULTS AND DISCUSSION
This research was validated using several phases during experimentation shown in Fig. 4.

A. Experimental Set Up
Proposed method experimented on the environment with a hardware configuration of Intel(R) Core (TM) i7 CPU540 @ 3.07GHz (4 CPUs) ~3.3GHzProcessor, 16GB RAM, www.ijacsa.thesai.orgWindows 10 64-bit operating system, and MATLAB R2019a.MATLAB provides a very good environment for rapid prototyping and debugging.MATLAB's image processing and machine learning environment provide an efficient set of optimized functions to make the workflow faster.The proposed method requires the parking area image as input and outputs the classified parking areas as green and red.

B. Datasets
The proposed method is experimented using the dataset PKLot [7] for validation.There was a total of 653169 numbers of observations extracted from the actual dataset out of which 72,000 observations are used as training samples.Out of 72,000 observations shown in Table III,   1) Distance metric: Euclidean Distance estimated the distance of edges between neighbors as K [27] for the proposed method.Neighbors are m-by-n data vectors where m denotes the number of training samples and n denotes number of features used.
The distance was measured using ( )

√∑ (
) where d i is the distance of the edges between neighbors, and are 1-by-y data vector of the source and destination.
2) Distance weight: Weight associated with the training samples are estimated as the squared inverse of the distance, and the weighted distance is calculated as 3) Break ties: Smallest value was used to break the tie in case there were an equal number of neighbors with similar values.
4) Number of nearest neighbors: Nearest neighbor K=10 used to take decision of new data.

5) Nearest neighbor search method:
A k-dimensional tree (Kd-Tree) was used to search for the neighbors [28] instead of an exhaustive method to search faster.The Kd-tree method divides a data vector of n-by-k recursively and distributes n points into a binary tree of K-dimension.Hence, the model grows a multi-dimensional Kd-tree using associated weight and Euclidean distance with a bucket size of 60 which is the maximum number of points in the leaf node [29].

D. Evaluation Parameters
The proposed method was validated with the test sets using various metrics, i.e., Confusion Matrix, Accuracy, Area under Curve (AUC), Error, Precision Rate, Recall Rate, Processing Time, and Processing Speed in Frame per Second shown in Fig. 5.  Processing in Frame per Seconds www.ijacsa.thesai.org 1) Specification of the classification of a parking space: This research used two possible outcomes to denote the status of a parking space, i.e., occupied, and unoccupied.Occupied space refers to when a vehicle is parked in the parking area otherwise denotes unoccupied.These two statuses are considered classes for the prediction model.The true class represents the actual status of parking spaces that are known, and the predicted classes represent the status predicted by the trained model shown in Table VI  2) Performance metrics: The proposed method estimated the Confusion matrix which denotes the information about actual status and predicted status of training samples [30].The performance of the trained model is commonly evaluated using the data of the confusion matrix.True Positive Rate and False Positive Rate were implicated to estimate the Receiver Operating Characteristic curve (ROC) and Area Under Curve (AUC) [31,32].The proposed research plotted ROC by placing the FAR on the x-axis and the TPR on the y-axis for several different observations.The values of False Positive Rate (FPR) and True Positive Rate (TPR) range from 0.0 to 1.0.A method with the best prediction skill is represented by the curve that goes from the bottom left corner to the top left corner and then towards the right top corner of the ROC plot [32].This research also used Precision rate which denotes the rate of the positive prediction in terms of total positive prediction.Processing time was also estimated which denotes the time required to process one frame.In addition, the Processing speed in patches per second is calculated from the number of patches and the processing time taken to process them.In this context, Processing speed in frame per second represents the time to process a frame.Other performance metrics were calculated to validate the proposed method mentioned in Table IV which the corresponding equation used.

E. Experimental Results
The proposed method achieved an accuracy of 99.47% during training, which indicates better performance compared with the state-of-the-art.Table V depicts the confusion matrix of the trained model and Fig. 6 illustrates the AUC of the trained model.The total number of Positive samples (unoccupied spaces) used for the trained model is 36967 and the total number of Negative samples (occupied spaces) was 35033.Out of the total Positive samples, 36743 samples are correctly classified using the trained model, and 224 unoccupied spaces are classified as occupied.Besides, out of the total Negative samples (occupied spaces), 34895 spaces are classified as Negative (occupied), and 138 samples are identified as Positive (unoccupied).Hence, the number of FP and FN is comparatively very small compared to TP and TN which leads to higher accuracy compared with existing research.

Name of Metrics Equation
Accuracy [30,33] True Positive Rate or Recall Rate [30,34] False Positive Rate [30,33,35] Error Rate [30,31,36] Precision Rate [30,31,37]  Fig. 6 depicts the ROC and AUC curve for the trained model [32].Here, the curve shown is the ROC curve, the shaded area is AUC which is ≈1and the point represents the used model.Based on the AUC value, the proposed method performed better for the training samples.
Table VI shows the confusion matrix of tested samples from PUCPR parking area images.399118 samples were extracted from PUCPR which were used for testing the proposed method, each sample represents a parking space labeled either Unoccupied or Occupied.For the PUCPR images, out , , , , , samples that were occupied, the proposed method predicted 183,890 samples as occupied and 1,667 samples as unoccupied.There were 88,266 samples from FPR04 parking area images that are used for testing the proposed method to estimate the confusion matrix shown in Table IX.For the FPR04 images, 48,700 samples were classified as unoccupied, and 5 , , , images that were occupied, the proposed method predicted 38,754 samples as occupied and 300 www.ijacsa.thesai.orgsamples as unoccupied.165785 samples from FPR05 parking area images were extracted using the proposed method to estimate the confusion matrix shown in TABLE X , , , samples that were unoccupied.Out of a total of , images that were occupied, the proposed method classified 96,648 samples as occupied and 778 samples as unoccupied.Several observations that were used from different parking areas for validating the model along with the number of prediction speeds in terms of observation/millisecond are shown in Table VII.The prediction speed on average was 23.01 observation (obs)/milliseconds(ms).The accuracy of the proposed method for PUCPR, FPR04, and FPR05 were 99.21%, 99.91%, and 99.29% respectively shown in Table VIII.The accuracy of the three subsets does not fluctuate and it stays above ≈99%.The error rates are 0.79%, 0.9%, and 0.71% which on average stays at around ≈ 0.98%.The precision and recall rate are on average 99.16% and 98.7%.The precision rate represents the rate of the positive prediction in terms of total positive prediction which is 99.16%, and the recall is the rate of positive prediction in terms of total actual positive which is 98.7% for the proposed method.
The number of patches or parking spaces used from different parking spaces is shown in Table IX.PUCPR subset images cover many parking spots in a single image but in the experimentation, 102 parking spots were used from each image of PUCPR, 31 parking spaces from UFPR04, and 44 parking spaces from UFPR05 were used, processing a single image of PUCPR, UFPR04, and UFPR045 with total patches for each required 0.14, 0.15, and 0.17 seconds respectively.Processing speed in patches per second was calculated from the number of patches and the processing time taken to process them.Besides, processing speed in frame per second denotes the time to process a frame shown in Table IX.When the number of patches increased, the processing speed decreased in frame per second because it required more processing time.

F. Comparison with Previous Research Performance
The proposed method achieved an average accuracy of 99.47% whereas De Almeida et al. [7] achieved an average accuracy of 91.5% using the PKLot dataset as shown in Table X.In this context, the proposed method by this research achieved an accuracy of 99.47% using the same datasets.Besides, De Almeida et al. [7] used texture-based features from PKLot whereas the proposed method used texture and color descriptor features which caused better performance.CarNet by Nurullayev et al. [8] and AlexNet by Amato et al. [9] received an accuracy of 97.04% and 96.74% respectively which is lower than the proposed method.Deep CNN by Valipour et al. [10] provided an AUC (Area Under Curve) of 0.9994.In this context, AUC achieved by the proposed method is 1 shown in Fig. 6 which indicates proposed method performed better than previous research in terms of accuracy and AUC.
The proposed method requires 0.15 seconds to process a single patch or single parking space shown in Table XI.1.048 frames was processed in 1 sec which consists of 50 parking spaces considered as a baseline to show the difference with previous research methods.AlexNet by Amato et al. [9] required 0.3 seconds to process a single patch and processed only 0.06 frames in 1 second which was slower than the proposed method.The deep CNN method by Valipour et al. [10] required 0.22 seconds to process a single patch and processed only 0.1 frames per second.So, the proposed method performed better in terms of processing speed than www.ijacsa.thesai.orgAlexNet by Amato et al. [9] and the Deep CNN method by Valipour et al. [10].The proposed method achieved higher accuracy compared with previous research shown in Fig. 8 due to the usage of features for training, i.e., Contrast, Correlation, Energy, Homogeneity, and Color descriptors such as red, green, and blue channels.In addition, the proposed method also implicates the squared inverse of the distance as a weight to make the classes more separable while training.Proposed methods required less processing time per patch than previous research methods like AlexNet by Amato et al. [9] and the Deep CNN method by Valipour et al. [10] shown in Fig. 9(a).Besides, the proposed method processed more frames per second than AlexNet by Amato et al. [5] and the Deep CNN method by Valipour et al. [10] where each research method considered processing frames of 50 parking spaces shown in Fig. 9(b).The proposed method extracted features from a patch to predict whereas Amato et al. [9] and Valipour et al. [10] used a convolutional neural network and the whole patch needed to be convolved in each convolution of the neural network which required more time than the proposed method.
The proposed method shows robustness in terms of classifying unseen data.According to the validation results, having enough training data proposed method is expected to be capable to work in newer parking areas.Besides, the proposed method is lightweight and can easily be deployed on the server-side or a single Raspberry Pi 3 or Raspberry Pi 4 can be used to run in real-time implication.

V. CONCLUSION
This research proposes a hybrid method to detect vacant parking spaces using texture-based features and color descriptors.Features are calculated from gray-level cooccurrence matrix and RGB color descriptors to distinguish between occupied and unoccupied spaces.The proposed method illustrated efficiency in terms of accuracy, processing speed, and other performance metrics.Proposed model archived above average accuracy of 99% including validation with unseen data.Experimentation was done on one of the benchmarking datasets named PKLot parking image dataset which contains images of three different parking areas and for each subset images were taken on three different weather conditions at different times of the day.For a desired parking area, the proposed method can be used to provide prior knowledge of the available parking spaces, and vehicle drivers are expected to be able to locate the parking space efficiently.Besides, the proposed method is expected to decrease traffic load and air pollution.Another use case of the proposed method is to detect illegal parking which will be investigated in the future.In addition, an investigation will be done to implicate the proposed method at night and optimize it to work concurrently for both day and night conditions.Proposed Method AlexNet [9] Deep CNN [6]
24,000 observations are used from each of the parking areas such as Pontifical Catholic University Paraná -Parking Lot (PUCPR -PKLot), Federal University of Parana -Parking Lot (PKLot -UFPR04) from 4th Floor, Federal University of Parana -Parking Lot (PKLot -UFPR05) from the 5th floor.These 24,000 observations are randomly sampled from the collection of observations for individual parking areas and weather conditions.

Fig. 8 .
Fig. 8. Accuracy of the proposed method and previous research methods.

TABLE II .
PARAMETERS USED FOR WEIGHTED KNN Fig. 3. Classified parking spaces UFPR05-Sunny.

TABLE III .
NUMBER OF SAMPLES to Table IX.In addition, looking for unoccupied space is considered a positive interest, and occupied space is considered a negative interest.Classifying a Positive sample as Positive is considered a True Positive (TP), Classifying a Negative sample as Negative is True Negative (TN), Classifying a Negative sample as Positive is False Positive (FP), and Classifying a Positive sample as Negative is False Negative (FN).

TABLE V .
CONFUSION MATRIX OF TRAINED KNN MODEL WITH 10-FOLD CROSS-VALIDATION AND K=10 Occupied 138 34895 Fig. 6.Area Under Curve (AUC) at training.

TABLE VII .
NUMBER OF OBSERVATIONS TESTED AND PREDICTION SPEED

TABLE VIII .
ACCURACY AND DIFFERENT PERFORMANCE MEASURES

TABLE XI .
COMPARISON WITH PREVIOUS RESEARCH IN TERMS OF PROCESSING SPEED