Multileveled ALPR using Block-Binary-Pixel-Sum Descriptor and Linear SVC

—Automatic license plate recognition (ALPR) is es- sential component of security and surveillance. ALPR mainly aims to detect and prevent the crime and fraud activities; it also plays an important role in trafﬁc monitoring. An algorithm is proposed for recognizing license plate candidates. The proposed work aimed to recognize the license plate of a car. Proposed work is designed in multilevel for more accurate License Plate (LP) recognition, At level 1 algorithm produced 93.5% accuracy and in level 3 algorithm gives 96% accuracy. For training and testing purpose, LP images were used from Medialab cars dataset, kaggle car dataset and goggle map images. These images in the dataset is formulated at various angles and illumination. Proposed algorithm for LP recognition is done by using the Block Binary Pixel descriptors (BBPS) and Linear Support Vector Classiﬁcation (SVC). Proposed algorithm is novel and produces higher accuracy in minimal processing time of an average 0.42 milliseconds with 96% accuracy when compared with state-of-the art methods.


I. INTRODUCTION
ALPR is a active, popular and interesting topic in image processing [1], development in license plate recognition have received much attention towards English license plate recognition [2]. It also plays an important role in intelligent transport system [3] [4] [5], border control, toll collection, traffic management [6] [7], it is used in information and communication technology [8]. Computer vision plays an major part in extracting ROI from an image [9]; license plate recognition is done by extracting number plate from images [3] where the Region of Interest(ROI) is license plate of an vehicle, the main task is to extract characters from license plate [10], each vehicle number plate carries different font style and size [11].Surveillance camera is fixed everywhere, earlier it is fixed in important places like malls, hospitals and in sensitive places, but now it's fixed everywhere even people prefer to fix surveillance-camera in their places for their safety. Since many algorithms exists for ALPR but still it lags the accuracy due to various factors like images were captured at night time, tilted and blurred images, so there is a need for an efficient algorithm to improve accuracy. Existing algorithms were proposed using various pre-processing steps followed by feature extraction, machine learning and deep learning techniques. Before extracting features from image pre-processing techniques should be applied on image. Preprocessing the input image which includes conversion of input image to a gray scale image and pre-processing is applied to enhance the details present in the image, in order to highlight the ROI and prune the non-ROI. Then morphological operations like erosion, dilation, tophat, blackhat are applied to enrich input image feature, which supports system proficiency [12]. For recognizing character from image the popular method called optical character recognition with tesseract, an emerging concept is used. ALPR system follows the sequence of steps Since many algorithm exists for ALPR, but it still lags to recognize license plate accurately. So there is a need for an better algorithm to recognize license plate of vehicle accurately even the image captured in uneven lightning, blurred [13]. Proposed work overcomes all those existing disadvantages by recognizing license plate accurately since training of license plate recognition is done by extracting real time license plate images, which tends for more accurate precision. Proposed work is done in multilevel so as to improve the accuracy of proposed algorithm. Many existing algorithm exists for ALPR but its still have many unsolved queries like, it lags to work on images captured at night and dark environment, to recognize license plate of images taken from far distance, to work on skewed and blurred images. It also produces false positive, false negative results for many license plates and existing works fails to predict between characters of similar strokes. So has to solve existing algorithm flaws proposed algorithm is designed in a novel way to overcome all those existing flaws and to recognize license plate accurately from complex backgrounds.

II. RELATED WORK
ALPR plays an important part in border scrutiny, confirming safeguards, and dealing with vehicle-related crime. ALPR system makes use of deep learning techniques using a convolutional neural network (CNN). The CNN assigns significance to numerous features of the image and differentiating them from each other. According to researchers point of view CNN works good for LP character recognition [3]. ALPR received much consideration for the English license plate. ALPR addresses more frightening traffic dealing with scanty road-safety processes. Architecture of ALPR for predicting vehicle LP regions (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 13, No. 5, 2022 prior to Vehicle License Plate (VLP) in the ideal is proposed to eliminate false positives causing in higher prediction accuracy of VLP [2]. ALPR plays essential part in current transport organizations such as traffic monitoring and vehicle violation detection. In real-world scenarios, LP recognition faces many contests and is diminished by unrecognized interfering such as weather or lighting conditions. Machine learning based ALPR solutions has been wished-for to resolve such tests in recent years. However, most of the algorithm does not yield convincing results since their consequences are appraised on small or simple datasets that lack varied surrounds, or it require potent hardware to accomplish a practical frames-per-second in real-world solicitations [14]. In ALPR initially, license plate location will be determined. Then, in the second phase, enhancement can be done by applying Gaussian function for filtering. Next, edges in the image located so that LP location can be spotted. Then tilting and plate rotation and affine transformation are applied if the input image was captured in tilted and slanting position or angles .Then by neural network concepts LP characters extracted from LP images [15]. ALPR [16] system plays vital role in Intelligent Transport System (ITS) their major goal was traffic controlling. ALPR for vehicles is a main part of ITS. ALPR send images to a server for LP recognition. To decrease stays and bandwidth usage during images communication, an Edge-AI-based Realtime (ER) ALPR ER-ALPR was proposed, in which an AGX XAVIER entrenched system is embedded on the edge of a camera to attain real-time image input to an AGX edge device and to enable real-time automatic LP candidate recognition.
To measure LP characters and styles in a precise setting, the ER-ALPR scheme smears the following methods: (1) image pre-processing (2) You Only Look Once v4-Tiny (YOLOv4-Tiny) [17] for LP prediction; (3) virtual judgment line for determining whether a license plate frame has passed; (4) the proposed modified YOLOv4 (M-YOLOv4) for LP candidate recognition; and (5) a logic ancillary ruling scheme for refining LP recognition. ER-ALPR system was complete in real-life test surroundings in Taiwan. Many state-of-art methodologies exists for recognizing text from various environment, but none of the algorithm recognizes accurately, it lags to differentiate between similar characters, existing algorithm lags to recognize license plate characters from complex environments.

III. THE PROPOSED WORK
The proposed algorithm consists of three levels and each level has different phases Level 1: Training phase Input: Load various input font images.
• Initialize the BBPS Descriptor • Train separately both character and digit classifier using Linear SVC to recognize LP accurately Output: Dump the character and digit classifier as a file (pickle file).
Testing phase Input: Load the pickle file (both character and digit classifier) and dataset images. Aim of the proposed work is to propose a powerful algorithm to recognize license plate candidates accurately. Here BBPS descriptor is used to propose an algorithm to extract features of license plate characters. Initially various font images were downloaded from web which is used to train the classifier. Linear SVC is used to train the digit and character classifier separately to obtain higher accuracy. Training dataset were constructed by segmenting LP characters from dataset Images and stored in various labelled folders, which is used in testing phase.  2) Block-Binary-Pixel-Sum Descriptor(BBPS): BBPS descriptor [18], is used for license plate character recognition. BBPS functions by separating an image into non-overlapping M x N pixel blocks, by applying BBPS descriptor on image ; image will be subdivided on three basis one is 3 x 3 regions, another one is 2 x 3 regions and last one is 3 x 2 sub division as shown in Fig 2. In BBPS target size of the image is fixed, canonical size of ROI is resized so that all images can maintain consistent representation and quantification of each character from dataset images. Input image is converted to a binary image, with pixels corresponding to the character having an intensity > 0, and pixels corresponding to the background set to 0. 3) Training Classifiers: License plate has two groups of characters letters followed by digits. In order to obtain better accuracy, both the digit and character classifier should be trained separately. Here linear SVC is used to train a classifier for classification or prediction. Character contains alphabets from string a-z and the digits 0-9. In testing phase all font images will be loaded for training the classifier, convert it to a grayscale image, and threshold operator applied on image so the image look like Fig. 3. After thresholding image, contour detection was applied on image to sort our contours from left-to-right, bounding box computed for each of the sorted contours. For each of the sorted contours ROI extracted and passed to BBPS i.e., LP. Training dataset image has characters followed by digits so if value < 26 means it is character otherwise digit. Respective digit and characters updated in appropriate labeled lists. Each character and digit is classified using linear SVC, it will fit the data which is loaded into it will provide, a finest fitting hyperactive plane that rifts or classifies the statistics which feed in to it, linear SVC will dump the data along with its label value as a model. Linear SVC created model will be dumped into pickle file, at the end of training phase two separate pickle file will be created one for character and another for digit. Pickle file which is used for serializing and de-serializing an object structure, which maintains program state across sessions, or transport data over the network. Here we dumped both the classifier in pickle file in order to transmit training data to testing phase.

4) Testing Phase:
In this phase two pickle file created in training phase will be loaded along with the testing dataset. From the localized license plate images [12], images were looped over the LP regions bounding box, for each of extracted characters from LP regions [19] applied BBPS descriptor, LP images always carry character followed by digits, So first character classifier is applied to predict features of LP characters followed by digit classifier, length of the bounding box characters were checked if length of the character greater than zero i.e., len(chars)>0 and compute the center of LP regions so has to print the detected strings parallel to the LP region. Fig. 6 depicts the work flow of level 1 testing phase.  Fig 13 digits were misclassification 1 is recognized has 9 so an enhanced proposed algorithm needed and for such wrong redictions. In order to overcome false prediction characters and to improve accuracy, training classifier were trained using real-time license plate images. License plate characters were extracted and labeled has a sample of license character examples from the license plate dataset, both digit and character classifier has been re-trained on top of the BBPS feature representations from the more "real-world" dataset, testing done with the re-trained classifiers which lead to obtain higher character identification accuracy.

B. The Proposed Work -Level 2
• In order to improvise accuracy training classifier done using real-time license plate images.
• LP characters; extracted and labeled has a sample of license character examples from our license plate dataset.
• Both digit and character classifier will be re-trained on top of the BBPS feature representations from the more "real-world" dataset.
• Testing done with the re-trained classifiers which lead to obtain higher character identification accuracy. Aim for retraining classifier is to obtain a higher accuracy; here classifier is retrained using real-world character examples using LP dataset. LP dataset which used to training is a real-world LP car image dataset so that it can recognize LP characters of real world vehicles. Fig. 7 depicts the work flow of level 2. From a localized bounding box generated images [12]. In second phase initially will randomly select 50% images for training and 50% images for testing. LP localized image with bounding box will be loaded [12], will loop over the LP region with bounding boxes and extract LP characters [19], in that each of the characters from LP images will be displayed and wait for key press event to manually label each of these LP characters like Fig 8.Once characters extracted from LP, it will wait for key press event. If the '/ (backtick/tilde) key was pressed, proposed work will ignore the character which is built to ignore falsely detected characters or "noise" in the LP image. Other key press represents confirmation of the character without labelled data; meanwhile appropriate key pressed for characters on screen those characters will be stored with correct label like Fig. 9. Once the LP character has been labelled, it will be written in to disk using a directory structure like filename followed by extension like Fig. 10.
Once characters extracted from license plate, once backtick By pressing appropriate key while labelling data, character will be saved to folder as like Fig. 10. While comparing with our previous level contrived LP fonts samples these labelled LP characters are certainly representative of real-world LP characters like Fig. 11. In Level 2 labeled characters directory was created with real-time LP images, since web downloaded font image were trained and used in level 1 which lags to give exact recognition at some cases since there is lots of difference between web downloaded image and real LP images, so if training classifier also done with real LP images which gives good accuracy when testing with real LP images. In level 3 from LP dataset images using BBPS extract features, and retrain the character and digit classifier.  From the gathered labeled directory (o/p of level 2). BBPS descriptor was applied to extract license plate characters. Re-trained letter and digit classifiers were used to re-train classifiers to recognize license plate characters accurately. As a beginning step in level 3, BBPS descriptor will be initialized and sample directory which is created in level 2 will be loaded, each of the images will be loaded, pre-processed and finally BBPS features extracted from those sample images. Accordingly like level 1 digit and character classifier trained and updated and dumped into separate pickle file. As a beginning step in level 3, BBPS descriptor will be initialized and sample directory which is created in level 2 will be loaded, each of the images will be loaded, pre-processed and finally BBPS features extracted from those sample images. Accordingly like level 1 digit and character classifier trained and updated and dumped into separate pickle file.

IV. DATASET
To recognize LP candidates, large collection of dataset required. Publishing large collections of license plates and any information regarding where the license plates were captured including Global Positioning System (GPS) coordinates, noticeable landmarks, street signs, etc. is actually considered an invasion of privacy in many countries [12]. It's really hard for ALPR standpoint where in google and their street view initiative have amassed one of the largest datasets of license plates in the entire world. The google street view cars have driven countless miles around the world, passing millions upon millions of cars and trucks all with license plates clearly visible; images where blur and license plate characters were looks so smudged [19]. For the proposed work dataset images taken from a popular dataset repository called as medialab which is maintained by a national university in greece, kaggle car dataset and google map images. Proposed work trained and tested with medialab images of 139 combined with kaggle car dataset 55 images and google map images 20, totally 214 images were trained and tested. Dataset images were localized [12], sliced and segmented [19] proposed work is a continuation of previously proposed license plate localization [12], segmentation and scissoring [19]. The proposed work license plate recognition is done using medialab dataset, kaggle car image dataset and Goggle map images. Proposed algorithm build using BBPS descriptor implemented in anaconda python. The experiment is done in laptop with intel core i5, 2.70 GHz, 8 GB RAM windows 10-64-bit OS. On execution time all the images were rescaled and maintained the size of 640 pixels with unchanged aspect ratio. Proposed algorithm gives very good accuracy compared to existing works. Proposed work is novel and works better than the state-of-the-art methodologies. Existing works lags to recognize license plate from the blurred, sweked images captured in uneven lightning and fails to recognize license plate characters of images taken from longer distance, proposed algorithm works on all the above cases so its a novel and works better than state-of-the-art methodologies. Proposed algorithm is simple to understand, easy to implement in real time environment which gets executed in 0.42 milliseconds. Proposed algorithm is proposed in such a way to overcome all the flaws of existing methodologies. Here license plate were successfully recognized whereas in Fig. 13 in left side image all candidates were recognized correctly except the second character candidate which is recognized has N instead of M. In Fig. 13 all character candidates were recognized correctly except a digit '1' which is recognized has '9'. All these misclassification occur because all these candidates share similar features. In level 1 training of LP characters were done using different font images downloaded from internet, level 1 yields good accuracy but it fails to recognize a few cases, in order to overcome such flaws, training of classifier is done by gathering characters from dataset images like Fig 14. In Fig. 15 depicts the wrongly predicted license plate images in level 1 which is correctly recognized in level 3.   [12] and characters [19] draw bounding box around recognized license plate loop over the characters display the character if actual != segmented character print(ignore) continue construct the path to the output directory if output directory not exist, create directory write the labeled character to file Level 3: Advanced training phase initialize the BBPS descriptor initialize the data and labels for the alphabet and digits loop over the sample dataset created in level 2 extract the images loop over the images load the character, convert it to grayscale preprocess it if the character is digit append digit data and label otherwise append alphabet data and label train both advanced character and digit classifier dump the character and digit classifier to file Advanced testing phase Input: Load advanced character and digit classifier call testing phase() VII. PERFORMANCE EVALUATION In the proposed work in level 1 few LP character candidates were predicted wrongly, so to overcome such flaws training images were created using medialab, kaggle car dataset and Google map images. To evaluate the performance of proposed system accurately, here we have trained and tested the proposed system with three dataset images, which has images captured at various angles, various lightning condition. In a Tunisian dataset, [20] which is used for calculating accuracy for LP recognition, which comprised of true positive and true negative divided by true positive, true negative, false positive and false negative boundary boxes. Table 1 depicts different dataset images used, the performance metrics of proposed system is calculated in terms of measuring precision, recall, f-measure. Proposed algorithm recognize dataset images more accurately than the existing algorithm, it recognizes images captured at challenging environment and angles. Performances of proposed algorithm is produced in table format. If the license plate correctly predicted from license plate images which come under True Positive (TP), if non-license plate region recognized has license plate which called as False Positive (FP). Since the proposed work is done in multilevel performance evaluation has been done for each level independently in Tables 2, 3  P recision = T P T P + F P (1) In prediction phase if both the license plate character candidate and predicted character candidate were same means its true positive, if non LP region predicted has LP candidate means its false positive, if non license plate region not recognized has license plate region its true negative, if non license plate region detected has license plate region means false negative. Equations for precision, recall , f-measure given in eq(1),(2), (3). Precision, recall, f-measure were metrices which is used to test performance of an algorithm. Precision is a measure which can be used when the count of false positive is less. Recall is a metrics which is used to improve the prediction rate when it lag to produce satisfying result in precision phase; this metrics can be used when the value of false negative is less. F1 measure can be used when the accuracy seek the balance between precision and recall, when class distribution is uneven i.e more actual negatives. Table 5 depicts the precision, recall, f-measure score of proposed work. Table 6 depicts the proposed methodology accuracy with existing works. Proposed work evaluated with popular dataset like medialab, kaggle, and Google map images. All images were used for both training the classifier and testing license plates.  ALPR is an interesting hot topic, many algorithm exists for ALPR but it still lags on recognizing license plate characters from complex environment, by overcoming all the backdrops in existing system a novel and efficient algorithm produced here to recognize license plate characters accurately from complex environment. In this proposed work, license plate is recognized from media-lab,kaggle car dataset images and from Google map images. Many existing algorithm exist for LP recognition but it lags to provide good accuracy due to various factors like images were captured in uneven lightning, taken at tilted position and blurred images. Existing algorithms are tedious, time consuming and it requires labeled data, those were expensive to compute, and highly time consuming. Proposed algorithm is novel, built using BBPS descriptor and linear SVC which is super-fast, run easily in real-time, and its simple to comprehend, its inexpensive to compute, and executes in 0.42 milliseconds. In level 1 proposed work produced 93.5% accuracy, In level 2 gathered training dataset images and which gave 98.3% in recognizing and training dataset images which is used as a input to level 3 . level 3 produced an enhanced better results than proposed level 1 and other existing methodologies. As a future work license plate of regional language images will be recognized.