Handwritten Arabic Characters Recognition using a Hybrid Two-Stage Classifier

Handwritten Arabic character recognition presents a big challenge to researchers in the field of pattern recognition. Arabic characters are characterized by their highly-cursive nature and many of them have a similar appearance. For example, the only difference between some of the alphabet characters is the existence of a number dots above or below the main character shape. This paper proposes a system for isolated off-line handwritten Arabic character recognition using the Discrete Cosine Transform (DCT) as the feature extraction method and a two-stage hybrid classifier. The two stages are a Support Vector Machine (SVM) and a neural network (NN). The first stage is a two-class SVM classifier which classifies a character either a character with dot(s) or without dot(s). The output of this stage is used to extend the feature vector of the character by the class value to give it an extra unique feature. The extend feature vector is fed to a multi-class neural network model to classify the character. The proposed approach is tested on a database of Arabic handwritten characters called AlexU Isolated Alphabet (AIA9K) containing 8,737 character images. The experimental results of the first stage classifier showed a high recognition accuracy rate of 99.14%. The proposed twostage hybrid classifier obtained an average recognition accuracy rate of 91.84% over all Arabic Alphabet characters. Keywords—Arabic character recognition; Support Vector Machine (SVM); neural network (NN); hybrid classifier


I. INTRODUCTION
The optical character recognition OCR is an important field for offline handwriting recognition systems. Offline handwriting recognition systems are unalike online handwriting recognition systems [1] [2]. In certain contexts, the ability to handle large amounts of handwritten script data is priceless. An example of these applications is the automation of copying script in old documents taking into account the complex and irregular nature of writing [3]. Arabic optical character recognition is still primitive and slowly developing compared to other languages [4].
The main challenge in the Arabic script recognition systems originate from the cursive nature of the characters. Moreover, some characters have two to four different forms depending on its position in the word. Several characters are connected with complementary parts above, below, or inside them. In addition, there are many similarities among the Arabic characters with regard to their structure and morphology that makes it difficult to recognize, particularly those characters that have dots. Therefore, to distinguish some characters from each other, Arabic Language uses a variety of dots, one, two or three dots, above or below the main shape of the character. These characters are: ‫,ب(‬ ‫,ن‬ ‫,ت‬ ‫,ث‬ ‫,ج‬ ‫,خ‬ ‫,ف‬ ‫,ق‬ ‫,غ‬ ‫,ض‬ ‫,ظ‬ ‫,ش‬ ‫,خ‬ ‫,ي‬ ‫.)ز‬ The elimination of any of the dots will cause misinterpretation of that character. In addition, some people handwrite these dots as dashes, which brings more difficulties for a recognition system. The Arabic alphabet is used for writing different languages such as Persian, Urdu, and Jawi [5]. The Arabic alphabet consists of 28 letters and most of them are written in a cursive manner. There are several shapes for most of the Arabic letters depending on its position within the word. Those different shapes correspond to the different placements of the character within a word, such as at the beginning, in the middle, at the end.
In automated optical character recognition systems, the choice of feature extraction method could be the most important issue for obtaining high recognition accuracy [6]. AlKhateeb, R., J., Ipson, & El-Abed [6] proposed an approach for recognizing handwritten Arabic words that utilizes Discrete Cosine Transform (DCT) as the feature extraction method. The resulted features are used to train a neural network for classification.
Lawagali, Bouridane, Angelova, & Ghassemlooy [7] compared the effectiveness of using DCT and Discrete Wavelet Transform (DWT) in capturing the features of handwritten Arabic characters. The authors built a new dataset containing 5600 characters covering all Arabic characters. To compare the two feature extraction methods, a neural network model was built and implemented. The results of the experiment results showed that the use of DCT-based feature extraction method outperformed DWT.
Furthermore, distinguishing the Arabic handwritten text is a difficult task due to the fact that Arabic characters have complex formality, and writing style from one person to another is highly variable. The aim of this research is to confirm the feasibility of using multi-stage classifier for recognizing offline handwritten isolated Arabic characters. We believe that each stage of the classifier allows partial recognition and reduces overall misclassification errors.
The rest of the paper is organized as follows. Section 2 gives brief overview of related work. The proposed technique is presented in Section 3. Section 4 shows the details of the experiments and results discussion. Finally, Section 5 closes with a conclusion.

II. RELATED WORK
Several techniques have been proposed for offline Arabic handwritten recognition [8]. The techniques vary in the type of classifiers being used. Some of them uses a single classifier and other tries to benefit from more than one classifier by constructing a multi-stage hybrid classifier. Most of the techniques implement neural networks in addition to some other classifiers. In this section, we present a review of the related work that uses single and multi-stage classifiers in building the recognition system for offline handwritten Arabic characters. In addition, we present the results of those techniques that have used the same dataset being used here in this paper.
Torki, Hussein, Elsallamy, Fayyaz, and Yaser [10] presented a comparative study of the window-based descriptor on the application of handwriting recognition of Arabic alphabets. It shows a detailed empirical assessment of the different descriptors with many classifiers. The purpose was to evaluate different window-based descriptors as feature extraction methods. They used AlexU Isolated Alphabet (AIA9K) datasetat with defferent descriptors in literature, namely, HOG, SIFT, SURF, LBP, and GIST. The paper presented a comparative evaluation of four common classifiers on the chosen descriptors, namely, Logistic Regression, Linear SVM, Nonlinear SVM, and Artificial Neural Networks. The proposed system obtained a recognition accuracy rate of 72.64% for NN and 70.05% for SVM with SURF descriptors.
Alijla and Abu Kwaik [11] proposed a recognition system for online handwriting of isolated Arabic characters, suitable for hand-held applications. The proposed system uses feedforward and backpropagation neural networks as the main classifier. The system employs online feature extraction methods including Number of Segments and Letter Direction. The system also used Density, Aspect Ratio and Character Alignment as the offline features and arranged the characters into four groups according to the number of segments in the Arabic character. The system is designed with four neural networks, one for each group of characters. The system achieved a recognition accuracy of 95.7% on a dataset of untrained writers.
Ali, Shaout, and Elhafiz [12] proposed two phase classifier to recognize offline handwritten Arabic characters. The two-phase system is based on dividing the characters into two groups according to their similarity. In the second phase, a specific classifier for each character group is used to classify the character within a group. The proposed system uses NN for both classification phases. The feature extraction method used in the system is the Principal Components Analysis (PCA) and extracted a feature vector of 95 values. The proposed system applied on a private dataset and achieved a recognition accuracy rate of 93%.
Abed & Alasad [13] suggested an approach for the identification of isolated Arabic characters using error back propagation neural networks (EBPANN). The neural network was optimized to recognize 12 characters which achieved a recognition accuracy rate of 93.61%.
Al-Boeridi and Ahmad [14] demonstrated the performance of a hybrid Off-line handwriting recognition system (OFHR) for Malay Bank Cheques written in Malay language. The proposed recognition system used two individual classifiers, namely, NN and SVM. The authors concluded that these two classifiers gave an exceptional result. But at the same time, this hybrid method is difficult to implement and takes longer to obtain satisfactory results. The experimental results show that NN has a higher recognition rate at 99.06% and SVM at 97.15%.
Al-Jubouri and Abusaimeh [15] proposed two-stage classifiers to recognize handwritten Arabic characters. The first stage uses the Support Vector Machine classifier which classifies the characters into two groups namely: characters with dot(s) and characters without dots. The second stage uses a neural network classifier. The experiment conducted on a dataset of 2927 character images from the IFN-ENIT dataset with no character segmentation. The proposed approach used Discrete Wavelet Transform (DWT) and curvelet feature extraction methods. The experiment result showed a recognition accuracy rate of 92.2%.
Younis [16] presented a deep neural network to solve the problem of recognizing offline handwritten Arabic characters based on a Convolutional Neural Network (CNN) models. The deep CNN has been tested on two datasets, AIA9K and AHCD. The accuracy for the two datasets were 94.8% and 97.6%, respectively.

III. PROPOSED TECHNIQUE
Classification is a general categorization in which the body and key objects are identified and recognized. The main objective of using SVM in the proposed system is to separate the characters with dots and those without dots. This separation of characters into two classes, makes is easier for the second stage of NN classifier to recognize the individual character. The distinction between characters significantly reduces the error rate in recognizing some characters within the system. In other words, the probability of characters being similar in shape will be reduced when the classification is augmented with a good feature extraction method, such as DCT [17].
The choice of feature extraction method is the most important step to achieve high recognition accuracy in automatic recognition systems. One of these methods is the 2D Discrete Cosine Transform (DCT), which is a transform method for converting image data into its primary components by calculating a set coefficients and store them in a 2D matrix. These coefficients are categorized as low-frequency values located in the top left corner and high-frequency located at the bottom right corner of the 2D matrix. Thus, the ability of DCT to pack the energy of the image to a few low-frequency coefficients is considered as one of its main characteristics [18].
The Support Vector Machine classifier is one well-known classifiers and have been extensively used in many industrial applications [19]. SVMs gained considerable interest in the research community and proven to have many characteristics useful in Machine Learning applications.
Neural network is one classifier that is used extensively in many applications of pattern recognition, including image recognition, speech recognition, and text recognition [20]. This paper focuses on using multi-class NN within a two-stage Arabic character recognition system. Any multi-class problem can be defined by Three-tuples (S, T, C), where F represents an n-dimensional feature space, T is a training dataset which is a subset of S, and C is a set of class labels [21]. Each element, e, in T is associated with a class label c where the number of class labels is greater than 2. In the training phase, the NN is trained on T to produce a model function F that maps any given feature vector x ∈S such that F(x)=c, where c∈C.
A multi-class NN classifier maps the input feature from the feature space into the output space. The NN classifier consists mainly of three types of layers; input, output, and hidden. Neural networks are characterized by their topology, and this is determined by the learning algorithm and the neurons characteristics. The NN has been applied to solve the problem of recognizing both printed and handwritten Arabic characters. Various methods for classification augmented by various feature extraction methods have been proposed. In this paper, a multi-layer perceptron backpropagation (BP) NN [22] is used for training and then for classification of handwritten Arabic character. The input layer of the NN is fed with the training feature set T, while the output layer produces the class of the tested input.
This research explores the classification capabilities of both the SVM and the NN to produce intelligent off-line Arabic handwritten character recognition system. The major steps in the proposed classification system is shown in Fig. 1, which includes feature extraction step and two-stage classifier, explained in the following subsections.

A. Feature Extraction Phase
The Discrete Cosine Transform DCT [23] is used as a feature extraction method for the alphabet character images. Using DCT as a feature extraction technique can remove the redundancy from the image data and earn a more effective representation of the character image by a set of numerical values [24]. In handwritten text, the features represent the useful information extracted from the characters. This information is then used to classify characters and assist in the classification process. The DCT transforms an image from the spatial domain to the frequency domain. This transformation can help reduce redundancy and focus on the power of the image in a very limited frequency range. Hence, the DCT converts the data of the image into elementary frequency components (i.e., coefficients). The coefficients matrix resulted from applying the 2D DCT function contains low-value coefficients located at the bottom right corner and the highvalue coefficients at the upper left corner. The high-value coefficients are the most important ones as they can be used to represent the image and can also be used to reconstruct the original image with some image quality loss. Thus, DCT is used in the JPEG lossy image compression algorithms [25].
The input for the DCT is 32x32 pixel black and white image of a character. The 2D DCT produces a 32x32 twodimensional matrix of data coefficients. These coefficients are considered accurate representation of the original image; however, the transformation has made it easier to get rid of redundant information. The number of DCT coefficients representing the image are reduced to a smaller set of possible values that hold most of the energy in the image. The feature vector of an image is generated by extracting the higher coefficients values in the matrix resulting from applying the 2D DCT. These coefficients constitute the minor diagonal elements of the matrix. The coefficients are read from the matrix in a zigzag fashion and storing them in a onedimensional feature vector as shown in Fig. 2.
Extensive experiments were carried out using MATLAB [26] to find those DCT coefficients that are the most representative features of a character image. The coefficients chosen were those ones that are sufficient to reconstruct the original image when performing the inverse DCT, rather than all coefficients of the image. The total number of these coefficients that represent the minor-diagonal elements of the 32x32 pixels image is 560. This number is determined by empirical testing to reconstruct perceivable characters with a minimum number of coefficients. These features are utilized for training and testing phases of the system.

B. Training Phase
The training phases for the SVM is shown in Fig. 3. The purpose of training is to produce a SVM model that can, later on, differentiate between letters with dot(s) and letters without dot(s) of the alphabet characters. During training, the SVM is fed with the feature vectors of all characters. As mentioned before, the feature vector represents the n DCT coefficients representing a character.
During the training phase of the neural network, as shown in Fig. 4, the inputs are two manually separated subsets corresponding images for letters with dot(s) and letters without dot(s). The images are fed to feature extraction step that uses DCT at which vectors of features are generated for both types of letters. To distinguish between those two datasets, the feature vector is extended with an extra value, here, this values is either 1 or 2, corresponding to letters with dots and letters without dots, respectively. Now the feature vector length is 561. The extended feature vectors are then fed to the neural network running a feedforward back-propagation algorithm for training.  145 | P a g e www.ijacsa.thesai.org

C. Testing Phase
During the testing phase of the system, the SVM is used to classify the character either a character with dot(s) or without dot(s). When the SVM classifies a character, the output class value of 1 or 2, corresponding to a character with dot(s) and a character without dot(s), is appended to the original feature vector of the corresponding character. After appending the feature vector for the character being classified, the NN is fed with that feature vector to give the final class of the character as shown in Fig. 5.

IV. EXPERIMENTS AND RESULTS
The dataset used in the experiments is a novel dataset called AlexU Isolated Alphabet (AIA9K). The database was built and proposed by researchers at the University of Alexandria/Egypt [9]. The database contains 8,737 valid samples of the 28 Arabic alphabet letters. The extracted images of handwritten Arabic characters were written by 107 volunteer Arabic writers among the students in the Faculty of Engineering at Alexandria University. Each writer wrote the Arabic characters three times on a form. All the Arabic characters were scanned from the forms using a scanner at a resolution of 300dpi.
To verify the proposed approach, three experiments were implemented and carried out using MATLAB version R2016a [26]. The first experiment was performed to train and test the classification accuracy of the SVM classifier. The dataset is divided into 60% for training and 40% for testing. The second experiment intended to test the performance of a standalone neural network classifier using the original dataset. The third experiment is conducted to measure the performance of the proposed two-stage classifier. In this experiment, the dataset is divided into 70% for training, 15% for validation, and 15% for testing.
The first experiment was conducted to test the recognition accuracy of the SVM classifier. The rule of the SVM model classify individual as either character with dot(s) or without dot(s). The recognition accuracy of this model is shown in Table I. The results are very promising, and the overall recognition accuracy achieved is 99.14%. It is noted that, more than one-third of the characters are correctly recognized. The lowest recognition accuracy obtained was for the letter "Daad" ‫.)ض(‬ The second experiment was conducted to test the performance of a standalone neural network in which the network was trained and tested on the original dataset. The recognition accuracy for individual characters for this experiment is shown in Table II. It can be seen that the best recognition ratio of 96.57% obtained for the alphabet character "Alif" ‫,)ا(‬ while the worst recognition accuracy of 79.75% obtained for the alphabet character "Thaa" ‫.)ث(‬ The reason for this low recognition accuracy of the character "Thaa" ‫)ث(‬ is due to the great similarity in the way people writes this character compared to other alike characters. The overall recognition accuracy of the standalone neural network classifier of all alphabet characters is 88.5%.
The third experiment is implemented to test the performance of the proposed approach. First, the feature vectors of the test dataset are fed to the SVM model which produces either one of the two aforementioned classes, either with dot(s) or without dot(s). Following that, the recognized class value is appended to the feature vector of that particular character. The newly appended feature vector is then fed to the NN stage for final classification. The maximum recognition rate result obtained is 97.51% while some characters were difficult to recognize, as they are incorrectly recognized by the SVM stage. As shown in Table III, the character "Miim" ‫)م(‬ and "Baa" ‫)ب(‬ have the highest recognition rate of 97.51%, while the character "Thaa" ‫)ث(‬ has the lowest recognition rate It is clear that the best classification accuracy obtained in the proposed approach is for those characters that were well recognized by the SVM, which affects positively the final NN classifier. This proves the effectiveness of the proposed approach in recognizing characters over the standalone NN classifier.

V. CONCLUSIONS
This paper proposed an isolated Arabic offline handwritten alphabet character recognition system. The proposed system employs the DCT as the feature extraction method and utilizing both a Support Vector Machine and a neural network, in a twostage hybrid arrangement. The reason behind using two-stage classifier is to overcome the main limitations of using traditional single-stage classifier. The first stage SVM classifier achieved a recognition accuracy of 99.14%, which classifies the characters into one of two classes, namely, characters with dot(s) and characters without dot(s). The notion behind this approach is to make it easy for the neural network stage to classify each character after being discriminated as either with dot(s) or without dot(s). The experimental results showed that the recognition accuracy of the neural network classifier stage depends highly on the accuracy of the first stage classifier. That is, when there is a misclassification in the first stage, subsequently, affecting the results of the final stage. Despite this, the recognition accuracy of the proposed two-stage hybrid approach achieved 91.84%. Furthermore, the experimental results showed that the two-stage hybrid classifier approach outperforms a standalone neural network classifier. Further investigation is need to enhance the proposed approach by employing different feature extraction methods as well as applying this hybrid approach on different datasets, and possibly different types of classifiers.