EMCC : Enhancement of Motion Chain Code for Arabic Sign Language Recognition

In this paper, an algorithm for Arabic sign language recognition is proposed. The proposed algorithm facilitates the communication between deaf and non-deaf people. A possible way to achieve this goal is to enable computer systems to visually recognize hand gestures from images. In this context, a proposed criterion which is called Enhancement Motion Chain Code (EMCC) that uses Hidden Markov Model (HMM) on word level for Arabic sign language recognition (ArSLR) is introduced. This paper focuses on recognizing Arabic sign language at word level used by the community of deaf people. Experiments on realworld datasets showed that the reliability and suitability of the proposed algorithm for Arabic sign language recognition. The experiment results introduce the gesture recognition error rate for a different sign is 1.2% compared to that of the competitive method. Keywords—image analysis; Sign language recognition; hand gestures; HMM; hand geometry; and MCC


INTRODUCTION
The incident deficiencies in the language for deaf people make there it difficult to translate thoughts and feelings into words and phrases understandable and aware.The normal people translate ideas into words audible, but the deaf people translate ideas into visual signs through the fingers and hands movement.
Normally, there is no problem when deaf persons communicate with each other by using their common sign language.The problem appears when a deaf people want to communicate with a non-deaf people.Usually both will be disgruntled in a very short time [1] Since the beginning of the use of the deaf people sign language, they have created specific language among themselves.Those signs of these languages were the only form of communication between deaf people.Within the diversity of cultures of deaf people, signing developed to complete languages.It is a form of communication with deaf people.There has been interest in recognizing human hand gestures.
The target of the sign language recognition is to introduce an accurate mechanism to convert sign gestures into speech or meaningful text so that communication between deaf and nondeaf society.Sign language is not uniform on the world, but different from country to other country.the researchers attempt to unify the sign language in each country separately have been carried out such as Jordan, Egypt and Saudi Arabia to support members of the deaf for each community [2].
Many previous researchers have been working on hand gestures recognition in many sign languages such as the Dutch Sign Language, the American Sign Language (ASL) [3], the Australian Sign Language (Auslan) [4] , and the Chinese Sign Language (CSL) [5], the Arabic Sign Language (ASL) has less attention [6].
In this section we focus the discussion of the previous researchers on sign language gesture recognition, and especially on Arabic sign language (ArSL) recognition.The sign language recognition can be classified into signerindependent and signer-dependent according to the signer sensitivity.Also Most of the previous studies on sign languages are based on vision method or glove based method [7].In the glove based method, the person needs to wear special electronic devices, like gloves or markers.While in vision based method, it uses image processing methods to recognize the gestures without setting any limitation on the user, to supply the system with data related to the motion and hand shape [6].
Cyber gloves are used in most of previous works on Sign Language Recognition.Research [8] developed a system depends on power gloves.It recognizes a set of 95 isolated signs on Australian sign languages with accuracy 80%.Research [9] developed a system to recognize 262 isolated sign with accuracy 91.3% by using HMM.The use of cyber gloves or other input devices conflicts with recognizing and is very difficult to running in real time [10].The researchers presented several Sign language Recognition systems based on vision methods [2,10,11,12,13,14,15,16].Some of vision research works recognize the Arabic alphabet using vision based as research work [2].It created an automatic translation system for gestures of manual alphabets in the Arabic sign language recognition.It does not rely on using any visual markings or gloves.The extracted features phase depends on two stages only, the first stage is edge detection and the second stage is feature-vector-creation.It used multilayer perceptron (MLP) classifier and minimum distance classifier (MDC) to detect 15 characters only of 28 characters.
The research work in [11], a system of the recognition and translation of the numbers were designed.The system is composed of four main phases; Pre-processing phase, Feature Extraction phase, interpolation phase and Classification phase.The extracted features are scale invariant and make the system more flexible.The experimental results revealed that the system was able to recognize a representing numbers from one to nine based on the minimum Euclidean distance between the numbers.
The research work in [12] investigated appearance-based features for the deaf person-vision-based on sign language recognition.It does not depend on a segmentation of the input images and he used the image as a feature.The system used a combination of features including PCA, hand trajectory, hand position, and hand velocity.The rwth-boston-104 database is used for the grey scale image with a reduced frame size 195x165 pixels and downscaling to 32x32 pixels.
The research work in [13], a system of the recognition and translation of the Arabic letters was designed.The system depends on the inner circle position on the hand contour and divides the rectangle surrounding by the hand shape into 16 zones.The extracted features are scale invariant.Experiments revealed that the system was able to recognize Arabic letters based on the hand geometry.The experiment results shown that the different signs gesture recognition rate of Arabic alphabet for were 81.6 %.The research work in [14] used Adaptive Neuro-Fuzzy Inference system (ANFIS).The system used 30 Arabic sign language alphabets visually.The recognition rate of the system was 93.55%.The research work in [15] built an ArSL system and measures the performance of ArSL data collected.The system based on Polynomial classifiers.It collected a 30 letter of ArSL.It collected the data by using gloves marked with six different colours at different regions as shown in Fig. 1 [15].The recognition rate is 93.41 %.
The research work in [16] used new two features are introduced for American Sign Language recognition: those are kurtosis position and principal component analysis PCA.Principal component analysis was used in this research as a descriptor that represents features of image to provide a measure for hand orientation and hand configuration.PCA has been used before in sign language as a dimensionality reduction.Kurtosis position is used as a local feature for measuring edges and reflecting the position of articulation recognition.It used motion chain code that represents the movement of hand as feature.The system input is a sign from RWTH-BOSTON-50 database, and the recognition error rate of the output is 10.90%.
In this paper the motion chain code used in [16] to recognize Arabic sign language is to be enhanced through an EMCC algorithm.Appling the EMCC on forty different Arabic words, as Fig. 2, the conducted results showed the enhancement compared to the |MCC algorithm.
The rest of the paper is organized as follows.Section two presents a Motion chain code (MCC).Section three explains HMM classifier.Section four presents the proposed system.Section five shows the experimental data.Section six explains the experimental results.Section seven presents the conclusions.This method provides a representation of hand trajectory.It is a sequence of numbers {0,1,2,3,4}, to represent the motion directions of the hand, zero to no motion, one to up, two to left, three to down, and four to right [17] as Fig. 2. The chain code is extracted from the relative motion of the hand by subtracting a centroid of the hand in two frames.

III. HMM CLASSIFIER
HMM is used as a classifier for speech [18] and used in sign language recognition systems.In HMM-based approaches, the information of each sign is modelled by a different HMM.The model that gives the highest likelihood is selected as the best model and the test sign is classified as the sign of that model [19].It consists of a set of N states where the transition from each state to another state.It is denoted by Eq. 1: (1)  The state transition probability distribution  = �a ij � where its elements represent the transition probability from each state to another state.State transition coefficients having the properties Eq. 2 and Eq. 3. www.ijacsa.thesai.org  The observation symbol probability distribution in state j, B = �b j (k)� where its elements represent the probability of certain observation to occur at a particular state{1 ≤ j ≤ N , 1 ≤ k ≤ M}, where M is a number of observation sequence O1 O2. . .OM The initial state distribution = {π i } , 1 ≤ j ≤ N A sequence of input video frames Xt, t=1. ..T, where T is a number of video frames and the output is Maximum recognition probability Wi, i = 1. ..N, where N is a number of signs, is corresponding to sign detection.
The system components described in the following subsections: Sub section 4.1 presents skin detection and removing background. .Sub section 4.2 presents face and hands isolating.Sub section 4.3 presents hand and face position detection.Sub section 4.4 presents a Proposed Enhancement of Motion Chain Code (EMCC) and HMM.

A. Skin Detection and Background Removal
The algorithm uses skin detection [20].The algorithm adopts skin colour detection as the first step.Due to YCbCr color space transform, YCbCr is faster than other approaches [21,22].The algorithm calculates the average luminance Y avg of the input image as given in Eq.4.
Y avg = ∑ y i,j (4) Where y i,j = 0.3 R + 0.6 G + 0.1 B is normalized to the range {0 to 255}, where i, j are the indices of the pixel in the image.According to Y avg , the algorithm can calculate the compensated image C i,j by the following equations Eq.5 and Eq.6 [20]: It should be noted that the algorithm compensates the colour of R and G to reduce computation.Due to chrominance (Cr) which can well represent human skin, the algorithm only consider Cr factor for colour space transform to reduce the computation.Cr is defined as follows Eq. 7 [22]: Cr=0.5R'− 0.419G' − 0.081B (7) Accordingly, the human skin binary matrix can be obtained as follows: Where '0' is the white point and '1' is the black point.The algorithm implements a filtration by a 5 × 5 mask.First, the algorithm segments Sij into 5×5 blocks, and calculate show many white points in a block.Then, every point of a 5 × 5 block is set to white point when the number of white points is greater than half the number of total points.Otherwise, if the number of black points is more than a half, this 5 × 5 block is modified to a complete black block, as shown in Fig. 4

B. Face and Hand Isolating
The algorithm tracks the objects in each image.The algorithm neglected the small objects, and then detects the largest objects as hands and the face.The algorithm isolates the hand and face as in Fig. 5.After detecting the skin colour and removing background the position of the face and hands can be isolate and detected as Fig. 6. Figure 5 shows the detected skin with background removal.The image contains a right hand and a face.The algorithm detects the hand and a face by the position and shape of each.Fig. 6 shows isolating the face and hands, then isolate the right hand to detect the letter.

C. Hands and Face position detection
Figure 6 shows the skin detected with background removal.The image contains two hands and a face.The algorithm detects the hands and a face by the position and shape of each.Figure 7 shows three images at times {t-1, t, t+1}.The algorithm detects the hand position of each {Ut-1, Ut, Ut+1} to recognize the changes of hand position for each frame from video sequence.

D. Proposed Enhancement of Motion Chain Code (EMCC) and HMM
The algorithm of EMCC depends on a two factors as Fig. 8:  Fig. 9.The proposed algorithm of calculating the observation vector and using HMM to train and test the signs Figure 9 shows the proposed algorithm to detect an observation (O) of the sign or word detected.The first step detects the column number of the first frame of the sign (O1) and detects the hand position P(x 1 , y 1 ).For each frame of a sign the hand position P(x i , y i ) is detected.If the hand position in the same column, calculate the angle between the hand position of the previous frame and the recent frame to detect the observation number (Oi) from Fig. 7 (b).If the hand position changes to other column, the observation number (Oi) is the same number of a column as Fig. 7(a).After calculating the observation of the sign, apply the HMM algorithm.

V. EXPERIMENTAL DATA
To tune and test the proposed system, Arabic sign database EMCC database (EMCCDB) is generated as follows.The EMCCDB corpus consists of 40 Arabic words as Fig 10 .The words were signed by three signers: one female and two male signers.All of the signers are dressed differently and the brightness of their clothes is different.
The video frames of the database are sampled at 30 frames per second and the size of the frames is 640 x 480 pixels.The implementation is carried out using the following as table 1: The prototype is implemented using a Windows based MATLB (R2013a).
Step 1: Detect the column number of the right hand position (O1) in the first frame {from coloumn9 to 13}) as Fig. 8(a).
Step 2: Detect the hand position of the first frame P(x 1 , y 1 ).
Step 3: Detect the hand position of the frame(i) P(x i , y i ) Step 4: If P(x i , y i ) and P(x i−1 , y i−1 )are not in the same column then Determine the column number of the right hand position (Oi) in the frame(i) {from coloumn9 to 13}).Step 5: If frame(i) is not an end frame then go to step 3

Else
Step 6: Train the HMM for each sign λ = ( A, B, π) to maximize P(O|λ) Step 6: To test a sign: Given the observation sequence O=O 1 O 2 . . .O n and a model λ = ( A, B, π) for this sign, then compute P(O|λ) for each sign.The target letter is the maximum P(O|λ).

VI. EXPERIMENTAL RESULT
For the purpose of comparisons, MCC [16] is applied on EMCCDB database, and it achieves an error rate with 38.15 %, while EMCC achieves an error rate with 1.2 %. Figure 11 shows the EMCC performance for every sign detected in EMCCDB versus MCC performance.The total recognition rate enhancement is 36.95%.As shown in table 2, it comparisons between EMCC and previous work on Arabic sign language recognition.In [16], MCC was applied on the American Sign Language.The error rate is 34.54% over the RWTH-BOSTON-50 database.The implementation in [16] was carried out using the following: • Number of words: 30.
• Number of videos: 110.In [26] used (HMM) to represent 50 words by using Gaussian skin colour model.It detects the signer's face which acts as a reference of hands movement.It used Region growing method.The signer wore gloves coloured by orange and yellow.The recognition rate for this system was 98%.
Shanableh.[27] Introduced k-nearest neighbours KNNalgorithm to developed Arabic signs recognition system.The signer wore colour gloves.The correct recognition rate of this system was 87%.
Zaki [28] new feature is used to recognize the Arabic Alphabet and numbers sign language via HMM.The proposed algorithm divided the rectangle surrounding by the hand shape into zones.The best number of zones was 16 zones.The observation of HMM was created by sorting zone numbers in ascending order depending on the number of white pixels in each zone.Experimental results show that the proposed algorithm achieves 100% recognition rate with minimum execution time at 16 zones with 19 states.EL-Bendary [2] used minimum distance classifier (MDC) and also used multilayer perceptron (MLP) classifier to detect 15 characters only of letters with recognition rate 91.7 % and 83.7 % respectively.Jarrah, et al [14] used Gloves marked with six different colours; the system used polynomial classifiers to recognize 30 letters with recognition rate of 93.41 %.Reference [15] did not use gloves and used ANFIS to recognize 30 letters by recognition rate of 93.55 %.
Assaleh, et al [13] recognized Arabic letters based on the hand geometry and the recognition rate of Arabic alphabet for different signs was 81.6 %.This system can reach a 100 % recognition rate with increasing number of zones and number of states.As shown in table 2, Reference [2] used minimum distance classifier (MDC) and also used multilayer perceptron (MLP) classifier to detect 15 characters only of letters with recognition rate 91.7 % and 83.7 % respectively.Reference [11] recognized Arabic letters based on the hand geometry and the recognition rate of Arabic alphabet for different signs was 81.6 %.This system can reach a 100 % recognition rate with increasing number of zones and number of states.Reference [13] used Gloves marked with six different colour, the system used polynomial classifiers to recognize 30 letters with recognition rate of 93.41 %.Reference [12] did not use gloves and used ANFIS to recognize 30 letters by recognition rate of 93.55 %.

VII. CONCLUSIONS
In this paper, a new algorithm, which is called Enhancement Motion Chain Code (EMCC), for Arabic sign language recognition is presented.It has been demonstrated experimentally that the phases of the proposed algorithm includes skin detection, background exclusion, face and hands extraction, hands and face position detection, feature extraction, and also classification using Hidden Markov Model (HMM).Experimental results show that the proposed algorithm achieves 1.2% error rate compared to the other competitive algorithm which achieves 38.15 % error rate.

Fig. 3 .
Fig. 3. Proposed system architecture IV.PROPOSED SYSTEM The proposed system, as shown in Fig. 3, consists of fix phases, skin detection, removing background, face and hands isolating, hand and face positions detection, Enhancement Motion Chain Code EMCC, and Hidden Markov Model HMM classifier.

Fig. 5 .
Fig. 5. Skin colour detection and removing background Fig. 5 shows the resultant image shapes after skin detection and removing the background [23] of image.

Fig. 7 .
Fig. 7. Hand position detection and tracking Column number: The algorithm detects the column number that has a position hand as Fig 8(a).• Angle direction: the algorithm detects the angle direction by calculating the angle between the hand positions for sequence frames as Fig 8(b).The algorithm calculates the observation number by change of column number and angle direction for hand position in each in a video sequence.

Fig. 8 .
Fig. 8. (a) The column distribution, (b) The eight directions of the right hand motion.The dotted lines represent the decision boundaries between different directions, (c) A sample EMCC 9 1 2 3 1 10 3 8 Number of words: 40. Number of videos: 1288. Number of training videos: 1045. Number of testing videos: 243. Average videos per word: 32.2. Average training videos per word is: 26.125. Percentage of training videos per word is: 81.13%. Percentage of testing videos is: 18.87%.

1 �
Calculate θ i = arctan � y i −y i−1 x i −x i−Determine the direction number (Oi ) depends on θi as Fig 8(b) End if