Indian Sign Language Recognition Using Eigen Value Weighted Euclidean Distance Based Classification Technique

Sign Language Recognition is one of the most growing fields of research today. Many new techniques have been developed recently in these fields. Here in this paper, we have proposed a system using Eigen value weighted Euclidean distance as a classification technique for recognition of various Sign Languages of India. The system comprises of four parts: Skin Filtering, Hand Cropping, Feature Extraction and Classification. Twenty four signs were considered in this paper, each having ten samples, thus a total of two hundred forty images was considered for which recognition rate obtained was 97 percent.


I. INTRODUCTION
Sign Language is a well-structured code gesture, every gesture has meaning assigned to it. Sign Language is the only means of communication for deaf people. With the advancement of science and technology many techniques have been developed not only to minimize the problem of deaf people but also to implement it in different fields. Many research works related to Sign languages have been done as for example the American Sign Language, the British Sign Language, the Japanese Sign Language, and so on. But very few works has been done in Indian Sign Language recognition till date.
Finding an experienced and qualified interpreters every time is a very difficult task and also unaffordable. Moreover, people who are not deaf, never try to learn the sign language for interacting with the deaf people. This becomes a cause of isolation of the deaf people. But if the computer can be programmed in such a way that it can translate sign language to text format, the difference between the normal people and the deaf community can be minimized.
We have proposed a system which is able to recognize the various alphabets of Indian Sign Language for Human-Computer interaction giving more accurate results at least possible time. It will not only benefit the deaf and dumb people of India but also could be used in various applications in the technology field.
II. LITERATURE REVIEW Different approaches have been used by different researchers for recognition of various hand gestures which were implemented in different fields. Some of the approaches were vision based approaches, data glove based approaches, soft computing approaches like Artificial Neural Network, Fuzzy logic, Genetic Algorithm and others like PCA, Canonical Analysis, etc. The whole approaches could be divided into three broad categories-Hand segmentation approaches, Feature extraction approaches and Gesture recognition approaches. Few of the works have been discussed in this paper.
Saengsri [13] in his paper Thai Sign Language Recognition used '5DT Data Glove 14 Ultra' data glove which was attached with 14 sensors-10 sensors on fingers and rest 4 sensors between the fingers which measures flexures and abductions respectively. But accuracy rate was 94%. Kim [14] used 'KHU-1' data glove which comprises of 3 accelerometer sensor, a Bluetooth and a controller which extracted features like joints of hand. He performed the experiment for only 3 gestures and the process was very slow. Weissmann [15] used Cyberglove which measured features like thumb rotation, angle made between the neighboring fingers and wrist pitch. Limitations were that the system could recognize only single hand gestures.
There have been wide approaches for feature extraction like PCA, Hit-Miss Transform, Principle Curvature Based Region detector (PCBR), 2-D Wavelet Packet Decomposition (WPD) etc. In [1] [16][17][18] Principal Component Analysis (PCA) was used for extracting features for recognition of various hand gestures. Kong [16] segmented the 3-D images into lines and curves and then PCA was used to determine features like direction of motion, shape, position and size. www.ijacsa.thesai.org Lamar [17] in his paper for American and Japanese alphabet recognition used PCA for extracting features like position of the finger, shape of the finger and direction of the image described by the mean, Eigen values and Eigen vectors respectively. The limitations were accuracy rate obtained was 93% which was low and the system could recognize gestures of only single hand. Kapuscinski [2] proposed Hit-Miss transform for extracting features like orientation, hand size by computing the central moments. Accuracy rate obtained was 98% but it lacks proper Skin filtering with changes in illumination. Generic Fourier descriptor and Generic Cosine Descriptor is used [19] for feature extraction as it is rotation invariant, translation invariant and scale invariant. Rotation of the input hand image leads to shifting of hand in polar space. Rotation invariance is obtained by only considering the magnitude of the Fourier coefficient. While using centroid as the origin translational invariance is achieved and finally ratio of magnitude to area scale invariance is obtained. Only 15 different hand gestures were considered in this paper. Rekha [9] extracted the texture, shape, finger features of hand in the form of edges and lines by PCBR detector which otherwise is a very difficult task because of change in illumination, color and scale. Accuracy rate obtained was 91.3%.
After the features were extracted, proper classifier were used to recognize the gestures. There are various gesture recognition approaches used by different researchers like Support Vector Machines, Artificial Neural Network (ANN), Genetic Algorithm (GA), Fuzzy Logic, Euclidean distance, Hidden Markov Model (HMM), etc. [13][17] used ANN for recognizing gestures. Saengsri [13] used Elman Back Propagation Neural Network (ENN) algorithm which consisted of input layer with 14 nodes similar to the sensors in the data glove, output layer with 16 nodes equal to the number of symbols and hidden layer with 30 nodes which is just the total of input and output nodes. Gesture was recognized by identifying the maximum value class from ENN. Recognition rate obtained was 94.44%. Difficulty faced in this paper was it considered only single gestured signs. Lamar [17] used ANN which comprises of input layer with 20 neurons, hidden and output layer each with 42 neurons. Back propagation algorithm was used and after the training of the neural network one output neuron was achieved, thus giving the proper recognized gesture. Gopalan [1] used Support Vector Machine for classification purpose. The linearly non separable data becomes separable when SVM was used as the data was projected to higher dimensional space, thus reducing error. Kim [20] in his paper of Recognition of Korean Sign Language used Fuzzy logic. Fuzzy sets were considered where each set were the various speeds of the moving hand. They were mathematically given by ranges like small, medium, negative medium, large, positive large, etc. Accuracy rate obtained was 94% and difficulty faced by them was heavy computation.
We have thus proposed a system that could overcome the difficulties faced by various. Our proposed system was able to recognize two hand gestures with an improved accuracy rate of 97%. Moreover, experiment was carried out with bare hands and computational time was also less thus removing the difficulties faced by use of the hand gloves with sensors.

A. Eigen value and Eigen vector
Eigen values and Eigen vectors are a part of linear transformations. Eigen vectors are the directions along which the linear transformation acts by stretching, compressing or flipping and Eigen values gives the factor by which the compression or stretching occurs. In case of analysis of data, the Eigen vectors of the covariance are being found out. Eigenvectors are set of basis function which describes variability of data. And Eigen vectors are also a kind of coordinate system for which the covariance matrix becomes diagonal for which the new coordinate system is uncorrelated. The more the Eigen vectors the better the information obtained from the linear transformation. Eigen values measures the variance of data of new coordinate system. For compression of the data only few significant Eigen values are being selected which reduces the dimension of the data allowing the data to get compressed. Mathematically, it is explained in (1).
If x is a one column vector with n rows and A is a square matrix with n rows and columns, then the matrix product Ax will result in vector y. When these two vectors are parallel, Ax = λx, (λ being any real number) then x is an eigenvector of A and the scaling factor λ is the respective eigenvalue.

IV. PROPOSED SYSTEM
The block diagram of the proposed system is given in Fig.  1 which comprises of mainly four phases: Skin filtering, Hand cropping, Feature Extraction and Classification. 190 | P a g e www.ijacsa.thesai.org Some of the database images considered for the proposed system

A. Skin Filtering
The first phase for our proposed system is the skin filtering of the input image which extracts out the skin colored pixels from the non-skin colored pixels. This method is very much useful for detection of hand, face etc. The steps carried out for performing skin filtering is given in Fig. 3. The input RGB image is first converted to the HSV image. The motive of performing this step is RGB image is very sensitive to change in illumination condition. The HSV color space separates three components: Hue which means the set of pure colors within a color space, Saturation describing the grade of purity of a color image and Value giving relative lightness or darkness of a color. The following Fig. 4 shows the different components of HSV color model. Then the HSV image is filtered and smoothened and finally we get an image which comprises of only skin colored pixels. Now, along with the hand other objects in the surroundings may also have skin-color like shadows, wood, dress etc. Thus to eliminate these, we take the biggest binary linked object (BLOB) which considers only the region comprising of biggest linked skin-colored pixels. Results obtained after performing skin filtering is given in Fig. 5.

B. Hand Cropping
Next phase is the cropping of hand. For recognition of different gestures, only hand portion till wrist is required, thus the unnecessary part is clipped off using this hand cropping technique. Significance of using this hand cropping is we can detect the wrist and hence eliminate the undesired region. And once the wrist is found the fingers can easily be located as it will lie in the opposite region of wrist. The steps involved in this technique are as follows. RGB  Once the wrist is detected its position can be easily found out.  Then the minimum and maximum positions of the white pixels in the image are found out in all other directions. Thus we obtain X min , Y min , X max , Y max , one of which is the wrist position.  Then the image is cropped along these coordinates as used in [5]. Few images have been shown in Fig. 6 after performing hand cropping.

C. Feature Extraction
After the desired portion of the image is being cropped, feature extraction phase is carried out. Here, Eigen values and Eigen vectors are found out from the cropped image. The mathematical steps for finding out Eigen values and Eigen vectors in our proposed system are:  The input data is assumed to be X. Here, in our paper cropped image has been taken as the input image having a dimension 50 by 50.  The mean of the above vector X is found out as  Then the covariance matrix C of the above input vector X was found out. Mathematically, it was given by  The Eigen vectors and the Eigen values are computed from the covariance matrix C.  Finally the Eigen vectors are arranged in such a way that the corresponding Eigen values is in the decreasing order. In our project, only five significant Eigen vectors out of 50 has been considered because the Eigen values were very small after this and so can be neglected. This provides advantages like data compression, data dimension reduction without much loss of information, reducing the original variables into a lower number of orthogonal or non-correlated synthesized variables.

D. Classifier
Classifier was needed in order to recognize various hand gestures. In our paper, we have designed a new classification technique that is Eigen value weighted Euclidean distance between Eigen vectors which involved two levels of classification.

V. RESULTS AND DISCUSSIONS
Different images were tested and found that the new technique of classification was found to show 97% accuracy. Some images tested with other database images are given in the following table where 2 levels of classification were used to identify the gestures. Table I shows the Level 1 classification  experimented for different test images and Table II shows the level 2 classification. A comparison between the first level and second level of classification is being made in Table III and it is seen that the success rate has improved from 87% to 97% with the use of the Eigen value weighted Euclidean distance between Eigen vectors as a classification technique.  From the above experiments, we can say that we have designed a system that was able to recognize different alphabets of Indian Sign Language and we have removed difficulties faced by the previous works with improved recognition rate of 97%. The time taken to process an image was 0.0384 seconds. Table IV describes a brief comparative study between our works with the other related works.

VI. CONCLUSION AND FUTURE WORK
The proposed system was implemented with MATLAB version 7.6 (R2008a) and supporting hardware was Intel® Pentium® CPU B950 @ 2.10GHz processor machine, Windows 7 Home basic (64 bit), 4GB RAM and an external 2 MP camera. A system was designed for Indian Sign Language Recognition. It was able to handle different static alphabets of Indian Sign Languages by using Eigen value weighted Euclidean distance between Eigen vectors as a classification technique. We have tried to improve the recognition rate compared to the previous works and achieved a success rate of 97%. Moreover, we have considered both hands in our paper. As we have performed the experiments with only the static images so out of the 26 alphabets 'H' and 'J' were not considered as they were dynamic gestures. We hope to deal with dynamic gestures in future. Moreover only 240 images were considered in this paper so in future we hope to extend it further.