Classification of Hand Gestures Using Gabor Filter with Bayesian and Naïve Bayes Classifier

A hand Gesture is basically the movement, position or posture of hand used extensively in our daily lives as part of non-verbal communication. A lot of research is being carried out to classify hand gestures in videos as well as images for various applications. The primary objective of this communication is to present an effective system that can classify various static hand gestures in complex background environment. The system is based on hand region localized using a combination of morphological operations. Gabor filter is applied to the extracted region of interest (ROI) for extraction of hand features that are then fed to Bayesian and Naïve Bayes classifiers. The results of the system are very encouraging with an average accuracy of over 90%. Keywords—Human Computer Interaction; Hand Segmentation; Gesture recognition; Gabor Filter; Bayesian and Naïve Bayes classifiers; Feature Extraction; Image Processing


INTRODUCTION
These days artificial intelligence is being commonly used in various daily life applications.Image processing and machine learning play a vital role in artificial intelligence applications.Systems like face recognition, facial expression recognition and gesture recognition involve multiple steps of image segmentation and processing and research in this domain continues to prosper to find out more suitable solutions for forthcoming challenges and problems.
Gesture recognition is considered as quite a difficult job due to nature of challenges that hinder the state of the art systems get 100% accurate results.Different techniques like Support Vector Machines (SVM), Neural Networks (NN) and Hidden Markov Models (HMM) are used by researchers for classification of gestures in various classes.Hand gesture classification has various applications, like in Human Computer Interactive applications, gesture recognition and classification allows users to interact efficiently with computers using easy to use/understand hand gestures.William1995 was one of the first ones that established the use of hand gesture recognition for controlling household machines.Ayesha2010 defines that in gesture recognition main goal is to create a system which can recognize precise human gestures for relaying useful information.
The hand gestures can be categorized as static hand gestures and dynamic hand gestures.Static hand gestures are basically still images having a certain position and features of hand.The hands can be differentiated by different hand position features in which finger positions are determined and features are found for the precise finger, thumb and palm pattern.Thus a static hand gesture is signified by an individual frame or image.
Dynamic hand gestures on the other hand can be differentiated through the starting and last stroke movement of moving hand gesture.A video is the best example of dynamic hand gesture.Mainly two different approaches are used for hand gesture-tracking named as glove based systems and vision based systems for hand gesture analysis.In the first approach, sensors and other instruments are attached with user"s hand in subjective analysis.In order to analyse data, Magnetic field tracker device and data glove or body suits are used.Extracted features from different human body positions such as the angle and rotation are essential for examination in such systems.Vision-based approach is the substitute to glove based technique because normally it does not need sophisticated or even small external equipment apart from a camera.These systems work using various image processing and machine learning algorithms.In any system however, the 3 important steps include hand region segmentation, feature extraction and then final classification & recognition.

II. PROPOSED SYSTEM
In this paper, we present a scheme for static gesture recognition using vision based approach.Our proposed system can be summarized in the following steps.In the first step preprocessing is done, where initial pre-filtering for noise removal is performed followed by conversion of images from RGB to LAB color space.Then the third dimension grey level is converted to binary using global threshold and is mapped to extract just the desired hand region.For removal of undesirable areas, combination of different morphological operations is applied on the image.The resulted image is then used to extract the edges of the segmented region.Once the hand region is segmented, we can extract the certain features of hand which are then used in the classification module for gesture mapping.The classification is done by using Bayesian and naïve Byes classifier and their performance is compared in the end.www.ijacsa.thesai.orgThe procedures included in above block diagram are discussed in detail in the following section.

A. Preprocessing Conversion to L.A.B Space 1)
We use RGB color images as input which is converted to LAB color space by analysing the pixel's color and differentiating on its basis.Skin color has certain properties that can be extracted easily via color information.In contrast to RGB and CMYK, L.A.B color is designed for accurate estimations and therefore better results in case of image segmentation.
Luminance L* is separated by color space from two different color components (a* and b *).This is because human perception of lightness is almost matched by the "L" component.Accurate colored balance correction can be made by modifying output curves in "a" and "b" components, or for the adjustment of lightness and contrast using the "L" component.

Thresholding 2)
Automatic grey thresholding of third component "b" is performed in L.A.B. color space.

 
: , : , 3 The obtained gray image is converted to binary using the below mentioned threshold.The resultant binary image can be mapped to original image to view the segmented region.

I x y
Where O x y J x y otherwise O is the binary image and if it has a nonzero pixel value then the same spatial coordinates intensity of I will be assigned to the output pixel   , J x y .Thus segmented skin area can extracted through this mapping.

B. Segmentation
For segmentation, we have to perform the following operations.

Morphological Operations 1)
Opening is performed in order to remove narrow joints and noise.In Opening we first perform erosion by using 55  structuring element.In the above equation, structuring elements are applied on the image A .If the "ON" pixels of a structuring element hit the pixel image where the pixel value is more likely to be 0, then the center value of mask is shifted towards the new image.Whenever there is a missing "ON" pixel in the structure element, either structure element or mask will be shifted towards the next pixel.
After erosion we dilate image by using 55  structuring element in order to fill the disturbed region.To recap, dilation adds pixels to the object boundary while erosion removes object pixels from boundary.

Canny edge detector 2)
After morphological operations, we need to detect the boundary of the region.For that we tried various known filters like Sobel, Prewitt, Canny.We finalized Canny edge detection as it gave more desirable results as compared to the other two options.After this stage, we get a well detected hand region and its boundary.

C. Feature Extraction
Once the hand region is extracted, we have to extract good features that have more discriminative power.These precise features are used for classifying various hand gestures so that the problems like interclass similarity and intra class variability can be catered for easily.Extraction of the these best features is definitely a challenging task.Various researchers have used various features for classifying gestures based in internal information of the object in image.A good feature set highly effects the performance of a classifier.
For our technique, we use Gabor filter based features that tell us about the nature of gesture using its pattern and orientation.The advantage we have using Gabor filter is that we get large variance between the features of different classes which minimizes the chances of error if used properly in a classifier.

Gabor Filter 1)
Gabor filter has been used in literature for various applications including satellite image segmentation, urban zone detection [17], document and camera images for edge detection etc.In spatial domain, a 2D Gabor filter is a Gaussian kernel function modulated by a sinusoidal plane wave.The impulse response of Gabor filter is defined by a sinusoidal wave.Sinusoidal wave is known as a plane wave for 2D Gabor filters, multiplied by a Gaussian function.Because of the multiplication-convolution property (Convolution theorem), the Fourier transform of a Gabor filter's impulse response is the convolution of the Fourier In the above equation The wavelength of sinusoidal factor is represented by  .
is the orientation of the normal to the parallel stripes of the Gabor function,  is the phase offset,  is the sigma/standard deviation of Gaussian envelope, and  is the spatial aspect ratio that specifies the Ellipticity of the support for Gabor function.
We apply this filter to each hand image to find out the subjected feature values from the image.

D. Classification
Classification means to segment regions of interest into various groups called classes.Mainly there are two types of classification, Supervised and Unsupervised.
In unsupervised classification (also known as clustering in pattern recognition) the segmented groups are without labels, while in supervised classification we need some training samples that are helpful in classifying the unknown data or test cases.The grouping or matching is based on certain features that are extracted from the objects.These features should be discriminating in nature to give a correct and meaningful classification of the data.In our case, we give the Gabor Filter based features to two classifiers (Bayesian and Naïve Bayes) for deciding the class of the query gesture.

Bayesian classifier 1)
Bayesian classifier is based on calculating posterior probabilities using Bayes theorem.Posterior probabilities which are also known as class density estimates are calculated as below:

X H P H P H X PX 
In the above expression, posterior probability is denoted by   | P X H in which X is conditioned on H.The prior probability of X is denoted by   PX.Classifier"s accuracy highly depends on this parameter.

Naïve Bayes Classifier 2)
The other technique that we tried was the Naïve Bayes Classification.This classifier basically estimates the class of unknown data item using probability models.Using Bayes theorem, posterior probability can be calculated using the following expression

III. EXPERIMENTAL RESULTS
Now for testing the above explained methodology, we have used data with complication or real background.The data is acquired by a 7 mega pixel camera.We have a total of ten hand gestures that we are classifying.In our data set, we have collected 18 images of each hand gesture in which 5 are used for training and 13 for final testing.To justify the best accuracy of our proposed result, we have compared our technique results with a similar work done by nawazish et.al (2013) [18] that used Neural Network Based hand gesture recognition of dumb people.In that paper, the authors worked on a technique using neural networks with the combination of mean and entropy feature.In the paper he extracted features using block base and used 4 blocks of the image and extracted the feature separately.After extracting the features he used an array to store the results of all blocks and for classification he used neural networks.The results are given below in table 1

IV. CONCLUSION AND FUTURE WORK
In this paper, we briefly presented a technique for gesture classification in static images using Gabor Filter based features and Bayesian classifier.Our proposed technique shows promising results on our data set.The main point is that the hand segmentation part in our technique plays very important role.If the segmentation is good, the results of gesture classification improve also.The classification accuracy is obtained with the help of Bayesian and Naïve Bayes classifier.In future we intend to extend this work on dynamic gestures as well.Human Computer Interaction and Sign Language Recognition are two of the most prime applications of this research.We also need to compare our method with more state of the art methods and improve it further to incorporate more gestures.

Fig. 1 .
Fig. 1.System Block Diagram Here erosion is used to remove the extra region.Where A represents the image and B denotes the structuring element.
Here A is represents the gray scale image, and B again denotes the structuring element.

S
as class prior probabilities and it can be estimated by the following expression.represents the selected training sample, which has class i C and S denotes the total number of training samples.

Fig. 2 .
Fig. 2. Segmentation Results (a) Original image (b) LAB color space image (c) third color component of Lab image A (:,:, 3) (d) binary Image (e) Binary mapping (f) Morphological Operation opening (g) Final Binary image (h) Segmented Hand . The table shows that our method of Gabor Filter based feature extraction and Bayesian Classification works well and classifies the gestures 90% of the time while the method of Nawazish achieved a classification rate of 88.25%.Step wise results of various stages of our technique are shown in figure2.Two examples are depicted here in detail.

TABLE I .
COMPARISON OF RESULTS