Face Behavior Recognition Through Support Vector Machines

Communication between computers and humans has grown to be a major field of research. Facial Behavior Recognition through computer algorithms is a motivating and difficult field of research for establishing emotional interactions between humans and computers. Although researchers have suggested numerous methods of emotion recognition within the literature of this field, as yet, these research works have mainly focused on one method for their system output i.e. used one facial database for assessing their works. This may diminish the generalization method and additionally it might shrink the comparability range. A proposed technique for recognizing emotional expressions that are expressed through facial aspects of still images is presented. This technique uses the Support Vector Machines (SVM) as a classifier of emotions. Substantive problems are considered such as diversity in facial databases, the samples included in each database, the number of facial expressions experienced an accurate method of extracting facial features, and the variety of structural models. After many experiments and the results of different models being compared, it is determined that this approach produces high recognition rates. Keywords—Facial Behavior Recognition; Support Vector Machine; Human Computer Interaction

Researchers have categorised facial expressions of feeling into varied categories.Darwin (1872) proposed universal facial expressions of emotion in his evolutionary theory.Floyd Allport (1924), Shalom Asch (1952) and Tomkins (1962,1963) additionally explored universal emotional facial behavior, although each theorist proposed a unique theoretical basis for his expectation.These theorists proposed that cultural variations are additionally a factor for the variation in facial behavior [2].
Human Computer Interaction (HCI) has developed into an important field of computer study in the last few decades resulting in many forms of communication between computers and human beings.Researchers have worked on and developed numerous solutions to increase the interaction between humans and computers.Emotion detection provides a valuable insight into Human Computer Interaction through computing the direct response of the user.An important reason behind the growth of activity in this research area is the necessity of understanding computers' ability to distinguish between numerous facial emotions.
The main motivations behind this research work are: First, the field of facial behavior recognition through computer algorithms is very demanding for inaugurating emotional communications between humans and computers.Second, despite the fact that researchers proposed several methods of emotion recognition within the literature, nevertheless, their methods primarily have concentrated on using one facial database for evaluating the results of their methods and this might reduce the comparability range and the generalization.This paper proposes a technique for recognising facial expressions from still pictures taken of people with numerous facial expressions.The main contributions of this system are as follows; first is to use totally two different sets of facial expression that would be extracted via the facial feature point extraction technique, second is to reduce the number of features in the database sets, then, these reduced feature sets are fed into a classification model so that to improve the recognition rate and to determine the performance of the proposed method.This paper is organised as follows: Section 2 describes previous related works.Section 3 introduces the system structure of the proposed method in which full details in terms of database sets, feature extraction and selection techniques, different structures of a classification method and the www.ijacsa.thesai.orgexperimental results are demonstrated.Sections 4 presents the main conclusion points and finally, future work steps are suggested.

II. RELATED WORKS
Moon Hwan Kim and his colleagues proposed a way for detecting emotions through frontal facial images in 2005 [3].For feature extraction, they divided the face into 3 feature regions: the eye region, the mouth region, and the auxiliary region.For every region, features were extracted by comparing geometric and shape data.They used a Fuzzy Color Filter and Histogram Analysis to extract the face region.The facial elements are extracted utilising a Virtual Face Model (VFM) based mostly Histogram analysis.A fuzzy classifier is identified by the Linear Matrix Inequality (LMI) optimization method to classify facial images into five different emotions.
In 2010 Suvam Chatterjee and Hao Shi proposed a brand new method for extracting a novel feature matrix from facial images.They used the Local Binary Pattern (LBP) for extracting facial features, then applied these features to an adaptive Neuro Fuzzy logical thinking System (ANFIS) to determine five facial expressions [4].Jagdish Lal Raheja and Umesh Kumar introduced a design for human gesture recognition in 2010.The face of a human is detected via employing a technique described by Viola and Jones through the Add Boost Haar classifier.It is then followed by an edge detection method.Since edge detection plays a vital role in finding out the tokens, they therefore, used four accepted algorithms, i.e.Prewitt, Sobel, Prewitt Diagonal, and Sobel Diagonal.The thinning method is applied to scale back the breadth of an edge to one line.After the thinning process, tokens are generated.A Back-propagation Neural Network classifier trained via three completely different gesture images [5].
Priya Metri and his research group studied the multimodal approach for emotion recognition in 2012.They used two completely different models; Facial Expression Recognition and Hand Gesture Recognition, then merging the results of each classifier via employing a third classifier which supplies the ensuing emotion.For extracting features of the face, they followed the favored model of Vagn Walfrid Ekmanand Friesen, which identifies all the visually distinguishable facial movements as called the Facial Action Coding (FACS).They used the Clamshift algorithmic program to extract Hand Gesture Unites (HGU).The data from communicative face and hand gestures is classified into labeled emotion classes via a Bayesian classifier that conjointly takes the features of each system then the emotion is recognized [6].
In this paper, the aforementioned works are studied and analysed and in view of that, a new approach is proposed for recognizing facial expressions from still pictures of two different sets of facial expression which would be extracted via the facial feature point extraction technique.Then feature sets are then fed into an SVM classification model.Details of the new approach is explained in the next section.

III. THE SYSTEM STRUTURE
Human face behavior recognition system can be divided up into five main stages: Facial Database, Pre-Processing, Feature Extraction, Feature Selection, and Classification.Figure 1 displays the flowchart of the proposed emotion recognition system structure.

A. The Database of Images
Choosing an appropriate database of images is an important task as the specification of images has a significant impact on the results of the emotion recognition system.Different databases are compared based on various criteria such as different subjects of samples, completely different skin colors, variation in size and existence of spectacles, beard, and hair.After conducting a comparison among collections of databases based on the above criteria, the Bosphorus and JAFFE databases were selected as appropriate databases for this research work [7,8].The descriptions of database types are explained below:www.ijacsa.thesai.org

1) The Bosphorus Database
This database is planned for research working on 3D and 2D human face processing problems such as expression recognition, facial action unit detection, facial action unit intensity estimation, and facial recognition under poor circumstances, defaceable face modeling, and 3D face restoration.The database contains 105 subjects and 4666 faces all together.Figure 2 shows seven emotions of constant subject of the Bosphorus database.The seven emotions are surprise, sadness, neutral, happy, fear, anger and disgust.The figure shows seven emotions of the same subject.This database is distinctive in three aspects: First, is the well-off repertoire of expressions which includes up to 35 expressions per subject, FACS scoring (includes intensity and asymmetry codes for each AU), and one third of the subjects are certified actors/actresses.Second, is the systematic head which poses 13 yaw and pitch rotations and third is the diversity of face occlusions (beard & moustache, hair, hand, eyeglasses).2) The JAFEE Database JAFEE Database encloses 213 images of 7 facial expressions (6 basic facial expressions + 1 neutral) posed by 10 Japanese female models.The images are in grayscale mode in this database.Each image has been rated on 6 emotion adjectives by 60 Japanese subjects.These images were taken at the Psychology Department in Kyushu University.Figure 3 shows seven emotions of the same subject.

B. Pre-Processing
The emotion recognition system requires that the input images to be unified in size and brightness especially in our recognition system.The number of pixels of images has an important role in the accuracy of emotion detection.Therefore, the dimensions of images should be the same for each of the databases.Since all of the images of the JAFEE database have the same dimension (256 X 256) pixels and they are in grayscale mode, so they do not need any filtering or scaling process.Images of the Bosphorus database need to be scaled to an equal size, therefore they are scaled to (384 X 470) pixels.

C. Facial Feature Extraction
In this stage, a review on previous researches is done about the most important characteristics of the human face that have a great impact on the behavior of humans and identifying the regions of the face that are focused by researchers.These regions usually called regions of interest (ROI).Our proposed method for feature extraction is to divide the face into three regions: eye region, mouth region, and auxiliary region.In facial feature extraction stage geometric information is extracted from these regions that can be seen as two separate phases: Facial feature point extraction and Facial feature set detection.
The first step is to find some important points on the face called facial feature points, which are the base for finding the final facial feature set.There are many techniques and tools for extracting these points but not all of them are accurate.Thus, for this purpose, it has been decided in this research work to use the Luxand FaceSDK library as it has many features which are: Easy integration into applications, supports different programming languages (e.g.C++, Visual Basic, C#, Java,…etc.),and supports different platforms (Windows, Linux, Mac OS X), …etc [9].This library can detect 66 facial feature points of the face.Not all of the points are used.Only 27 feature points are used in our experiments.Figure 4 illustrates the facial feature points that are found in a sketched face.The red and blue filled circles identify the facial feature points that are used in this paper, while the ellipses are the unused points.The second step is to find the final facial feature set which is needed for classification stage.The distance between each pair of these facial feature points could be used as a facial feature.Therefore, the Euclidian Distance Algorithm is used for measuring the distance between each pair of these points.The Euclidean distance between two points (p and q) in a two dimensional space is the length of the line between them.www.ijacsa.thesai.orgIdentifying facial features from many combinations that can be derived from these points is another important process which needs more investigation.Therefore, the most commonly used features are selected [3,10,11].Figure 5 shows the 15 facial features that are derived from 27 used facial feature points.Each line indicates a facial feature.They are labeled and colorized with different color.Table 1 shows the features extracted from the eye region.

D. Feature Selection
In this stage, the most useful features will be selected among the whole set of features.Identifying important features can be done by different techniques like PCA, SVM, etc.Some of these methods have been used to identify and rank the importance of each feature [12,13,14].A combination process is conducted for some of the features, since the face is composed of two similar sides; right side and left sides of the face.Thus, features like width of both eyes might be nearly the same.Basically, Duplicating features might decrease the classification accuracy.A typical solution for this is to combine or taking the average of these features.

E. Classification
Different subsets of the facial feature set are used as inputs to the classifier.The seven facial expressions are encoded in natural numbers (see Table 4).They are used for measuring the performance of learning of both training and testing stages of the system SVM is a classifier method that can be used for classifying emotions [14,15].It classifies the input data into class labels by finding the maximum-margin hyperplanes, which is a line, plane or hyperplane.This process maximises the distance between the line and the nearest data points.SVM process can be summarised in three steps: The first step is to find the hyperplanes in the feature space which is able to classify input features.Since emotion detection for seven different expressions is a nonlinear problem due to the high dimension of input features, mapping is performed for each input sample to its representation in the feature space.Maximising the margin and evaluating the decision function both require the computation of the dot product in a high-dimensional space.In this research work, different experimental tests are carried out using different values of parameters of the support vector machines such as the type of Kernel function, the degree of the kernel function, cost, coefficient, and the gamma are presented in real numbers, then in each test, the time taken to build the model in seconds (TTBM), percentage number of correctly classified instances (CCI), percentage number of incorrectly www.ijacsa.thesai.orgclassified instances (ICI), mean absolute error (MAE), and root mean squared error (RMSE) are computed.This above styling format is used for all other case models.Descriptions of different case models on each database set are explained:-

a) Bosphorus Database 1) Case 1
In case 1, Table 5 shows the input features selected for classification, the output class labels of emotions, and the parameters of the used classifier.These parameters are used for both training and testing the network.The input features are presented as "F1-F15" which means all 15 features are used.The output is formatted as "1-7" which means all 7 facial expressions are used in this experiment.2) Case 2 In this case, instances of the Fear class are removed from the data set in order to compare results with other models that discarded the Fear class.Table 9 shows the classifier parameters of case 2 model.Table 12 shows the confusion matrix of the testing phase of case 2 model classifier.

3) Case 3
For identifying the accuracy of identical models for both databases, the Sadness class together with the Fear class are removed from the data set in this model.Table 13 shows the classifier parameters of case 3 model.Table 14 shows the results of case 3 model classifier.b) The JAFEE Database 4) Case 4 In this case, Table 17 shows the input features selected for classification, the output class labels of emotions, and the parameters of the used classifier.Table 19 shows the confusion matrix of the training phase of case 4 model.This matrix clearly shows the correctly and incorrectly classified instances.The vertical labels identify the desired facial expressions that are classified as the expressions in the horizontal labels.Table 24 shows the confusion matrix of the testing phase of case 5 model classifier.It can be seen that the classification accuracy, or in other words, the recognition rate increases when the number of classes decreased.This statement is true for all testing phases in both databases.It can be realised from the results that almost all of the models give a satisfying outcome, except the 7 class model of the JAFEE database which scores the lowest among all other models (indicated in red colour).The five class model of the JAFEE database using SVM classifiers provides a good recognition rate (indicated in green colour).Table 29 shows percentage of classification accuracy of different experiments.Table 30 shows a comparison between the results of the proposed technique with another approach that was proposed by [3].Although the used facial databases are not the same in these approaches, the comparison is performed, because in both research works five classes of emotions as an experiment is used.The results prove that the proposed approach provides a much greater recognition rate.

IV. CONCLUSION
From the results of all case models that were considered during this research work, many conclusions can be formed concerning the used data samples, the techniques used for each facial feature extraction and feeling classification.Variation within the subjects of the dataset is very important, since the dataset would be additionally generalized.The very best recognition rate of SVM classifier is 89.5349 % for the 5 classes model of emotions of the Bosphorus database, which is close to the result of the same model of the JAFFE database which is 90%.This demonstrates that this approach for emotion recognition obtains high recognition rates.It is conjointly observed that the fear expression has the greatest interference with other expressions.

V. RECOMMENDATIONS FOR FUTURE WORK
Although the proposed approach in this paper provides promising results for emotion recognition, nonetheless, there are still some other areas that can be considered for studying as a step for more improvement and generalization.Some of the most important ideas for future work are listed as follows:-1) Using some other algorithms for classification such as Learning Vector Quantization, Quadratic Classifier, Iterative Dichotomiser 3 (ID3) and Nearest Neighbour Classifiers, 2) It is also advisable to consider Multi-label Classification as another method for classification.This method assigns multiple target labels to each of the instances.

Fig. 1 .
Fig. 1.The Flowchart of the Proposed Emotion Recognition System Structure Initially, the database samples are divided into up two sets namely; Training set and Testing set.Training set samples composed of 80% of the original database samples and 20% of the samples are specified for testing.The samples are divided randomly.Preprocessing is performed to prepare images for the feature extraction stage.A set of facial feature points is extracted from the images then facial features derived from these points.Different sets of facial features are used for both training and testing classifiers.Facial features are applied onto an SVM.Detailed descriptions of the five stages are shown below:-

Fig. 2 .
Fig. 2. Seven Emotional Expressions of the Bosphorus Database

Fig. 5 .F5
Fig. 5. Facial Features Derived from Facial Feature Points (Image Credit: Deepam Pathak) TABLE I. FEATURES OF EYE REGION Features Description F3 Distance between upper of right eye and middle of right eyebrow F4 Distance between upper of left eye and middle of left eyebrow F5 Right eye width F6 Right eye height F7 Left eye width F8 Left eye height the confusion matrix of the testing phase of case 1 model.It appears that the Neutral class has the most incorrectly classified instances.The lowest classification accuracy of the Neutral class is due to the problem of the Fear class and the Sadness class, since it has the most interference with these two classes.

Table 2
shows the features extracted from the mouth region.

TABLE II .
FEATURES OF MOUTH REGION

Table 3
shows the features extracted from the auxiliary regions.

TABLE III .
FEATURES OF THE AUXILIARY REGIONS

TABLE IV .
CODE VALUE OF EACH EXPRESSION

Table 6
shows the results of for Case 1 model classifier.

TABLE VI .
CASE 1 -RESULTS OF TRAINING AND TESTING PHASE

Table 7
shows the confusion matrix of the training phase of case 1 model.The results show that all the instances of all classes are correctly classified.

TABLE VII .
CASE 1 -CONFUSION MATRIX OF TRAINING PHASE

TABLE VIII .
CASE 1 -CONFUSION MATRIX OF TESTING PHASE

Table 10
shows the results of case 2 model classifier.

TABLE X .
CASE 2 -RESULTS OF TRAINING AND TESTING PHASE

TABLE XI .
CASE 2 -CONFUSION MATRIX OF TRAINING PHASE

TABLE XII .
CASE 2 -CONFUSION MATRIX OF TESTING PHASE

TABLE XIII .


TABLE XIV .
CASE 3 -RESULTS OF TRAINING AND TESTING PHASE

Table 15
shows the confusion matrix of the training phase of case 3 model classifier.

TABLE XV .
CASE 3 -CONFUSION MATRIX OF TRAINING PHASE

TABLE XVI .
CASE 3 -CONFUSION MATRIX OF TESTING PHASE

Table 18
shows the results of Case 4 classifier model.

TABLE XVIII .
CASE 4 -RESULTS OF TRAINING AND TESTING PHASE

TABLE XIX .
CASE 4 -CONFUSION MATRIX OF TRAINING PHASE

Table 20
shows the confusion matrix of the testing phase of case 4 model.We can see that none of the instances of Sadness class are correctly classified.

TABLE XX .
CASE 4-CONFUSION MATRIX OF TESTING PHASE In case 5, the Fear class is removed from the data set that is because of the reasons specified in case 2. Table21shows the classifier parameters of case 5 model.

TABLE XXII .
CASE 5 -RESULTS OF TRAINING AND TESTING PHASE

TABLE XXIII .
CASE 5 -CONFUSION MATRIX OF TRAINING PHASE

TABLE XXIV .
CASE 5 -CONFUSION MATRIX OF TESTING PHASE According to case 5 results, the Sadness class has the lowest recognition accuracy.Therefore, the Sadness class with the Fear class are removed from the data set.Table 25 shows the classifier parameters of case 6 model.www.ijacsa.thesai.org

TABLE XXV .
CASE 6 -CLASSIFIER PARAMETERSTable 26 shows the results of case 6 model classifier.

TABLE XXVI .
CASE 6 -RESULTS OF TRAINING AND TESTING PHASE

TABLE XXVII .
CASE 6 -CONFUSION MATRIX OF TRAINING PHASE

TABLE XXVIII .
CASE 6 -CONFUSION MATRIX OF TESTING PHASE

TABLE XXX .
PERCENTAGE OF CLASSIFICATION ACCURACY OF OUR PROPOSED TECHNIQUE COMPARED TO A SIMILAR APPROACH