Facial Emotion Recognition using Neighborhood Features

We present a new method for human facial emotions recognition. For this purpose, initially, we detect faces in the images by using the famous cascade classifiers. Subsequently, we then extract a localized regional descriptor (LRD) which represents the features of a face based on regional appearance encoding. The LRD formulates and models various spatial regional patterns based on the relationships between local areas themselves instead of considering only raw and unprocessed intensity features of an image. To classify facial emotions into various classes of facial emotions, we train a multiclass support vector machine (M-SVM) classifier which recognizes these emotions during the testing stage. Our proposed method takes into account robust features and is independent of gender and facial skin color for emotion recognition. Moreover, our method is illumination and orientation invariant. We assessed our method on two benchmark datasets and compared it with four reference methods. Our proposed method outperformed them considering both the datasets. Keywords—Haar features; feature integration; emotion recognition; face detection; localized features; multiclass SVM


I. INTRODUCTION
Classification of emotion in different classes is a field of significant attention nowadays. The most important of this field is related to human facial emotion classification which is demonstrated as a chain procedure to recognize various human emotions via facial skin expressions (shown in Fig. 1), verbal expressions, different gesture and body movements, and different physiological signals measurement methods. The importance of people feelings in the research of latest technology gadgets is well-known. In today's world, the analysis and recognition of human emotion recognition has an extensive range of significance in wide majority of applications including machine learning based human-computer interaction, online automated tutoring systems, image and video retrieval, smart environments for health-care, and automated driver warning systems as narrated by Seyedehsamaneh et al. [1]. In addition to what has been mentioned above, facial emotion recognition plays very important role in finding various mental health conditions by doctors, psychiatrists and psychologists.
In the past few decades, scientists and researchers from multidisciplinary fields have proposed different approaches and methods to identify emotions from facial features, speech signals, and many other sources. However, it is worth noticing that it is still a difficult issue in the field of machine learning, deep learning, computer vision, psychology, physiology due to the nature of its complexity. Facial recognition started nearly 80's [41]. Scientists and researchers agreed that facial expressions are the most influential part in recognizing human emotion. But, it is difficult to interpret human's emotion by utilizing facial expression characteristics due to the sensitivity to the external noises for example illumination conditions and dynamic head motion Kwang et al. [2]. Moreover, the final results for emotion classification based on facial expressions still need to be improved. For this purpose, different research investigations have been made and it is found out that the clue lies in the baseline or the backbone of most of the methods based on the initial step of face recognition. This fact was further investigated by Jiankang et al. [49]. They discovered that if a robust technique is used to detect faces, then the complexity of next steps can be reduced substantial and the effectiveness of these next steps improve significantly. Ray and Mishra [12] investigated EEG signals and on top of that they considered different techniques to measure the performance of emotion recognition capabilities.
To handle these problems, we introduce robust technique for human facial emotion classification into various states using facial features in the localized regions. Our proposed technique does not rely on the postulation of a specific gender or skin color of different human beings. The proposed technique is illumination and orientation invariant to prevail robustness to these changes. In fact, the proposed technique is characterized by the compact representation of spatial information as illustrated by Manisha et al. [3] that effectively combines human facial emotion features. We fuse the characterizations of both face detection and human facial emotion classification into a unique framework. The proposed technique follows the inspiration of investigating the local structure of facial image with a different technique of unification of localized features. It is important to mention here that the proposed technique is motivated by smaller computational overhead. This characteristic of the method makes this method very feasible to be placed in practice for any handheld device, for example, smart phones and other smart portable devices. The flow and complete process of our proposed technique is outlined in Fig. 2. We identified faces in the images using famous Haar features. Subsequently, we then formulate localized regional descriptor (LRD) and exploit multi-class SVM to classify different human facial emotion. Our contribution lies in the development of localized regional descriptor that motivates us further enhance the proposed method with experimental analysis from different aspects.  In the rest of the paper, we present literature reviews in Section II and our proposed method in Section III for classification of facial emotions into seven different classes. Results are presented in Section IV, discussion is presented in Section V, and conclusion is presented in Section VI.

II. LITERATURE REVIEW
We provide details of literature review in this section. We have partitioned state-of-the-art techniques into 3 parts to explain human facial emotion classification considering speech signals, physiological signals measurements, and human facial expressions based recognition methods.
One dimensional signal, namely, speech which is a complicated signal providing a lot details, for instance, about the data to be communicated, speaker, language, region, and emotions. Therefore, we want to mention that speech processing is a significant field in digital signal processing and it presents a number of different applications including human computer interfaces using machine learning techniques, telecommunication between peer, assistive technologies for health-care, and security and safety associated with different places of people gatherings. The sound and speech/acoustic properties of the speech signal represent feature and the procedure through which some data is extracted from the speech signal and this is called feature extraction as introduced by Likitha et al. [4] and they utilized Mel Frequency Cepstral Coefficient (MFCC) method for human facial emotion recognition through speech signals representing different properties. Lotfidereshgi et al. [5] introduced an algorithm that uses the speech signal directly from the provided data through various speech collection devices. Therefore their technique fuses the robustness of the traditional source filter model of human speech generation with those of the currently presented liquid state machine (LSM) which is also called as biologically-inspired spiking neural network (SNN). Tzirakis et al. [6] presented a technique consisting of a Convolutional Neural Network (CNN). This model formulates features from the unprocessed signal, and concatenates them together to present them to a 2-layer Long Short-Term Memory (LSTM) network. Taking into account speech emotion from multiple sources i.e., in multiple speech emotion, the rate of identifying emotion will be decreased due to the expansion of emotional confusion. To fix this issue, Sun et al. [7] presented a speech emotion recognition technique considering the decision tree support vector machine (SVM) algorithm with Fisher feature selection bottom-up approach. Liu et al. [8] introduced a speech emotion recognition technique considering an enhanced version of brain emotional learning (BEL) algorithm, which is motivated by the emotional processing procedure of the limbic system in the brain of human beings. The outcome results of BEL algorithm is affected and improperly adapted by the reinforcement learning rule. Moreover, human emotions classification considering speech signals suffer from the unavailability of information and features because they don't provide improved interaction between human and machine in the form of a computer. To enhance the robustness of speech signals information itself, still a very large amount of technical space should be completed and addressed by the researchers in the same field. Now we consider different category where emotions are classified using signal measurement procedure. For instance, physiological signals measurements are engendered by the physiological process of human beings, e.g., heart-beat rate (electrocardiogram or ECG/EKG signal of brain in the human), respiratory rate of human and content (capnogram), skin conductance (electro thermal activity or EDA signal on the body), muscle current (electromyography or EMG signal taken via different hardware sources available in the market), brain electrical activity (electroencephalography or EEG signal that can be measured using different electrodes on human skull). The aforementioned ways of signals collection help in finding emotion of human beings due to various mental and physical activities. For instance, Ferdinando et al. [9] used LDA technique (Linear Discriminant Analysis feature method), NCA (Neighbourhood Components Analysis feature method), and MCML (Maximally Collapsing Metric Learning for feature assessment) for the supervised monitoring and decreasing of different features in human emotion recognition based on ECG signals collected via electrodes. Kanjo et al. [10] presented a technique that removes the requirements for manual feature extraction by using multiple learning methods, for example, a hybrid method considering a deep model namely Convolutional Neural Network and another deep model, namely, Long Short-term Memory Recurrent Neural Network (CNN-LSTM) on the unprocessed sensor information based on phones and wearable devices easily available in the marked. Nakisa et al. [11] fixed the problem related with the high-dimensionality of EEG signals by presenting an algorithm to effectively search for the optimal subset of EEG features in www.ijacsa.thesai.org EEG signals. For this purpose, they used evolutionary computation (EC) methods. Moreover, taking into account signal pre-processing and emotion classification, their technique divides a huge set of emotions and combines extra features. Ray et al. (2019) introduced a method by using computational intelligence algorithm e.g., discrete wavelet transform and Bionic Wavelet Transform (BWT) for the evaluation of EEG signals Ullah et al. [13]. Jirayucharoensak et al. [14] investigated the usage of a deep learning network (DLN) to find out undiscovered feature correlation between input signals from various sources. The DLN is used with a stacked auto encoder using hierarchical feature learning technique. It is worth mentioning that the physiological signals measurement based techniques for human emotion classification face several issues as illustrated by Egon et al. [15]. These issues are obtrusiveness of physiological sensors, unreliability of physiological sensors, for example, due to movement artifacts of multiple reasons, not fixed bodily position, changing air temperature, and varying humidity. In addition to that, these signals have many-to-many relationship issues; that is, multiple physiological signals can partially serve as indicators for multiple conventional biometric features of human emotions. These signals also present varying time windows where measurements could differ. Now we will provide details of methods based on facial emotion recognition aspects. Facial expression based emotions classification moves the next level the fluency of the environment, accuracy and genuineness of interaction taking place in the surroundings, especially to demonstrate humancomputer interaction complications as illustrated by Rota et al. [16] in his method related to particle groupings. To take into account these considerations, both scientists and researchers from the community are contributing important efforts to facial expression based emotion classification techniques and the literature is increasing with the passage of time. Jain et al. [17] introduced an algorithm based on advance and latest Deep Convolutional Neural Networks (DNNs) that is made of various layers performing different functions and deep residual blocks to achieve different tasks of interest. Wang et al. [18] proposed a technique considering stationary wavelet entropy to discover robust features, and used a single hidden layer feed forward neural network as the classifier for facial expression classification. Jaya method is presented to block the training of the classifier fall into local optimum regions that would ultimately compromise the overall performance. Yan et al. [19] introduced a novel and robust discriminative multi-metric learning approach for facial expression classification in multiple video. Orientation feature descriptors from many directions for each face video are discovered to illustrate facial appearance and motion data from dynamic aspects. These metrics driven by multiple features are subsequently learned with these extracted multiple features in a unified fashion to use complementary and discriminative data for emotion classification. Sun et al. [20] introduced a multi-channel deep neural network that learns and puts together the spatialtemporal descriptors for facial expressions identification in static frames. The important concept of the algorithm is to discover and collect optical flow from the difference among the peak expression face frame and the neutral face frame as the temporal data of a specific facial expression, and consider the grey-level frame of peak expression face as the spatial data. A Deep Spatial-Temporal feature Fusion neural Network is investigated to collect the performance of the deep feature extraction and combination from the frames and images. Lopes et al. [21] introduced a robust algorithm for facial expression identification that uses a unification of Convolutional Neural Network and some novel pre-processing factors for the same purpose. Chen et al. [22] proposed a robust method to handle the key challenge of face motions by considering a robust set of features namely Histogram of Oriented Gradients from three perpendicular planes to collect features associated with textures from video data. For the consideration and utilization of facial appearance variations, a robust geometric feature Ullah et al. [23] is introduced from a novel transformation of facial landmarks. Discovering the strengths of facial features based emotions classification techniques, people in the field paid attention to facial expression based emotion classification techniques for handheld smart devices including mobiles. To this end, smart mobiles and smart wrist watches are fully equipped with different types of sensors, for instance, accelerometer, gyroscope, fingerprint Sensor, heart rate sensor, and microphone. Alshamsi et al. [24] investigated a method driven by sensor technology and cloud computing for identification of emotion in both speech and facial expression. Hossain et al. [25] introduced a framework that puts together the strengths of emotion-aware big data and cloud technology towards 5G. In fact, they fused together facial and verbal descriptors to introduce a bimodal technique for big data emotion classification. Grünerbl et al. [26] presented a method considering smartphone sensors for the identification of depressive and panic mental states and recognize state variations of people targeted by bipolar disorder disease. Sneha et al. [27] introduced the textual content of the message and user typing behaviour to make a model that easily divides the future instances. Hossain et al. [28] introduced a method in which Bandlet transform is used on the face areas, and the resultant subband is partitioned into non-overlapping sections. Additionally, a local binary pattern is investigated for each section. The Kruskal-Wallis feature selection is used to choose the most discriminative bins of the fused histograms, which are provided to Gaussian mixture model-based classifier to find different human emotion. Sokolov et al. [29] presented a crossplatform system for human emotion identification. Their system is based on convolutional neural network. Their system can effectively identify human emotions on arousal-valence level of measurement. Lee et al. [42] proposed deep networks for context-aware emotion recognition that consider both human facial expression and context data in a combined fashion. Mao et al. [43] introduced three HMM based frameworks and compared throughout the current paper. Han et al. [44] investigated and summarized the ideas and categories, techniques and applications of transfer learning briefly, and studies the combination of transfer learning and deep learning, and the application of speech emotion recognition. Borra et al. [45] presented an attendance system using partial facial recognition. Nhuong et al. [46] propose an algorithm for feature extraction for the purpose of face recognition. Imen et al. [47] introduce sequence kernels for emotion recognition. Erfana et al. [48] present a survey about the emotion intelligence of different algorithms in the field. www.ijacsa.thesai.org The literature is very limited due to the associated challenges of developing a reliable technique with low computational requirements. The aforementioned methods require huge computational powers since most of them are based on deep models. These methods are modelled for very narrow and specific emotions and they are not extendable easily to consider other emotional states. Therefore, we propose an efficient method for emotions classification into a set of different states using facial features. Our method is independent of gender class, skin colour, illumination changes, and face orientations. Our proposed method presents compact representation of spatial information Verma et al. [3] that effectively encodes emotion information. We integrate the strengths of both face detection and emotion classification into a unified model. Additionally, our method is driven by low computational complexity. Therefore, it can be implemented easily on any handheld device including smart phones.

III. PROPOSED METHOD
Feature modeling for facial emotion classification has been an active area in the fields of image processing and computer vision. The motivation for fast face modeling for realistic facial recognition and classification has led scientists to discover different model-based methods. The techniques in the literature for facial expression modeling and recognition differ in various aspects depending on the application under observation, computing efficiency, type of sensors, cost, and required accuracy. Some researchers proposed 3D generic face deformation for smartphone applications, where they use a single image to adapt the generic model to the face in the video frame captured via smartphones. Different methods can be adapted facial features extraction from video frames. Some researchers use stereo to model face features using differential geometry. However, this kind of technique requires prior knowledge about the shape of the surfaces of the face and its differential geometry for accurate performance. Parallel stereo images can also be used that rely on manually selected corresponding feature points to compute the rotation and translation matrices that are used to fit the model to the computed feature points. For facial emotion recognition we model features which are illumination and orientation invariants. For classification we use multiclass SVM which is a powerful and accurate classifier. Multiclass SVM presents good performance on many problems including non-linear problems. Due to the classification strengths of multiclass SVM, our method avoids both overfitting and underfitting. Multiclass SVM renders good performance by training it even with small samples. Considering our proposed features, this makes the classifier ideal for different personality traits, and high segmented facial expression. The multiclass SVM presents generalization capability; therefore, our proposed method can handle unseen data. The generalization capability of our method is determined by complexity and training of the multiclass SVM.
Face detection considering Haar feature-based cascade classifiers is a famous face detection model Aguilar et al. [30] and Viola et al. [31] due to its simplicity and robustness. Inspired by the mode, where we train a cascade function considering ground truth faces with their labels. In fact, the model entails a lot of positive labels for faces and negative labels for non-faces to train the classifier. Subsequently, we extract Haar features which resemble convolutional kernel. Each feature is a single value calculated by subtracting sum of pixels under a rectangle from sum of pixels under a different rectangle considering a video frame under observation. Due to different rectangles, we exploit different sizes and locations of each kernel to obtain a lot of features. For this purpose, the concept of integral image is exploited.
Where Φ is the integral image and Γ(x', y') is the original image. (x, y) is the cumulative row sum. The integral image can be obtained in one pass over the original image. Additionally, we explore Adaboost model to filter out irrelevant features. To remove irrelevant features, we consider each and every feature on all the training images. For each feature, we investigate the optimal threshold which will classify faces and non-faces. We choose the features with smallest error rate since these features classify the faces and non-faces in optimized way. In the beginning, each image is rendered an equal weight. After each classification, we increase the weights of misclassified images and repeat the same procedure. We then calculate new error rates and new weights. We found that in each video frame and image, significant section consists of irrelevant areas. Therefore, if part of a window does not contain face, we remove it. To consider the concept of Cascade of Classifiers is modeled, which fuse the features into different stages of classifiers and use them oneby-one instead of applying all the features on a window. We remove the widow if it does not qualify the first stage. Therefore, we do not explore the remaining features. If the window qualifies the first stage, we apply the second stage of features and continue the procedure. A window qualifying all stages is a face region.
We than extract localized regional descriptor (LRD) which represents the features of a face based on localized appearance encoding. The LRD formulates different pattern based on the relationships between local areas themselves instead of considering only intensity information. For appearance information, we use localized regions in numerous directions and scales to compute regional patterns. We find the correspondence between localized areas by using the extrema on appearance magnitudes. We want to efficiently summarize the local structures of face by using each pixel as center pixel in a region under observation. Considering a detected face, for a center pixel Δc and neighboring pixels Δn (n=1,2,..,8), we compute the pattern number ( ) as, where M and N are the radius of neighbors and number of neighbors for the pattern number. After calculating the of face, histogram is computed as formulated in the equation, Relationship between regions in terms of these pixels has been used, and a pattern number is assigned. We model histogram to represent the face in the form of LRD. For regional pixels Δ n and a center pixel Δ c, LRD can be formulated as, We find the difference of each region with two other regions in n1 and n2. Considering these two differences, we assign a pattern number to each region, For the central pixel Δc, LRD can be found using the above numbers and the histogram for LRD map can be calculated in the equations, The LRD represents robust features which are calculated by extracting the relationship among local regions by considering them mutually. The LRD finds the relationship of local regions with central region. In the proposed method, face detection and LRD are fused as they complete each other on the basis of characteristics they represent individually.
To classify LRD features into various classes, we use the M-SVM classifier Liu et al. [32] and Du et al. [33]. The M-SVM consists of different parameters which are a combination of different predictors. The M-SVM classifier takes the input features, classifies them with every set of parameters in the classifier, and provides the class label that obtained the majority of votes. The classifier is trained with the same parameters considering the training sets which are produced from the original training set using the bootstrap process. For each training set, the classifier identifies the same number of features as in the original set. The features are chosen with replacement. It means that some features will be taken more than once and some will be ignored. At each iteration of the algorithm, the classifier does not use all the variables to compute the best split, but a unpredictable subset of them. With each set of parameters a new subset is generated. The M-SVM classifier does not require any performance estimation process, such as cross-validation or bootstrap, or a separate test set to get an approximation of the training error. In fact, the error is calculated internally during the training. In fact, in machine learning, M-SVMs are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. Given a set of training samples, each labeled as associated to one or the other of two categories, an SVM classifier sets up a model that assigns new samples to one class or the other, making it a non-probabilistic binary linear classifier. An M-SVM classifier is a representation of the samples as points in space, mapped so that the samples of the separate classes are isolated by a clear gap that is as wide as possible. New samples are then mapped into that same space and predicted to associate to a class based on the side of the gap on which they fall. M-SVM can efficiently perform a non-linear classification using what is called the kernel trick, implicitly mapping their inputs into high-dimensional feature spaces.

IV. RESULTS
For experimental evaluation, we consider static facial expressions in the wild (SFEW) 2.0 Dhall et al [34] and Dhall et al. [35] dataset and real world affective faces (RAF) dataset Li et al. [36]. The dataset namely static facial expressions in the wild (SFEW) has been collected by choosing frames from AFEW part of the collection which is popular among the community of facial emotion recognition. The database presets a lot of images and frames representing unconstrained facial expressions, varied head poses, various age ranges, different occlusions, various focus, different resolution of face and close to real world illumination in both the background and foreground. These chosen facial frames are takes from AFEW sequences and labeled based on the label of the sequence. In summary, SFEW consists of many images and that have been marked for six different facial expressions including angry, disgust, fear, happy, sad, surprise and the neutral class and was labeled by two independent participants. Similarly, real-world affective faces database is very big facial expression dataset consisting of very diverse facial frames downloaded from the Internet. Based on different annotation technique, each individual frame has been independently marked by a huge number of participants. Frames in this dataset are of great varieties that include changes in age, gender and ethnicity, head poses, lighting conditions, occlusions, and postprocessing operations. This dataset has large aforementioned diversities considering different factors, large quantities, and rich annotations. Additionally, we perform comparison with many state-of-the-art methods and reported the results in term of both confusion matrices and total accuracies. We consider seven facial emotion classes including sad, happy, angry, disgust, fear, neutral, and surprise.
We compare our proposed method with four reference methods over two datasets. These reference methods include implicit fusion model Han et al. [37], biorthogonal model Dong et al. [38], higher order model Ali et al. [39], and bioinspired model Vivek et al. [40]. The comparison results are listed in Table I in term of total accuracies. Our proposed method achieved promising results and performed better than four reference methods. Our method still has some limitations. For example, we did not exploit geometric features. Our method is applicable to treat and diagnosis patients with emotion issues. It is worth noticing that the anger facial expression is tense emotional outcome when the human considers that his/her personal limits are violated. Persons in this kind of emotion generally take the gestures including intense stare with eyes wide open, output uncomfortable sounds, bare the teeth, and attempt to physically seem larger. The staring with eyes wide www.ijacsa.thesai.org open is a significant hint for computers to recognize anger considering other facial emotions. There are also other face related elements including V-shape eyebrows, wrinkled nose, narrowed eyes, and forwarded jaws. All these important elements help to recognize anger emotion.
In the facial expression, happiness indicates an emotional state of joy. In this emotional state, the reader can find that the forehead muscle relaxes and the eyebrows are pulled up slowly. Apart from that, both the wrinkled outer corners of eyes and pulled up lip corners represent unique representation. In fact, the neutral facial emotion relaxes the muscles of the face and other facial emotions all need to use extensive muscles of face. The other six facial emotions in the datasets are more extreme.
We have also provided the confusion matrix for SFEW dataset in Table II. As can be seen, our proposed method presents encouraging results regarding the facial emotions.
We have also provided the confusion matrix for RAF dataset in Table III. As can be seen, our proposed method presents encouraging results regarding the facial emotions.
A great diversity of approaches has been proposed to solve the problem of facial emotion recognition. However, most of them are designed to work for specific emotion, where different representations of structures and appearance are analyzed with different models. In this paper we consider spatial properties of faces considering different emotions. The facial emotions are complex spatial representation with unexpected appearance or spatial patterns. For facial emotion recognition, we propose a novel method where we compute localized regional descriptor from the face images. Considering these facial emotions, we design a set of robust features combined into a unified LRD descriptor. For compact encoding of spatial patterns in these faces, we explore regional pixels which represent distinguish spatial patterns of faces. In fact, localized regional features are mid-level characteristics to fuse the distance between low-level and high-level features for capturing facial emotions. For classification, we exploited M-SVM which is a set of supervised learning methods used for classification and regression. Provided a set of training samples, the SVM classifier builds a model that finds the class of new unseen samples. This classifier is very significant in both machine-learning and data-mining curriculums and is frequently used by researchers. Besides, its utilization spans to a wide variety of applied research fields including but not limited to neuroscience, text categorization, and finance. The effectiveness of M-SVMs classification tasks in a wide variety of fields, such as text or image processing and medical informatics, has inspired researchers to do research on the execution performance and scalability of the training phase of serial versions of the algorithm. Since we describe a facial emotion from the view of a set of features, our method can be widely exploited in different applications. What's more, our modeling does not limit the type of features or the type of scenes, which helps us to extend the proposed technique to broader research fields. Experimental results demonstrated that our proposed approach is effective for the detection of various facial emotions.

V. DISCUSSION
We have presented a new method for facial emotion recognition based image processing and computer vision techniques. It is worth mentioning here that many methods have proposed previously for the same problem as we discussed in the literature review. However, those methods suffer from various problems ranging from limited datasets to limited metrics for the purpose of evolutions. Moreover, our proposed method is invariant to different key challenges as we mentioned in the introduction section. We carried out detail experimental analysis on two benchmark datasets which are considered very challenging for the same problem in the community. Thanks for localized feature descriptor that proved that our method is enriched with robustness to deal with the difficult problem of facial emotion recognition. In the experimental assessment, we used two performance metrics i.e., total accuracy and confusion matrix. Our method showed very promising results considering both aforementioned datasets and performance metrics. In fact, our work can be further extended with many machine learning and deep www.ijacsa.thesai.org learning approaches. However, these advance learning approaches required huge amount of data to process during the training stage. Therefore, we keep it our next step in the future.

VI. CONCLUSION
We explore a new method for facial emotion classification into seven different states. For this purpose, we detect faces and extract localized regional descriptor (LRD) based on the relationships between neighboring regions. To classify facial emotions into seven different classes, we train a multi-class SVM classifier which recognizes these emotions during the testing stage. We evaluated our method on two benchmark datasets and compared it with four reference methods that show that we outperformed them.
In our future work, we would like to consider publicly available datasets as well as we will collect our own datasets in order to have huge amount of data. Then we will explore a deep learning model for the same problem. A deep learning method will address the weaknesses associated with our method including the usage of limited datasets and the consideration of limited number human emotions for the purpose of classification.
ACKNOWLEDGMENT This work was supported by the deanship of scientific research, University of Ha'il, Saudi Arabia [BA-1912].