Hybrid Technique for Human Face Emotion Detection

—This paper presents a novel approach for the detection of emotions using the cascading of Mutation Bacteria Foraging optimization and Adaptive Median Filter in highly corrupted noisy environment. The approach involves removal of noise from the image by the combination of MBFO & AMF and then detects local, global and statistical feature form the image. The Bacterial Foraging Optimization Algorithm (BFOA), as it is called now, is currently gaining popularity in the community of researchers, for its effectiveness in solving certain difficult real-world optimization problems. Our results so far show the approach to have a promising success rate. An automatic system for the recognition of facial expressions is based on a representation of the expression, learned from a training set of pre-selected meaningful features. However, in reality the noises that may embed into an image document will affect the performance of face recognition algorithms. As a first we investigate the emotionally intelligent computers which can perceive human emotions. In this research paper four emotions namely anger, fear, happiness along with neutral is tested from database in noisy environment of salt and pepper. Very high recognition rate has been achieved for all emotions along with neutral on the training dataset as well as user defined dataset. The proposed method uses cascading of MBFO & AMF for the removal of noise and Neural Networks by which emotions are classified. I. FACIAL EXPRESSION RECOGNITION Biometric is the science and technology of recording and authenticating identity using physiological or behavioral characteristics of the subject. A biometric representation of an individual. It is a measurable characteristic, whether physiological or behavioral, of a living organism that can be used to differentiate that organism as an individual. Biometric data is captured when the user makes an attempt to be authenticated by the system. This data is used by the biometric system for real-time comparison against biometric samples. Biometrics offers the identity of an individual may be viewed as the information associated with that person in a particular identity management system. Automatic recognition of facial expressions may act as a component of natural human machine interfaces Such interfaces would enable the automated provision of services that require a good appreciation of the emotional state of the service user, as would be the case in transactions that involve negotiation[1], for example Some robots can also benefit from the ability to recognize expressions. Noise is added as unwanted variations in …

Automatic recognition of facial expressions may act as a component of natural human machine interfaces Such interfaces would enable the automated provision of services that require a good appreciation of the emotional state of the service user, as would be the case in transactions that involve negotiation [1], for example Some robots can also benefit from the ability to recognize expressions .
Noise is added as unwanted variations in the image when image is transmitted over the network [22].It causes a wrong conclusion in the identification of images in authentication and also in pattern recognition process.Firstly there should be the removal of noise from the image then features are detected.Noise in imaging systems is usually either additive or multiplicative.In practice these basic types can be further classified into various forms such as amplifier noise or Gaussian noise, Impulsive noise or salt and pepper noise, quantization noise, shot noise, film grain noise and nonisotropic noise.However, in our experiments, we have considered only salt and pepper impulsive noise. .

II. RELATED WORK
Automated analysis of facial expressions for behavioral science or medicine is another possible application domain.From the viewpoint of automatic recognition, a facial expression can be considered to consist of deformations of facial components and their spatial relations, or changes in the pigmentation of the face [1].
There is a vast body of literature on emotions.Recent discoveries suggest that emotions are intricately linked to other functions such as attention, perception, memory, decision making, and learning.This suggests that it may be beneficial for computers to recognize the human user's emotions and other related cognitive states and expressions.Ekman and Friesen [1] developed the Facial Action Coding System (FACS) to code facial expressions where movements on the face are described by a set of action units (AUs).Ekman's work inspired many researchers to analyze facial expressions by means of image and video processing.
The AAM approach is used in facial feature tracking due to its ability in detecting the desired features as the warped texture in each iteration of an AAM search approaches to the fitted image.Ahlberg [7] use AAM in their work.In addition, ASMs -which are the former version of the AAMs that only use shape information and the intensity values along the http://ijacsa.thesai.org/profiles perpendicular to the shape surface are also used to extract features such as the work done by Votsis et al. [9].
Many algorithms have been developed to remove salt/pepper noise in document images with different performance in removing noise and retaining fine details of the image, like: Simard and Malvar [10] shows image noise can originate in film grain, or in electronic noise in the input device such as scanner digital camera, sensor and circuitry, or in the unavoidable shot noise of an ideal photon detector.Beaurepaire et al. [2] tells the identification of the nature of the noise is an important part in determining the type of filtering that is needed for rectifying the noisy image.Noise Models from Wikipedia [11] shows the noise in imaging systems is usually either additive or multiplicative.Image Noise [12] shows in practice these basic types can be further classified into various forms such as amplifier noise or Gaussian noise, Impulsive noise or salt and pepper noise, quantization noise, shot noise, film grain noise and nonisotropic noise.Al-Khaffaf [13] proposes several noise removal filtering algorithms.Most of them assume certain statistical parameters and know the noise type a priori, which is not true in practical cases.
Prof. K. M. Passino [8] proposed an optimization technique known as Bacterial Foraging Optimization Algorithm (BFOA) based on the foraging strategies of the E. Coli bacterium cells.Until date there have been a few successful applications of the said algorithm in optimal control engineering, harmonic estimation in Ref [15], transmission loss reduction in Ref [16], machine learning in Ref [14] and so on.Its performance is also heavily affected with the growth of search space dimensionality Kim et al [17] proposed a hybrid approach involving GA and BFOA for function optimization.Biswas et al [18] proposed a hybrid optimization technique, which synergistically couples the BFOA with the PSO.
Up to our knowledge this is the first time we are using this hybridized technique for the detection of emotions in noisy environment.However, in our experiments, we have considered only salt and pepper impulsive noise.

Organization of the Paper
The rest of the paper is organized as follows: Section 3 describes general set up of proposed technique, experimental results have been discussed with respect to percentage of correct recognition considering JAFFE facial image database under salt and pepper noisy environment in section 4 and comparative analysis of the proposed technique with existing techniques at the end of section 4. The paper is concluded with some closing remarks and future scope in section 5.

III. THE GENERAL SET UP
The design and implementation of the Facial Expression Recognition System can be subdivided into three main parts:  The first part is Image Pre-processing.

 Second part is a Recognition technique, which
includes Training of the images. Third part is testing and then there is result of classification of images.

A. Image Preprocessing
The image processing part consists of image acquisition of noisy image.Filtering, Feature Extraction, Region of Interest clipping, Quality enhancement of image.This part consists of several image-processing techniques.First, noisy face's image acquisition is achieved by scanner or from JAFFE database by introducing the noise in the images then adaptive median filter is used to remove noise from the image then Mutation Bacteria Foraging Optimization Technique is used to remove noise that is remained there after using median filter and finally features are extracted.The region of interest is eyes, lips, mouth (eyes and lips) that are independently selected through the mouse for identification of feature extraction.Statistical analysis is also done by that mean, median and standard deviation of the noisy frame, restored frame, cropped frame and enhanced frame are calculated.The significance of these shows that mean and standard deviation should be less in enhanced frame as compared to noisy frame.These extracted features of image are then fed into Back-Propagation and Radial Basis Neural Network for training and then emotions are detected.

B. Training of Neural Network
The Second part consists of the artificial intelligence, which is composed by Back Propagation Neural Network and Radial Basis Neural network.First training of the neurons is there and then testing is done.Back-propagation and RBF algorithms are used in this part.It consists of two layers.At input to hidden layer Back-propagation Neural Network is used which consists of feed forward and feed backward layers and at hidden to output layer, RBF Neural Network is used for classification of expressions.
 As the recognition machine of the system, a three layer neural network has been used that is trained with several times using hit and trial method on various input ideal and noisy images forced the network to learn how to deal with noise.The window size is of 9.We can increase the size of window with that computation time is also increased.The variation in the density of noise is taken from 0.05 to 0.9.

 The combination of adaptive median filter and Mutation
BFO removes noise up to 90% from the image with more accuracy and the learning ranges from 0.1 to 0.9.Main accuracy or goal is 0.01, for that it takes more computation time.For single image total of 10 iterations are needed for zero error but for 21 images the error is zero in total of 15000 epochs because there are different images and different types of motions so more iterations are there.

C. Testing
The third phase consists of testing of expressions that shows the percentage of accurate results and result of classification for different expression.http://ijacsa.thesai.org/ A special advantage of the technique is that the expression is recognized even there is more noise density in the image up to 0.9.The range of density is taken from 0.05 to 0.9.Second by taking statistical features like mean, median and standard deviation the classification is easy and there will be more correctness to recognize the facial expression.Third, dual enhancement of image is there, first at the removal of noise by Adaptive Median Filter and Mutation BFO and second by using histogram equalization. The flowchart and Implementation Overview of Facial Recognition System with proposed method is shown in figure 1 and in figure 2.

A. Input Noisy Image
The first module shows the input phase.To this module a noisy face image of salt and pepper noise is passed as an input for the system.The input image samples are considered from JAFFE database.The input image is randomly picked up from the database used for training and evaluated for the recognition accuracy.

B.
Adaptive Median Filter A median filter is an example of a non-linear filter [20] .It is very good at preserving image detail.To run a median filter: 1. Consider each pixel in the image 2. Sort the neighboring pixels into order based upon their intensities 3. Replace the original value of the pixel with the median value from the list  Algorithm of Adaptive Median Filter 1. Initialize w = 3 and Zm = 39.2. Compute Z min, Z max and Z med 3.If Zmin < Zmed < Zmax, then go to step 5. Otherwise w = w+2.4

C. Mutation Bacteria Foraging Optimization
During the first stage the input image corrupted with a saltand-pepper noise of varied densities from 0.05 to 0.9 is applied to the adaptive median filter.In the second stage, both the noisy and adaptive median filter output images are passed as search space variables in the BFO technique [20] to minimize errors due to differences in filtered image and noisy image.
 Bacterial Foraging Optimization with fixed step size suffers from two main problems I.If step size is very small then it requires many generations to reach optimum solution.It may not achieve global optima with less number of iterations.II.If the step size is very high then the bacterium reach to optimum value quickly but accuracy of optimum value gets low.Similarly, in BFO, chemotaxis step provides a basis for local search, reproduction process speeds up the convergence, elimination and dispersal helps to avoid premature convergence.
To get adaptive step size, increase speed and to avoid premature convergence, the mutation by PSO is used in BFO instead of elimination and dispersal event by equation 1.
) Step 2: Chemotaxis loop: j = j+1 a) For i= 1,2…S, take a chemotaxis step for bacterium i as follows b) Compute fitness function J (i, j, k, l) c) Let Jlast =J (i, j, k,) to save this value since we may find a better cost via a run.d) Tumble: Generate a random vector o Else, let m = Ns.This is the end of the while statement.In this case, continue chemotaxis, since the life of bacteria is not over.
Step 4: Reproductions: a) For the given k and l, and for each i = 1, 2…S, let ) , , ( Nc j i health k j i J J http://ijacsa.thesai.org/be the health of bacterium i .Sort bacteria and chemotaxis parameter C (i) in order of ascending cost J health (higher cost means lower health).b) The Sr =S/2 bacteria with the highest J health values die and other Sr=S/2 bacteria with the best values split.
Step 5: (New step): Mutation For i = 1, 2…S, with probability Pm, change the bacteria position by pfPSO. ) Step 6: If k < Nre, go to step 2. We have not reached the specified number of reproduction steps.Therefore, we have to start the next generation in the chemotaxis loop Pre-Processing and Feature Extraction The face image passed is transformed to operational compatible format in this phase, where the face image is resized to uniform dimension, the data type of the image sample is transformed to double precision and passed for feature extraction.
As the first step in image processing, the region of interest (ROI) of a lip and an eye or only lips region or only eyes region have been selected independently in the acquired images through the mouse.The ROI image is converted into grayscale image.

E.
Histogram Equalization A histogram equalization method has been applied before obtaining the filtered grayscale image.Histogram equalization improves the contrast in the grayscale and its goal is to obtain a uniform histogram.The histogram equalization method also helps the image to reorganize the intensity distributions.New intensities are not introduced into the image.Existing intensity values will be mapped to new values but the actual number of intensity pixels in the resulting image will be equal or less than the original number.In the image sequence, the histogram-equalized image is filtered using average and median filters in order to make the image smoother.Hence, the histogram-equalized image is split into lip ROI and eye ROI regions and then the regions are cropped from the full image.The problem of light intensity variations has been solved.

F. Modified Back Propagation and Radial Basis Neural Network
The neurons are trained by hit and trail method, a total of 625 input neurons are taken, hidden neurons are75 and output neurons are 4.Total of 13 pair are trained with the different emotions of happiness, anger, fear and neutral.In this phase epochs and errors are calculated of particular face region.Two parameters are used, total numbers of epochs and errors are calculated in neural network.For more accuracy more computation time is needed.The main accuracy or goal is 0.01 then it takes more computation time.The Classification of Neural Network includes two types of neural networks that were trained based on the input parameters extracted.Back Propagation Neural Network and Radial Basis Neural Network.
Two types of neural networks were trained based on the input parameters extracted that are: (i) Back Propagation Neural Network (ii) Radial Basis Neural Network

(i) Back Propagation Neural Network
The most widely used neural network is the Back Propagation algorithm.This is applied at input to hidden layer, due to its relative simplicity, together with its universal approximation capacity.The learning algorithm is performed in two stages: feed-forward and feed-backward.
In the first phase the inputs are propagated through the layers of processing elements, generating an output pattern in response to the input pattern presented.In the second phase, the errors calculated in the output layer are then back propagated to the hidden layers where the synaptic weights are updated to reduce the error.This learning process is repeated until the output error value, for all patterns in the training set, are below a specified value.
The Back Propagation, however, has two major limitations: a very long training process, with problems such as local minima and network paralysis; and the restriction of learning only static input -output mappings.To overcome these restrictions, RBF Neural Network is used.Table 1 shows the parameters for Back Propagation Neural Network.The classification system of Neural Network consisted of three stages.

IV. EXPERIMENTAL RESULTS
The proposed algorithm technique is applied on sample images.When the noise level increases, the face images get more affected and sometimes are not visible.Hence in our experiments, we have considered mean and variance varying from 0.05 to 0.9.To start with, applying the salt and peeper noise with mean and variance equal to 0.05 on all the images of the JAFFE face database forms the probing image set.All the images in the JAFFE database without adding any noise are taken as the prototype image set.Hence we get all images in prototype set.Applying our technique on both the sets forms the feature set.In the experimental phase, we take the first image of the first subject from the prototype image set as the query image and the top matching ten images are found from a set of all probe images.If the top matching images lie in the same row (subject) of the prototype query image, then it is treated as a correct recognition.The number of correct recognized images for each query image in the prototype image set is calculated and the results are shown for salt and peeper noise of variance 0.8.
For our experiments, the facial images from the facial image database JAFFE are used.The database contains 213 images of 7 facial expressions (6 basic facial expressions + 1 neutral) posed by 10 Japanese female models.60 Japanese subjects have rated each image on 6 emotion adjectives.Figure 3 shows the sample images used in our experiments collected from JAFFE face database.In our experiments, we have taken 1 person images as 21 images for a single person from that 13pairs are trained in the training with different emotions we have used common type of noise namely, salt and pepper impulsive noise that affect the biometric image processing applications.In order to show the robustness of our face recognition method, these noises are introduced in the JAFFE database face images.Figure 3 shows the sample of image database.

Sample images from JAFFE database
The results consist of four sections:

(Section 2) Training of Neural Network Results
This part shows the results for training of neural network.It consists of a single frame of one subject which includes total of 21 images of different emotions, from which 13 images are trained through neural network, which includes 3 frames for angry feature, 4 for fear feature, 4 for happy feature and 2 for neutral or normal feature.Two parameters epochs and errors are considered for the detection of expression by lip, eye and mouth feature.Epochs specify maximum number of iterations that are to be taken for the detection of feature.
Errors specify total number of errors that are encountered according to the epochs.The more the epochs are, less is the error.The less number of errors gives more accuracy and efficiency in detection of emotions from features.The acceptable range for errors is from 0.1 to 0.9 and the goal is to reduce the errors up to 0.01 for more accuracy.For more clarity reports and graphs are shown which includes both of the two parameters.

Comparative Analysis
The proposed technique is compared with existing ones [21]   The number of correct recognized images for each query image in the prototype image set is calculated and results are shown in table 28.The proposed method Mutation BFO & AMF is compared with other existing techniques which is taken from Reference [21], which consists of CGLPF, PCA, LDA, LPP.Table 13 shows that for salt and pepper noise for variance from 0.05 to 0.2.Graph clarifies the efficiency of proposed technique.Our proposed technique removes noise from variance up to 0.9,all other methods removes noise up to the variance of 0.2, means 90% of noise is removed with the accuracy approximately equal to 90% through our proposed method, so it performs better than other conventional techniques and it shows the high robustness of our algorithm.http://ijacsa.thesai.org/V. CONCLUSION In this work Bacteria Foraging Optimization with mutation is used to remove highly corrupted salt and pepper noise with variance density up to 0.9.
The Bacteria Foraging Optimization with fixed step size requires more computation time with less accuracy, due to mutations the speed of BFO is increased by enhancing the accuracy in terms of quality of images.This technique can be used as robust face emotion detection algorithm.
In this work a multiple feature options such as face, eyes and lips are used for emotion detection.The global, local features of facial expression recognition images can be independently selected through the mouse for identification for feature extraction.
The Radial Basis Neural Network requires less number of epochs as compares to other neural networks, therefore the proposed method is suitable for identification of emotions in the presence of salt and pepper noise as high as 90%.
Comparative analysis shows that the proposed technique is more efficient in recognizing expressions even under noisier environment.

FUTURE WORK
Future work includes that the same technique can be used for detection of emotions in the presence of other noise such as speckle noise with adaptive median filter or by wiener filter.
Other neural network, which can be learning through optimization technique, can be used to improve overall significance of the system.
Replacing BFO with other less computational requirement tools improves the computation time.It can also be made as Graphical base system and a number of emotions should be considered more to make algorithm universal.
Position vector of i-th bacterium in j-th chemotaxis step and k-th reproduction steps.global  =Best position in the entire search space In-preprocessing step of hybrid soft computing technique, adaptive median filter is used to identify pixels that are affected by noise and replacing them with median value to keeps the uncorrupted information as far as possible.The BF-pfPSO follows chemotaxis, swarming, mutation and reproduction steps to obtain global optima.The algorithm of BF-pfPSO is presented below. Step by step algorithm of mutation based BFO Initialize Parameters p, S, Nc, Ns, Nre, Ned, Ped and C (i), i= 1, 2... S. Where, p = Dimension of search space S = Number of bacteria in the population Nc = Number of chemotaxis steps Ns = Number of swimming steps Nre = Number of reproduction Steps Pm = Mutation probability C (i) = Step size taken in the random direction specified by the tumble  i (j, k)= Position vector of the i-th bacterium, in j-th chemotaxis step, in k-th reproduction step and in l-th elimination and dispersal step Step 1: Reproduction loop: k = k+1

1 .
Training of Neural Network 2. Testing of Neural Network 3. Performance Evolution of Neural Network G. Classification A selected database of 21 images of 4 classes is considered to demonstrate the capability and the accuracy of the recognition stage.The faces presented are the inputs into the training stage where a representative set of facial features are determined.After training, new images are processed and entered into the recognition stage for identification.Then emotions are detected as angry, happy, fear and normal.


In first section (Section 4.1), consists of feature extraction stage and enhancement results including preprocessing results of lip feature, eye feature and mouth feature to recognize facial expression of a single person.It consists of noisy image of salt and pepper noise, and then by applying adaptive median filter and mutation bacteria foraging optimization the restored image is appeared.The cropped region is shown in figures and is clearly expressed in the histograms .In the end the enhanced region and enhanced histogram clarifies the results.The statistical features as shown in tables also measure all results. Second section (Section 4.2), consists the results for training of neural network.These results are presented in tabular form, and then the report and graph for all of the features is also shown in results. Third section (Section 4.3), consists the results for testing of frames, which are also shown in tabular form. Fourth section (Section 4.4) shows the classification results for angry, happy, fear and neutral features. Fifth section (Section 4.5) shows the comparison of the proposed technique with existing ones .

Figure8.
Figure8.Graph of Epochs versus Error for Lips

Figure 14 Comparitive
Figure 14Comparitive Analysis through graph . If w ≤ Zm, go to step 2. Otherwise, replace Zxy by Z med 5.If Zmin< Zxy< Zmax, then Zxy is not noisy pixel else replace Zxy by Zmed.where, Noisy image.Zmin = Minimum intensity value.Zmax = Maximum intensity value.Zmed = Median of the intensity values.Zxy = Intensity value at coordinates (x, y).Zm = Maximum allowed size of the adaptive median filter window. A median filter is a rank-selection (RS) filter, a particularly harsh member of the family of rank-conditioned rank-selection (RCRS) filters, a much milder member of that family, for example one that selects the closest of the neighboring values when a pixel's value is external in its neighborhood, and leaves it unchanged otherwise, is sometimes preferred, especially in photographic applications.Median filter is good at removing salt and pepper noise from an image, and also cause relatively little blurring of edges, and hence are often used in computer vision applications.

TABLE I
The RBF is basically composed of three different layers: the input layer, which basically distributes the input data, one hidden layer, with a radially symmetric activation function, hence the network's name and one output layer, with linear http://ijacsa.thesai.org/activationfunction.Table2shows the parameters for Radial Basis neural Network

TABLE II .
PARAMETERS FOR RADIAL BASIS NEURAL NETWORK

TABLE III .
STATISTICAL FEATURES FOR LIPS B. Preprocessing Results for Eye feature

TABLE V .
STATISTICAL FEATURES FOR MOUTH

TABLE VIII .
TRAINING OF EYES Figure11.Report of Epochs and Errors for Eyes http://ijacsa.thesai.org/ and gives more accuracy as compared to other methods.The existing methods include Combined Global and Local preserving features (CGLPF), Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Locality Preserving Projection (LPP).