Multiclass Pattern Recognition of Facial Images using Correlation Filters

Pattern Recognition comes naturally to humans and there are many pattern recognition tasks which humans can perform admirably well. However, human pattern recognition cannot compete with machine speed when the number of classes to be recognized becomes tremendously large. In this paper, we analyze the effectiveness of correlation filters for pattern classification problems. We have used Distance Classifier Correlation Filter (DCCF) for pattern classification of facial images. Two essential qualities of a correlation filter are distortion tolerance and discrimination ability. DCCF transposes the feature space in such a way that the images belonging to the same class gets closer and the images from different class moves far apart; thereby increasing the distortion tolerance and the discrimination ability. The results obtained demonstrate the effectiveness of the approach for face recognition applications. Keywords—Pattern recognition; correlation filter; multiclass recognition


I. INTRODUCTION
There are many daily pattern recognition tasks that come naturally to humans. For example, we can recognise a close friend of ours even after a gap of many years though his features have changed a lot. We can understand a familiar voice even if it is slightly distorted. However, human pattern recognition suffers from three main drawbacks: poor speed, difficulty in scaling, and inability to handle some recognition tasks. Not surprisingly, humans cannot match machine pattern recognition tasks where good pattern recognition algorithms exist. Also, human pattern recognition has limitations when the number of classes to be recognized becomes large. Although humans have evolved to perform well on some recognition tasks such as face or voice recognition, except for a few trained experts, most humans cannot tell whose fingerprint they are looking at. Thus, there are many interesting pattern recognition tasks for which we need machines.
The main goal of pattern recognition is to assign an observation maybe a signal, or an image or a high dimensional object into one of the multiple classes. An important class of pattern recognition applications is the use of biometric signatures like face image, fingerprint image, iris image etc. for person identification [1-.3] The use of two-dimensional (2-D) correlation to detect, locate, and classify targets in observed scenes has been a topic of research for a long time [4][5][6][7][8][9][10].
In this paper we analyze the possibility of using correlation filters for solving multiclass classification problems. DCCF design uses a global transformation (correlation filter) to transform the feature space to decrease the intra class distance and to increase the inter class distance. Results of the experiments conducted on benchmark dataset demonstrate the robustness of the proposed method. The rest of this paper is organized as follows. We begin with a discussion on some related work on correlation filters in Section 2. In Section 3, we discuss the salient features of DCCF and outlines the strategies adopted to apply DCCF for a multiclass facial recognition problem. Section 4 provides the experimental results and finally Section 5 concludes the paper.

II. RELATED WORK
Correlation filters have been widely used for several pattern recognition tasks and visual tracking of objects [11]. The advantage of using correlation filters for object tracking tasks is that it can track objects that are rotated or occluded or are with several photometric and geometric challenges. Pattern recognition of complex objects which are partially occluded are also done efficiently using multiple correlation filters [12] [13]. Composite correlation filters [14][15][16][17] gives superior performance to Matched Filters as they are designed from multiple reference images. If the reference image set are well represented to incorporate all the possible distortions that are likely to be encountered by an object, then the resulting filter will be distortion tolerant. While designing correlation filters three questions need to be considered: (1) How good is the ability of the filter to suppress clutter and noise. (2) How easy is it to detect a correlation peak. (3) How tolerant is the filter to the distortion of the object.
One of the earliest composite correlation filters proposed was the Synthetic Discriminant Function (SDF) [18]. SDF uses a linear combination of reference images to create a composite image. When the designed filter is correlated with a test image, a peak will be present in the correlation plane if the test image corresponds to the TRUE class to be recognized. For all other inputs belonging to FALSE class, there will be no peak in the correlation plane. For digital pattern recognition applications, an SDF synthesized in the computer is correlated with the test image digitally, whereas for optical pattern recognition applications, the SDF designed digitally is converted to a hologram using multiple-exposure holographic techniques.
Initially, design of SDF did not consider any noise and hence the filters were not noise tolerant. Minimum Variance SDF (MVSDF) [19] was one of the earliest attempts to introduce noise analysis in SDF filter design by maximizing the noise tolerance of the SDF. The original SDF design considered only the cross-correlation values at the origin. This could not ensure that the output had its peak at the origin. Since shift in the reference images resulted in shift in the peak location, there was an ambiguity in peak location when the reference image shifts were unknown. This problem was addressed in the Minimum Average Correlation Energy (MACE) filters [20]. MACE filters could produce sharp correlation peaks at the origin and were more likely to produce a correlation peak at the same location as the shifted input.
However, it was realized that by exclusively focussing on the correlation-peak values, one neglects the information in the other regions of the correlation plane. In Minimumsquared-error synthetic discriminant function the averaged squared error between the resulting correlation outputs and the desired one is minimized to obtain a desired correlation plane. Maximum-average-correlation-height filter essentially uses this idea to achieve a correlation shape that yield the smallest squared error. Distance Classifier Correlation Filters (DCCF) [21] essentially incorporates the two ideas. In DCCF rather than considering just the peak the entire correlation plane was considered. Applications of initial DCCF had limitations as the approach was limited to just two classes at a time.

III. MULTICLASS PATTERN RECOGNITION OF FACIAL IMAGES
A facial recognition system is a computer application for automatically identifying or verifying a person from a digital image or a video frame obtained from a video source. One of the ways to do this is by comparing selected facial features from the image and a facial database. DCCFs were first proposed by Mahalanobis et al. DCCF uses a correlation filter ℎ , designed from a set of training images, to classify a test image into one of a set of predefined classes. As illustrated in Fig. 1, DCCF design uses a global transformation (correlation filter) ℎ which transforms the space in such a way that same class gets closer and images belonging to different classes move apart. This improves distortion tolerances as well as discrimination ability of the correlation filter.

A. Formulation of DCCF Filter
Let represent the 2D Fourier transform of the i th training image of class k ordered as a vector and a diagonal matrix with as its diagonal elements. Let R k represents the mean vector of class k and R k the diagonal matrix with R k as its diagonal elements. If the transformation ℎ has to make the inter-class distance large, then the distance between the mean correlation peak values between the different classes is made as large as possible. This is formulated as the measure ℎ + ℎ where M is given as in equation (1) [16].
In the above equation R k represents the mean vector of class 'k' and is the mean of all the classes given as in equation (2).
Simultaneously, the transformation ℎ makes each class compact. The compactness of a class after applying ℎ is measured by the metric average similarity measure [9] which is a metric measure of the similarity between the training images and the mean value of the class given by ℎ + ℎ where is intra-class scatter matrix given as in equation (3).
It follows that the correlation filter ℎ should be so designed that it maximizes the metric.
Once the correlation filter ℎ is designed using the training image set, a test image is classified by correlating with ℎ. The distance metric is calculated as the difference between two correlation peaks. The first correlation peak corresponds to the correlation of the filter with the test image. As shown in Fig.  1, this corresponds to the transformation of the test image. The second correlation peak corresponds to the correlation of the filter with the mean vector of class 'k' which corresponds to the transformation of the class 'k'. The test image is assigned to the class which gives the minimum distance.

B. Classification using the DCCF Filter:Method 1
Let z be the test image. The correlation peak that corresponds to the correlation between the test image and the correlation filter is given by * . The correlation peak that corresponds to the correlation between the mean vector of class 'k' and the correlation filter is given by * . The distance metric that gives the distance between the transformed test image and transformed mean vector of class 'k' is given by.
The given test image is assigned to the class that gives the minimum value of 422 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 5, 2020 C. Classification using the DCCF Filter:Method 2 The above equation for can also be written as follows In the above equation, a is the transformed input image and is given as = | * | 2 and = | * | 2 is the energy of the transformed k th class mean and ℎ = * is considered as the effective filter for class k. This gives us an alternate strategy for classification of a test image using 2D correlation techniques. The third and the fourth term in the expression (6) is implemented as a 2D correlation of the test image 'z' and the effective filter ℎ for the class 'k'. The term is minimized when the third and fourth term is maximized which corresponds to the peak correlation value of z with ℎ . Shown in Fig. 2 is the schematic for implementing this method using 2D correlation. The test image is correlated with effective filter of each class and the peak value which is the value at the origin of the correlation plane is used to compute the distance using expression (6). As discussed in the previous section, coherent optical processing systems can be used to implement these techniques.

D. Facial Recognition using DCCF
In this section we discuss the classification of 'c' different objects, in this case facial images, using the DCCF. Facial images of one person with different facial expressions form the members of one class. The training images are used to design the filter, which is used to classify a test image. The algorithms for the classification of the facial images using DCCF are given in Table I. Find the eigenvectors and Eigen values of V.

Algorithm for classification of the test image (Method 1)
Step 1: Formulate the square matrix H whose diagonal consist of the vector h Step 2: Calculate the distance metric of the test image vector z to the class k mean vector mk as dk=H * z -H * mk 2 for k= 1 to C Step 3: Classify the vector z to that class that gives the minimum value of distance metric.

Algorithm for classification of the test image (Method 2)
Step 1: Calculate hk and bk for each k= 1 to C as given by expression (6) Step 2: Perform the 2D correlation of test image 'z' with the effective filter hk for each class.
Step 3: Compute the distance metric dk for each class 'k' using expression (6).
Step 4: Classify the vector z to that class that gives the minimum value of distance metric.   [22] which consists of 3040 facial images of 152 persons. A total of 20 images are available for training from each class. The subjects are set at fixed distance from the camera and were asked to speak, whilst a sequence of 20 images was taken. The speech was used to introduce facial expression variation. The resolution of each image was 180 x 200 pixels. The algorithm discussed in Section 2 was used to design a DCCF that could classify any image in the database to one of the 152 classes. The 20 images in each class were divided into two sets. One set of images were used to train the filter referred to as training images. The other set of images were used to test the filter referred to as test images. Representative images from the database is shown in Fig. 3.
The robustness of the designed filter would depend on the size of the training image set and how well the training image set represents the possible distortions that occur. Obviously, the larger the training image set, the robust the filter will be. But usually there are a limited number of images available. Hence, if most of the images are used to train the filter, we are left with few images to test the filter. If the number of images used to train the filter is less, then the classification error increases. Hence the training and test image set must be chosen judiciously. Table II gives the results obtained from the study varying the size of the training images set and test image dataset. In each case the total number of images used (i.e.training+testing) is kept a constant. The filter is generated using the training images. The other images are used for testing. The distance of the transformed test image from the transformed mean image of each class is calculated as described in the previous chapter. The test image is classified to the class which gives the minimum value of distance metric.
The total classification error for all classes as a percentage of total facial images are plotted against total number of training images as a percentage of total facial images and is shown in Fig 4. It is observed that when more than 40% images are used for training the classification error is significantly low. Fig. 5 shows the distribution of errors when 95% of the images are used for training. There are only 5 errors out of 152 images used for testing (one image from each of 152 classes) and these 5 errors fall in 5 different classes. In other words, the errors are distributed evenly. Fig. 6 shows the distribution of errors when 90% images of the image dataset is used for training. The 16 errors are distributed over 10 classes with the maximum error per class being 2. Fig. 7 shows the distribution of errors when 85% of the images are used for training. The 25 errors are distributed over 13 classes with the maximum error per class being 3. Fig. 8 shows the distribution of errors 38 errors over 14 classes when 80% of the images are used for training with maximum error per class being 4 and minimum error per class being 1. Fig. 9-15 shows the distribution of errors when the training images decreases from 75% to 45%. The classification errors correspondingly increase from 50 to 175. For the case when training images are 75% of the image dataset, the errors fall in 15 classes with maximum error per class being 5 and minimum error per class being 1. When the training images are 45%, the errors fall into 40 classes with maximum error per class being 11 and minimum error per class being 1. It is observed that for all the cases the error is uniformly distributed across all classes. Fig. 16-23 shows the distribution of errors when the training images decreases from 40% to 5%. The total errors increase from 219 to 2868. These errors fall in 46 classes and 152 classes, respectively. In the worst case when the training images are just 5%, the errors are evenly distributed in all the 152 classes with the average error per class being 19. It may be seen that as the number of training images decreases, the error increases, as expected, and these errors are evenly distributed across multiple classes.   Vol. 11, No. 5, 2020 The experimental results presented shows that DCCF successfully classifies facial images of 152 persons from the Essex database. The facial images belong to persons from quite different ethnic backgrounds. There are 20 images of each person with different facial expressions. It is seen that the DCCF filter shows good discrimination ability to classify facial expressions of 152 persons while maintaining good distortion tolerance towards a variety of facial expressions present in each class. It is also seen that the classification error is below 10% when at least 40% images from the available dataset are used to train the filter.

V. CONCLUSION
In this paper correlation based pattern recognition is adopted for classification of facial images. A good correlation filter must have distortion tolerance and discrimination ability in equal measure. Distance Classifier Correlation Filter is designed, with a global transformation H which maximises the separation between different classes and which minimizes the spread of each class is found. Experimental results show DCCFs work perfectly well for face recognition applications. However, the training set used to develop the filter should be large and well represented. Percentage of training images must be at least 40% of the available dataset for errors to be reasonably low. We find that as the images used to train the filter decreases the increase in classification error is distributed evenly across multiple classes. Our future work would involve the study of DCCF based classification technique for various other biometrics like fingerprint, iris, etc.