Dimensionality Reduction technique using Neural Networks – A Survey

A self-organizing map (SOM) is a classical neural network method for dimensionality reduction. It comes under the unsupervised class. SOM is a neural network that is trained using unsupervised learning to produce a low-dimensional, discretized representation of the input space of the training samples, called a map. SOM uses a neighborhood function to preserve the topological properties of the input space. SOM operates in two modes: training and mapping. Using the input examples, training builds the map. It is also called as vector quantization. In this paper, we first survey related dimension reduction methods and then examine their capabilities for face recognition. In this work, different dimensionality reduction techniques such as Principal component analysis [PCA], independent component analysis [ICA] and self-organizing map [SOM] are selected and applied in order to reduce the loss of classification performance due to changes in facial expression. The experiments were conducted on ORL face database and the results show that SOM is a better technique. KeywordsPrincipal component analysis [PCA]; Independent component analysis [ICA]; self-organizing map [SOM]; Face recognition.


INTRODUCTION
Biometrics refers to the study of methods for uniquely recognizing human based upon one or more intrinsic or behavioral characteristics. Biometrics is used to identify the input sample when compared to a template used in cases to identify specific people by certain characteristics.
Face recognition is an important part of today's emerging biometrics and video surveillance markets. Face recognition can benefit areas of airport security, access control, driver's license, passports; homeland defense, customs and immigration etc. face recognition has been a research area for almost 30 years with significant increased research activity since 1990.
This has resulted in successful algorithms and the introduction of commercial products. The benefit of using a computer system for face recognition would be its capacity to handle large amount of data and the ability to do a job in a predefined repeated manner. This leads one to the methods of dimensionality reduction that allows one to represent data in lower dimension space. The steps for face recognition in [3] are as follow:

A. Selection and Sampling:
Sampling is selection of those points which are required to represent the given image. It is mapping of the image from a continuum of points in space to a discrete set.

B. Dimensionality and reduction
In this phase, the dataset i.e. the images are reduced to minimum size by sampling. Thus, a reduced data set is obtained.

C. Feature Extraction
This process deals with extracting patterns from the data by using techniques such as classification, regression, segmentation or Deviation detection.

D. Classifier
Classification involves mapping data into one of several predefined or newly discovered classes.
In practical situation one is often forced to use linear or even sub-linear techniques. Principal component analysis [PCA], Independent component analysis [ICA] and Selforganizing mapping are the popular form of linear techniques. Using the SOM as a feature extraction method in face recognition applications is a promising approach, because the learning is unsupervised, no pre-classified image data are needed at all. When high compressed representations of face images or their parts are formed by SOM, the final classification produced is fairly simple, needing only a moderate number of labeled training samples.
In this paper, we have introduced face recognition algorithms based on this consideration.
Technically, a principal component can be defined as a linear combination of optimally-weighted observed variables.

II. PRINCIPAL COMPONENT ANALYSIS
Principal Component Analysis is a variable reduction procedure.
It is useful for removing redundancy (i.e. some variables are correlated to each other) among data and to reduce the observed variables into smaller no of principal components that will account for most of the variance in the observed variables. where Y = (y 1 , y 2 ,…,y n ) T is an n dimensional output vector, and W is an n*m weight matrix. In the information theoretic technique like ICA, various objective functions based on information theoretic concepts such as negentropy, minimization of mutual information, maximum entropy, maximum likelihood have been used for source separation problem. This paper follows maximum entropy based ICA method for face recognition [5], the weight update rule for which is [6].
Where z is the output of nonlinearity (logistic function) used. ICA has been performed on both the Architectures (I & II) as proposed in [5].

IV. SELF-ORGANIZING MAP
T. Kohonen introduced self-organizing map [1]. It is unsupervised learning process, which learns the distribution of a set of patterns without any class information. It has the property of topological preservation.
SOMs have also been successfully used in dimensionality reduction and feature selection for face space representations.

Algorithm
PCA is applied as it generates a set of orthogonal axes of projections known as principal components or eigen vectors. PCA is applied to weight matrix generated by mapping the image onto lower dimensional space using SOM. Only the Eigen vectors for large values are considered and those for smaller values are ignored. The steps are as follows:- Step 1. A face image of size m×m was divided into subblocks of size b×b resulting in total of p = (m*m) / (b*b) blocks each of which gives q =b*b number of elements, concatenation of which gives a vector to represent one block resulting in a matrix X = [X 1 , X 2 ,…, X p ] of size q×p. This gives a stream of training vectors Step 2. Consider 2-dimensional (s×s) map of neurons each of which is identified as index jk, j,k = 1,2,…,s.
The j k th neuron has an incoming weight W jk = (w 1,jk ,…,w q,jk ) at instant i. The value of neighborhood function around the winning neuron as h jk at instant i. Initialize weight W jk , neighborhood h jk and the learning rate η o .
Step 3. Pick a sample vector X i at random and present it to a two dimensional (s×s) map of neurons with a total of z = s * s neurons.
Step 4. Find out best matching (wining neuron) using following distance criterion where W jk is the best matching weight vector.
Step 5. Update the synaptic weight vectors of only the winning cluster W jk(i+1) = W jk(i) + η i (X (i) -W jk(i) ) jk ε h JK(i) Step 6. Update learning η i and the neighborhood h jk(i) Step 7. Continue with step 3 until no noticeable changes in the feature map are observed. Finally a matrix M of size z×q is obtained.
Step 8. Compute the Eigen vectors and Eigen values of the covariance matrix M T M, sort the Eigen vectors and retain the Eigen vectors corresponding to highest values.
Step 9. Calculate the KL coefficients (M T * Eigenvectors) and retain them.
Step 10. Repeat above steps for all training images.
Step 11. Reconstruct the images at the time of recognition match with the test image using nearest neighbor classifier.

V. TRAINING AND TEST DATA
For the face recognition experiment, we partitioned the ORL database into a training set and a testing set. The partition is done as follow: First, k images are selected for test; the remaining images i.e. (10 -k) are used for training set and for computing the projected matrix. All the ten images in the training and test sets were projected to a dimension reduced space.

VI. EXPERIMENTATION
Here the eigenvectors of the weight matrix were found. PCA was then applied to the transpose of the weight matrix and the Eigen vectors corresponding to the largest eigenvalues were retained for reconstruction of the image. The table shows the results for PCA, SOM and PCA+SOM    Table   TABLE II. EXPERIMENTAL OBSERVATIONS

VII. CONCLUSION
The SOM algorithm is a typical dimensionality reduction technique which has good properties to preserve topological relationships even in lower dimensional space. The algorithm is very suitable for using K nearest neighbor classifier. The experimental results show that our proposed algorithm performs better and faster in real data set.
An efficient system for face recognition using SOM has been proposed. Firstly, this system provides a general integration of multiple feature-sets using multiple selforganizing maps. Secondly, with the help of compressed feature vector, SOM is trained to organize all face images in database.
The highest average recognition rate of 85.5% is obtained for 40 persons' 400 images of AT&T database, where the training is done on 30 images only and tested on remaining images. Thus, the SOM method is an efficient face recognition process.