Classification of Image Database Using Independent Principal Component Analysis

—the paper presents a modified approach of Principal Component Analysis (PCA) for an automatic classification of image database. Principal components are the distinctive or peculiar features of an image. PCA also holds information regarding the structure of data. PCA can be applied to all training images of different classes together forming universal subspace or to an individual image forming an object subspace. But if PCA is applied independently on the different classes of objects, the main direction will be different for them. Thus, they can be used to construct a classifier which uses them to make decisions regarding the class. Also the dimension reduction of feature vector is possible. Initially training image set is chosen for each class. PCA, using eigen vector decomposition, is applied to an individual class forming an individual and independent eigenspace for that class. If there are n classes of training images, we get n eigenspaces. The dimension of eigenspace depends upon the number of selected eigen vectors. Each training image is projected on the corresponding eigenspace giving its feature vector. Thus n sets of training feature vectors are produced. In testing phase, new image is projected on all eigenspaces forming n feature vectors. These feature vectors are compared with training feature vectors in corresponding eigenspace. Feature vector nearest to new image in each eigenspace is found out. Classification of new image is accomplished by comparing the distances between the nearest feature vectors and training image feature vector in each eigenspace. Two distance criteria such as Euclidean and Manhattan distance are used. The system is tested on COIL-100 database. Performance is tested and tabulated for different sizes of training image database.


I. INTRODUCTION
Principal Component Analysis (PCA) is the most popular and the oldest multivariate statistical technique [1]. PCA was invented in 1901 by Karl Pearson [2], who formulated the analysis as "Finding lines and planes of closest fit to systems of points in space". The focus was on geometric optimization. Later it was re-invented by Harold Hotelling in 1933 [3]. In image analysis, the term Hotelling transformation is often used for a principal component projection. PCA is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. Since patterns in data can be hard to find in data of high dimension, where the luxury of graphical representation is not available, PCA is a powerful tool for analysing data.
Generally classification of images is a two step process, feature vector generation followed by a nearest-neighbor classifier [4] [5]. Classification accuracy depends on many factors. One major factor is the extraction of features to represent the image. Feature extraction is a special form of dimensionality reduction. PCA is a method used to reduce the number of features used to represent the data. The benefits of this dimensionality reduction include providing a simpler representation of the data, reduction in memory, and faster classification. PCA transforms the original variables to a new set of variables, that are uncorrelated and ordered such that the first few retains most of the information present in the data. [6]. These uncorrelated components are called principal components (PC) and are estimated from the eigenvectors of the covariance or correlation matrix of the original variables. PCA has been widely used for image processing applications such as face recognition [7][8] [9] [10], palm print recognition [11], image compression [12][13], image fusion [14], image enhancement [15] [16] ,object recognition [17] etc.
This research paper is structured as follows: Section II explains the generation of principal components by PCA method. Section III describes the methodologies of the system. Section IV presents the results. Finally section V describes the conclusions and proposing some possible future work followed by references.

II. PRINCIPAL COMPONENT ANALYSIS
Principle Components Analysis (PCA) is a well-known method to identify statistical trends in data. It projects the data from a higher dimension to a lower dimensional manifold such that the error incurred by reconstructing the data in the higher dimension is minimized. As shown in fig. 1, given a set of points in Euclidean space, the first principal component Z 1 corresponds to a line that passes through the multidimensional mean and minimizes the sum of squares of the distances of the points from the line. The second principal component Z 2 corresponds to the same concept after all correlation with the first principal component has been subtracted from the points. Principal components are a series of linear least square fits to a sample, each orthogonal to all previous. The Principle www.ijacsa.thesai.org Components reveal important information about the dispersion of the original data set. Principal components Z1 and Z2 PCA is based upon eigenvector decomposition of a covariance matrix. For multivariate data, covariance is a measure of the relationship between different variables, or dimensions of the data set. The general steps of PCA [18] [19] are as follows: 1) Acquire data. 2) Subtract mean from the data. 3) Generate the covariance matrix [20]. Important property of covariance matrix is that it is square, real, and symmetric. This means that there always exists n real eigenvalues for an n×n covariance matrix. 4) Calculate the eigenvalues and eigenvectors of the covariance matrix. The first Principle Component is the eigenvector of the covariance matrix with the largest eigenvalue. It represents the most significant relationship between the data dimensions. 5) Compute the cumulative energy content for each eigenvector. 6) By ordering the eigenvectors in the order of descending eigenvalues (largest first), create an ordered orthogonal basis. 7) Use this basis to transform input data vector. Instead of using all the eigenvectors of the covariance matrix, the data can be represented in terms of only a few basis vectors of the orthogonal basis.

III. PROPOSED ALGORITHM
From the image database, some images are used for training and the remaining images are used for testing. The algorithm used to generate the feature vector for each training image is given below.

A. Feature vector generation for training images
Consider there are 'M' training images in each class and there are such 'N' classes. All images are converted into gray scale images.
For each class do the steps 1 to 10.
Step 1:Find the average image of that class. Refer equation 1.
Step 2: Find zero mean images by subtracting the average image from each image as given in equation 2.

M i for
Step 3: Convert zero mean images into one dimensional vector by arranging the columns of an image one below the other as shown in fig.2. ] Step 6: Calculate the eigen values (λ 1 to λ M ) and eigen vectors (X 1 to X M ) by solving equation 4. Eigen vectors are ordered according to the corresponding eigen values from high to low.
Step 7: Construct eigen images as given in equation 5.
(Exclude eigen vector corresponding to lowest eigen value as it is extremely small comparatively) Step 8. Convert each vector F i into 2 dimensional eigen image as shown below in fig. 4.
Column n n x n Step 9: Calculate the cumulative energy μ for each F i image.
Step 10: Calculate the feature vector V i for each training image I i of the class as in equation 6 where the coefficient w ji is calculated as given in equation 7 After applying this entire procedure for all 'N' classes, we get the average image of each class, M-1 eigen images for each class and one feature vector of size (M-1)x1 for each training image of that class. Since total training images are MxN, we get 'MN' training feature vectors.

B. Feature vector generation for testing images
Each testing image is converted into gray scale image. The procedure used to generate feature vector for testing image I test is given below(step 1 and step 2). This procedure is repeated for all other testing images.

For each training class do
Step1: Find zero mean test image as given in equation 8. avg test ztest I I I   (8) Where I avg is the average image of that class.
Step2: Calculate the feature vector V test (refer equation 9) where each coefficient w testj is given in equation 10.
Since there are 'N' classes, we get 'N' feature vectors for single test image. Each feature vector size is (M-1)x1.

C. Classification of testing image
When we apply the algorithms explained in section A, we get training feature vector set V containing MxN column feature vectors. Column vector V ij denotes the feature vector for i th training image of j th class.
After applying the algorithm from section B we get feature vector set V test ( for single testing image) containing N column vectors. Each vector V testj denotes the feature vector of testing image on j th class. Procedure to classify the given testing image is given below: The given testing image is assigned to J th class. This procedure is executed for all testing images. Manhattan distance criteria are also used to find the distance between training and testing feature vector.

IV. RESULTS
The implementation of the proposed technique is done in MATLAB 7.0 using a computer with Intel Core i5, CPU (2.50GHz and 6 GB RAM). The proposed technique is tested on the COIL-100 [21] image database. Columbia Object Image Library (COIL 100) is a database of color images of 100 objects. The objects have a wide variety of complex geometric and reflectance characteristics.
For different training database size, the accuracy for each class is calculated. Fig. 3,4 and 5 shows the number of classes in the different ranges of accuracy for different sizes of training databases. Observations: When around 14% of data is used for training purpose, no object class gives more than 80% accuracy. Only one object class gives accuracy of 71%. Most of object classes give accuracy below 60%. Observations: When around 33% of data is used for training purpose, most of object classes give more than 70% accuracy and around 40 classes give more than 90% accuracy. With Euclidean distance, 57 classes and with Manhattan distance, 55 classes give more than 80% accuracy.

V. CONCLUSIONS
The paper presents the application of PCA for an automatic classification of image database. Database used is COIL-100. It is a very large database containing 100 classes, each of 72 images, so total 7200 images. Each image is of size 128x128. If PCA is directly applied to an image for dimension reduction, then it would be computationally very intensive to find the eigen vectors of covariance matrix of size 128x128. In classification of data, generally training data is organized as columns of matrix and then PCA is applied to that matrix. But with such a large database, even if only 10 images per class are used for training purpose, the size of covariance matrix becomes 1000x1000. So to reduce the computational complexity, independent PCA is proposed and tested. In this technique the size of covariance matrix is n x n if 'n' images per class are used for training purpose. Experiments performed with three sizes of training database such as 10 images per class (13.88%), 18 images per class (25%) and 24 images per class (33.33%). When the training database is increased from around 14% to 25% and then to around 33%, the overall classification accuracy increases from 45% to 68% to 76%. Manhattan distance criterion gives overall better performance in comparison with Euclidean distance criterion when the size of training database is small. When 24 images per class are used for training, 40 object classes with Euclidean distance and 37 object classes with Manhattan distance give more than 90% accuracy. In this paper the technique is applied on grayscale image. It can be extended to all three planes of color image and combine the results.

Euclidean Distance
Manhattan Distance