Texture Based Image Retrieval Using Framelet Transform–Gray Level Co-occurrence Matrix(GLCM)

This paper presents a novel content based image retrieval (CBIR) system based on Framelet Transform combined with gray level co-occurrence matrix (GLCM).The proposed method is shift invariant which captured edge information more accurately than conventional transform domain methods as well as able to handle images of arbitrary size. Current system uses texture as a visual content for feature extraction. First Texture features are obtained by computing the energy, standard deviation and mean on each sub band of the Framelet transform decomposed image .Then a new method as a combination of the Framelet transform-Gray level co-occurrence matrix (GLCM) is applied. The results of the proposed methods are compared with conventional methods. We have done the comparison of results of these two methods for image retrieval. Euclidean distance, Canberra distance, city black distance is used as similarity measure in the proposed CBIR system.


I. INTRODUCTION
Content based image retrieval is emerging as an important research area with application to digital libraries and multimedia data bases [1].Content based mage retrieval is a technique, which uses visual contents to search images from large scale image databases according to user's interests.During the past decade remarkable progress has been made in both theoretical research and development.There remain many challenging research problems that continue to attract researchers from multiple disciplines.The main goal of the content based image retrieval is to find images which are similar to query image visually without using any textual descriptions for the image.
Feature extraction is the basis of content based image retrieval.Images are usually represented by the visual features such as color shape and texture.There are mainly twp approaches for feature extraction in content based image retrieval.(i) Feature extraction in spatial domain and feature extraction in transform domain.Feature extraction in spatial domain based on statistical calculation on the image.Many of the spatial domain methods suffer from insufficient number of features and also sensitive to noise.The transform domain include the use of Discrete Cosine transform (DCT), Multiresolution methods such as Gabor filters, Wavelet transform, curve let and Contourlet transform for feature representation.Most of the transform domain methods is that they do not capture edge information of an image efficiently.Finding better transform domain approaches, which can capture the edge information, is a challenging field in content based mage retrieval.
Content based image retrieval (CBIR) system perform two main tasks (i) Feature extraction, where in a set of features called feature vector is generated to represent the content of the image in the database.The second task is similarity measurement where a distance between the query image and each image in the database using their feature vectors is computed so that the closest images can be retrieved.
Most of the existing methods mainly focused on the efficient extraction of color, shape and texture features.Color is the basic feature, color histograms are commonly used for color feature extraction.The color histogram method requires simple calculation.However it is unsuitable for images in which there is a great color variation .But it does not include any spatial information.
Shapes are based on contour information in an image which includes edge detection and correlograms.Edge detection leads better results only clear contour information.
Texture is one of the important features due to its presence in most of the real and synthetic radar imagery which makes high attention in CBIR and also medical imaging, Remote sensing etc. Wavelet transform have been used most widely in many aspects of image processing such as noise removal, image compression, image super resolution and image retrieval.The texture feature of an image is extracted by mean and variance of the wavelet subbands .But wavelets [2][3][4] loses their universality in capturing the edge discontinuities in image which is important in texture representation.
Another mMultiresolution approach, Gabor filters [5][6][7][8] consists of group of wavelets each of which capturing energy at a specific resolution and orientation.Therefore Gabor filters are able to capture local energy of the entire image .But it suffer due to computational complexity, their non-invariance to rotation as well as non orthogonal property of the Gabor filters that implies redundancy in the filtered image.www.ijarai.thesai.org The Dual tree complex wavelets transform (DT-CWT) [9][10][11][12] as introduced by Kingsbury has been used to found an important tool for image texture analysis and feature extraction which overcome the drawbacks of both Gabor and Discrete Wavelet transform(DWT).
Curvelet transform [13]- [15] was introduced by Donoho is another multiresolution transform which provides more edge information.But computationally is not efficient in in large images.
Contourlet transform [16][17][18][19] was proposed by Do& Vetterli is a multiscale and directional image representation that uses a wavelet like structure for edge detection and than a local directional transform for contour segment detection .This transform also shift sensitive due to up and down sampling of Laplacian filters.
Cunha and Zhou proposed a modified version [20][21] of Contourlet transform which was constructed by combing a non sub sampled Laplacian pyramid and non sub sampled directional filter banks known as Non sub sampled Contourlet transform.Though much advancement made in content based image retrieval system, finding an efficient retrieval method is a major challenge for researchers.In this paper we introduced a new texture feature based on Framelet transform is proposed .The technique makes use of Framelet transform which represents the latest research on multiresolution analysis of digital image processing.This method overcomes the weakness of conventional wavelets to obtain noise free edges of images with less computational complexity.
All standard paper components have been specified for three reasons: (1) ease of use when formatting individual papers, (2) automatic compliance to electronic requirements that facilitate the concurrent or later production of electronic products, and (3) conformity of style throughout the proceedings.Margins, column widths, line spacing, and type styles are built-in; examples of the type styles are provided throughout this document and are identified in italic type, within parentheses, following the example.

II. FRAMELET TRANSFORM
Framelet transform [22][23][24][25] which is similar to wavelets but has some differences.Framelets has two or more high frequency filter banks, which produces more subbands in Decomposition.
This can achieve better time frequency localization ability in image processing.There is redundancy between the Framelet sub bands, which means change in coefficients of one band can be compensated by other sub bands coefficients.After Framelet decomposition, the coefficient in one subband has correlation with coefficients in the other subband.This means that changes on one coefficient can be compensated by its related coefficient in reconstruction stage which produces less noise in the original image.

A. Mathematical overview
In contrast to wavelets, Framelets have one scaling function () and two wavelet functions  1 () and 2 ().
A set of functions { 1 ,  2 , … … … . . −1 } in a square integrable space  2 is called a frame if there exist A>0, B<∞ so that, for any function (1) Where A and B are known as frame bounds.The special case of A = B is known as tight frame.In a tight frame we have, for all  ∈  2 .In order to derive fast wavelet frame, multiresolution analysis is generally used to derive tight wavelet frames from scaling functions Now we obtain the following spaces, The scaling function () and the wavelets  1 () and  2 () are defined through these equations by the low pass filter 0 () and the two high pass filters 1 () and 2 ()

B. Perfect Reconstruction conditions and Symmetry Conditions
The Perfect Reconstruction (PR) conditions for the three band filter bank can be obtained by the following two equations A wavelet tight frame with only two symmetric or anti symmetric wavelets is generally impossible to obtain with a compactly supported symmetric scaling function().Therefore if 0 () is symmetric compactly supported.Then antisymmetric solution 1 () and 2 () exists if and only if all the roots of 2 −  0   0  −1 +  0 −  0 − −1 has even multiplicity.case 2  =  2 − : The goal is to design a set of three filters that satisfy the PR conditions in which the low pass filter 0 () is symmetric and the filters 1 () and 2 () are either symmetric or anti symmetric.There are two cases.Case I denotes the case where 1  is symmetric and 2  is anti symmetric.Case II denotes the case where 1 () and 2 () are both anti symmetric.The symmetric condition for 0 () is Where N is the length of the filter 0 () .We dealt with case I of even length filters.Solutions for Case I can be obtained from solutions where 2  time reversed version of is 1 () and where neither filter is anti symmetric.To show this suppose that 0 () , 1 () and 2 () satisfy the PR conditions and that 2  = 1 ( − 1 − ) (10) Then by defining The filters 0 1  , 2  also satisfy the PR conditions, and 1  and 2  are symmetric and symmetric as follows The polyphase components of the filters 0 () , 1 () and 2 () are given in [25] with symmetries in Equ(9) And Equ (10) satisfies the PR conditions .The 2D extension of filter bank is illustrated on "Fig .1". Fig. 1.
An Over Sampled Filter Bank For 2D Image

III. GRAY LEVEL CO-OCCURRENCE MATRIX(GLCM)
Haralick first introduced the use of co-occurrence probabilities using GLCM [26][27][28] for extracting various texture features.GLCM is also called as Gray level Dependency Matrix.It is defined as "A two dimensional histogram of gray levels for a pair of pixels, which are separated by a fixed spatial relationship."GLCM of an image is computed using a displacement vector d, defined by its radius δ and orientation θ. δ values ranging from 1, 2 to 10 and every pixel has eight neighbouring pixels allowing eight choices for θ, which are 0°, 45°, 90°, 135°, 180°, 225°, 270° or 315°.
Gray level co-occurrence matrix (GLCM) is generated by counting the No. of times a pixel with i is adjacent to pixel with value j and then dividing the entire matrix by the total No. of such comparisons made.Each entry is therefore considered to be the probability that a pixel with value i will be found adjacent to a pixel of value j.Where  ,  is the co-occurrence probability between gray levels i and j. i and j = Within the given image window, given a certain (δ, θ) Pair .G is the quantized number of gray levels.
The sum in the denominator thus represents the total number of gray level pairs (i, j) within the window.Graycomatrix computes the GLCM from a full version of the image.By default, if I a binary image, graycomatrix scales the image to two gray-levels.If I is an intensity image, graycomatrix scales the image to eight gray-levels.In order to use information contained in the GLCM, Haralick defined 14 statistical measures to extract textual characteristics.In this paper we used 4 features that can successfully characterize the statistical behaviour Let us consider P is the normalized GLCM of the input texture image.The choice of the displacement vector d is an important parameter of the Gray level co occurrence matrix.In general GLCM is computed for several values of d and the one which maximizes a statistical measure computed from P(i, j) is used.

IV. THE PROPOSED ALGORITHM USING FRAMELET TRANSFORM
The basic steps involved in the proposed CBIR system as follows.Flow Of Algorithm Using Framelet Transform www.ijarai.thesai.org

V. THE PROPOSED ALGORITHM USING FRAMELET TRANSFORM-GRAY LEVEL CO-OCCURRENCE MATRIX (GLCM)
1) The images are decomposed using Framelet Transform.
2) GLCM of the Decomposed subbands are calculated with orientation and distance 3) Finally feature vectors such as contrast, Energy, Correlation, Homogeneity were extracted from GLCMs of subbands.
4) The resulting feature vector () is given by  = [   ,   ,   ,   ] ] is used to create the feature database.
5) Apply the query image and calculate the feature vector as given in step ( 2) & ( 3).
6) Calculate the similarity measure using Euclidean distance, Canberra distance, Manhattan distance.

7) Retrieve all relevant images to query image based on minimum Euclidean distance, Canberra distance, and Manhattan distance.
The feature vectors are stored to be used in the similarity measurement.For creation of feature database above procedure is repeated for the entire image and these feature vectors stored in feature database.Flow of the algorithm is shown in Fig. 3 Fig. 3.
Flow Of The Algorithm Using Framelet Transform+ GLCM

VI. SIMILARITY MEASUREMENT
A query image is any one of the images from image database.This query image is processed to compute the feature vector.Distance metrics are calculated between the query image and every image in the database.This process is repeated until all the images in the database have been compared with the query image.After completing the distance algorithm, an array of distances is obtained and which is then sorted.In the presented work three types of similarity distance metric are used as given below:

C. Euclidean Distance
Euclidean distance is not always the best metric.The fact that the distances in each dimension are squared before summation, places great emphasis on those features for which the dissimilarity is large.Hence it is necessary to normalize the individual feature components before finding the distance between two images.Where q is the query image,  is feature vector length is image in database. . is   feature of image in the database. . is   feature of image in the query image q.

VII. PERFORMANCE EVALUATION
To evaluate the retrieval efficiency of the proposed system, we use the performance measure, Recall and Precision.Recall measures the ability of the system to retrieve all the models that are relevant, while precision measures the ability of the system to retrieve only the models that are relevant.To evaluate the algorithms such as Framelet transform and Framelet co-occurrence features in image retrieval, two relative algorithms DWT and DWT combined with Gray level co-occurrence are used to compare with it.We have used three different similarity measures, Euclidean distance, Manhattan distance and Canberra distance.
First algorithm is based on extracting features from coefficient in the subbands of Framelet Transform.In this Energy, standard deviation and Energy + Standard deviation is used to create the feature vectors.Three distance measures are used to match 4 sample query images with the Database includes 400 images of four different classes and the tabulated results were shown in Table 1.The proposed method gives high precision compared to discrete wavelet transform (DWT) based image retrieval.
In second algorithm Framelet-Gray level co-occurrence matrix (GLCM) method is particularly better than the method based on Discrete Wavelet transform-co-occurrence matrix (GLCM).
Proposed method we have used four GLCM statistical measures namely Energy, Contrast, Homogeneity and correlation .Gray level co-occurrence of coefficients of Framelet transform subbands angle( 0 0 ) and distance (d=1) were used and the precision results shown in Table .2.Which shows proposed method is efficient than DWT-GLCM methods.Canberra distance measures gives better retrieval results in both the proposed methods.Average Precision is calculated and the graph is drawn between Average precision with distance measures as shown in "Fig.5 The search for the relevant information in the large database has become more challenging.More précised retrieval techniques are needed in such cases.In this paper a new algorithm for content based image retrieval was presented.(i) Framelet Transform [Energy+ Standard Deviation] (ii) Framelet transform and GLCM were combined to build a feature vectors.Euclidean distance, Manhattan distance were used to match the four sample query image with four classes of [Cars, Flowers, Horse, Buildings] 400 images from WANG image database which includes 1000 image of ten different classes.The proposed method gives better retrieval results and higher precision.We have used only four GLCM statistical features with angle ( 0 0 ) and distance (d=1).Feature work of this study are Framelet-co-occurrence with different angle and various distances.To extract the feature vectors in images and design the CBIR system based on Framelet-co-occurrence features.

1 )
Feature vector () Decompose each image in Framelet Transfrom Domain.2) Calculate the Energy, mean and standard deviation of the Framelet transform Decomposed image.-Mean value of the   Framelet transform subband.  Coefficient of   Framelet transform subband. ×  is the size of the decomposed subband.3) The resulting  = [ 1 ,  2 , … .,   ,  1 ,  2 … …   ] is used to create the feature database.4) Apply the query image and calculate the feature vector as given in step (2) & (3).5) Calculate the similarity measure using Euclidean distance, Canberra distance, Manhattan distance.6) Retrieve all relevant images to query image based on minimum Euclidean distance, Canberra distance, and Manhattan distance.The flow of algorithm is shown in "Fig.2".

2 D
. Manhattan or City block Distance Manhattan or City block Distance is computationally less expensive than Euclidean distance because only the absolute differences in each feature are considered.City block distance is given by  ,  = |  ()  −    | E. Canberra Distance In this distance, numerator signifies the difference and denominator normalizes the difference.Thus distance values will never exceed one, being equal to one whenever either of the attributes is zero.Thus it would seem to be a good expression to use, which avoids scaling effect.It is obvious that the distance of an image from itself is zero. ,  = | , −  , | | , +  , |   =1 = Number of relevant images retrieved Total number of images retrieved  = Number of relevant images retrieved Total number of relevant images Precision Results are tabulated in Table .
1 and Table .
2. VIII.EXPERIMENTAL RESULTSThe algorithm is implemented in MATLAB platform.Database of 400 images of 4 different classes is used to check the performance of the algorithms developed.Same representative sample images which are used as query images as shown in "Fig.4".

TABLE I
"