Content -based Image Retrieval for Image Indexing

—Content-based image retrieval has attained a position of overwhelming dominance in computer vision with the advent of digital cameras and explosion of images in the Internet and Clouds. Finding the most relevant images in a short time is a challenging job with many big cloud sites competing in image search in terms of accuracy and recall. This paper addresses an image retrieval system employing color information indexing. The system is organized with the hue components of the HSV color model. To assess the precision of the image retrieval system, experiments have been carried out on a database consisting of 450 images drawn by Japanese traditional painters, namely Sharaku, Hokusai, Hiroshige, and the images obtained from the World Wide Web (WWW) multicolor natural scenes. In order to query the database, the user specifies an object on which the same color attributes are evaluated and all similar looking images are exposed as the outcomes of the query.


INTRODUCTION
Content-based image retrieval is emerging tremendous interest in image processing, computer graphics, computer vision, pattern recognition, image management system, and so on [1][2][3][4].In distinction to the traditional text-based approach, several benefits have been reported in literature for contentbased access to images, such as automatic identification, classification, recognition, and retrieval of large digital libraries with photographic images, satellite images, medical images for remote searching and browsing over the ever increasing World Wide Web (WWW).
A fair amount of developments were carried on over the last couple of decades in image retrieval system due to the enormous interest of establishing multimedia information systems and database systems.The convergence of image processing/computer graphics and database technology yields the basis for the creation of such digital image archives.Moreover, with the proliferation of WWW, a remarkable amount of visual information is ready accessible publicly.As a result, it has become a promising demand for search strategies retrieving pictorial entities from large image documentations [5].Over the past half century there has emerged an increasing interest in the cultural heritage of Japanese society.In tandem with this, and indeed a logical consequence of this, Japanese traditional painting pictures, known as Ukiyoe pictures [6], have been a growing concern with preserving, disseminating, displaying and effectively exploiting the rich cultural resources embodied in many museum and art gallery collections.Benefits of the technology are numerous and the most important points which are generally given are that the use of digital versions of surrogate representations of works of art can: assist security, provide a central database of information to provide easy retrieval of relevant material, assist in preservation of originals and provide a networkable resource of images which greatly increases availability and access to the system.
Image contents include color, shape and texture.Among these contents of images, color provides an efficient visual clue for image retrieval.Managing image data in this regard entails processing, storage, and retrieval of pictorial representations [5].Due to its graphical illustration, the color histogram becomes the most frequently used technique for image indexing.It provides a convenient tool for computing the similarity between different images, since it proves to be robust to object translation, scaling, rotation, occlusion, deformation, and so on [7].
A substantial amount of research works have been reported in literature [8][9][10][11][12][13][14][15][16] on content-based image retrieval (CBIR).Swain and Ballard [17] proposed a color-based object recognition employing color histograms for matching between image regions and query objects.Kieldson and Kender [18] applied Gaussian kernels to smooth the histograms on finding skin in color images.Funt and Finlayson [19] developed a color indexing algorithm for object recognition to take into account the influence of lighting conditions.Ennesser and Medioni [20] proposed a local histogram method to localize objects in images.Chang and Wang [21] developed a texture segmentation algorithm employing color histogram.McKenna, Raja and Gong [22] employed an adaptive Gaussian mixtures to model the color allocations of objects.Androutsos, Plataniotis and Venetsanopoulos [23] established a cosine metric based distance measure for color image indexing and retrieval.Their query method is very flexible and provides single and multiple color queries.Liu and Ozawa [24] proposed the spatial neighborhood adjacency graph (SNAG) which could serve as a basis for detecting object by color contents from the candidate images.Sharma et al. [25] have represented images by global descriptor and developed a CBIR system that used color histogram processing.This system is not yet a commercial success.Because the distribution of RGB values changes proportionally with the illumination, which is suitable to some images but have low precision on others.This paper addresses a color-histogram based method for indexing and retrieving color images.Different dominant and perceptually relevant colors have been extracted from each image in RGB model and are stored in the respective database files.Images are being identified and classified in HSV space depending on the color contents prevailing in these dominant www.ijacsa.thesai.orgcolors, that is, whether a particular color component is significantly present in an image or not.Similarity between different images has been calculated on the basis of Minkowski distance metric.Experimental results demonstrate that the method is capable of indexing, classifying, and retrieval of images with distinct color properties.
The rest of the paper is organized as follows.Section II describes color model.Section III illustrates histogram and image retrieval.Section IV and Section V describes color quantization and image query, respectively.Section VI highlights experimental results and finally, Section VII draws the overall conclusions of this paper.

II. COLOR MODEL
Numerous color models have been justified for color specification, such as CIE (R,G,B), (X,Y,Z), (L*,u*,v*), and so on.The main drawback of the CIE (R,G,B) model is that it is not perceptually uniform and the proximity of colors in RGB space does not indicate color similarity.The (X,Y,Z) color space is not uniform, that is, the Euclidean distance between two colors is not proportional to the color difference perceived by humans.Although the (L*,u*,v*) space is uniform, but nevertheless, is not intimately related to the way in which humans perceive color [21].
A color is represented in HSV color space by the three features: hue, saturation and value.Hue is the characteristic of visual perception that corresponds to color sensation related to the dominant color, saturation indicates the comparative purity of color content and value specifies the brightness of a color.The HSV color model organizes similar colors under similar hue alignments.The transformation from RGB color space to HSV space is given by the equations [23,[26][27][28].
, 255 where R,G,B are the red, green, and blue component values which exist in the range [0,255].This research employs HSV color model for classification of pictures drawn by the painters namely Sharaku, Hokusai, Hiroshige and natural pictures.The RGB model has been used to identify the Ukiyoe pictures because these are being distinguished according to their red, green and blue color components in the face parts.

III. COLOR HISTOGRAM AND IMAGE RETRIEVAL
Color histogram [29,30] represents the distribution of colors in an image.A color histogram is a stable object illustration which is unaffected by occlusion and changes in viewing conditions, and that a color histogram has the advantage of being insensitive to scaling, rotation, and small deformation of objects and being immune to noise [31].
The basic idea to image retrieval by color content is to extract the characteristic colors from target images which are matched with those of the query.Different images from the database are then searched to check whether a specific color feature value is prominently existing in an image or not.If a number of images contain the substantial amount of that query color, then these are marked according to the priority basis.The architecture of the proposed system is shown in Fig. 1.
Since the hue component of the HSV color model performs better with human chromatic perception [17], the hue component has been chosen to designate the colors of images.Pixels in the image are characterized in the RGB space, so it appears natural to define the color attributes as the red, green, and blue value at each pixel.Let a color image ) , ( y x Q of a color image is achieved by counting the number of pixels which have got a hue value in the image: where # denotes the number of pixels with a hue value is the total number of image locations.
A few sample color images and the hue histograms computed from the respective images are illustrated in Fig. 2.
Images are being classified depending on the prominent colors.Color segmentation has been employed to extract the regions of dominant and perceptually relevant colors.Natural pictures are being separated from those of the painting pictures on the basis of the ratio a r of the area containing ten dominant colors dc a to that of the total area containing all colors .
ac a The reason behind this choice appears from the fact that the painting pictures contain only a few number of colors (the painters use only a limited number of colors during painting) in comparison to that of the infinite number of colors in nature.So the dominant colors contribute more to the images in comparison to those of the natural pictures.Painting pictures are being identified and classified according to the name of the painters, such as Sharaku, Hiroshige and Hokusai depending on the dominant color components because it has been found from the experimental results that the pictures are being fashioned with different colors according to the color choice of the painters.So ten prominent colors are being extracted from the hue histogram in RGB space for each image and the representative vectors are identified as, ), , , ( of colors belong to the images that resembles the most widely used colors of a given painter. The similarity between different painters are calculated on the basis of Minkowski-form vector distance metric from their hue histograms.The generalized Minkowski-form distance metric ( M L norm) is given by: where N is the dimension of the vectors q h and , t h and i q h is the i-th component of .
q h This research uses , 2  M (which is often used for M L metric).
Let q h and t h be the query and target histograms, respectively, then application of the histogram intersection operator introduced in [17] provides a simple way to match two different images q I and t I through their color histograms as [32]: Ukiyoe actors are distinguished from those of the natural and painting pictures on the basis of face colors.Human skin colors cluster in a small region in a color space.Although the color representation of a face obtained by a camera is influenced by many factors such as lighting conditions, facial expressions, etc. and the skin colors cluster and differ from person to person in different races [33,34], Ukiyoe actors are nevertheless drawn by some distinct colors by different painters.The presence of some colors in a specific zone provides information whether the images are of actors' faces or not.Fig. 3 In order to reduce the computational cost in segmentation, an input color image is quantized so that the number of colors contained in the image is reduced while the primary chromatic information about the image still remains the same.In the quantization method [21], the number of quantized colors are first determined, say k, by a histogram thresholding technique.Then fuzzy c-means classification algorithm is performed to classify each pixel in the image to one of the k colors [34].The number of clusters are decided depending on the threshold values.In fuzzy clustering, each color has a degree of belonging to clusters, rather than belonging completely to just one cluster.Thus, colors on the edge of a cluster, may be in the cluster to a lesser degree than the colors in the center of the cluster.For each color C we have a coefficient providing the degree of being in the j-th cluster u j (C).Usually, the sum of those coefficients for any given C defined to be 1: where k is the number of clusters.
(e) Sharaku10 (f) Hokusai10 (g) Hiroshige10 (h) Nature10 www.ijacsa.thesai.org In fuzzy c-means, the centroid of a cluster is the mean of all colors, weighted by their degree of belonging to the cluster.Therefore, the center of the cluster, r j will be: where r j is the center of j.
The degree of belonging is related to the inverse of the distance the cluster center: , ) , ( 1) ( Then the coefficients are normalized and fuzzyfied a real m > 1 so that their sum is 1.Therefore, .) , ( ) , ( 1) ( This investigation uses m equal to 2, which is equivalent to normalizing the coefficient linearly to make their sum equals 1.The query process is to effectively find and retrieve those images from the database are most similar to the user's query image.In this case, z-nearest neighbor query is employed, which retrieves the z images that are most similar to the query image (which are typically sorted by lowest dissimilarity to the query image).Given a number N of I images and a feature dissimilarity function ,  where q I is the query image and fd T is the threshold for feature dissimilarity.In this case a query returns any number of images depending on the bounds defined by the threshold of feature dissimilarity .

fd T VI. EXPERIMENTAL RESULTS
The effectiveness of the proposed method has been justified over some experimental results.The database furnished for this experiment contains a total of 450 images: 80 drawn by Sharaku, 80 by Hokusai, 80 by Hiroshige and 210 natural pictures (sea, flowers, sunrise, sunset, scenery, animals, architectures, towns, etc.) down-loaded from the Internet.The snapshot of the CBIR is shown in Fig. 4. When a user selects the query image and specifies the threshold value for L 2 norm, all the similar looking images are then displayed.Classification of images drawn by the painters and those of the natural pictures have been achieved on the basis of the ratio of the area containing the dominant colors to that of the total area from their respective hue histograms.The percentage a r of the area bounded by the five dominant colors to that of the total area has been calculated from the hue histograms for different images.The hue component values of five dominant colors found for different actors is given in Table 1.For natural pictures the dominant colors change within the range [0,360] depending on the color properties of the images.The a r versus the number of occurrences graph is shown in Fig. 5, which reveals that if the threshold value is taken up to 1.0, where  is the variance, for the five dominant colors, pictures drawn by Sharaku will get the a r value within the range [19.29,28.67],Hokusai within the range [12.45,20.77],and Hiroshige within [29.90,39.72],respectively.Relating Sharaku, there is an outstanding difference from other painters almost all of his drawn pictures are of Ukiyoe actors.The painters used only a few number of colors during drawing pictures.The number of colors used by different painters has been justified for different threshold values of a r and is shown in Fig. 6, which reveals that the natural pictures have got innumerable number of colors whereas those of the painting pictures are limited.Finally, the Ukiyoe actors are distinguished from the normal pictures according to the RGB color distribution in the face color.Fig. 7 shows the skin color distribution of 80 Ukiyoe actors faces, where two distinct color zones are found.For the larger cluster, mean values for red, green and blue are R m =103.65,G m =104.

Name of painters
The precision versus number of images and recall versus number of images are shown graphically in Fig. 8 and Fig. 9, respectively, for different classes of images.Finally, the proposed method has been compared with the existing www.ijacsa.thesai.orginfluential methods where similar measures have been established.The comparison has been made in terms of precision and is shown graphically in Fig. 10.The graph reveals that the proposed method performs better for less number of images and for higher number of images the performance is more or less the same like existing methods.

VII. COLONCLUSION
This paper describes the design and implementation of the content-based image retrieval system.So a quantitative analysis of color distribution has been presented for searching, indexing and retrieving color images.In the query process, the goal of the query is to retrieve images of interest.Prior to the trials, 450 images has been inspected, among them 320 are designated as painting pictures according to the ratio of the dominant colors to those of the total areas from the hue histograms.When the query is issued, the corresponding color index file is analyzed to select a set of candidate images containing regions with the similar colors of the query.The major limitation of the proposed method is that similarity measure for image retrieval has been established on the basis Minkowski distance metric with 2 L norm.Other types of similarity measures like Mahalanobis metric or hausdorff distance can be considered in future for expressing similarity between colors.Our future plan is to develop a multimedia information system that will be able to perform the storage, browsing, indexing, and retrieval of multimedia data based on their text, sound and video contents.

Fig. 1 .
Fig. 1.Architecture of the image retrieval system (a) illustrates a face image, and Fig. 3(b) illustrates the skin color distributions in the RGB color model.www.ijacsa.thesai.org

Fig. 3 .
Fig. 3. RGB color distribution of a typical Ukiyoe actor's face V. IMAGE QUERY

TABLE I .
HUE COMPONENT VALUE OF DIFFERENT ACTOR