An Intelligent mutli-object retrieval system for historical mosaics

In this work we present a Mosaics Intelligent Retrieval System (MIRS) for digital museums. The objective of this work is to attain a semantic interpretation of images of historical mosaics. We use the fuzzy logic techniques and semantic similarity measure to extract knowledge from the images for multi-object indexing. The extracted knowledge provides the users (experts and laypersons) with an intuitive way to describe and to query the images in the database. Our contribution in this paper is firstly, to define semantic fuzzy linguistic terms to encode the object position and the inter-objects spatial relationships in the mosaic image. Secondly, to present a fuzzy color quantization approach using the human perceptual HSV color space and finally, to classify semantically the mosaics images using a semantic similarity measure. The automatically extracted knowledge are collected and traduced into XML language to create mosaics metadata. This system uses a simple Graphic User Interface (GUI) in natural language and applies the classification approach both on the mosaics images database and on user queries, to limit images classes in the retrieval process. MIRS is tested on images from the exceptional Tunisian collection of complex mosaics. Experimental results are based on queries of various complexities which yielded a system's recall and precision rates of 86.6% and 87.1%, respectively, while the classification approach gives an average success rate evaluated to 76%. Keywords—retrieval; mosaics; metadata; classification; multi- objects


I. INTRODUCTION
Nowadays many visual information retrieval systems with different complexities and recall capabilities were developed, tested and even made available online.Content Based Image Retrieval (CBIR) approach has been studied and explored for decades [1][2][3][4][5].However, such systems usually use only a restricted set of low-level features such as color, texture and shape.The features are often computed globally such as in QBIC system [1] or locally such as the FourEyes system [2].
The QBIC system uses texture and color as global features to index images.The global features have some limitations in modeling perceptual aspects of shapes and usually perform poorly in the computation of similarity with partially occluded shapes.
The FourEyes system uses the local features to index an input image; an image is first divided into small and equal square parts then, shape, texture and other local features are extracted from these squares or regions.These local features are then used to index the whole image.
Recently, various museums are constructing digital archives consisting of high-resolution images of paintings and artifacts to preserve the original copies and to make them available to a wider audience via the Internet.For example, the Hermitage Museum of Amsterdam and its partner IBM use a browsing and retrieval system to make images of its collection available online [6].Vuupijl et al. [7], based on statistic study, demonstrate that 72% of users interest to objects in the images.So, many searches are emerged which give a relevance to the semantic object in the image.For example, Shomaker et al. [8] propose the Vindx system, which uses a cooperative annotation object shape accompanied with a semantic textual classification to index digitized collection of the National Gallery of the Netherlands (the Rijksmuseum [9]).
The Vindx database consists of images of complex paintings from the 17th century containing multiple objects.Some efforts were invested in the improvement of the Vindx system.For example, Broek et al. [10] presented the C-BAR system (Content Based Art Retrieval) to describe the paintings of the Rijksmuseum by the CBIR approach.Broek et al. [11] continue to improve the performance of the system and its user interface.The system developed by Berretti et al. [12,13] indexes objects in an input image based on their shape.Chang et al. [14] developed an XML-based document browsing and retrieval system for the digital museum of Korean porcelain.The high-resolution images consist of various porcelain artifacts photographed on a uniform background.
Recently, some researchers have tackled the problem of indexing and cataloging images of mosaics with all the challenges such image particularities present [15][16][17].M'hedhbi et al. [16] use a CBIR approach to retrieve mosaics based on shape descriptors.The aim is to dedicate a recognition system to archaeologists.In reference [17] Maghrebi et al., present a retrieval system of Roman mosaics images using drawing queries.They use a robust MPEG7 of low level shape descriptors to index objects.Those retrieval systems [1,2,7,8,[10][11][12][13][14][15][16][17] use, mainly, low level features and don't give the possibility to specify multiple objects in the user query.Nevertheless, a few multi-objects based image indexing and retrieval systems have been developed [18][19][20].They have the ability to specify spatial relationships between objects in an image and present to expert users a complex drawing query design.
In this paper we present MIRS, a Mosaics Intelligent indexing and Retrieval System.Our purpose is to extract knowledge from complex mosaics images for multi-object www.ijacsa.thesai.orgindexing and retrieval.This system has been designed to be user-friendly and to simplify the query process as much as possible.
The paper is organized as follows: In section 2 we start by describing the mosaics database and its particularities.Section 3 discuses the general system architecture.Section 4 details our approach to define mosaics metadata.In section 5 we present the classification approach.Section 6 details the retrieval process.Section 7 is dedicated to the experimentations and results and finally, section 8 concludes the paper.

II. MOSAICS DATABASE
Tunisian museums (e.g.Bardo, El-Jem, Enfidha, Sousse, Sfax) house a huge and exceptional collection of mosaics of great historical value.Some of these beautiful mosaics date back to 420 BC and depict various artistic themes.The mosaics have very rich and complex content consisting of many objects of different shapes, colors, sizes, and textures which make the automatic extraction of meaningful objects from an image very challenging if not impossible.These treasures were carefully photographed and catalogued not only to make them available to researchers, but also to limit direct handling of these fragile articles.Fig. 1 shows samples of Tunisian mosaics in our database.Mosaic is the art of creating images through assembling small pieces (or tesserae) of colored natural marble.By its nature, this creation technique can cause natural color variation in supposedly uniform regions within the mosaic.
Our database is composed of 200 mosaic images which have been filtered with a Gaussian low-pass filter ( = 1) to reduce noise caused by the inherent structure of mosaics.We also filtered the images using a 3x3 median filter to reduce the brisk intensity variations within the images.

III. GENERAL ARCHITECTURE OF MIRS SYSTEM
In the proposed system, we use low level features like color and shape descriptors, but we also use high level features such as objects' position, objects spatial relationships.In addition, we propose a fuzzy color quantization approach in HSV color space and a mosaics semantic classification.Our goal is to transform the hard and complex multi-object queries by content to simple textual queries.
As shown in Fig. 2, using the objects database the system extracts crisp features dealing with the object region, area, centroid and the smoothed object boundary.These crisp features are used to extract fuzzy linguistic terms to represent the object's color in the perceptual HSV color space, its position in the mosaic, and its spatial relationships with other objects.

Fig. 2. MIRS: mosaic metadata definition
Our purpose is to attain the mosaic semantic interpretation and to offer to user the ability to handle query that imagines in natural language such as "man between horse and dog" or "poet at the middle near to muse" or "Brown or dark brown horse very far from dog".Both sets of features are formalized into XML language to create the mosaic metadata.

IV. MOSAICS IMAGES METADATA
We apply the fuzzy logic techniques to define the semantic features representing the object position, spatial relationships and color.The semantic linguistic fuzzy features are collected and formalized into XML language to describe mosaics and to facilitate information exchange.The XML language has the capability to self-description, intuitive readable format, simplicity, extensibility since it gives the possibility to add features to description schemas.We detail in the following our approach to define automatically the mosaic semantic features.

A. Fuzzy object position and spatial relationships
The idea is to divide the image into 3x3 regions to define a set of three vertical positions and set of three horizontal ones.We affect to each set fuzzy linguistic terms to encode the object position within the mosaic as a set of membership values to the fuzzy sets Left, Right, Middle, Up and Down.The centroid of the object is used in these calculations.For example, the membership to the sets Left, Right, and Middle www.ijacsa.thesai.orgare calculated based on the horizontal location of the object's centroid within the mosaic using the fuzzy membership functions shown in Fig. 3.We enable the users (historians or layperson) not only to specify the objects they are looking for in a mosaic, but also, specify the object position and spatial relationships with the other objects in the image, to enhance the relevance of the retrieved images.
Our approach uses fuzzy logic techniques to extract the spatial relationships between objects.Our aim is to define semantic linguistic terms that precise twelve fuzzy spatial relationships: Over, Under, On, Beside_of, Very_near, Near_to, Far_from, Very_far_from, At_left_of, At_right_of, between, and Surrounded_by.To determine these spatial relationships we have used three steps:  We predefine models for each spatial relationship to describe fuzzy positions neighboring objects.Consequently, we have defined sixty fuzzy rules.Fig. 5 below shows some samples of the objects neighboring models that are assimilated to "Between" spatial relationships.The evaluation of the fuzzy rules is performed by fuzzy sets operations.We have used min and max operations for respectively the "AND" and "OR" operators.After evaluating the result of each rule and in the inference process we retain spatial relationships of activation degree greater than or equal to 0.5.
Fig. 5.Samples of "Between" spatial relationships predefined models  To define the "On" spatial relationship, we have applied the Wang similarity measure [21] between the two objects fuzzy positions (i.e.vertical and horizontal).Let ( ) and ( ) with i [1,6] vertically and horizontally fuzzy sets of respectively object1 and object2.The Wang similarity measure is defined as [21]: The proposition "object1 ON object2" is true if : i) the objects fuzzy positions similarity greater or equal to threshold t1, ii) the intersection region area between the two object is more than an experimentally predefined threshold t2 and iii) the satisfaction of the following condition : ( ) with (x 1min ,x 1max ,y 1min ,y 1max ) are the coordinates of object1 region and (x 2min ,x 2max ,y 2min ,y 2max ) the coordinates of object2 region.Table 1 shows samples of fuzzy objects position and spatial relationships formalized into XML language.

B. Color fuzzy quantization
Mosaics are a harmony of marble tesserae.So, mosaics are of natural colors.To extract color descriptor, we use the HSV color space (instead of RGB space).The HSV color space is more intuitive and closer to the human perception than the RGB space.
A color is represented in the HSV space by its hue (H) with values between 0 and 360, saturation (S) and value (V) with values between 0 and 100.The value of the hue indicates the nature of the color (e.g.red, blue, etc.), while saturation and value indicate the richness and brightness of the color.For example, Fig. 6 shows the hue variations of the sample mosaics in Fig. 1 (a and c).Notice that the hue values of the first The brightness/darkness of an image is determined by its S and V values in the HSV color space.Fig. 8 shows the saturation values S of the two sample mosaics shown in Fig. 1   (a and c).In the first sample, the saturation values are mainly between 0 and 40 but can go as high as 80, while in the second sample they are mainly between 0 and 50.Based on these values we create four fuzzy linguistic terms to describe the saturation of an object in the mosaic: gray, almost_gray, medium and clear.The investigation of the value component V of two samples of mosaics presented in Fig. 9 reveals that V ranges between 0 and 70.So, we choose to decompose the V component into four fuzzy linguistic terms: very_dark, dark, medium and light.Based on the H, S, and V membership degrees of an object, we created a set of 12x4x4=192 rules that define what a human would perceive as the dominant color of an object.Moreover, let's consider the following sample: if H=15, S=100 and V=40.Our fuzzy controller find dark_brown color which is presented with fuzzy sets as hue=dark_orange, saturation=clear and value =dark.In addition, for every hue values with gray in saturation and light in value the color returned is white.We find the black color if value V is very_dark and saturation S is gray.
We present in following samples of some predefined fuzzy rules to find equivalent semantic linguistic colors.If (Hue="dark_orange" and saturation="medium" and Value="light") then color ="dark_orange" If (Hue ="dark_orange" and saturation= "clear" and Value = "medium") then color = "Honey" www.ijacsa.thesai.orgIf (Hue ="green" and saturation= "medium" and Value = "dark" ) then color = "Dark_green" If (Hue ="red" and saturation= "almost_gray" and Value = "medium then color = "Dark_pink" The evaluation of the fuzzy rules is performed by the fuzzy sets operations.We have used min and max operations for respectively the "AND" and "OR" operators.After evaluating the result of each rule and in the inference process we have used the maximum algorithm as accumulation method.Our approach consists to compute the fuzzy color for each object pixel, determine the object colors histogram and save the five most dominant colors that represent more than 80% of object area.Table 2 shows examples of returned objects colors based on predefined fuzzy rules.

V. MOSAIC SEMANTIC CLASSIFICATION
Historians who usually study the mosaics classify them into religion, cultural, social, economic and natural semantic classes.In the religious scenes we find many gods or goddesses such as "Dionysus" the wine god, "Vulcain" the god of fire and "Oceanus" the sea god.Moreover, in these scenes we can find the wild animals such as lions or tigers and/or the imaginary creatures like seahorses, sea panther or centaurs 1 .
1 Centaurs: imaginary creatures from the Greek mythology that presents a head and torso of man and body of horse We decided that each mosaic contains god names or imaginary creatures, can be classified as religious scenes.Others mosaics are classified as economic scenes while they present the Roman economic activities such as hunting scenes, olive harvest, or fishing scenes.In some mosaic we find Roman leisure.So, we find circus with many games and celebrate fighting force scenes.In these scenes prisoners or gladiators must realize a fighting with dangerous wild animals (e.g.tiger and bear).Other mosaics show the importance that gives Roman to the literature, in this context we find mosaics which describe the famous musicians (e.g.Orphee) or the Muses who protect cultural activities such as music, songs or poetry.These samples of mosaics can be classified as cultural ones.Mosaics showing the worth of savage animals in the Roman period can be classified as natural scene.To classify the mosaics into these five classes we use semantic concepts and we define for each concept a set of keywords.For example "Dionysus" is a keyword for the "religion" concept.We apply an extensional measure which uses the instances of the concept or the term occurrences that denote the concept in the corpus.This measure is introduced by Sanderson & Croft [22].They use the probability p(C) to have the concept (C) in a given corpus:

 
Where Count(n) measures the occurrence number of each concept terms in the corpus, and nbc (n) is equal to the number of concepts that the term n is a label.This probability measure p(c) takes into consideration that one term can be found in one or more concepts.For example the term "lion" can be found with "Dionysus" in a religion scene, or in wildness scene.
We calculate the Ψ(c) introduced by Resnik [23] as followed: The relevant concept C of the predefined mosaic is the concept that verifies the following condition: Where C represents the class number and takes a value between 1 and 5.The application of this approach, on the mosaics metadata, gives in some cases an equal pertinence values between two or more classes.Consequently, it returns more than one class.So, we take all the relevant classes that verify the formula 4.

VI. QUERY AND RETRIEVAL PROCESS
We enable the users not only to specify the objects they are looking for in a mosaic, but also, to specify other semantic descriptors like the object position, semantic color and/or spatial relationships between these objects.Based on the objects described by the user, the system classifies this query using the classification approach explained in section 5 and returns the relevant class(s).The architecture of the retrieval system is summarized in Fig. 11.The user enters a textual query describing: object(s), object position, object color(s), and object's position relative to other object(s).The system processes the query by filtering the non useful word (e.g.the, a, to, of), parsing Boolean operators (e.g.or, and, not), formalizing the user query into the SEQL (Search Engine Query Language) and, using the integrated Niagara search engine [24,25], returns relevant XML documents and consequently mosaics images corresponding.
The textual query is introduced in the natural language.The retrieval process traits this query automatically and recognize the object(s) textual description, position, color(s) and spatial relationships.The system can be queried with one or more objects and one or more colors using Boolean operators.Fig. 12a shows a user looking for mosaics containing "brown or pink or dark brown bear or horse or tiger at right near a dog", and Fig. 12b shows the returned results.EXPERIMENTAL RESULTS In this study, we use 100 queries to test the system.By these queries and using 1050 objects of our XML metadata, we test the system's recall and precision performance and its ability to accurately handle the semantic queries.The overall system is implemented in Java, using client/server architecture and threads.
As shown in Fig. 13a, the precision and recall performance of MIRS retrieval system varies depending on the query, it takes anywhere between 70 and 800 ms to process a given query, depending on its complexity.However, we were able to achieve an overall recall and precision rates of 86.6% and 87.1% respectively.Fig. 13b shows the precision-recall curve for 40 different queries using the extracted knowledge such as the object position, spatial relationship and color.The precision is found to be a decrease function of the recall and reaches a minimum value of 37% when the recall is equal to 1. www.ijacsa.thesai.orgIn our tests we remark that MIRS gives acceptable results but it presents slight failure in some particular cases.For example we can see in Fig. 14 that queries where we wanted a particular object on another one, the system gives a good result for 80% of tested images and failure for 20%.This is may be due to the use of some experimentally predefined thresholds t1 and t2, with t1 limits the objects intersection regions chosen to be grater or equal to 0.1, while t2 precise the positions fuzzy similarity and chosen to be greater than or equal to 0.9.
The proposed classification approach is also tested; results give a success average degree evaluated to 91.6% and reach 97.1% if relevant classes are more than one.The mutli-classes returned is may be due to mosaics with multi-scenes such as agriculture and social scenes.
In this case, if the probability to find one concept is greater than the probability to find another concept, our approach returns the most pertinent class.Whenever, if probabilities are equal, this can lead to multi-classes as shown in Fig. 15.

VIII. CONCLUSION
In this paper, we present an intelligent system to index and retrieve images of historical mosaics.Our purpose is to transform the hard and complex multi-object queries by content to simple textual queries.The MIRS system proposes a fuzzy quantization color approach in HSV color space.In addition, it extracts semantic linguistic terms using a fuzzy similarity measure and proposes a semantic mosaics metadata classification approach.
The system has an intuitive query and response graphic user interface.The queries are in the natural language and specify the objects textual descriptions, colors, position and/or spatial relationships between them.The system was tested on a database containing 1050 objects, extracted from Tunisian historical mosaics images, by a variety of complex queries.The system was able to achieve respectable recall and precision rates of 86.6% and 87.1%, respectively.The average query processing time was around 400ms.

Fig. 3 .
Fig. 3. Membership functions to fuzzy sets Left, Middle, and Right Definition of the memberships degrees to the fuzzy functions Beside_of, Near_to, Very_near, Far_from, and Very_far_from.This step is based on the computation of the relative Euclidian distance between the objects centroids.Using a training image database, we have defined the fuzzy membership function of this subset of spatial relationships as shown in Fig.4

Fig. 4 .
Fig. 4. Fuzzy membership functions to some of the spatial relationships determined based on distances between objects' centroids  Characterization of spatial relationships Over, Under, At_left_of, At_right_of, between, and Surrounded_by.We predefine models for each spatial relationship to describe fuzzy positions neighboring objects.Consequently, we have defined sixty fuzzy rules.Fig.5below shows some samples of the objects neighboring models that are assimilated to "Between" spatial relationships.The evaluation of the fuzzy rules is performed by fuzzy sets operations.We have used min and max operations for respectively the "AND" and "OR" operators.After evaluating the result of each rule and in the inference process we retain spatial relationships of activation degree greater than or equal to 0.5.

Fig. 9 .
Fig. 9. Example of value extracted from two mosaics of Fig.1

Fig. 10 .
Fig. 10.Sample of image classification a) Cultural class.The poet Virgil is between the Muses Clio and Melpomene.b) Religion class.Venus, the goddess of love and beauty, is on sealion.The image contains also a seahorse and seapanther.

Fig. 12 .
Fig. 12.The system's query GUI showing: a) a sample query of query in which we search for "brown or pink or dark brown bear or horse or tiger at right near a dog" VII.EXPERIMENTAL RESULTS

TABLE I .
SAMPLES OF FUZZY OBJECTS SPATIAL RELATIONSHIPS FORMALIZED INTO XML LANGUAGE.

TABLE II .
EXAMPLES OF FORMALIZED FUZZY COLOR IN XML