Shape based Image Retrieval Utilising Colour Moments and Enhanced Boundary Object Detection Technique

The need for automatic object recognition and retrieval have increased rapidly in the last decade. In contentbased image retrieval (CBIR) visual cues such as colour, texture, and shape are the most prominent features used. Texture features are not considered as a significant discriminator unless it is integrated with colour features. Colour-based image retrieval uses global and/or local features has proved its ability to retrieve images with a high degree of accuracy. In contrast, shape-based retrieval is still suffering from numerous unsolved problems such as precise edge detection, overlapping objects, and high cost of feature extraction. In this paper, global colour features are utilized to discriminate unrelated images. Furthermore, a novel hybrid approach is proposed, consisting of a combination of boundary-based shape descriptor (BBSD) and region-based shape descriptor (RBSD), image retrieval. An enhanced object boundary detection (EBOD) is proposed, which uses canny edge detector to detect shape boundaries, with morphological opening to remove isolated nodes. Subsequently, morphological closing is utilized to solidify objects within the target image to enhance shape-based features representation. Finally, shape features are extracted and Euclidean distance measure with different threshold values to measure the similarity between feature vectors is adopted. Five semantic categories of WANG image database are selected to test the proposed approach. The results of experiments are promising, when compared with most common related approaches. Keywords—Boundary Based Shape Descriptor (BBSD); Region Based Shape Descriptor (RBSD); CBIR, EBOD; edge detectors


I. INTRODUCTION
CBIR is one of the most enthusiastic research area since 1970.It enables us to retrieve images based on visual content rather than textual description.Image databases with thousands or even millions of images are easily to create, maintain, and manipulate with less cost and high level of efficiency.It is obvious that many fields such as biometric security and medicine need better image database retrieval systems with degree of precision.As stated in [1], colour is one of the most important features to be extracted in any CBIR system.Many researchers deployed colour histogram approach.Colour histogram is easy to compute with acceptable level of retrieval accuracy.But it lacks to spatial distribution and less efficient in handling noise.To overcome limitations of the colour histogram colour moments is applied [2].Human can easily recognize objects within an image.Therefore, shape descriptor is considered as one of the most significant descriptors that may enhance image based retrieval.
There are two major methods of feature extraction in shape based image retrieval, namely boundary or contour based descriptor and region based descriptor.
Boundary based feature extraction is relied on outer boundary, while with region based feature extraction the whole region is considered [3].The boundary of objects within an image may identified by determining sharp discontinuities (changes in pixel intensity) within image.Sharp discontinuities in an image are very well known as edge detection [4].Sobel, Prewitt, Canny, Laplacian and Roberts are considered as examples of traditional edge detection operators [5].
As stated in [6] boundary based shape descriptors (BBSD's) are needed when boundary (contour) has importance over the interior content of the shape while region based descriptors (RBSD's) are needed when the interior content of the shape is significant to the retrieval process.BBSD's and RBSD's are further classified as local descriptor and global descriptor.When image is segmented to different regions and features are computed and based on these regions then it is considered as local descriptor, while if the whole shape is considered as one region then it is considered as global descriptor.Most researchers considered either BBSD's [7] or RBSD's [8].Furthermore, simple shape descriptors such as major axis length, eccentricity, and circularity are not perform as good discriminators if there is no big differences between shapes [6].There is a lack of researches to explore the integration of boundary based and region based image retrieval techniques.Whenever, it comes to shape based retrieval a lot of concern is given to global features extraction and boundary based retrieval.That because shape based features extraction is a time consuming process, so most of researchers compromises between efficiency and accuracy.In this paper, a hybrid approach to combine boundary based shape descriptors and region based shape descriptors is proposed.To enhance the accuracy of retrieval and maintain high level of feature extraction efficiency, a systematic approach is proposed to isolate the shape region from the background, enhance object recognition by proposing a new algorithm named as enhanced boundary object detection (EBOD), utilising morphological opening to remove isolated nodes and morphological closing to solidify objects within image, and extract global shape based descriptors.The time consuming processes are done as off-line process, while simple global features extraction of image query is done as online process.The rest of the paper is organized as follows.www.ijacsa.thesai.orgSection 2 illustrates the proposed approach in details.Section 3 presents similarity measures deployed in this research.Section 4 discusses experimental results.Finally, Section 5 highlights conclusion and future work.

II. PROPOSED APPROACH
The proposed approach allows colour features extraction based on colour moments, while object based recognition is done using boundary based and region based techniques.Features extracted via colour and shape is combined together in one feature vector to represent target image.Then similarity based on Euclidean distance measure is used to rank and retrieve images.Figure 1 shows the system diagram of the proposed approach.

A. Colour Features Extraction
Global and local colour features are widely used in CBIR.Many research attempts is made to localize colour features by dividing images into equal sub images or dividing images to overlap sub images [9].Colour-based image retrieval utilising local features overcomes the limitations of global features like; the depiction of spatial distribution of colours.In this research, spatial distribution of colours is not significant because shape rather than colour is the main concern.Consequently, global colour features were adopted to reduce the cost of computation.Colour feature extraction involves two steps:  Separate each channel of the RGB images to R, G, and B.
 Extract features (colour moments) shown in table 1 for each channel.

Mean
The mean value for each colour channel R, G, and B. mean2

Standard deviation
The standard deviation value for each colour channel R, G, and B. std2

Entropy
The entropy value for each colour channel R, G, and B. As stated in [10] entropy is a measure of randomness used to depict the texture of the input image.Entropy is defined as : -sum(p.*log2(p))

B. Object Recognition
One of the most challenging topics in shape-based image retrieval is the accuracy object identification and recognition.In order to, retrieve images based on shape we need to identify objects, isolate them from background, and extract shapebased features.Figure 3 shows the proposed system architecture to identify, enhance, and solidify object.b) To identify the T p continue scanning after S p until a node with 1's is found then consider that point as T p and apply the suitable rule as shownin Algorithm II.
 After filling the nodes in between Sp and Tp, check the connectivity's of Tp based on the following priority rules.

If 1 is True then move to 1 Else if 2 is True move to 2 Else if 3 is True move to 3 Else if 4 is True move to 4 Else
Let Tp = New Sp then go to b Algorithm II: Join Disconnected Edges (JDE) S p = (x 1, x2 ) T p = (y 1 , y2 ) ∆r = y 1x 1 : difference in rows ∆c = y 2x 2 : difference in columns 1 st case:  Fill in between (x 1 , x 2 ) and (y 1 , y 2 ) with 1 2 nd case: That indicate to left connectivity then do:  Based on ∆r value add one to x 1 gradually and fill the new points with 1's.
3 rd case: That indicate to diagonally right connectivity then do:  Add one to x 1 and x 2 till we reach target point and fill the new points with 1's.

th case:
If (∆r = = 0 && ∆c ≥ 1) Then, the start point and the target point at the same row.Do:  Add the ∆c gradually to the start point and fill new points with 1's till it reaches the target point.Figure 7 shows an example of the 2 nd case.Index of last point is (4, 13), target point is (6,6), ∆r = 2, ∆c = -7.Add one to the row of S p (5, 13) and fill with 1, add 1 to S p again (6, 13) and fill with 1. Finally, fill in between (6, 6) and (6, 13) with 1's. Figure 8 shows an example of the 3 rd case.Index of last point is (7,5), target point is (10,8), ∆r = 3, ∆c = 3. Add one to x 1 and x 2 gradually to reach target point and fill the in between points with 1's. Figure 9 shows an example of the 4 th case.Index of last point is (14,13), target point is (14,16), ∆r = 0, ∆c = 3.The two points at the same row, so fill in between with 1's.In between points (15,15) and (15, 19), 4 th case was applied.10 shows an example of the 5 th case.Index of last point is (15,19), target point is (20, 14), ∆r = 5, ∆c = -12.Add 1 to x 1 and -1 to x 2 gradually to reach target point and fill the in between points with 1's.
Figure 11 shows the final result.Figure 12 shows the result of applying object recognition model to a real image shown at the upper left corner.In order to achieve better results, the scanning process is done twice from left to right and from right to left.The same approach EBOD is applied with slight modification.

2) Shape-based features:
Features selection is crucial to any CBIR system.In this research, many shape-based features were tested and the most significant features are selected to represent objects within image and to compare query image with images database.Table 2 shows shape features that are selected from list of features available in MATLAB R2015a documentation.Regionprops function in MATALB is used with the following features shown in Table 2.

Shape feature Description
Area It is a scalar that specifies the actual number of pixels in the region.

Centroid
It is a vector that represents the center of mass of the region.
In this research, the first two elements of the vector are considered.These elements specifies the horizontal and vertical coordinate of the center of mass.

Major Axis Length
It is a scalar that identifies the length of the major axis of the ellipse.The ellipse and region have similar normalized second central moments.

Minor Axis Length
It is a scalar that specifies the length of the minor axis of the region based on ellipse that has normalized second central moments same as the region.

Eccentricity
The eccentricity is the ratio of the distance between the foci of the ellipse -that has the same second moment as the regionand its major axis length.Equiv.

Diameter
It is a scalar that identifies the diameter of a circle that has the same area as the region.Calculated as sort(4*Area/pi).3 shows an example of shape features extracted, which represent the image shown in figure 13.

III. SIMILARITY AND PERFORMANCE MEASURES
Distance measure or distance functions are used to calculate the similarity between feature vectors.As stated in [11], Euclidian distance measure is more precise as compared to the most common distance measures available.Euclidean distance measures the similarity by calculating the distance between two vectors using square root of the sum of the squared difference between two vectors v1 and v2.

√∑ (
) To measure the performance of any CBIR systems precision and recall are the most common measures available.Precision measure the effectiveness of retrieval as it measures accuracy of the retrieved set as compared to the query image [12].Precision is calculated by dividing the number of relevant (accurate) retrieved images over the number of all retrieved images, while recall measures the system ability to return as much as possible of relevant images available in the database [13].As stated in [13] the following formulas define precision and recall. ( When precision and recall is measured to the same image using different threshold and many hits (images) are used to evaluate the precision and recall of the same data set then average precision, average recall, and overall average precision are needed to measure the performance of the proposed system.

IV. EXPERIMENTAL RESULTS
The proposed approach is implemented using MATLAB R2015a with WANG database [14].WANG database is extensively used in CBIR to test the effectiveness of any CBIR system because of its clear categorization and reasonable size in each category [11].From this database, five semantic categories of WANG database are chosen as shown in table 4. If the distance between image query and image database features is less than or equal a given threshold (α) then it is considered as similar image.The distance measure is calculated based on (eq.1).Then precision, recall, average precision, and average recall are used to evaluate the system performance.Image queries are shown in the upper left corner of each retrieved set.Due to space limitation it is not possible to show all samples of retrieval results.
From each category, five different images are selected as image queries.At different threshold (α) and for each query image number of retrieved images (NRI), precision (PR), and recall (RE) are calculated.The following tables (5)(6)(7)(8)(9) show the results of testing for the different semantic categories.Previous tables show that there is slight decrease in precision as recall increases.Recall is controlled through the tolerance (threshold) value.The following table 10 summarizes the testing results.
As shown in table 10 the best precision is achieved with dinosaur's category and buses, while roses achieved better results as compared with buildings and horses.Nature of image has dramatic effect on any CBIR system.Moreover, when the object is in contrast with the image background, the shape based image retrieval achieved better results.Dealing with real world images as in WANG database is challenging as compared with synthesized image databases such as MPEG 7. Table 11 shows two of the most relevant approaches to compare with.The overall precision of the proposed approach is 91%.The image database used to test the proposed system is WANG database containing real world (natural) images.Despite of the nature of the image database the results obtained is better as compared with two of the most common approaches presented in [15], and [16].In [15], the overall precision is 90%, while in [16] its 88%.Consequently, the proposed approach is efficient, robust, and realistic.

V. CONCLUSION AND FUTURE WORK
A new and efficient approach to shape based image retrieval is proposed in this paper.Edge detection as a boundary based technique, and region-based technique utilising morphology are used in addition to colour moments.Combining colour and shape is essential to discriminate the unrelated images being retrieved.Furthermore, shape features extracted using boundary-based image retrieval are not necessarily suitable for region-based image retrieval.In order to, improve the precision of retrieval a new edge detection enhancement algorithm is proposed.This algorithm is used to remove isolated nodes and solve the problem of disconnected edges raised by the most common edge detectors such as Canny, Sobel, and Prewitt.Global colour features and regionbased local features are integrated to enhance the accuracy of retrieval.Furthermore, morphological operators opening and closing are used in systematic way to solidify object's region within the target image.As a result, the shape discriminators are able to discriminate images precisely.The experimental results obtained are promising when compared with the most common approaches related to the method proposed in this paper.Despite using a real world images from the WANG database, the precision rate is still high.As future work, user feedback to bridge the semantic gap will be explored, in order to enhance the system performance based on supervised learning.

Figure 2
Figure2presents the interface of the proposed system.Query by example (QBE) is adopted in this work.

Figure 5
Figure 5 shows an example of removing isolated nodes.

Figure
Figure 10  shows an example of the 5 th case.Index of last point is(15, 19), target point is (20, 14), ∆r = 5, ∆c = -12.Add 1 to x 1 and -1 to x 2 gradually to reach target point and fill the in between points with 1's.

TABLE I .
COLOUR FEATURES DESCRIPTION

TABLE IV .
CATEGORIES OF TEST IMAGES

TABLE V .
TEST RESULTS PRECISION VS.RECALL (BUILDINGS)

TABLE XI .
RELEVANT APPROACHES