Feature Selection Based on Minimum Overlap Probability ( MOP ) in Identifying Beef and Pork

Feature selection is one of the most important techniques in image processing for classifying. In classifying beef and pork based on texture feature, feature overlaps are difficult issues. This paper proposed feature selection method by Minimum Overlap Probability (MOP) to get the best feature. The method was tested on two datasets of features of digital images of beef and pork which had similar textures and overlapping features. The selected features were used for data training and testing by Backpropagation Neural Network (BPNN). Data training process used single features and several selected feature combinations. The test result showed that BPNN managed to detect beef or pork images with 97.75% performance. From performance a conclusion was drawn that MOP method could be used to select the best features in feature selection for classifying/identifying two digital image objects with similar textures. Keywords—overlap; feature selection; best feature; minimum overlap probability (MOP); identifying


INTRODUCTION
Image identification based on texture features from many images with varying feature types, X = (xi, i=1…M) is a difficult task.Multi-features of multi-objects for task of certain applications have three probabilities, i.e. relevant, irrelevant, and redundant features [1].Similarly, according to [2][3], there are three main matters related to feature, i.e., 1) strong relevant, 2) weakly relevant and 3) irrelevant.Based on the argument, not having good knowledge on texture feature will be a problem in determining the best feature which can be used as a key for classification or identification.Therefore, feature selection process is important.
Feature selection is very effective in supporting performance in special tasks [3][4][5][6][7].Several special tasks in image and computer vision processing are classification [8][9][10], clustering [11], computational neuroscience, imaging genomics [12 and 13], protein property prediction [14], text mining, image annotation, [15 and 16].Feature has become an important part in the study of image and computer vision processing [17][18].Feature is unique identity of an image.Unique identity of an image can be used as a key to recognize an image or can also be used to identify an image from another.Key feature is urgent when certain applications involve hundreds of data with tens of characteristics [1, 12 and 18].In reality, a feature of two different objects for the same feature with different values is a problem on its own [18].
Feature selection is one of the main tasks in classification.In a large feature collection, it's possible that some or all of them are irrelevant or redundant features.Feature can be collected from extraction on three parts, i.e. texture, shape, and color of object.In this study, feature discussion is focused on texture feature.Feature separation must be performed with the correct extraction because the extraction feature would be used to train classificator.Selecting key feature is a part of a process to improve accuracy in classificator performance [19].One of the feature selection techniques is selecting features with minimum redundancy criteria in the classification process [7].Feature selection technique by minimum redundancy is also used by [4] for classification.The main function in selection in choosing minimum redundancy features from two objects which can be used to make distinct classes [2].However, feature selection of both is used in multi-label features in single object.Unlike previous researchers [2], [4], [8] in this paper the writer performed multilabel feature selection for multi-object on identification of digital images of beef and pork.The selection criteria are features relevant with accuracy of classification.The writer suggested that a feature is relevant when it has minimum overlap.The challenge was choosing one or a few of the feature overlaps as candidates for the best feature.The basic assumption was the smaller the range of overlapping value of a feature, the better it was to select the feature to win the selection (key feature).The basic problem for feature overlap is the feature group isn't a comprehensive representation [20] of the value of a feature as a target because it still contains the value of other features.
In this paper, the writer offered a feature selection to obtain the best features from two groups of features from two digital images with similar textures by MOP method.The identity of each feature was determined by min-max values then calculating the overlapping value of each feature.Then overlap probability of each feature was calculated.The next step was selecting key (best) features by applying threshold on probability values.Features with probability less than the threshold won the selection.This paper was arranged as follows: Section 2, describes relevant scientific works.Section 3, described methodology.Explanation on the MOP the author offered is reviewed in Section 4, Test and Result are described in Section 5 and Discussion in Section 6. Lastly is Conclusion.www.ijacsa.thesai.orgII.
RELATED WORK Kamyab and Eftekhari [21] in their paper discuss a special study on the usage of Multimodal Optimization (MO) method for feature selection.To do this, Evolutionary Algorithms (EAs) modification from several famous methods based on and a proposed niching method called GA_SN_CM are used for feature selection task and is compared with several famous EA-based methods for feature selection to study the strength of MO method on improving the result of feature selection.Sotoca and Filiberto [22] in their paper on feature selection, or variable selection, select the most relevant features (attributes) of a group of variable data.In this framework, relevant term refers to the effect of given features or feature set to obtain possibility of minimum error in classification or recognition of classification problems.
Al-Ani, Alsukker and Rami [23] proprose differential evolution algorithm for wrapper feature selection which uses the simplest yet effective way to narrow the search without removing any feature.A number of dataset with different sizes are used to evaluate the performance of the proposed method, which can give good indication on exploration and exploitation of the ability.
Kabir, Islam and Kazuyuki [24] in their paper proposes a new algorithm called constructive approach for Feature Selection (CAFS) based on wrapper approach concept consecutive search strategy.As a learning model, CAFS employs three layers of feed-forward Neural Network (NN).The proposed technique combines feature selection (FS) with NN architecture determination.It uses constructive approach which involves correlation information in selecting features and determining network architecture.
Lutu and Engelbrecht [25] in their paper discuss algorithm for feature selection in data mining prediction for classification problem by trying to categorize them to select relevant and not excessive features for classification task.A relevant feature is defined as one which correlates with target functions.As excessive feature is defined as one which correlates with other features.In this writing, they propose a new algorithm by combining the usage of certain threshold values and decision rules to select feature subset Hanchuan, Fuhui and Chris [26] combine maxdependency, max-relevance and min-redundancy for feature selection.In reciprocal information, the purpose of feature selection is to discover a set of S features with m feature (x), which together have the biggest dependency in target class c.This scheme is called max-dependency.But it's difficult to do.The alternative is max-relevance which is searching for features which fulfill max-dependency value by average value of all reciprocal information values between features of individual x and class.

III. FRAME WORK MOP
Feature selection was performed to get the best features from feature set of beef and pork for classification task.Every feature set has 20 features where every feature consists of 200 data.The problem was the values of extraction features between beef and pork from the same feature name didn't produce independent features, but feature overlaps instead.To determine whether selected features are the best features as expected, selected features were tested on artificial neural network (ANN).

A. Extraction featur
Extraction is a pre-processing stage which is a basic stage to get maximum data before processing.Extraction was performed on each image to determine texture characteristics.The feature which became the object of the writer's research was the feature of the texture of digital images of beef and pork.The main thing to get was strong features which could be used to differentiate the texture of both.As usual, to get features in pre-process, the study conducted extraction process of both by several types of features which have been used by previous researchers.Some of those features were used to look for unique features from the extracted images.The extraction model in this study was gray level co-occurrence matrix (GLCM) method.GLCM is a tabulation of how often different combinations of gray level co-occurrence matrix are found in image section or images [27].Calculation of texture feature used GLCM to get sizes of variations in intensity (i.e, image texture) in pixels which were focused on.Cooccurrence matrix was calculated by two parameters, which is relative distance between d of pixel pair measured in total pixel and their relative θ orientation.These two parameters were expected to find special characteristics of two digital images of beef and pork.Unique features expected to be found maximally from the digital images were: autocorrelation, contrast, correlation, cluster prominence, cluster shade, dissimilarity, energy, entropy, homogeneity, maximum probability, sum of square variance, sum average, sum variance, sum entropy, difference variance, difference entropy, information of correlation, Inverse difference normalized,

B. Feature Range
Based on the extraction, the value of extraction features for 20 type of the features showed there was no particular feature which has independent range or categorized as strong relevance.Instead there were overlaps.So, in this study, the value of feature overlap received special attention or led to further study.This problem required certain formulation which can be used to determine the values of features in overlap area [16].This method aims to get features with minimum overlap range probability and select features by certain thresholds to get the best features.Feature range is a range formed by minimum and maximum values.The formula was as follows: 1) Max Value: Maximum value is the highest value of a feature of a data set.
Max x = max (x 1 :x m ) (1) 2) Min Value: Minimum value is the lowest value of a feature.
Min x = min (x 1 :x m ) (2) 3) Feature Range: Area marked by minimum and maximum value limits of a feature Fitur x = (min x :max x ) (3) x 1 ;x m is a group of value of the 1 st x feature to m amount of data Ranges of beef features (Fs) and pork features (Fb) were areas formed by Min x and Max x (3) of extracted data.These min-max values were respectively used as the lower limits and upper limits of feature areas.So the area of every Fs and Fb could be determined.To determine overlapped area visually between Fs amd Fb when interacting, each feature area was visualized in two dimensions (2D).The feature areas could be formed by giving range values of features to x and y axis.Therefore, value range in x axis was (min,max), the same value range applied to y axis.When similar features (X) from beef and pork image data were described in the field, the features could be analyzed.
Fb Range,Ordinate point (x 1 ,y 1 ) was the lower left corner point or equaled to feature value x 1 (min.min),(x 2 ,y 1 ) was the lower right corner point which equaled to the feature value x 1 (max.min),(x 1 ,y 2 ) was the upper left corner point equaled to feature value x 1 (min,max) and (x 2 ,y 2 ) was the upper right www.ijacsa.thesai.orgcorner point equaled to feature value x 1 (max,max).Fb area could be determined based on the ordinates, so Fb was Fb =((x 1 ,y 1 ),(x 2 ,y 1 ) ; (x 1 ,y 2 ),(x 2 ,y 2 )) (4) Fs Range, Ordinate point (x 3 ,y 3 ) was the lower left corner point or equaled to feature value x 1 (min,min), (x 4 ,y 3 ) ) was the lower right corner point which equaled to the feature value x 1 (min,max), (x 3 ,y 4 ) was the upper left corner point equaled to feature value x 1 (max,min) and (x 4 ,y 4 ) was the upper right corner point equaled to feature value x 1 (max,max).Fs area could be determined based on the ordinates, so Fs was Fs = ((x 3 ,y 3 ),(x 4 ,y 3 );(x 3 ,y 4 ),(x 4 ,y 4 )) ( 5)

C. Overlap
Overlap between Fs happen Fb happened when they surpassed the value ranges of two or more features.In this study, feature had value range (3).Based on (3) Fs(X 1 ) and Fb(X 1 ) features overlapped when the maximum values of Fs(X 1 ) were bigger than the minimum values of Fb(X 1 ) and the minimum values of Fs(X 1 ) were less than the minimum values of Fb(X 1 ).In theoretical discussion, overlapping set is called intersection.The formulation to get intersection value was 5) was effective to determine intersection element.However, this study was aimed to determine range, so this study modified (6) to find intersection value.The term intersection in this study was called overlap.

D. MOP
Probability of an incidence is a number which shows the possibility of an event.In this study, there was possibility of similar value between Fs and Fb in overlap area.The problem was how big the overlap was between them.Using set theory, the number of Fs members could be written as nFs, and the number of Fb set members could be written as nFb.Based on this, probability of overlap area (ProbArea) was defined as In this study, the author modified (8) for the number of members of sets with feature area size.The formulation of each size is defined below: Size of feature areas of beef (Ls) and pork (Lb): The general formula of area size is length multiplied with width.Length in this case the length of Fs was the range of Fs along X axis, i.e. delta ( s ) the distance between maximum point (x 4 ) and minimum point (x 3 ).While width of Fs was the range of Fs along Y axis, i.e. delta ( s ) the distance between maximum point (y 4 ) and minimum point (y 3 ).While the length of Fb was the range of Fb along X axis, i.e. delta ( b ) the distance between maximum point (x 2 ) and minimum point (x 1 ).While the width of Fb was the range of Fb along Y axis, i.e delta ( b the distance between maximum point (y 2 ) and minimum point (y 1 ).So, Ls was defined as:  11) and ( 14), Ls and Lb were areas with the same length on both sides.So Lo in (17), could be written as Lo = x 2 (18) Using Lo (18) for feature values in this area was indicated to cause problem for the process of identifying images of beef or pork.The problem happened in the area maybe due to duplication of values of features of beef and pork.Thus, the bigger the value of Lo, the bigger the amount of duplication of members of features of beef and pork, and vice versa.It should be noted that this area was formed by the range of feature values, so the overlap area of every feature wasn't absolute at certain amounts because the range was influenced by the area stability of each feature.However, by using area range based on min-max of the features, the system was still able to get overlap area.The possibility of overlap or overlap probability was the main focus of this study.To determine overlap probability of every feature between Fs and Fb, Lo could be compared with the size of all features (Fs + Fb).In this study, computation of overlap probability by the author was called error probability (ProbError).The formula was ProbError = 2 * Lo / (Ls + Lb) ( 19) Equation ( 19) meant that the smaller the value of Lo, the smaller the value of ProbError.Conversely, the bigger the value of Lo, the bigger the value of ProbError.
The author used this ProbError value as data to select the best features.Selection was performed by giving threshold value < 10%.The author named this method Minimum Overlap Probability (MOP) method.

1) Algorithm of feature selection by Minimum Overlap Probability (MOP) method a) Calculating minmax values of extraction features of digital images of beef and pork b) Determining feature area (F x ) of digital images of beef and pork.
 If F x had no overlap (independent) it's a selected feature  If F x was a subset or superset of each other, F x wasn't a selected feature (rejected)  If the calculation delta Fx of a group of feature database wasn't in a or b process, the process was continued (process 3).

c) Calculating ProbError value d) Determining threshold (as filter of selection of selected features) e) Finding features with ProbError less than treshold f) Selected features F. MOP Flowchart
Algorithm of feature selection by MOP method is illustrated in the flowchart in Fig 3.

G. Testing the selected features
Testing the features selected conducted on artificial neural networks.This testing is done to determine the effect on the accuracy of results.Type of neural network used is a multilayer back propagation neural network..The network architecture used here was I-H-O i.e. input layer, hidden layer and lastly layer output.To determine the correlations of selected features and accuracy of network classification, input layer was set up for several nodes.Meanwhile, output layer was set up was two nodes.In training stage, the target classes were label 00 for pork, 11 for beef.To support the performance of the network, the selected learning method was levenberg marguad.It's because this method has the best accuracy compared with other learning methods.

IV. EXPERIMENT AND RESULT
A. Experiment Image data was acquired by mobile digital camera at five mega pixel (5MP).Total data was 400 images with 200 images each for beef and pork.The dimension of digital images was 255 x 255 and in JPEG format.Data was preprocessed by converting RGB to gray, filtering images by gabor.Total and names of extraction features are written in Table I 3) Calculating proberror value by equation ( 14). 4) Selecting features.In this selection process, a criterion was used to select the best features.The best features met the following criterion: Note ( 15) f(x) is selected feature, threshold < 10%.Criterion f(x) =1 means fulfilling requirement or accepted, while f(x)=0 means not fulfilling the criterion or rejected.

5)
The final step was testing selected features on neural network.

B. Result
The range values of Fs and Fb, and error probability of every feature from the implementation of MOP method produced the result shown in Table II   Table III shows selection result with error probability values of 6%,7%,8% and 9%.Some features had the same error probability values, e.g.contrast, difference variance, sum average with error probability value of 7%, autocorrelation, energy, sum entropy with error probability value of 8%.
The test on selected features for classification task was performed on neural network.The architecture of neural network was five nodes of input, five nodes of hidden layer and 2 nodes of output (5I-5H-2O).There were two models of the test, first using the best feature as single input and second using combination of some of the best features.The performance of neural network for data training by best feature input (maximum probability) produced 95.50%.A different result was shown by several combinations of the best features which produced100%.The result of data testing by some combinations of features as input of neural network was shown in Table IV.Table IV is the result of data testing based on data classification.Combination of features 1,2,6,8 had the highest accuracy of neural network (97.75%), while combination of features 1,2,3,4,5 had the lowest accuracy (92.75%).It showed that classification by combination of selected features produced accuracy of performance of neural network above 92.00%

V. DISCUSSION
Feature selection by MOP with threshold 0.1 selected maximum probability, contrast, Difference variance, Sum average, Autocorrelation, energy, Sum entropy, and entropy as the best features from 20 feature candidates.It meant that www.ijacsa.thesai.orgthese features in digital images of beef and pork had smaller overlap values than other features.Combinations of selected features were used to train network and then testing was performed using new data, showing the best features could support network performance.The lowest network accuracy was in feature combination 1,2,3,4 and 5 with 92.75% accuracy or error level of 7,25%.The best accuracy was in feature combination 1,2,6 and 8 which had network performance with 97.75% accuracy or error level of 2.25%.It showed that feature combinations influenced accuracy of classification.Based on the result of the test, the selected features were correct and could be used as unique characteristics to identify beef or pork by digital image.

VI. CONCLUSION
Overlap probability can be used to select the best features of some of the features that have value overlap one another.MOP method could be used as one of the solutions for selecting the best or strongest features of two objects with feature overlap.
The selected feature is a maximum probability, contras, energy and entropy is the best feature based on the results of testing with artificial neural networks.It is derived from the performance of the neural network with an accuracy rate of 97.75%.In other perngertian error rate of of 2.25%.Future work will be the development Minimum Overlap Probability method to determine the correlation between the selected feature.
Fig 1 shows the framework for feature selection by MOP method.The focus on this stage is making a model to calculate area overlap of every feature between features of digital images of beef (Fs) and pork (Fb.The early stage is determining the area of each feature of Fs and Fb.It's continued by calculating the area overlap of both.Then, probability value of every feature was calculated.Lastly, overlap probability value was selected.The constructed model architecture is shown in Fig 1.

Fig. 1 .
Fig. 1.Framework of feature selection by MOP Selected feature www.ijacsa.thesai.orginverse difference moment normalized.The extraction result was numeric.The numbers were the data or sources of data processing.The extraction result showed the group of feature values for digital images of beef S = (x 1 ,x 2 ,x 3 …x n ) and for digital images of pork, B = (x 1 ,x 2 ,x 3 …x n ), with x being feature names.x feature has a group of value from extraction of a number of extracted images.The names and formulas of extraction features used for classification/identification of beef and pork in this research was cited from[28] [29], as shown in TableI Fig 2 shows overlap area between a feature of Fs and Fs.

Fig. 2 .
Fig. 2. Overlap area between Fb and Fs Fig 2 is interaction of Fs and Fb which shows an overlap between Fs and Fb.

1 )
Size of overlap area (Lo): To determine the length and width of Lo area, Fig 2 shows the length of Lo is x and the width y.Each could be calculated by the following equationsand (16), for x = |x 2x 3 |, point x 2 was maximum value for Fb and point x 3 was minimum value of Fs. y = |y 2y 3 |, point y 2 x 2 was maximum value for Fb and point y 3 was minimum value of Fs.So based on (15) and (16) the size of overlap area (Lo) was Lo = x * y(17)

TABLE I
. Treshold value was 10%.The process in this experiment is as follows:

Table
II, feature selection was performed by using the determined filter value (threshold < 10%).The result of the selection was names of selected features shown in TableIII.

TABLE IV .
ACCURACY OF FEATURE COMBINATION ON NN