Local Frequency Descriptor and Hybrid Features for Classification of Brain Magnetic Resonance Images using Ensemble Classifier

—A brain tumor is an irregular development of cells in the human brain that causes problems with the brain's normal functionalities. Early detection of brain tumor is an essential process to help the patient to live longer than treatment. Hence in this paper, a hybrid ensemble model has been proposed to classify the input brain MRI images into two classes: brain MRI images having tumor and brain MRI images with no tumor. The hybrid features are extracted by analyzing the texture and statistical properties of brain MRI images. Further, the Local Frequency Descriptor (LFD) technique is employed to extract the prominent features from the brain tumor region. Finally, an ensemble classifier has been developed with the combination of Support Vector Machine (SVM), Decision Tree (DT) and K-Nearest Neighbour (KNN) technique to successfully classify the brain MRI images into brain tumor MRI images and non-tumor brain MRI images. The proposed model is tested on the Kaggle brain tumor dataset and the performance of the method is evaluated in terms of accuracy, sensitivity, specificity, precision, recall and f-measure (f1 score-harmonic mean of precision and recall). The results show that the proposed model is promising and encouraging.


I. INTRODUCTION
The brain tumor is developed due to the abnormal cell growth in the brain. To identify the brain tumor, two imaging modalities are extensively used such as Computed Tomography (CT) and Magnetic Resonance Imaging (MRI). Where MRI is less harmful to the human tissues as compared to CT and also it gives detailed visualization of the internal structure of the brain. The brain tumor classification models are categorized into Machine Learning (ML) and Deep Learning (DL) techniques. Feature selection and extraction processes are extensively used in ML approaches for classification and to achieve good accuracy even on a small dataset, which consumes less computational time. On the other hand, DL methods extract and learn the features from an image directly with a large dataset. Hence in this proposed work, the conventional hybrid features and ensemble classifiers are designed for the effective classification of tumor and non-tumor brain MRI images. Texture, statistical and descriptor features are combined as hybrid features for effective analysis of tumor region. SVM [1,2], KNN [1], and DT [3,15] and have been ensemble for accurate classification of tumor and non-tumor brain MRI images. The KNN needs less computation time with limited storage space and DT considers all possible consequences of a decision with following each direction to its end. SVM is mainly used in the two-class problem, which takes the labelled information from both the classes to produce a model file that can be used to categorize the new unlabelled or labelled information. Overall in this research, a hybrid ensemble classifier is used to boost the precision of the findings. A comparison of SVM, KNN, DT and the proposed hybrid ensemble classifier is also presented.
The remaining sections of the paper are structured as follows. The literature review is explained in Section 2, the proposed model for classifying brain MRI images as tumor or non-tumor is illustrated in Section 3. The classification analysis is discussed in Section 4. Section 5 outlines the experimental analysis of the proposed model and finally, Section 6 concludes the proposed work with future contributions.

II. LITERATURE SURVEY
Detecting a brain tumor is a time-consuming and complex process due to an intensity inhomogeneity, tissue overlap, and a lack of clear boundary differentiation between tumor and nontumor brain regions. Over the years, several research works have been carried out by various researchers in the field of brain tumor detection and classification.
In [1] the authors have proposed a model for classifying brain MRI images by applying the GLCM technique for texture features and classification is achieved using supervised SVM and KNN algorithms. In [2] the authors have applied morphological function, anisotropic diffusion filter, Discrete Wavelet Transform (DWT) and SVM for classifying the brain MRI images. In [3] the authors have implemented Convolution Neural Network (CNN), Radial Basis Function (RBF) and Decision Tree (DT) for classifying brain MRI images. In [4] authors have implemented a novel approach for edge detection in two steps F-test for identifying the pixel variations and T-test based on contrast function which observes the edges in all four direction. The proposed model in [5] presents an edge detection model based on Neuro fuzzy-approach. The authors have developed an edge detection model in [6] by incorporating active contour model driven from cellular neural network. In [7] authors have proposed a fuzzy logic based edge detection model by incorporating Triangular norms on DICOM images. In [8] the authors have proposed a image segmentation model 189 | P a g e www.ijacsa.thesai.org by incorporating Particle Swarm Optimization (PSO) and outlier rejection combined with level set method. The authors in [9] have applied level set approach to extract microarray spot intensity features for classifying foreground and background pixels.
In [10] authors have developed a brain tumor prediction model using statistical features, Naïve Bayes classifier and morphological operation. Gabor wavelets and statistical features are employed in [11] for brain tumor detection and segmentation of brain MRI images. In [12] the authors have implemented a hybrid approach using DSURF features, HoG features and SVM for brain tumor classification. Local Frequency Descriptor (LFD) is used for texture feature extraction in [13] for tumor classification using Support Vector Machine (SVM), Decision Tree (DT) and Random Forest (RF). Local Frequency Descriptor is applied by authors in [14] on brain MRI images for studying the various properties of the brain tumor using Gray Level Co-occurrence Matrix (GLCM), Local Binary Patterns (LBP) and Second Orientation Pyramid (SOP). In [15] authors used Grey Level Run Length Matrix (GLRLM), Fuzzy C-Means (FCM) and SVM techniques for brain tumor detection and classification in MRI images. In [16] authors have applied Gabor Wavelet Transform (GWT), HOG and LBP techniques for studying the tumor region. SVM, DT, KNN, Naive Bayes and Random Forest (RF) classification models are used to categorize the brain MRI images into tumor and non-tumor classes. Authors in [17] have extracted Discrete Wavelet Transform (DWT) and statistical features for the classification of brain images using Multi-Layer Perceptron (MLP) classifier. A novel approach is suggested in [18] by implementing Gray-Level Co-occurrence Matrix (GLCM), Probabilistic Neural Network (PNN) and K-means clustering algorithm for brain tumor detection and classification. In [19] Authors have developed a classification model based on a CNN using texture and statistical features to predict normal and abnormal tissue in the brain. Then comparative analysis has been done on KNN, Logistic Regression, multi-layer perceptron, Naive Bayes, Random Forest (RF), and SVM classifiers.
In [20] authors have designed a brain tumor grading model using texture features, morphological features and SVM for brain tumor classification. Authors in [21] have extracting the features using gray-level co-occurrence matrix (GLCM) followed by tumor segmentation based on Discrete Wavelet Transform (DWT) and morphological operation for brain tumor classification. The model categorizes the brain tumor using Support Vector machine (SVM) classifier with classification accuracy of 98.91%. Authors have developed a brain tumor detection and classification model in [22] by applying k-means clustering algorithm to identify cluster with tumor and is separated by applying morphological operation and region properties. The neural network based classifier categorizes the resultant tumor by extracting features like contrast, energy, correlation, kurtosis, and homogeneity along with perimeter and area into different classes.
In [23] authors have extracted area, perimeter, and eccentricity features for the classification of brain tumor using k-medoid clustering method and morphological operations. In [24] authors have proposed a hybrid ensemble approach based on the majority voting method, which incorporates RF, KNN and DT for classification of brain tumors by extracting Stationary Wavelet Transform (SWT), Gray Level Cooccurrence Matrix (GLCM) and Principal Component Analysis (PCA) features. Authors have developed a brain tumor detection model in [25] using deep learning based convolution neural network to classify the brain MRI images into tumor and non-tumor class. In [26] authors proposed a classification model by applying the preprocessing using the Gaussian filter and segmented the tumor region by incorporating region growing technique. The classification of the tumor has been done by extracting texture features and Genetic Algorithm (GA) is utilized to select the optimal texture features followed by KNN classifier in order to classify whether the brain MRI image is normal or not.
Author in [27] has done extensive survey on various existing brain tumor segmentation and classification methods from 2014 to 2019 and the same is presented and discussed.
As per the literature survey the problem of brain tumor detection is solved by various image processing and machine learning algorithms, but the actual semantic gap between tumor and non-tumor region is optimally less in the existing models. Hence to address the semantic gap we proposed the combination of statistical, textural and descriptive models in our research.

III. PROPOSED METHODOLOGY
The proposed brain tumor classification model is presented in this section. The main goal of this research work is to use effective feature extraction methods to reduce the misclassification of brain MRI images. Initially, MRI images are preprocessed to increase the semantic gap between the tumor region and non-tumor regions, after that the morphological action is performed to eliminate the possible non-tumor regions. Local Frequency Descriptor (LFD), texture and statistical features are extracted as hybrid features to analyze the various properties of the tumor region. Finally, an ensemble classifier is developed using Support Vector Machine (SVM), Decision Tree (DT), and K-Nearest Neighbour (KNN) along with the majority voting concept. The schematic representation of the proposed model is shown in Fig. 1.

A. Preprocessing
The preprocessing stage increases the distance between the tumor region and the non-tumor region by performing binarization and morphological operations. The outcome of binarization and morphological functions are shown in Fig. 2. The binarization process differentiates the tumor region pixels from background pixels. The morphological function is employed on an outcome of the binarization process to analyze the tumor region. The unwanted binary regions are eliminated by applying morphological area opening techniques to retain the tumor region.

B. Feature Extraction
In this proposed work texture, statistical and descriptor measurements are used to extract various features such as contrast, correlation, energy, homogeneity, mean, standard deviation, kurtosis, skewness, variance, smoothness, IDM, RMS and Local Frequency Descriptor (LFD).

1) Texture and statistical features:
The texture of an image can be described easily with the help of statistical measurements. Texture analysis is an intimate property of the spatial domain that predicts the properties of an image that belongs to second-order statistics. In this paper, the GLCM method is applied on gray-level images to study the occurrence of pixels in the brain tumor region and statistical approaches are employed to analyze the characteristics of the brain tumor region.
a) First-order statistical features: The statistical analyzer is applied on brain MRI images to study the relationship among the pixels using standard deviation, mean, energy, kurtosis, entropy and skewness.
The characteristics derived from first-order statistics provide information about the gray-level distribution of the image. However, they provide no information about the relative placements of grey levels in the image. These characteristics are not able to determine whether all lower grey levels are grouped together or if they are swapped out for higher grey levels. A matrix of relative frequencies can be used to describe an occurrence of a gray-level arrangement. The second-order statistics are concerned with how often two pixels of grey level appear in the window separated by a distance. b) Second-order statistical features: The Gray Level Cooccurrence Matrices (GLCM) gives the frequency of pairs of pixels that are separated by a specific distance. The GLCM technique uses the gray intensity value of an image i.e. G and the probability density function of the intensity level is i, i.e. P(i) to study the second-order statistical property.
Where h(i) is the histogram of intensity level i and N is the total number of intensities in the given image. The mathematical formulation and description of first-order and second-order statistical measurements are tabulated in Table I. The qualitative textural features are extracted from the brain MRI images by employing local descriptors such as Local Phase Quantization (LPQ) and Local Binary Pattern (LBP). These two techniques help to quantify the phase values in local neighbourhood pixels.
LPQ is applied to analyze the phase values in low resolution and blur MRI images using Fast Fourier Transform (FFT). LBP www.ijacsa.thesai.org is employed to analyze the identical property of the brain MRI images which helps to assign the label for each pixel of an image by considering the threshold of neighbourhood pixels to result as a binary number. The different LBP variants are represented in Fig. 3 Fig. 4. Finally, the LFD is achieved with the combination of LPQ and LBP to extract prominent textural properties from the MRI images.

and the outcome of LBP is represented in
3) Hybrid features: Generally, the semantic gap between tumor region and non-tumor region in gray level brain MRI images is considerably less. Due to this nature, the lowest numbers of features are insufficient to distinguish the tumor region from the non-tumor region. Even though the combination of common features also lead to insufficient representation of brain tumors. Hence a highly discriminative and sufficient combination of features is required to represent the brain tumor region in MRI brain images. In the proposed model the statistical and textural features are combined to obtain the highly discriminative features of the brain tumor. This hybrid feature helps to distinguish the brain tumor from the brain MRI images.
To study the brightness of the tumor region.

2.
Variance The values of variance help to distinguish the brain tumor pixels and non-tumor pixels.

5.
Energy Energy studies the gray level distribution in the brain MRI images.

6.
Entropy Entropy analyses the randomness of textural regions in the brain MRI images.

Smoothness
The smoothness removes possible noise by performing spatial smoothing on brain MRI images.

8.
Contrast The contrast analyses the intensity variation in the brain MRI images.

9.
Correlation The correlation exhibits spatial relationships among intensity levels in the brain MRI images.

10.
Homogeneity A homogeneous extracts the affinity or closeness of brain MRI pixels.

IDM
IDM measures the local likelihood of the image and it gives a single or range of values to represent whether the brain MRI image is textured or nontextured.
12. RMS √ 1 ∑ =1 | | 2 Root Mean Square calculates the number of changes across the pixel of brain MRI images.
R=1,N=8 R=2,N=16 R=3,N=24 IV. CLASSIFICATION Image classification is the task of extracting a collection of different attributes in an image and then mapping them to a specified class. As this research work is carried out on two-class problems, the supervised classification models named KNN, SVM and DT are considered to assign the given input brain MRI images into normal and abnormal classes. Further, the considered classifiers are ensemble to achieve an exact categorization of MRI images using the majority voting concept.

A. Support Vector Machine (SVM)
The linear SVM classifier is primarily used in the binary classification process. Since the proposed model classifies the given input MRI images into tumor and non-tumor classes, the linear-SVM classifier has been incorporated. The SVM classifier analyzes the hybrid features and trains the model to minimize the structural misclassification in MRI brain images. Later the trained SVM model is tested by providing untrained brain MRI images.

B. K-Nearest Neighbour (KNN)
The KNN classifier finds the optimal neighbours by studying the space among the hybrid features to observe the similarity and dissimilarity among the pixels. In the proposed model, the KNN classifier is incorporated to estimate the discrimination among the tumor region and non-tumor region within the K distance. In order to execute this, the KNN model is trained using hybrid features and then the trained KNN model is tested by providing untrained brain MRI images.

C. Decision Tree (DT)
The decision tree is a stage wise prediction algorithm to assign the given brain MRI image into a particular class. In the proposed model, the decision tree classifier repeatedly partitioning the hybrid features into smaller and more uniform features. These uniform features are used to train the DT classifier to distinguish the tumor and non-tumor regions and then the trained DT model is tested by providing untrained brain MRI images.

D. Ensemble Classifier (SVM+DT+KNN)
The ensemble classifier (SVM+DT+KNN) outperforms in achieving improved accuracy as compared to the individual classifier. The constituent classifier studies the hybrid features based on the principle of a respective classifier. From this, the prediction of the classifier differs from one to another. Hence, the majority voting concept is applied to consider the maximum prediction among the classifiers.

V. EXPERIMENTAL ANALYSIS
The performance of the proposed model is evaluated by conducting experimentation on the brain MRI Kaggle data set [28]. This data set contains 2065 tumor and non-tumor brain MRI images respectively. The proposed model is trained with 600 tumor and 600 non-tumor brain MRI images. The same model is tested with 485 tumor and 380 non-tumor brain MRI images.

A. Discussion
Texture, statistical and descriptor features play an important role in the classification of brain MRI images into a tumor and non-tumor classes. The hybrid features help to identify the discriminative feature of the tumor region. Later, SVM, DT and KNN classifiers are combined as an ensemble classifier using the majority voting concept for best classification. Overall the proposed model outperforms all measuring terms such as accuracy, sensitivity, specificity, precision, recall and F1 score as compared to individual classifier outcomes and existing models. The performance analysis of an individual classifier and ensemble classifier is depicted in Table II. Fig. 5 represents the classification performance on the Kaggle dataset. Finally, the proposed model is compared with the state-of-the-art techniques of the existing methods and the comparative analysis is shown in Table III.  VI. CONCLUSION In this paper, we developed an effective and efficient hybrid ensemble model for extracting hybrid features and classifying brain MRI samples into tumor and non-tumor classes. Texture and statistical features are extracted to determine the presence of the tumor region in the brain MRI image. The local magnitude descriptor and local phase descriptor of brain MRI images are analyzed by employing the Local Frequency Descriptor (LFD). The effective property of the LFD supports the classifier to increase the efficiency of a classification process. The conventional classifiers such as SVM, DT and KNN are combined as an ensemble classifier using the majority voting concept for effective discrimination of brain MRI images. In the future, the probabilistic model needs to be incorporated to analyze the distribution of tumor pixels in brain MRI images.