Automatic Classification and Segmentation of Brain Tumor in Ct Images Using Optimal Dominant Gray Level Run Length Texture Features

— Tumor classification and segmentation from brain computed tomography image data is an important but time consuming task performed manually by medical experts. Automating this process is challenging due to the high diversity in appearance of tumor tissue among different patients and in many cases, similarity between tumor and normal tissue. This paper deals with an efficient segmentation algorithm for extracting the brain tumors in computed tomography images using Support Vector Machine classifier. The objective of this work is to compare the dominant grey level run length feature extraction method with wavelet based texture feature extraction method and SGLDM method. A dominant gray level run length texture feature set is derived from the region of interest (ROI) of the image to be selected. The optimal texture features are selected using Genetic Algorithm. The selected optimal run length texture features are fed to the Support Vector Machine classifier (SVM) to classify and segment the tumor from brain CT images. The method is applied on real data of CT images of 120 images with normal and abnormal tumor images. The results are compared with radiologist labeled ground truth. Quantitative analysis between ground truth and segmented tumor is presented in terms of classification accuracy. From the analysis and performance measures like classification accuracy, it is inferred that the brain tumor classification and segmentation is best done using SVM with dominant run length feature extraction method than SVM with wavelet based texture feature extraction method and SVM with SGLDM method. In this work,we have attempted to improve the computing efficiency as it selects the most suitable feature extration method that can used for classification and segmentation of brain tumor in CT images efficiently and accurately. An avearage accuracy rate of above 97% was obtained usinh this classification and segmentation algorithm.


I. INTRODUCTION
In recent years, medical CT Images have been applied in clinical diagnosis widely.That can assist physicians to detect and locate Pathological changes with more accuracy.Computed Tomography images can be distinguished for different tissues according to their different gray levels.The images, if processed appropriately can offer a wealth of information which is significant to assist doctors in medical diagnosis.A lot of research efforts have been directed towards the field of medical image analysis with the aim to assist in diagnosis and clinical studies [1].Pathologies are clearly identified using automated CAD system [2].It also helps the radiologist in analyzing the digital images to bring out the possible outcomes of the diseases.The medical images are obtained from different imaging systems such as MRI scan, CT scan and Ultra sound B scan.The computerized tomography has been found to be the most reliable method for early detection of tumors because this modality is the mostly used in radio therapy planning for two main reasons.The first reason is that scanner images contain anatomical information which offers the possibility to plan the direction and the entry points of radio therapy rays which have to target only the tumor region and to avoid other organs.The second reason is that CT scan images are obtained using rays, which is same principle as radio therapy.This is very important because the intensity of radio therapy rays have been computed from the scanned image.
Advantages of using CT include good detection of calcification, hemorrhage and bony detail plus lower cost, short imaging times and widespread availability.The situations include patient who are too large for MRI scanner, claustrophobic patients, patients with metallic or electrical implant and patients unable to remain motionless for the duration of the examination due to age, pain or medical condition.For these reasons, this study aims to explore methods for classifying and segmenting brain CT images.Image segmentation is the process of partitioning a digital image into set of pixels.Accurate, fast and reproducible image segmentation techniques are required in various applications.The results of the segmentation are significant for classification and analysis purposes.The limitations for CT scanning of head images are due to partial volume effects which affect the edges produce low brain tissue contrast and yield different objects within the same range of intensity.All these limitations have made the segmentation more difficult.Therefore, the challenges for automatic segmentation of the CT brain images have many different approaches.The segmentation techniques www.ijacsa.thesai.orgproposed by Nathali Richarda et al and Zhang et al [3][4] include statistical pattern recognition techniques.Kaiping et al [5] introduced the effective Particle Swarm optimization algorithm to segment the brain images into Cerebro spinal fluid (CSF) and suspicious abnormal regions but without the annotation of the abnormal regions.Dubravko et al and Matesin et al [6] [7] proposed the rule based approach to label the abnormal regions such as calcification, hemorrhage and stroke lesion.Ruthmann.et al [8] proposed to segment Cerebro spinal fluid from computed tomography images using local thresholding technique based on maximum entropy principle.Luncaric et al proposed [9] to segment CT images into background, skull, brain, ICH, calcifications by using a combination of k means clustering and neural networks.Tong et al proposed [10] to segment CT images into CSF, brain matter and detection of abnormal regions using unsupervised clustering of two stages.Clark et al [11] proposed to segment the brain tumor automatically using knowledge based techniques.Lauric et al [12] proposed to segment the CT brain data into CSF and brain matter using Bayesian classifier.Genesan et al introduced [13] to segment the tumor from CT brain images using genetic algorithm.Hu et al proposed [14] to segment the brain matter from 3D CT images by applying Fuzzy C means and thresholding to 2D slices to create 2D masks and then propagating 2D masks between neighboring slices.From the above literature survey shows that intensity based statistical features are the straightest forward and have been widely used, but due to the complexity of the pathology in human brain and the high quality required by clinical diagnosis, only intensity features cannot achieve acceptable result.In such applications, segmentation based on textural feature methods gives more reliable results.Therefore texture based analysis have been presented for tumor segmentation such as Dominant gray level run length matrix method ,SGLDM method and wavelet based texture features are used and achieve promising results.
Based on the above literature, better classification accuracy can be achieved using dominant run length statistical texture features.In this paper, the authors would like to propose a dominant gray level run length statistical texture feature extraction method [15] The extracted texture features are optimized by Genetic Algorithm(GA) [16] for improving the classification accuracy and reducing the overall complexity.The optimal texture features are fed to the SVM [17] classifier to classify and segment the tumor region in brain CT images.

II. MATERIALS AND METHODS
Most classification techniques offer grey level (i.e) pixel based statistical features.The proposed system is divided into 4 phases (i) Image preprocessing ii) Segmentation of Region of Interest (ROI) (iii) Feature extraction (iv) Feature selection (v) Classification and Evaluation.For feature extraction, we discovered three methods which are i) Dominant gray level run length feature extraction method ii) Wavelet based feature extraction method iii) SGLDM method .Once all the features are extracted, then for feature selection, we use Genetic Algorithm (GA) to select the optimal run length statistical texture features.After selecting the optimal run length texture features, to classify and segment the tumor region from brain CT images using SVM classifier.

A. Image preprocessing
Brain CT images are noisy, inconsistent and incomplete, thus preprocessing phase is needed to improve the image quality and make the segmentation results more accurate.The cropping operation can be performed to remove the background.The Contrast Limited Adaptive Histogram Equalization (CLAHE) can be used to enhance the contrast within the soft tissues of the brain images and hybrid median filtering technique can also be used to improve the image quality.

B. Selection of ROI
First the pixel having highest intensity value in the image is selected, then that pixel is compared to the neighboring pixels.The comparison goes till there is change in the intensity level of pixel value.All the pixels having the similar intensity form the Region of Interests.

C. Feature extraction
As the tissues present in brain are difficult to classify using shape or intensity level of information, the texture feature extraction is founded to be very important for further classification.The purpose of feature extraction is to reduce original data set by measuring certain features that distinguish one region of interest from another .The spatial features from each ROI are extracted by dominant gray level run length method.In this method, by using multilevel domonant eigenvector estimation algorithm and the Bhattacharyya distance measure for texture extraction.

D. Dominant Gray Level Run Length Matrix method
The DGLRLM is based on computing the number of gray level runs of various lengths.A gray level run is a set of consequence and collinear pixels point having the same gray level value.The length of the run is the number of pixel points in the run.The gray level run length matrix is as follows: Where Ng is the maximum gray level and Rmax is the maximum Run length.The element p (i, j | θ) specified the estimated number of runs that a given image contains a run length j for a gray level i in the direction of angle θ.Four gray level run length matrices corresponding to θ = 0o, 45o, 90o, 135o are computed for an each region of interest and the following four textural features such as Short run low gray level emphasis, short run high gray level emphasis, Long run low gray level emphasis, Long run high gray level emphasis are extracted for each gray level run length matrix and take the average all four gray level run length matrices.

Long Run High Gray Level Emphasis(LRHGE)
). , ( / 1 (5) Where p is the run length matrix, p (i, j) is an element of the run length matrix at the position (i, j) and nr is the number of runs in the image.For all the four directions, the dominant gray level run length texture features are extracted.b) Spatial Gray Level Dependence Matrix method (SGLDM) For each ROI, the Spatial Gray Level Dependence Matrix method (SGLDM) can be used to extract the second order statistical texture features for the better diagnosis.This method is based on the estimation of the second order joint conditional probability density functions P(i,j | d,) where  = 0,45,90,135 degrees.Each P (i, j |d,) is the probability matrix of two pixels which are located with an inter sample distance d and direction  have a gray level i to gray level j.The estimated value for these probability density functions can be represented by Where, Ng is the maximum gray level.In this method, four gray level co-occurrence matrixes for four different directions (θ = 0o, 45o, 90o, 135o) are obtained for a given distance d (=1, 2) and the following 13 Haralick statistical texture features [20] are calculated for each gray level cooccurrence matrix and take the average of all four gray level co-occurrence matrices.The set of extracted second order statistical texture features from each ROI forms the feature vector or feature set.

c) Feature selection
Feature selection is the process of choosing subset of features relevant to particular application and improves classification by searching for the best feature subset, from the fixed set of original features according to a given feature evaluation criterion(ie., classification accuracy).Optimized feature selection reduces data dimensionalities and computational time and increase the classification accuracy.The feature selection problem involves the selection of a subset of the features from a total number of the features, based on a given optimization criterion.T denotes the subset of selected features and V denotes the set of remaining features.So, S = T www.ijacsa.thesai.orgU V at any time.J(T) denotes a function evaluating the performance of T. J depends on the particular application.Here J(T) denotes the classification performance of classifying and segmenting soft tissues from brain CT images using the set of features in T. In this work, genetic algorithm is used.

GENETIC ALGORITHM:
We consider the standard GA to begin by randomly creating its initial population.Solutions are combined via a crossover operator to produce offspring, thus expanding the current population of solutions.The individuals in the population are then evaluated via a fitness function, and the less fit individuals are eliminated to return the population to its original size.The process of crossover, evaluation, and selection is repeated for a predetermined number of generations or until a satisfactory solution has been found.A mutation operator is generally applied to each generation in order to increase variation.In the feature selection formulation of the genetic algorithm, individuals are composed of chromosomes: a 1 in the bit position indicates that feature should be selected; 0 indicates this feature should not be selected.As an example the chromosome 00101000 means the 3rd and 5th features are selected.The features 1, 2, 4, 6, 7, 8th feature are not selected.That is the chromosome represents T={3,5} and V={1,2,4,6,7,8}.Fitness function for the given chromosome T is defined as where T is the corresponding feature subset , and penalty(T) = w * (|T| -d) with a penalty coefficient w.The size value d is taken as a constraint and a penalty is imposed on chromosomes breaking this constraint.The chromosome selection for the next generation is done on the basis of fitness.The fitness value decides whether the chromosome is good or bad in a population.The selection mechanism should ensure that fitter chromosomes have a higher probability survival.So, the design adopts the rank-based roulette-wheel selection scheme.If the mutated chromosome is superior to both parents, it replaces the similar parent.If it is in between the two parents, it replaces the inferior parent; otherwise, the most inferior chromosome in the population is replaced.The selected optimal feature set based on the test data set is used to train the SVM classifier to classify and segment the tumor region from brain CT images.

a) Classifier
Classification is the process where a given test sample is assigned a class on the basis of knowledge gained by the classifier during training.To make the classification results comparable and for exhaustive data analysis, we have used leave one out classification method for the SVM classifier.
Figure 1 shows the class separation of SVM classifier.Support Vector Machine (SVM) performs the robust non-linear classification with kernel trick.SVM is independent of the dimensionality of the feature space and that the results obtained are very accurate.It outperforms other classifiers even with small numbers of available training samples.SVM is a supervised learning method and is used for one class and n class classification problems.It combines linear algorithms with linear or non-linear kernel functions that make it a powerful tool in the machine learning community with applications such as data mining and medical imaging applications.To apply SVM into nonlinear data distributions, the data can be implicitly transformed to a high dimensional feature space where a separation might become possible.For a binary classification given a set of separable data set with N samples X = {Xi}, i = 1, 2 ….N, labeled as Yi = ± 1.It may be difficult to separate these 2 classes in the input space directly.Thus they are mapped into a higher dimensional feature space by X' = f(x).
The decision function can be expressed as Where W.x + P = 0 is a set of hyper planes to separate the two classes in the new feature space.Therefore for all the correctly classified data, Yi f(x) = Yi (W.x + ρ) > 0, i = 1, 2 ….. N (9) By scaling W and ρ properly, we can have f(x) = W.x + ρ = 1 for those data labeled as +1 closes to the optimal hyper plane and f(x) = W.x + ρ = -1 for all the data labeled as -1 closes to the optimal hyper plane.In order to maximize the margin the following problem needs to be solved.
It is a quadratic programming problem to maximize the margins which can be solved by sequential minimization optimization.After optimization, the optimal separating hyper plane can be expressed as Where K(.) is a kernel function, ρ is a bias,  is the solutions of the quadratic programming problem to find maximum margin.When  is non-zero, are called support vectors, which are either on or near separating hyper plane.The decision boundary (i.e.) the separating hyper plane whose decision values f(x) approach zero, compared with the support vectors, the decision values of positive samples have larger positive values and those of negative samples have larger negative values.www.ijacsa.thesai.orgTherefore the magnitude of the decision value can also be regarded as the confidence of the classifier.The larger the magnitude of f(x), the more confidence of the classification by choosing a Gaussian kernel function K(x,y) = e -γ ||x-y|| 2 (12) Where the value of γ was chosen to be 1 and has good performance for the following two reasons.First reason is the Gaussian model has only one parameter and it is easy to construct the Gaussian SVM classifier compared to polynomial model which has multiple parameters.Second reason is there is less limitation in using Gaussian kernel function due to nonlinear mapping in higher dimensional space.

III. RESULTS AND DISCUSSIONS
This section describes the comparative study of classification performance of the SVM classifier for different texture analysis methods used for classification and segmentation of tumor from brain CT images.The texture features extracted from each ROI of the image to be selected by using pixel based intensity method.The texture features are extracted with same set of images and are obtained from 16 bit gray level images.The SVM is used as a classifier.The results from the SVM classifier for all the texture analysis methods are evaluated by using the statistical analysis.
An experiment has been conducted on a real CT scan brain images based on the proposed system.Accuracy is the proportion of correctly diagnosed cases from the total number of cases.Sensitivity measures the ability of the method to identify abnormal cases.Specificity measures the ability of the method to identify normal cases.To make the classification results comparable and for exhaustive data analysis, leave one out cross validation method can be used to estimate the classifier performance in unbiased manner.Here each step, one data set is left out and the classifier is trained using the rest and the classifier is applied to the left out data set.This procedure is repeated such that each data set is left out once.In our application for evaluating classification accuracy, 10 fold cross validation method is done on the data set collected from 120 images (60 normal and 60 abnormal).
The images are divided into 10 sets each consisting of 6 normal images and 6 abnormal images.Then 9 sets are used for training and remaining set is used for testing.In next iteration (2-10), 9 sets are used for training and remaining set is used for testing.This process is repeated for 10 times.Hence 10 iterations are done.The classification accuracy was calculated for by taking the average of all correct classifications.All the textural features were normalized by the sample mean and standard deviation of the image.www.ijacsa.thesai.orgThe sensitivity, specificity values of the new run length method are 96.72%,98.3% and the wavelet co-occurrence method are 96.6%,95% and co-occurrence method are 95%, 91.6% respectively.
Then the accuracy of the new run length method is 98.3% and wavelet co-occurrence method is 95.8% and the cooccurrence method is 93.3% respectively.We now compare the new run length method with the co-occurrence method and wavelet co-occurrence method on the brain CT images.For the co-occurrence method, 13 co-occurrence features are computed for each of the four directions (i.e) 0,45,90,135 degrees; for the wavelet co-occurrence method, the texture features used for each wavelet decomposition high frequency sub bands are entropy, energy, contrast , sum average, variance, correlation, max probability, inverse difference moment, cluster tendency.Table 5 shows the accuracy of different segmentation methods.The same genetic feature selection method is used for all the three texture analysis methods.In dominant gray level run length feature extraction method, the selected features were long run high gray level emphasis and long run low gray level emphasis.The long run high gray level emphasis captures the inhomogeneous nature of texture features and long run low gray level emphasis captures the homogeneous nature of texture features.About 93% classification accuracy is achieved by most of the feature vectors.6 (a,b,c,d) shows the input CT image and the corresponding segmented tumor image.From the classification results of the SVM classifier for all the three texture analysis methods, the new run length method perform comparably well with the wavelet co-occurrence features and better than the cooccurrence features.This demonstrates that there is more texture information contained in run length matrices and that a good method of extracting such information is important to classification and segmentation algorithm.In this work a new dominant run length feature extraction methodology for the classification and segmentation of tumor in brain CT images using support vector machine with genetic algorithm feature selection is proposed.The algorithm has been designed based on the concept of different types of brain soft tissues (CSF, WM, GM, Abnormal tumor region) have different textural features.The selection method of ROI is simple and accurate.The results show that the segmentation and classification of tumor for the new run length feature extraction method yields better results compared to the other texture analysis methods based on SVM classifier.It is found that this method gives favorable result with accuracy percentage of above 98% for the images that are being considered.This would be highly useful as a diagnostic tool for radiologists in the automated segmentation of tumor in CT images.
The goal of this work is to compare the classification performance of the SVM classifier based on different texture analysis methods.Hence it is concluded that the neural network supported by conventional texture analysis methods can be effectively used for classification and segmentation of brain tumor from CT images.Use of large data bases is expected to improve the system robustness and ensure the repeatability of the resulted performance.The automation procedure proposed in this work using a SVM enables proper abnormal tumor region detection and segmentation thereby saving time and reducing the complexity involved.
In this research work, the four segmentation methods, Bayesian classification, fuzzy C means, K means and expectation maximization algorithm were produced good results.In this work, the dominant feature extraction method with SVM classifier is proposed for the classification and segmentation of brain CT images and the results are compared with Bayesian classification and fuzzy c means, k means clustering method and expectation maximization algorithm through statistical analysis.The proposed method has better performance compared to the other existing methods.Plans for future work include the specific annotation of the abnormal regions such as hemorrhage, calcification and lesion.

Figure 1 .
Figure 1.SVM classifier Figure 2(a-c) shows the original input CT image, histogram equalized image, hybrid median filtered image which is used to reduce the different illumination conditions and noises.

Figure 3
Figure 3(a-d) Selection of ROIS Figure 3(a-d) represents the selected ROIs of the image to be segmented using pixel based intensity method.For the comparative analysis of texture analysis methods using SVM classifier, 120 images were partitioned arbitrarily into training set, testing set with equal number of images.The accuracy of the classifier for the texture analysis methods are evaluated based on the error rate.This error rate can be described by the terms true and false positive and true and false negative as follows True Positive (TP): Abnormal cases correctly classified True Negative (TN): Normal cases correctly classified.False Positive (FP): Normal cases classified abnormal False Negative (FN): Abnormal cases classified normal The above terms are used to describe the clinical efficiency of the classification and segmentation algorithm.Sensitivity = TP / (TP + FN) * 100 Specificity = TN / (TN + TP) * 100 Accuracy = (TP + TN) / (TP + TN + FP + FN) * 100.

Figure 4 .
Figure 4.No of detections based on different texture features Figure 4 shows the number of abnormal tumor detection based on the three texture analysis methods.The results show that, if the representative samples increased, it gives good classification accuracy for 10 fold cross validation method.The sensitivity and specificity values of texture analysis methods are shown in Figure 5.

Figure 6 (
Figure 6(a,b,c,d).CT image and the segmented tumor image

Figure
Figure 6 (a,b,c,d)  shows the input CT image and the corresponding segmented tumor image.From the classification results of the SVM classifier for all the three texture analysis methods, the new run length method perform comparably well with the wavelet co-occurrence features and better than the cooccurrence features.This demonstrates that there is more texture information contained in run length matrices and that a good method of extracting such information is important to classification and segmentation algorithm.

TABLE 1 .
FEATURES EXTRACTED USING DGLRLM METHOD ROI) is performed which results in four sub bands such as LL,LH,HH,HL.The sub band LL represents low frequency pat of the image and the sub bands LH,HH,HL represents high frequency part of the image.The wavelet decomposition transform is applied this high frequency sub bands only.Daubechies wavelet filter of order two is used.
a) Wavelet based feature extraction methodA two level discrete wavelet decomposition[18]of region of interest (

TABLE 2 .
WCT FEATURES EXTRACTED USING SGLDM METHOD

TABLE 3 .
FEATURES EXTRACTED USING SGLDM METHOD

TABLE 4 .
CLASSIFICATION RESULTS OF SVM CLASSIFIER USING CO-OCCURRENCE, WCT, NEW RUN LENGTH FEATURES

TABLE 5 .
ACCURACY OF DIFFERENT SEGMENTATION METHODS