A Novel Approach to Mammogram Classification using Spatio-Temporal and Texture Feature Extraction using Dictionary based Sparse Representation Classifier

—Cancer is a chronic disease and increasing rapidly worldwide. Breast cancer is one of the most crucial cancer which affects the women health and causes death of the women. In order to predict the breast cancer, mammogram is considered as a promising technique which helps to identify the early stages of cancer. However, several schemes have been developed during last decade to overcome the performance related issues but achieving the desired performance is still challenging task. To overcome this issue, we introduce a novel and robust approach of feature extraction and classification. According to the proposed approach, first of all, we apply pre-processing stage where image binarization is applied using Niblack’s method and later Region of Interest (ROI) extraction and segmentation schemes are applied. In the next phase of work, we developed a mixed strategy of feature extraction where we consider Gray Level Co-occurrence Matrix (GLCM), Histogram of oriented Gradients (HoG) with Principal Component Analysis (PCA) for dimension reduction, Scale-invariant Feature Transform (SIFT), and non-parametric Discrete Wavelet Transform (DWT) features are extracted. Finally, we present K-Singular value decomposition (SVD) based dictionary learning scheme and applied the Sparse representation classifier (SRC) classification approach and performance is evaluated using MATLAB tool. An extensive experimental study is carried out which shows that the proposed approach achieves classification accuracy as 98.13%, Precision as 97.58%, Recall as 98.36%, and F-Score 97.95%. The performance of proposed approach is compared with the state-of-art techniques which shows that the proposed approach gives better performance


I. INTRODUCTION
Cancer is a paramount root of death worldwide among men and woman. The burden of cancer is increasing worldwide due to different lifestyles. In complete worldwide population, 49.5% of population is covered by the females where larger portion is occupied by the population over 60 years [1]. Recently, American Cancer Society (ACS) presented a study where it was concluded that breast cancer is repeatedly diagnosed cancer and considered as a chronic disease which is second leading source of cancer death among U.S. women [2]. Also, the facts and figures indicate that there is increase in deaths in female due to breast cancer [45]. Breast cancer is considered as a vital health issue and generally, it is found in the women around age of 40. ACS revealed new cases of cancer which shows that total 231,840 cancer patients are women and 2350 cases are men and this study also presented that 40,730 number of deaths are also estimated during 2015 [3]. Hence, early prediction and detection of breast cancer is highly recommended to reduce the death counts due to the breast cancer.
Breast cancer is a most familiar type of cancer which is generally found among women after the skin cancer. Lower region of breast tissue structure is analyzed for identification of abnormality. In general, several techniques are present for detection and identification of breast cancer in its early stage such as biopsy, thermography, ultra sound imaging and mammography. Mammography is considered as most promising way for detection and diagnosis of breast cancer and can reduce the breast cancer mortality [4]. It is a medical imaging process where less intensity and energy X-ray systems are used to visualize the inner region of breast and images are acquired which are known as mammograms [5]. However, performance of mammography screening is a challenging task and all radiologists cannot achieve the uniformly higher accuracy because of tedious and error-prone process of analysis. This screening-based analysis suffer from various issues such as higher false positive rates, variability among clinicians and over-diagnosis of insignificant lesions [6]. In general, higher false positive rates can cause unnecessary cost and stress on the patient. In order to deal with these issues, a fully automated mammogram image analysis tool need to be developed which can efficiently analyze the mammogram images resulting in improved performance for detection of breast cancer and reduction in false positive rates using machine vision-based solution.
In this field of medical imaging, computer-aided diagnosis (CAD) and detection-based techniques has demonstrated a significant impact on the early detection process of several diseases. These systems are developed for various medical applications such as lung cancer, brain tumour, CT image www.ijacsa.thesai.org processing, mammography and other pathology images. In this work, focus is on the CAD systems for mammogram image analysis because of its significant performance for breast cancer mammogram image analysis. The complete process of mammogram analysis requires multiple stages such as ROI extraction, pectoral muscle segmentation image binarization, feature extraction and classifier construction. A general process flow is depicted in Fig. 1  Conventional approaches of mammogram classification require several extra information such as bounding box or ground truth images to perform the segmentation. These techniques are based on the handcrafted features where manual screening process is applied which is a tedious and time-consuming task. Moreover, analysis becomes inappropriate which may cause inaccurate diagnosis.
Hence, CAD based automated systems are developed for detection and classification of breast cancer [7]. Several techniques have been presented in the field of mammogram classification where image pre-processing, feature extraction and classifier construction are considered the fundamental stages. During image pre-processing, image enhancement and image denoising plays important role. Maitra et al. [8] presented contrast enhancement strategy using contrast limited adaptive histogram equalization (CLAHE) scheme. Later, modified seeded region growing (SRG) algorithm is applied for pectoral muscle segmentation. For image enhancement spatial [9] and transform domain [10] techniques are adopted. As discussed before, image binarization also plays an important role in mammogram image analysis. Various schemes have been used such as Otsu's thresholding method [11], adaptive thresholding [12] and Kittler's Method [13]. P. S. Vikhe proposed an adaptive threshold-based contrast enhancement method for enhancement of suspicious regions in mammograms masses in mammograms [46]. Homomorphic filtering and wavelet shrinkage methods are used for denoising and enhancement of mammograms in [47]. In the next stage, ROI extraction and segmentation is considered as main objective. Accurate image segmentation can help to obtain the desired information without mixing the unwanted sources. In order to perform the image segmentation, several techniques such as region growing [14] and watershed segmentation [15] are used. In order to obtain the robust performance of classification, feature extraction techniques are implemented for better leaning. In this field, GLCM (Gray-Level Concurrence) features [16][17], wavelet transform [17][18]22], LBP (Local Binary pattern) [19][20], HoG (Histogram of gradient) [21], SIFT (Scale-invariant feature extraction) [22], CNN [23][24] based features and texture features are widely applied to obtain the efficient accuracy. With the help of these techniques, classifiers are constructed for pattern learning and classification of the breast cancer. These classifiers are known as DNN (Deep Neural Network) [25], SVM (Support Vector Machines) [26] and Neural Network Classifier [27], etc.
Significant amount of work has been presented in this field of medical imaging for breast cancer detection and classification. Most of the schemes are focused on the feature extraction where multiple features are extracted and classified using different classifier modules. However, during image capturing, degraded image quality may cause poor performance. Moreover, image binarization, pectoral muscle segmentation and ROI (Region of interest) also need to be considered for better performance of classification. An optimum ROI can significantly improve the classification performance because of robust feature extraction region. Later, improved and hybrid feature extraction also can be helpful for image analysis and classification of breast cancer. According to the current classification studies, feature vectors are directly provided to the learning scheme where data range and its distribution becomes unpredictable which results in erroneous learning and poor classification performance. Hence, there is a need to develop a novel robust computer vision-based image mammogram image analysis tool for improved early prediction breast cancer.
In order to deal with these issues, an end-to-end image analysis tool is presented where novel methods are applied for image pre-processing, feature extraction and classification. According to image pre-processing phase, first of all, image binarization scheme is applied where Niblack image binarization is applied which helps to preserve the edges of image. This edge preserving can improve the segmentation process by accurate boundary identification. In the next phase of pre-processing, pectoral muscle segmentation is performed using convex hull modeling and ROI is extracted. Next, important phase considers feature extraction task where multiple scheme for feature extraction are applied. GLCM, improved HoG, improved SIFT and geometric non-parametric based features extraction schemes are implemented. For further process, these features are processed through the sparse representation to obtain the generalized features with a known distribution which helps to reduce the training error.
Finally, heuristic SRC classification scheme is presented which is self-adaptive in nature and can perform efficiently for high dimension complex data. This article shows that www.ijacsa.thesai.org proposed non-parametric feature extraction in combination with SIFT, GLCM, and HoG feature is an important contribution. This combination provides detailed information about mammogram images. Moreover, this scheme of feature extraction can be helpful for other bio-medical imaging applications.
Rest of the article is arranged as follows: Section II presents recent trends and techniques in field of breast cancer detection and classification, proposed model and its description is presented in Section III, Section IV presents experimental and comparative study using proposed approach and finally, Sections V and VI gives concluding remarks and future direction of the work.

II. LITERATURE SURVEY
In previous section, a brief discussion about the breast cancer is done, its effects and solutions using computer aided designs are presented. As discussed before, several techniques have been presented recently for breast cancer detection and classification using mammography mechanism. This section presents a brief discussion about recent trends and techniques in the field of mammogram classification using machine vision methodology. Many researches have been presented which are mainly focused on the robust feature extraction. Singh et al. [28] studied about the mammogram classification and introduced a center-symmetric based approach for texture feature extraction which is called as wavelet-based cantersymmetric local binary pattern (WCS-LBP) and concluded that these features are adequate to achieve better performance. This approach extracts the features from non-overlapping region. In addition to this, relevant feature selection and reduction is also applied using SVM based recursive feature elimination process. Finally, decision tree based random forests classifier in constructed and performance is obtained. However, this approach is applied for the raw data where input image is divided into four frequency bands and LL band is considered for the LBP feature extraction. Due to this process, accurate ROI extraction becomes a tedious task hence complete image need to be considered for processing which may result in false positives. Shastri et al. [29] proposed a novel feature extraction process for mammogram classification. In this work, Histogram of oriented texture features are computed which are obtained using histogram of gradients and Gabor filter combination. For improved texture analysis, Pass Band -Discrete Cosine Transform (PB-DCT) scheme is also applied followed by a feature selection technique. This approach achieves better performance but fails to study about the mammogram density which becomes a crucial task during ROI extraction. In this approach, the images are divided into different patches but sometimes all patches do not play significant role in feature extraction hence authors introduced Discrimination Potentiality scheme for feature selection.
In medical imaging applications, texture feature extraction plays important role. Texture features can be categorized into four groups as: statistical texture feature, local pattern histogram, directional features and transform based texture features. Khan et al. [30] utilized the statistical texture feature analysis and presented optimized Gabor feature extraction for mass classification in mammography and for improving the performance by reducing the false positives. In order to formulate an optimized filter bank, Particle Swarm Optimization and incremental clustering algorithm is applied. Moreover, Gaussian kernel SVM is applied as the fitness function for PSO. In [31] Abdel-Nasser et al. focused on the local patterns for image classification process and implemented for breast tissue classification. The classification works carried out in this study are as follows: classification of breast tissue within the region of interest and classification of breast density. The complete process is implemented into three main stages as ROI segmentation, feature extraction and classification. This study shows that detection of mass and breast density is a challenging and it has a serious impact on the breast tissue classification. Hence, presented feature extraction mode called as uniform local directional pattern. However, due to insignificant feature extraction and modelling, this study fails to obtain the promising estimation of breast density and mass identification. Recently, Gedik et al. The author in [32] presented transform based feature extraction method using fast finite shearlet transform where transform coefficients are used for feature vector construction. Later, thresholding process is implemented to distinguish between different classes and finally SVM based classification approach is implemented to measure the performance. This technique doesn't focus on the image segmentation and ROI extraction process which causes inaccurate feature extraction resulting in poor pattern learning.
In this field, machine learning based schemes are also widely adopted for various feature extraction and classification. In this field, CNN (Convolutional Neural Network) based feature extraction and classification scheme is used in many studies. Wu et al. [23] presented 2D mammogram classification study using CNN. According to this approach, loop interpretation method is applied for identifying the behavior of CNN model which helps to detect the breast tissue pattern. Later, these patterns can be analyzed by the experts to find the correlation with mass tissue and calicificated vessels. The developed system is not fully automated and it requires expert knowledge for analysis of brain tissues. Recently, Tusa et al. [33] presented a comparative study for mammogram classification based on the CNN models. These two different classifiers are different from each other in terms of connected layers, feature extraction techniques and bit per pixel. Further, CNN classifier uses Tensor-Flow library for classifying the mammograms. Total optimal number of layers and computational complexity is considered a challenging task in these processes. Gardezi et al. [34] introduced a classification approach for classifying the normal and abnormal mammograms using deep learning approach. In order to obtain this architecture, authors have used VGG-16 based CNN deep learning model. Moreover, ROI extraction is performed by applying 3x3 convolutional filter and 10-fold cross-validation is used for SVM, simple logistics classifier, binary tree, and KNN classifier with the varied values of K as 1,3, and 5. Lizzi et al. [35] discussed about the rules and regulations of the use of ionizing radiation and information about the radiation dose according to the European Directive 59/2013/EURATOM. Dose controlling during the cancer www.ijacsa.thesai.org screening process is a very crucial and important process because breast contains radio-sensitive tissues and during screening processed these are exposed to the radiation. Inaccurate dose of radiations may cause damage in tissues. In order to optimize the dose for each patient, we perform classification of mammograms using CNN in BIRADS standard. This process helps to personalize the doses according the severity of the breast cancer.
In this section, detailed study of recent techniques for breast cancer classification using mammogram image classification techniques is done. These techniques are based on the image pre-processing, feature extraction and classification. Some of the state-of-art techniques use feature extraction but ROI extraction and segmentation is not considered which results in the poor feature extraction. On other hand, CNN and machine learning based strategies are also considered which are capable to achieve the improved performance but computational complexity remains a challenging issue. Hence, still there is a need to improve the mammogram image analysis for breast cancer detection and classification. Below given Table I shows a comparative analysis of these techniques.
The existing techniques suffer from poor accuracy performance which is the main disadvantage of these techniques. The classification accuracy depends on the robustness of features. Hence, we focus on the feature extraction process and introduce a new feature vector model to improve the classification accuracy.

III. PROPOSED MODEL AND METHODOLOGY
Previous section described about the current techniques in the field of medical imaging and presented a brief discussion about recent techniques for mammogram classification and their drawbacks. This section presents a novel and robust solution for the mammogram image classification using computer vision-based approaches. As discussed before, computer vision-based pattern learning and classification scheme requires three basic steps which are image preprocessing, feature extraction and classification. The proposed work is also divided into three main stages which is as follows:  Image pre-processing: This is an important phase of any pattern learning and classification model. In this work, image binarization using Niblack algorithm is applied which helps to preserve the edges of image this edge preserving nature help to identify the object boundaries and can improve the segmentation process. Later, pectoral muscle segmentation is applied to obtain the segmented breast region and finally ROI is extracted.
 Feature extraction: In next stage, feature extraction process is applied where conventional GLCM features are extracted for texture analysis, HoG (Histogram of gradient) features are extracted for improved histogram analysis and object localization. In addition to this, SIFT (Scale Invariant Feature Transforms) are also computed which helps to improve the feature extraction for varied scales and angles of image acquisition. Later, geometry based non-parametric features are also extracted which makes features more robust without taking any pre-defined parameters.
Once the feature extraction is done, we apply sparse representation which performs feature normalization and reduces the feature redundancy. www.ijacsa.thesai.org  Finally, presented heuristic self-adaptive SRC classifier which can provide the robust performance for high dimension datasets.
 The complete process of proposed approach is depicted in Fig. 2.

A. Image Pre-Processing
According to the proposed approach, first of all, image binarization scheme for boundary identification is applied. In computer vision-based applications, binarization is also known as thresholding where a basic function is applied to compare the gray levels with the computed threshold. In this process, it converts all pixels zero if the pixel values are less than the threshold and higher pixel values are converted to 1. Let us consider that the threshold outcome is t(x, y) and input image is expressed as f(x, y) and the threshold is given as Thr, then the image binarization or thresholding relation can be expressed as.
In this process, optimal threshold plays important role hence Niblack's technique for image binarization is considered. According to this process, a rectangular window is created and pixel-wise sliding is performed over the input gray image. However, the size of the window may vary. In this process, local mean ( ) and standard deviation ( ) is considered to compute the threshold which can be expressed as: Where denotes total number of pixels in the image, denotes average pixel values, denotes then noise constant and the current pixel value.
After performing the binarization, pectoral muscle identification and ROI extraction is performed where segmentation and connected component labelling is applied.
Where denotes total number of pixels in the image, denotes average pixel values, denotes then noise constant and the current pixel value. Below given Fig. 3 shows a comparative analysis of image binarization using different techniques.
After performing the binarization, pectoral muscle identification and ROI extraction is performed where segmentation and connected component labelling is applied.

B. Segmentation and ROI
With the help of this process, the ROI and other components are extracted as given in Fig. 4 where input image, binarized image, pectoral muscle removal, etc. are obtained. In this, final outcome is obtained as pectoral muscle removed ROI. This ROI can be processed further for feature extraction.
For pectoral muscle extraction, the mammogram image is segmented and extract the ROI from the segmented region. First of all, apply multi-level Otsu's image thresholding approach for segmentation and extract the ROI. The multilevel Otsu's thresholding helps to classify the pixels in different classes based on their gray levels. In the next phase, focus is on the pectoral muscle extraction. It is considered that the pectoral muscle has high intensity pixels whereas other regions have low-intensity pixels. In this process, initialize the extraction process with two thresholds and increase it by one to analyse the region which is having higher pixel intensities. Further, pectoral muscle region and intermediate region is considered which is adjacent to the pectoral muscle and measure the transitional area between two regions. If this region has a greater number of black pixels, then it is considered that further region is not useful for calculations and multi-level Otsu's thresholding is reinitialized with higher values of threshold.

1) GLCM feature extraction:
In this sub-section, proposed feature extraction technique is presented where first of all GLCM features are computed for the given input image. In this phase, Haralick [24] based scheme for feature extraction is followed which have been adopted widely in various applications. General mathematical formulas for GLCM feature extraction are expressed in Table II. GLCM features are considered as a matrix whose distribution is based on the angular and distance relation between pixels. The tabulated features provide degree of correlation between pixel pairs and texture information to analyze the image. Moreover, these are the statistics properties of GLCM which helps to construct the normalized GLCM. Generally, in image pixels vary at every position which can be used for identifying various texture information using GLCM. With the help of GLCM, several features can be computed using this matrix which are classified as visual texture features, statistics feature, and information theory feature and information correlation measurement.
2) Improved HoG feature extraction: In this sub-section discussion is about the improved HoG feature extraction process for given input images. In order to improve the further performance, PCA (Principal Component Analysis) is applied for feature dimension reduction which is applied on every block. HoG features are known as gradient features and widely adopted for object recognition. These features are extracted by considering entire region of the input image. In other words, it can be expressed that the HoG features can represent the rough shape of the input as depicted in Fig. 5.
In order to efficiently extract the features, region or interest need to be extracted initially by using back ground subtraction or image binarization methods. Input image is normalized and the object in the form of breast is located on the image. In order to formulate the feature, first of all, image gradient need to be computed which is expressed as: In order to make it more robust feature extraction model, we use unsigned orientation computation, given as follows: The obtained gradient image is obtained and divided into cells as pixels where is width of cell and is the height of cell . In each cell, orientation bins ( ) are computed by quantizing the unsigned gradient orientations which are weighted by its magnitude for histogram computation. This process of histogram orientation and bin computation is presented in Fig. 6 where cell and blocks are presented. Any cell has orientation as , hence the feature extraction block dimension can be given as for each computing block. Let us consider that is a feature vector and denotes the histogram of the current cell at the pixel position ( ) and * + . Finally, a normalized feature vector of each block can be given as: In this process, the gradient over the entire image is computed hence the feature vector dimension is higher and for better classification, the features obtained from the object region only need to be considered and the background region features need to be eliminated to reduce the error. In order to perform this task, we present PCA based feature reduction strategy. In digital image processing, any input image can be represented in the form of pixel groups where pixels are arranged in the two-dimensional matrix form. Generally, these pixels are in the form of floating point for coloured image and gray scale images contains discrete values which can be dented in a matrix form as follows: Where and denotes the pixel coordinates and ( ) denotes the corresponding pixel value from the image. Image feature dimension reduction can be categorized into four main steps such as: image normalization, covariance computation, eigen vector computation and data transformation.
First of all, apply image normalization process where mean value is subtracted from the original image which helps to improve the SNR quality. This process of data normalization can be obtained as: Where , ̅ ( ) ̅ ( )denotes the mean value of the image. In next stage, we apply covariance computation which is used for identifying the lowest and highest variance points in the image. The covariance matrix of this normalized data ( ) can be obtained as follows: Where denotes the total number of elements. In the next phase of PCA, eigen vectors and eigen values of covariance matrix is computed which can be obtained by the SVD equation and can be expressed as: Where denotes the eigen vectors of covariance and the square of singular values ( ) denotes the eigen values of .Finally, transform the complete data into the new data with the reduced dimension, it can be expressed in the form of eigen vectors, as follows: This approach generates huge number of features over multiple location and wide range of scales. However, the size of feature generation depends on the image size and input parameters. According to this process of SIFT, aim is to find the reliable locations in the given scale space where these locations can be extracted efficiently. First stage uses scalespace extrema computation in ( ) with the help of DoG function where two nearby images are considered which are separated by a multiplicative factor denoted as , this computation can be given as: Where ( ) is the scale space function which is obtained by performing the convolution between Gaussian kernel ( ) and input image ( ) . Prior to the descriptor key point construction, the descriptors are transformed into the invariant to the rotation by assigning the key points. In order to compute the key point orientation, orientation histogram of local gradient is considered from the smoothed image as ( ) . For any given input image sample ( ) scale, the gradient magnitude as ( ) , orientation ( ) can be by using pixel differences as: 4) Non-parametric feature extraction: In this section, non-parametric feature extraction process using wavelet transform for the input image with the dimension as . In this process, wavelet decomposition is applied with decomposition levels and descriptors are computed in the patches form. These patches are characterized as feature vector for the decomposed sub-bands, first sub-band is computed with the help of low-pass filtering computation in horizontal and vertical directions as LL band, next frequency sub-band are estimated with the help of low and high pass filtering computation which can be expressed as LH where low pass filtering is applied as horizontal manner and high pass filtering is applied as vertical ,HH band where high pass filtering is applied in the both direction and HL band where high-pass filtering is applied in the horizontal and lowpass filtering is applied in the vertical direction.
A non-parametric feature vector is formed by computing the sub-band variance , mean value , L1-norm ‖ ‖ , L2-norm ‖ ‖ and entropy for each sub-band. In addition to this, the variance of sub-band can be given as: Where ,denotes the wavelet coefficients and is the mean of sub-band, which is expressed as: Similarly, the first order and second order norms are given as follows: where denotes the norm-factor i.e. if then first order norm and denotes the second-order norm. In next phase of feature computation, entropy and coding gain are computed in each sub-band. In this work, we have considered Shannon's entropy which can be computed as: Where denotes the pixel probability of occurring the sub-band.
In the next phase, the feature vector is constructed by combining all features in a row vector where GLCM, HoG, SIFT and non-parametric features are arranged in a sequence.
Where denotes the dictionary to be learned as where , denotes the sparse coding vector as , and ‖ ‖ Frobenius norm and denotes the sparsity limits which reduces non-zero coefficients in each sparse coding vector.
Let us consider that dictionary is given as which need to be expressed in the linear combination as ̂ where ̂ denotes the sparse. The solution for sparse can be given as: After finishing the dictionary training for each class * + , classification of new incoming patch can be computed by the representation error, given as.
With the help of class error, pseudo probability ( ) measurement is computed for each class which is used for the class assignment rule: In this approach, introduced a new model for significant feature extraction. This technique poses several advantages such as new hybrid approach for feature extraction, reducing the computational complexity, non-parametric feature extraction to achieve the transform domain information and designing a new scheme for learning to reduce the learning error.

C. Classifier Construction
Previous section presents proposed feature extraction techniques and formulate a robust feature vector. In order to obtain the improved classification process, data pattern learning scheme need to be applied on the extracted features. www.ijacsa.thesai.org Here, dictionary learning based approach is considered for pattern learning where K-SVD approach is used. The main advantage of K-SVD is that prior to training it includes data sparsity followed by normalization. Let us consider that input feature vector is given as , which is L2 normalized training samples as . During this phase, K-SVD approach tries to solve the following problem.

IV. RESULTS AND DISCUSSION
This section presents complete experimental study using proposed approach for mammogram classification. The proposed approach is carried out using MATLAB 2013b simulation tool running on windows platform with Intel i3 processor. The proposed approach is carried out on open source database known as Mammographic image analysis society (MIAS) database [36]. Complete information about the dataset is given in the corresponding ids. In this database, total 322 images are present where 115 images are abnormal and 207 images are considered to be from normal class. Here we consider total 115 images for experimental study to classify the benign and malignant cases where 62 images are benign and 53 images are malignant.

A. Performance Mesurement
In order to measure the performance of proposed approach the dataset is divided into multiple training and testing sets of images. The complete performance is evaluated by computing confusion matrix where true negative (TN), False Negative (FN), True positive (TP) and False Positive (FP) cases are present. With the help of confusion matrix, we compute various performance measurement parameters such as: Precision: which is a measurement of the discrimination ability of the test images for other images and classes.
Recall: It is a measurement of system's recognition performance for different type of mammograms. It is also known as sensitivity.
False Positive Rate: it is a measurement of proportion of classification of images as particular class but actually it belongs to the different class.
Later, F-measure is computed which is the combined measurement of precision and recall which is computed as: Finally, classification accuracy is measured which shows the overall classification performance, given as:

B. Experimental Results
In this sub-section experimental result and their discussion for various test cases are presented. Moreover, the intermediate stages are also depicted for multiple test cases.
In the above given Fig. 7, different test cases such as normal, benign and malignant are considered. Each image is processed through different phases such as binarization, segmentation, muscle removal, ROI extraction, HoG features, SIFT features, and wavelet-based feature extraction.

C. Performance Measurements
In this section, presented are, comparative performance in terms of classification accuracy, confusion matrix and other statistical measurement parameters. With the help of proposed classifier, the obtained confuse matrix is given in Table III. Based on the obtained confusion matrix, we compute other performance measurement parameters which are given in Table IV. www.ijacsa.thesai.org The Table V shows a comparative performance analysis by considering various state-of-art techniques. In this analysis, we compared several parameters such as accuracy, precision, recall and F-score measurement. Experimental analysis shows that proposed approach achieves better performance in terms of Accuracy, Precision, Recall, and F-Score as 98.13%, 97.58%, 98.36% and 97.95%, respectively.

V. CONCLUSION
In this article, focus is on the development of a novel framework for breast cancer classification using computervision based approach. In order to develop this model, first of all image pre-processing stage is presented where image binarization is performed using Niblack's method, later ROI segmentation and pectoral muscle removal techniques are presented. In the next phase, feature extraction process and developed a robust feature extraction model are considered where GLCM, HoG with PCA, SIFT and DWT based nonparametric features are calculated and represented in a sparse manner to reduce the data training error. Finally, pattern learning scheme is applied where K-SVD based dictionary learning approach is implemented and SRC classifier is applied and performance is evaluated. A comparative performance analysis is carried out which shows that the proposed approach achieves improved performance when compared with the state-of-art techniques. Moreover, this approach reduces computational complexity which can be helpful for clinicians to achieve the faster and reliable outcome to provide the suitable diagnosis.

VI. FUTURE WORK
Further, this work can be extended with the help of deep learning schemes such as convolutional neural networks to reduce the computational complexity related issues.