Feature Extraction based Breast Cancer Detection using WPSO with CNN

The cancer reports of the past few years in India says that 30% cases have breast cancer and moreover it may increase in near future. It is added that in every two minutes, one woman is diagnosed and one expires in every nine minutes. Early diagnosis of cancer saves the lives of the individuals affected. To detect breast cancer in early stages, micro calcifications is considered as one key symptom. Several scientific investigations were performed to fight against this disease for which machine learning techniques can be extensively used. Particle swarm optimization (PSO) is recognized as one among several efficient and promising approach for diagnosing breast cancer by assisting medical experts for timely and apt treatment. This paper uses weighted particle swarm optimization (WPSO) approach for extracting textural features from the segmented mammogram image for classifying micro calcifications as normal, benign or malignant thereby improving the accuracy. In the breast region, tumour part is extracted using optimization methods. Here, Convolutional Neural Networks (CNNs) is proposed for detecting breast cancer which reduces the manual overheads. CNN framework is constructed for extracting features efficiently. This designed model detects the cancer regions in mammogram (MG) images and rapidly classifies those regions as normal or abnormal. This model uses MG images which were obtained from various local hospitals. Keywords—Breast cancer; microcalcifications; weighted particle swarm optimization (WPSO); Convolutional Neural Networks (CNNs) mammogram


I. INTRODUCTION
Breast cancer is the most commonly found in women which causes deaths who are aged from 20 to 59. According to the Ministry of Health and Medical Education, it has become the most common disease in recent years in Iran [1]. Today, 88% of women diagnosed with breast cancer have a life expectancy of 10 years. In the United States, it has been reported that about 12% of women were identified during their lifetime, and were referred to as the second cause of women's death [2]. Diagnosing the disease at the earlier stages is important because in the early stages, cancer masses are restricted to the breast and the chance of surgical treatment in a less invasive manner is increased. The mortality rate is also decreased in the early stage [3]. Also, the use of classifiers such as artificial neural networks in various fields of engineering sciences is increasing to analyze the time series and various issues of classification. Due to the invention of techniques in the recent era for early diagnosis of breast cancer, the survival rate of the patients is improved. Now-a-days, X-ray mammography and MRI (Magnetic Resonant Imaging) techniques are widely utilized with few implications and limitations. X-ray is very harm due to the ionizing radiation and thus its contact with patients has to be only for very short duration. Conversely, MRI technique is expensive while mammography is of less cost, but hard to provide consistency and accuracy in analysing breast cancer [4]. Moreover, errors occur while analysis. To increase the rate of accuracy and reduce the occurrence of errors, supervised machine learning approaches like KNN, SVM, LSSVM are developed. These models efficiently classify the features as normal or abnormal classes. These methods are complex and even tedious with low CR. Therefore, to provide a solution for all the drawbacks of breast cancer, an optimal classification model is required for which machine learning approaches based on image processing are developed to classify cancer and non-cancer images which involved mammogram images. As the features are essential to discriminate breast cancer as benign or malignant, feature extraction process is of most important. Once the features are extracted, properties of the image like depth, coarseness, smoothness, and regularity are obtained with the help of segmentation process [5]. Scientifically, with breast cancer, division of tumor cells is uncontrolled and abnormal tumor cells need more nutrients for growing continuously and to reproduce. The cancer cells penetrate into the surrounding for gain in nutrients. There is a heterogeneous variation in the circulation of blood with various tumors and hence lesion morphology characteristics and ambiguity with edges in diagnosing images are significant in dicators for evaluation. The paramagnetic contrast agent spreads in blood which enters into the blood vessel and passes in the intercellular space as well as cells easily via penetrable capillary wall; hence the sputum concentration is high in the tumor rich region. This abnormality can be found using TIC when DCE-MRI is utilized for several imaging of the same tissue in various stages. Thus, edge, shape, etc. which are static characteristics and initial increase and change in signal which are dynamic characteristics of the lesion plays a major role in identifying the tumor as benign or malignant. MRI images are usually clear and complete with multi-angle, multifaceted imaging. With the breast, surface coil has been used for clinical purpose, and MRI technology is improved to be much clear. However, the true positive rate and the true negative rate obtained while diagnosing breast cancer are also improved simultaneously [6]. 375 | P a g e www.ijacsa.thesai.org Here this paper contributes weighted particle swarm optimization (WPSO) approach for extracting textural features from the segmented mammogram image for classifying microcalcifications as normal, benign or malignant thereby improving the accuracy. In the breast region, tumor part is extracted using optimization methods. Here, Convolutional Neural Networks (CNNs) is proposed for detecting breast cancer which reduces the manual overheads. CNN framework is constructed for extracting features efficiently. This designed model detects the cancer regions in mammogram (MG) images and rapidly classifies those regions as normal or abnormal.
The remaining part of this work is presented as follows. An outline of relating works is discussed in Section 2; Section 3 elaborates the proposed methodology while Section 4describes the experiments and discusses the obtained results. Finally, Section 5 concludes the work with future improvements.

II. RELATED WORK
This section discusses few related works carried out for diagnosing breast cancer which involved various optimization techniques. It is well known that breast cancer is one serious and dangerous cancers among women and hence diagnosing at the earlier stages is more effective to provide treatment and protect the lives of patients. Till now, several approaches are coined for detecting breast cancer which addresses different sorts of challenges and few of them are reviewed here. Asri et al. [7], in 2016, employed machine learning methods for the prediction and classification of WBC actual dataset. Various classifiers were used which includes SVM, Naive Bayes, KNN and decision tree C4. SVM produced an accuracy with Weka tool. In [8], Chowdhary et al. utilized mammography images for the detection of breast cancer using intuitive fuzzy histogram magnification approach there by data was processed and image quality was improved. Then, probabilistic Fuzzy Clustering approach was employed for segmenting and separating the cancer tissues. Hence, this model was suitable for processing larger cancer datasets with the objective to offer better accuracy. Next, with the methods like grey area coefficient and linear binary pattern, textural properties were extracted. The accuracy obtained was 94% but hard while dealing with larger datasets and extends the processing time. Aalaei et al. [9] employed genetic meta-specificity reduction for classifying breast cancer. Three datasets namely WBC, WDBC and WPBC were used for evaluating which used Artificial Neural Network (ANN) cluster. The accuracy estimated for the method used with WBC, WDBC and WPBC datasets were 96, 96.1 and 76.3, respectively. Even though feature set was reduced, accuracy could be improved. Nilashi et al. [10], in 2017, designed a knowledge-based system which involved fuzzy logic. The process was carried out in three steps: initially Wisconsin Breast Cancer data was processed. Then, data with similar groups was clustered by the use of Expectation Maximization (EM) clustering technique. Finally, once the features were reduced by PCA, fuzzy rule set was categorized as data by means of regression tree. The accuracy obtained was 93.2%. Sometimes when learning rules are applied on datasets, the classification task is complicated. In [11], the use of Bat algorithm selected optimal features for diagnosing breast cancer. 286 samples were selected from WDBC dataset for which simple random sampling approach was involved for feature selection. After selecting the features, according to the classification similarity which involved Random Forest (RF), overall ranking was performed and obtained an accuracy. As samples are selected at random, selection of features was sometimes difficult. In [12], Dore swamy et al. improved Bat algorithm to classify breast cancer images. 569 samples of UCI data were involved in experimenting this method. The accuracy for training set was 92.61 while that of the testing was 89.95. In [13], an approach using PSO was utilized for reducing the specificity in diagnosing breast cancer. The objective was to estimate the level of breast cancer. 699 pre-processed samples of UCI data after reducing the specificity were used by PSO algorithm along with decision tree C4.5 to classify the samples into two classes namely malignant and benign. The accuracy achieved was 95.61%. Sahu et al. [14] used a hybrid approach for classifying and diagnosing breast cancer. With PCA feature reduction and various clusters, it was found that ANN classification produced 97% of performance than other clusters. 699 samples with 9 features were used in the experiment to label them as benign and malignant. Even though results achieved are better, every method has few weaknesses and limitations. In [15], Gao  The specialties of the present investigation are reduction is detecting costs, using better classifier with no adverse effects of aggressive approaches, higher accuracy of detection than the paper cited, choosing titles appropriate with the data available and comprehensive comparison with the researches made so far.

III. PROPOSED METHODOLOGY
The work flow of the methodology developed is illustrated in Fig. 1. The phases like pre-processing, segmentation and feature extraction are discussed below. CNN classifier is involved to obtain the accuracy of classification.

A. Pre-processing Steps
Step 1: Looking for an input Breast Image.
Step 2: The raw image provided as input raw undergoes resizing to 256 x 256.
Step 3: When 3-dimensional (3D) images are provided as input, they are converted to 2D, since mostly image processing is carried out only with 2D images i.e, RGB image is converted into gray scale image.
Step 4: Two filtering techniques are applied for de-noising as described below: Step 4.1: Out_1 = Laplacian filter is applied on the gray scale image .
Step 4.2: Out_2 = Then mean filter is applied on the gray scale image.
Step 4.4: The final output of the pre-processing stage is the pre-processed breast imageOut_3.
Step 1: Gradient is obtained along X and Y axis in variables Out X and Out Y.
Step 2: Gradient values are combined to obtain gradient vector G val which is given by.
Step 3: G val obtained in radians is converted to degrees so that orientation information of image pixels can be attained.
Step 4: Out_3 image is partitioned to grids GRi.
Step 5: Threshold values are defined for intensity Ti and orientation To.

1) Histogram Hi for every pixel Pjis computed over grid GRi.
2) 6.2. Most frequent histogram of grid GRiis found which is represented by Freq H.
3) 6.3. Any arbitrary pixel Pjis selected which is related to Freq H which is then assigned to pixel information seed point (SP) with Intensity Ip and Orientation Op. 4) 6.4. Intensity along with orientation constraints for adjacent pixel is verified. 5) 6.5. When both constraints are fulfilled, then decided that region is grown, or else.
Next grid is considered for further process.

C. Weighted PSO based Feature Extraction
A heuristic global optimization technique namedWeighted Particle Swarm Optimization (WPSO) algorithm simulates the social behaviour of flock in g bird towards a position for attaining the exact objective in a multidimensional space. This approach involves a population of particles (called swarm) in the search space. For every particle, the status is categorized based on its location ⃗ = { 1 , 2 , … . }and the velocity of particle i is given by ⃗ = { 1 , 2 , … . }.To find the optimal solution, every particle deviates from its actual searching direction to a new direction based on two concepts namely the best location of the given particle (pbest) and the one obtained so far by swarm (gbest). WPSO identifies the optimal solution after velocity and position of every particle are updated in relation with the equations, where t and d represent the iteration in the evolutionary space and dimension in search space respectively. W denotes the weight of inertia. c1 and c2 represent the personal and www.ijacsa.thesai.org social learning factors. r1 and r2 are uniformly distributed random values ranging between 0 to 1. pid and pgd denotes pbest and gbest in the dimension d.
The basic steps performed in WPSO algorithm are as described below:  Initialization: Random positions and velocities are used to initialize the particles.
 Evaluation: For every particle, value of the objective function is estimated.
 Finding pbest: When the value obtained with the objective function is better than the p best for particle i, then the current value is assigned as the new p best.
 Finding gbest: When p best is better than gbest, then gbest is assigned to the current value.
 Updating the position and velocity: For every particle, velocity is updated using Equation 1, and the particle is moved to the next position based on Equation 2.
 Terminating Criteria: When the required number of iterations are reached, the process ends or else repeated from step 2.
For the search space, exploration and exploitation are controlled by weight as velocity is adjusted dynamically. Moreover, weight controls the impact of the previous velocities on the current one. Thus, the exploration capabilities are compromised between global and local swarm. Larger weight simplifies global search for new areas while the smaller weight simplifies local search. When the weight is chosen properly, global and local exploration of swarm is balanced providing better solution. Hence, weight can basically set to a larger value for better global exploration of the search space and then decrease it gradually to obtain refined solution. When the weight decreases linearly, exploration from global to local change linearly. Search algorithm are required to have non-linear searching ability. With few statistical features obtained, PSO search is easily understood and the suitable weight is calculated for the next iteration. Here, when there is an increase in total generation, there is a linear decrease in weight w while optimization in relation to Where and denote the maximum and minimum inertia weight respectively, and are the current iteration and maximum number of iterations respectively. For particle i, the best position is position that the particle visited (past value of Xi), which provides highest fitness value. For minimization, a position with small function value is considered to have fitness. f(X) denotes the minimized objective function for which the updated equation is A faster rate convergence is provided by the g best with the expense of robustness and only a single best solution is maintained termed as global best particle. The role of this particle is to act as an attractor and hence pulls every particle towards it. Ultimately, every particle converges at this position and thus has to be regularly updated if not swarm converges prematurely. For every particle in the swarm, fitness value is computed using the objective function. Then, Pid and Pgd values are evaluated and updated with the global best position or better particle best position if obtained.
Steps for WPSO: Initialize the function.

Create objective function.
Objective function is based on intensity of pixel.
Calculate pixel intensity of images.
Optimize the cancer image pixel intensity.
Calculate optimal value for input image pixel.
Extract tumor part with maximum pixel intensity.
Calculate accuracy.

D. Classification using Convolution Neural Networks
CNN takes the breast cancer image dataset an input for classification. Then, deep convolutional kernels are trained using the introduced CNN architecture. RELU nonlinearity is used in convolution layers and are defined as: Generally, convolution layer is stated as: Here, xi represents the i th input map and yi denotes the j t h output map. bjis the bias parameter of j th map, convolution process between two functions is given by *, and convolutional kernel involved between i and j maps is bij. Max-pooling layer was the next layer following the convolutional layer. In max-pooling layer, every neuron provides yi pools in the output map yi pools against s * s nonoverlapping areas of xi. In general, max-pooling layer is defined as: Convolutional as well as max-pooling layers are fully connected which is followed by Softmax classifier containing output classes which equals the number of outputs. In the architecture introduced, tan h is used as a non-linear protocol in connecting one layer with another. The function of Softmax function equals squashing, and dataset with k-dimension is renormalized producing real values ranging from 1 to 2. This is represented mathematically as: Error obtained while developing ML approaches are training and generalization errors. The former is observed www.ijacsa.thesai.org while training the neural network, where the latter is produced while testing the proposed classifier. In the process of deep learning, training is frequently affected with the process of overfit and under-fit. To surpass these issues, after every layer, batch normalization is applied in the proposed architecture for BCC. Dropout layer was added next to the first fully connected layer. The entire architecture developed for breast cancer classification is illustrated in Fig. 2.

E. Training of CNN
The proposed CNN architecture has two classes namely benign and malignant. Weighted loss function was employed for training the proposed CNN classifier.
Here xn represents input vector, yn the prediction obtained from classifier for n th clinical input, and tn its actual response. K and N are the number of classes total clinical samples.
For recognizing, patch results of the entire image are combined. As the model is trained with image patches, strategy is necessary for partitioning the actual testing images into patches, then executing and combining the results obtained to get optimal result but is computationally too complex. Rather, grid patches are obtained from the images which provide the set of non-overlapping patches, and this was reasonable and balanced the performance of classification as well as computational cost. By implementing this model, every patch produced the probability of every possible class for the given patch of the image. For combining the results produced by the patches for the test image, three various fusion rules were involved and found that Sum rule produced better results.

IV. PERFORMANCE ANALYSIS
Detection of breast cancer in the earlier stages is critical for treatment and managing its condition. This study presented a detailed derivation methods and processes along with way it is applied in detecting tumor. According to the tissue segmenting method, the effects of the obtained number of glandular tissues is analysed. It is observed that presence of numerous glandular tissues worsen the imaging effect later. Simultaneously, a progressive approach to detect multiple tumors is also introduced. Imaging is done in three steps: preliminary examination, refocusing, and image optimization by which every tumor is detected successfully. Here, WPSO-CNN is used for extraction of features and classification of tumor, and this has obtained enhanced accuracy. Features were obtained and classified the image of histopathology. The below Fig. 3 shows extracted feature of histopathology image using WPSO-CNN. The Fig. 4 shows classified image with malignancy.
The accuracy, precision, recall and F-1 score graph has been shown below in Fig. 5,6,7,8. The below graphs show comparison of parameters between existing and proposed techniques.
From analysis of results obtained for proposed technique, it is observed that performance is significantly improved. The analysis of results expressed that proposed technique exhibits its significant performance in Classification.  (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 12, 2021 379 | P a g e www.ijacsa.thesai.org In Fig. 5, shows the classification of accuracy analysis for proposed technique. The accuracy measured for proposed technique is measured significantly higher than the existing techniques.
In Fig. 6, shows the classification of precision analysis for proposed technique. From overall comparison of proposed technique with existing classifiers is presented. The measurement of precision provides the analysis is expressed that the proposed technique provides improved performance rather than existing techniques.
In Fig. 7, shows the classification of Recall analysis for proposed technique. The Recall measurement provides the analysis and the comparative analysis expressed that the proposed technique provides improved performance than the existing classification techniques.  In Fig. 8, shows the classification of F1-Score analysis for proposed technique. The F1-Score is measured and provides the analysis and through analysis, it is concluded that the proposed technique exhibits improved performance rather than the existing classification technique. Limitations of the proposed WPSO-CNN:  WPSO-CNN is best suitable for image-classification, dealing with high-dimensional Data (images).
 WPSO-CNN is not best suitable for smaller data sets.
 WPSO-CNN becomes slower, if there are more layers. www.ijacsa.thesai.org V. CONCLUSION The objective to carry out this research is to improve accuracy of detection using CAD technique for detecting breast cancer. With this objective, a framework was contributed along with its flow and parameters used for simulation. Publicly available dataset is involved for analysing the effectiveness of the method for classifying normal and abnormal breast images of several individuals. Here, weighted particle swarm optimization (WPSO) with CNN (Convolutional Neural Networks) is employed named as WPSO-CNN with the objective to extract the features and estimate the error between the estimated and true density using kernel density estimation based classifier for diagnosing breast cancer. From the results it is observed that the performance of WPSO-CNN is remarkable than existing approaches. The future work is to possibly develop an online breast cancer detection system since the detecting systems used currently are offline.