Multi-Class Breast Cancer Classification using Deep Learning Convolutional Neural Network

Breast cancer continues to be among the leading causes of death for women and much effort has been expended in the form of screening programs for prevention. Given the exponential growth in the number of mammograms collected by these programs, computer-assisted diagnosis has become a necessity. Computer-assisted detection techniques developed to date to improve diagnosis without multiple systematic readings have not resulted in a significant improvement in performance measures. In this context, the use of automatic image processing techniques resulting from deep learning represents a promising avenue for assisting in the diagnosis of breast cancer. In this paper, we present a deep learning approach based on a Convolutional Neural Network (CNN) model for multi-class breast cancer classification. The proposed approach aims to classify the breast tumors in non-just benign or malignant but we predict the subclass of the tumors like Fibroadenoma, Lobular carcinoma, etc. Experimental results on histopathological images using the BreakHis dataset show that the DenseNet CNN model achieved high processing performances with 95.4% of accuracy in the multi-class breast cancer classification task when compared with state-of-the-art models. Keywords—Breast cancer classification; Convolutional Neural Network (CNN); deep learning; medical image processing; histopathological images

Abstract-Breast cancer continues to be among the leading causes of death for women and much effort has been expended in the form of screening programs for prevention.Given the exponential growth in the number of mammograms collected by these programs, computer-assisted diagnosis has become a necessity.Computer-assisted detection techniques developed to date to improve diagnosis without multiple systematic readings have not resulted in a significant improvement in performance measures.In this context, the use of automatic image processing techniques resulting from deep learning represents a promising avenue for assisting in the diagnosis of breast cancer.In this paper, we present a deep learning approach based on a Convolutional Neural Network (CNN) model for multi-class breast cancer classification.The proposed approach aims to classify the breast tumors in non-just benign or malignant but we predict the subclass of the tumors like Fibroadenoma, Lobular carcinoma, etc. Experimental results on histopathological images using the BreakHis dataset show that the DenseNet CNN model achieved high processing performances with 95.4% of accuracy in the multi-class breast cancer classification task when compared with state-of-the-art models.

I. INTRODUCTION
Breast cancer is a major public health issue because it is the most common cancer in women and the leading cause of cancer death worldwide.Indeed, nearly one in seven women will be affected by this pathology during its existence, the risk increasing with age [1].In addition, worldwide studies in 2012 reported 522,000 deaths from breast cancer in the same year, an increase of 14% over 2008 [2].The development of massive breast cancer screening has led to earlier diagnosis and rapid management with a significant improvement in survival rate.The treatment and analysis of medical images is a rapidly expanding area where the problem of automatically searching for information contained in medical images is urgently needed.Indeed, the great diversity of medical imaging devices, the difficulty of interpretation of these images as well as their large number, generates tedious work for those who must interpret them [3].In order to process this large volume of information, doctors are currently turning to the use of systems to assist in the analysis and interpretation of these images.This analysis aims to facilitate the diagnosis made by the practitioner and to make it as accurate and reliable as possible [4].However, and in contrast to advanced technology in the medical sector, breast cancer analysis remains a real public health problem and a very sensitive topical research topic to address.Mammographic imaging is one of the most commonly used modalities [5].This tool that we are interested in this paper has become an indispensable tool for any clinical examination related to breast cancer.In the field of computational medical imaging, methods of deep convolutional neural networks (CNN) [6] have proved successful for the hierarchical unsupervised learning of imaging features of increasingly complex data directly from raw images, allowing to discover the relevant characteristics, instead of extracting features defined a priori by the user.
A selection of variables can be done in an integrated way with the learning of the characteristics, and this, both on the raw data and on the learned characteristics [6], [7].Similarly, the supervised classification can also be integrated into the same architecture with the two previous steps to optimize and automate the process [8], [9].Studies have compared the conventional multi-step computational imaging methods with deep learning methods, and showed a better classification accuracy and mortality prediction with deep learning methods in the case of screening mammograms breast cancer [8].Deep learning refers to advanced statistical learning methods organized in multiple layers, to extract representations of data on multiple levels, and whose layers are not predefined by the user but learned directly from the data by the algorithm, thus mimicking human neuronal functioning [10].
It has been successfully applied to various pathologies and modalities, including the use of convolutional networks (CNN) that exploit large databases for the extraction of relevant descriptors and segmentation [11].The main challenge of cancer automatic aided diagnosis systems is dealing with the inherent complexity of histopathological images.To deal with this, we choose to use a powerful convolutional neural network of multiclass classification problem.We propose to use the DenseNet model [12], one of state of the art in the image recognition competition ImageNet [13].The DenseNet is built for natural images processing but we modified it to deal with histopathology images for breast cancer classification using transfer learning.
The results obtained using transfer learning on the proposed custom model surpasses the current best performance, for all of the resolutions in the benchmark dataset.The remainder of this paper is divided into four sections.After introducing, related works on breast cancer classification are reviewed in Section 2. Section 3 presents the proposed CNN model for multi-class breast cancer classification.Experiments, results and comparison with popular CNNs models are detailed in Section 4. Finally, this paper is concluded in Section 5. www.ijacsa.thesai.orgII.RELATED WORKS Research related to the detection of breast cancer has increased during the last decade.Much work has been directed towards the detection of the presence of cancerous tissue in the breast and the classification of tumors.Some researchers have preferred to design aided diagnosis systems based on Content based image retrieval techniques that would have the advantage of offering radiologists images available in a medical image database, whose content is known and which would be similar to image request for which the radiologist would have doubts.However, this approach also raises problems of search time and adequate similarity measurement between the request image and those contained in the database.For example, Tourassi et al. [14] proposed a content search system for tumors detection that makes use of the expert knowledge present in the different mammograms that make up the image database.To achieve this, they first use the matching template to find among the images in the database those that are similar to the ROI request presented by the user of the system.In order to determine whether the query ROI contains a tumor (of any kind) or only healthy tissue, they proposed a decision measure that effectively combines several similarity measures on the best matches.For their part, Alto et al. [15] preferred extracting descriptors of texture, shape and sharpness of the edge.Those related to pixel intensity, shape and texture were merged by Tao et al. [16], in order to find the tumors similar to that contained in the ROI query and classify it as benign or malignant.For better visual similarity, Zheng et al. [17] proposed a system that provides further interaction with the user; system in which, the latter is asked to evaluate the nature of the spiked tumor of the query image so that the system only looks for matches with similar degrees of speculation.This work was subsequently improved by removing from the search base the ROIs that gave the worst similarity scores [18].
Moreover, several works have tried to find in the image databases, mammary tumors with similar characteristics such as shape, contour and pathology.For example, the shape, intensity and texture descriptors were combined using a suitable weighting system, also exploiting the user interaction to optimize the quality of the proposed matches [19].Narvaez et al. [20] have proposed a method that begins by merging the shape and texture descriptors extracted on the two incidences of the breast to find the best matches, which images are then used to annotate the query ROI.Liu et al. [21] have on their side introduced an image search based on a hash function to produce a diagnosis for the tumors contained in the ROIs queries.More specifically, a hash function inspired by graph theory and named anchor [22] was used to compress two descriptors, namely the SIFT histogram and the GIST into binary codes; finally the search for similarity was made in the Hamming space.Moayedi et al. [23] developed an automatic classification of tumors in mammograms.They exploited three approaches for texture determination, namely texture analysis based on contour let transforms, as well as geometric characteristics representing the orientation, the zone and the center of the tumor, and the statistical descriptors obtained from the co-occurrence matrix using successive improvement learning (SEL) and weighted SVM, vector support based fuzzy neural networks (SVFNN) and kernel-based SVM classifier for localization as well as classification of tumors into malignant and benign.In [24], the authors carried out a quantitative approach for the classification of tumors of mammography based on the descriptors of the texture.Indeed, they extracted a set of texture descriptors on 130 mammograms, under different configurations and scales.In addition, multivariate analysis of variance (MANOVA) was applied to the construction of additional subsets of statistically independent texture descriptors.Thus, a texture signature is attributed to both malignant and benign tumors.The authors used linear and nonlinear classifiers for the classification stage, consisting of Linear Discriminant Analysis (LDA), Least Square Minimum Distance (LSMD), K nearest neighbors (k-NN), the function Radial Basis (RBF) and Multilayer Perceptron (MLP), Artificial Neural Networks (ANN), as well as Support Vector Machines (SVM).The authors asserted that texture descriptors extracted at large scales are richer in content than texture descriptors extracted at small scales.They achieved a tumor classification rate of 83.9% using the Support Vector Machines (SVM).Guo et al. [25] proposed a Multilayer Perceptron (PMC) as a classifier for diagnosing breast cancer.As a first step, a variable selection step is performed on the data using Genetic Programming (GP).The variables are then assigned to the entry of the PMC to evaluate the classification performance.The neural network converged with an average classification rate of 96.21%.The results obtained by the authors put forward the ability of the GP method to transform the information by reducing the dimensionality of the variable space, and to define the relationship between the data in an automatic way.These last two properties help to improve the accuracy of the classification.The recognition rates obtained are interesting.That said, the classifier was of the binary type with a class of malignant cases and a class of benign cases.
The works done in the literature [26], on deep learning through Convolutional Neural Networks among others, has opened the way to an automatic representation approach based on a non-supervised descriptors extraction, i.e. independent of any human intervention that could affect these performances.Deep CNNs often have a number of parameters so large that it cannot reasonably be trained without a very large dataset.Medical imaging datasets are often not sufficiently large to train a deep CNN model from scratch adequately.Thus, the usage of transfer learning in medical imaging has been explored.Transfer learning aims to transfer knowledge between large source and small target domains [27].For CNNs, this is often done by pre-training a CNN model with the source dataset, then re-training parts of the model with the target dataset.In [28], the authors adapted the popular CNN AlexNet to classify breast cancer tumors from histopathological images on BreakHis dataset [29].They proposed a sliding widow mechanism to extract random patches for the training strategy.They reached an average classification rate of 79.85%.Han et al. [30] proposed a breast cancer multi-classification framework using class structure-based deep convolutional neural network model (CSDCNN).It has particular feature learning manner using prior knowledge of class structure on histopathological images.The structured deep learning model has reached remarkable performance with 93.2% of average accuracy on www.ijacsa.thesai.orgBreakHis dataset.Nuh et al. [31], distinguished cell and noncell samples in breast tumors images using Convolutional Neural Networks with different spatial patches.The classification accuracy was estimated at 86.91% and 86.17% for 5x5 and 7x7 sub-window sizes respectively.In [32], the authors presented CNN classifier for visual analysis of invasive ductal carcinoma tissue regions in malignant breast tumors images.The proposed framework yielded higher performance compared to random forest classifier with 84.23% detection accuracy.Hafemann et al. [33] have shown, for histopathological images, that Convolutional Neural Network outperforms traditional textural descriptors.Besides, the traditional approach to extract appropriate features for classification tasks in pathological images requires considerable efforts and effective expert domain knowledge, frequently leading to highly customized solutions, specific for each problem and hardly applicable in other contexts [34].

III. PROPOSED CNN MODEL FOR MULTI-CLASS BREAST CANCER CLASSIFICATION
A Convolutional Neural Network (CNN) is feedforward neural network introduced by Kunihiko Fukushima in 1980 [30] and improved by Yann LeCan et al. in 1998 [35], [36].A CNN is composed of 6 types of layers: an input layers, a convolutional layer, a non-linear layer, a pooling layer, fully connected layer, and an output layers.Fig. 1 illustrates a traditional CNN architecture.
Convolutional Neural Networks (CNN) are one of the most remarkable approaches of deep learning, in which multiple layers of neurons are formed in a robust manner.They have shown that they are capable of demonstrating an impressive generalization capability on large data sets with millions of images [37], [38].These results come mainly from the particular architecture of CNNs that takes into account the specific topology of tasks related to the field of computer vision that exploit two-dimensional images.Other dimensions can also be taken into account when it comes to color images with multiple channels.
To train a CNN we determine the mapping function using the feedforward operation and we optimize the loss function using retro propagation techniques in particular, the gradient decent algorithm.The CNN that we choose for the task of breast cancer classification is not a traditional CNN model.DenseNet [12] is a CNN model which they replace convolution non-linear and pooling layers with dense blocks and transition layers using the original CNN layers except the first convolutional layer.Fig. 2 presents the original DenseNet model with three dense blocks and two transition layers.The dense block proposed by DenseNet contains convolution and non-linear layers.Also, they apply some optimization techniques like dropout and batch normalization.In addition, in the dense block proposed by DenseNet, outputs from the previous layers are concatenated instead of using the summation.So, assume that an input image has the shape of (28,28,3), in which three represents the RGB color space.First, we spread image to initial N channels and receive the image (28, 28, N).Every next convolution layer will generate k features, and remain the same height and width.The feature concatenation process is illustrated by the Fig. 3.If we assume that we have N= 24 and K= 12 we will receive the image with same dimension, but with plenty of features (28,28,48).To reduce the size, DenseNet uses transition layers.These layers contain convolution with kernel size = 1 followed by 2x2 average pooling with stride = 2.It reduces height and width dimensions but leaves feature dimension the same.The transition layer is presented in Fig. 4. As a result, if the input is an image with shape (28,28,48), we receive an output image with shapes (14,14,48).The DenseNet scale naturally to hundreds of layers, while exhibiting no optimization difficulties.Thus, that makes DenseNet one of the most powerful models in image recognition tasks.As mentioned above we will modify the DenseNet model to deal with histopathology images to build a breast cancer classifier using transfer learning.Our custom-made model is inspired by DenseNet, and contains four dense blocks and three transition layers to classify breast cancer tumors.The proposed CNN model is presented in Fig. 5.The DenseNet is www.ijacsa.thesai.orgbuilt to deal with natural image and non-microscopic images.
To solve this, we use kernel of 7x7 sizes for the first convolutional layer to detect small variation and substance in the image and extract more important features.Also, the size kernel of the convolutional layers in dense blocks is reduced to deal with the complex structure of the histopathology images.An average pooling layer with a 7x7 kernel size and stride 2 is used before the fully connected layer to fix the feature map connected to this layer.In addition, we configure the softmax layer for the eight classes of BC histopathological images instead of the 1000 classes of the ImageNet dataset [13].
Transfer learning is defined as fine-tuning CNN models pre-trained from natural image dataset to medical image tasks.Learning from clinical images from scratch is often not the most practical strategy due to its computational cost, convergence problem, and insufficient number of high quality labeled samples.A growing body of experiments has investigated pre-trained models in the presence of limited learning samples.We initialized weights of different layers of our proposed network by using pre-trained model on ImageNet.Then, we employed last layer fine-tuning on BreakHis cancer images dataset.Therefore, the ImageNet pretrained weights were preserved while the last fully connected layer was updated continuously.The first convolutional layer of the network is then un-frozen, and the entire network is fine-tuned on the BreakHis training data.The advantage of the DenseNet is feature concatenation that helps us to learn the features in any stage without the need to compress them and the ability to control and manipulate that features.This technique helps us to avoid the parameter number explosion so we reduce the training process complexity and eliminate the over fitting problem.

IV. EXPERIMENTS AND RESULTS
In our experiments we use the BreakHis dataset [29] for training and testing.It contains 7909 microscopic biopsy images of benign and malignant breast tumors.Images are acquired, by a Microscope System coupled with Digital Color Camera, in RGB True Color Space using four magnifying factors: 40X, 100X, 200X, and 400X.The image distribution is summarized in Table I.Both benign and malignant breast tumors in BreakHis dataset are sorted into four distinct subtypes.Lobular carcinoma (LC), Ductal carcinoma (DC), Papillary carcinoma (PC) and Mucinous carcinoma (MC) are the types of malignant breast tumors.For the benign tumors, the types are Fibroadenoma (F), Adenosis (A), Tubular adenoma (TA) and Phyllodes tumor (PT).Fig. 6 shows examples of the breast cancer subclasses.To develop the model, we use the Tensorflow deep learning framework [39] and the Nvidia digits tools [40].The model is trained and tested using the MSI Pro Series desktop equipped with an Intel i7 processor and an Nvidia Geforce GTX960 GPU.
Following the experimental protocol proposed in [29], the dataset is divided into 70% for training set and 30% for validation set.When discussing medical images, there are two ways to report the results.In the first one the decision is patient-wise, therefore, the recognition rate is computed at the patient level.Let N p be the number of histopathological images of patient P. For each patient, if N cancer images are correctly classified, the patient score and the global patient recognition rate are defined as in equation 1 and equation 2 respectively.
In the second case the recognition rate is computed at the image level.Let N t be the number of histopathological images of the testing set.If N r cancer images are correctly classified, then the recognition rate at the image level is represented in (3).
The proposed CNN model aims to treat the high resolution images generally used for the histopathological classification of breast cancer.The DenseNet model is modified to extract fully global feature from the histological images and use them in the training process using transfer learning.In this case we resize all the images to 224x224x3 RGB color space.
After obtaining the weights of the model pre-trained on ImageNet, transfer learning is done in the following steps.First, the fully connected layer has randomly initialized weights.We freeze the convolutional layers of the network, and only train the fully connected layer using the BreakHis training dataset.The fully connected layer is trained from scratch on the features extracted from the fixed convolutional layers.The first convolutional layer of the network is then unfrozen, and the entire network is fine-tuned on the BreakHis training data.This involves re-training the CNN, starting from the retained weights, and using a very small step size.
To train the model we use the Adam optimizer to minimize the loss function [41].Adam optimizer is a gradient descent algorithm with an adaptive momentum that computes adaptive learning rates for each parameter [42].Fig. 7 represents the total loss minimization during the training process.The training process took 11 hours and the total loss achieves a minimum of 0.3424.

B. Testing
After training our model we use the dataset reserved for validation to test the model.To provide a proper performance evaluation, we compare the results of the proposed model with the most powerful CNNs in the histopathological breast cancer images multiclassification.The performance comparison with state of the art models in Table III confirms that our model reached the highest multi-classification accuracy.AlexNet [26], the state of the art in the visual image recognition competition ImageNet (ILSVRC12), yielded 83% of detection accuracy in the histopathological images binary classification (benign and malignant) [28].However, for the multi-classification task, it achieves about 80 % of accuracy.The CSDCNN [30] is convolutional neural network proposed by Zhongyi Han et al. for breast tumors detection and multi-classification; it achieves about 94% accuracy.LeNet [43] is a traditional CNN used for the handwritten character recognition and achieving remarkable accuracy.However, on the histopathological images, its performance was considerably inferior, achieving about 47% multiclassification accuracy [28].
Compared to the mentioned powerful CNN models, our proposed model achieved the highest multi-classification www.ijacsa.thesai.orgaccuracy with about 96% of average accuracy of the image level.Histopathology tumor detection and classification into multi-classes would play a key role in breast cancer diagnosis, reduce the heavy workloads of pathologists and establish the appropriate therapeutic approach by doctors.

V. CONCLUSION
In the context of classification, deep convolutional neural networks (CNNs) have been widely proven in the scientific and industrial community.In this work, we investigated the performance of a deep neural network model on a classification task related to breast cancer detection.The modification applied to the DenseNet model proves that deep learning model used in natural images processing can achieves high performance in medical images processing.In our case we achieve about 96% of accuracy in the multi-class breast cancer classification task and that outperform human expert in the diagnostic domain.The performance achieved can be improved if we provide more data using larger datasets.
www.ijacsa.thesai.orgMulti-Class Breast Cancer Classification using Deep Learning Convolutional Neural Network Majid Nawaz, Adel A. Sewissy, Taysir Hassan A. Soliman Faculty of Computer and Information, Assiut University

TABLE I .
IMAGE DISTRIBUTION IN THE BREAKHIS DATASET Table II reports the accuracy of our model with the different magnification factors of the BreakHis dataset in both image level and patient level.Our model shows best performance with high multi-classification accuracy.It achieves, respectively, 95.4% of average accuracy of the image level and 96.48% accuracy of the patient level for all magnification factors.

TABLE II .
THE MODEL MULTI-CLASSIFICATION ACCURACY WITH THE DIFFERENT MAGNIFICATION FACTORS OF THE BREAKHIS DATASET

TABLE III .
COMPARISON WITH SOME POPULAR CNNS IN THE MULTI-CLASS BREAST CANCER CLASSIFICATION