Deep Learning Predictive Model for Colon Cancer Patient using CNN-based Classification

In recent years, the area of Medicine and Healthcare has made significant advances with the assistance of computational technology. During this time, new diagnostic techniques were developed. Cancer is the world's second-largest cause of mortality, claiming the lives of one out of every six individuals. The colon cancer variation is the most frequent and lethal of the numerous kinds of cancer. Identifying the illness at an early stage, on the other hand, substantially increases the odds of survival. A cancer diagnosis may be automated by using the power of Artificial Intelligence (AI), allowing us to evaluate more cases in less time and at a lower cost. In this research, CNN models are employed to analyse imaging data of colon cells. For colon cell image classification, CNN with max pooling and average pooling layers and MobileNetV2 models are utilized. To determine the learning rate, the models are trained and evaluated at various Epochs. It's found that the accuracy of the max pooling and average pooling layers is 97.49% and 95.48%, respectively. And MobileNetV2 outperforms the other two models with the most remarkable accuracy of 99.67% with a data loss rate of 1.24. Keywords—Colon cancer; MobileNetV2; Max pooling; Average pooling; data loss; accuracy


I. INTRODUCTION
Cancer refers to a category of illnesses in which abnormal cells develop within the human body as a result of random mutations. When these cells are formed, they divide abnormally and spread throughout the organs. If left untreated, most cancers will eventually kill their victims. Fig. 1A, which shows the 4-tier Human Development Index (HDI) based on the UN's 2019 Human Development Report, shows how much cancer's position as a cause of early death corresponds with nation levels of social and economic development.
In rare situations, a person inherits from their parents the faulty gene that causes cancer. Regular checks are required for those who are at risk of getting hereditary malignancies. Many individuals cannot afford these diagnostic procedures since they are expensive. Cancer is responsible for over 70% of fatalities in poor and middle-income nations [1]. To meet this issue, countries must make significant investments in public health, establish a large number of labs and pathology centres with the requisite technology, and educate more people to perform diagnostic operations. Furthermore, keeping the costs of these examinations within reach of those who are poor is necessary. Finding new techniques for diagnosing cancer will give a genuine chance of survival.  Most cancers have five stages, according to the Tumor-Node-Metastasis (TNM) classification devised and maintained by the American Joint Committee on Cancer (AJCC): 0, Stage I, Stage II, Stage III, and Stage IV [2]. The four stages of colon cancer are shown in Fig. 2. The approach considers a number of parameters, including the main tumor's size and location, the amount of its dissemination to lymph nodes and other organs, and the existence of any biomarkers that impact cancer spread. At certain phases, the odds of survival fluctuate dramatically. In the case of colon cancer, for example, more than 93% of persons between the ages of 18 and 65 may survive with effective treatment if they are discovered at Stage 0; however, survival rates at the later stages are 87%, 74%, and 18%, respectively [3]. The possibility of survival for colon cancer patients drops from 70% at Stage 0 to a terrifying 13% at Stage IV. As previously said, there is no sure therapy for cancer, thus the sooner a person is detected, the more time physicians have to design a treatment plan for the patients, the greater chance they get of surviving the condition. Early detection and early treatment are presently the only ways to prevent cancer-related fatalities [4]. However, most of the population lacks access to competent diagnostic facilities, making the fight against this deadly illness even more difficult.
In the field of diagnostics, AI has shown tremendous promise and provided us with a viable alternative to conventional diagnostic approaches. Currently, diagnosing an illness entails obtaining samples from a patient, executing a series of tests on those samples, putting the findings into an understandable format, and enlisting the help of a skilled expert to make judgments based on those findings. Now, if the samples taken from a patient are digital or have been digitalized somehow, machines can evaluate those. These data may then offer them a package of data comprising previous judgments on comparable circumstances. Finally, instructions are to be provided on how to detect the disorders that the new patient has. In machine learning, supervised learning refers to making judgments based on information obtained from past experiences. Different forms of biological signals have been classified and predicted using machine learning methods. Machines can now analyze high-dimensional data such as images, multidimensional anatomy images, and video thanks to the advent of Deep Learning (DL) algorithms. The learning algorithms inspired by the structure and function of the human brain are described in DL, a sub-field of ML [3]. DL uses artificial Neural Networks (ANNs) to improve pattern recognition skills. Above all, it is clear that AI has given the area of medical diagnostics a new dimension, and it is increasingly replacing old diagnostic procedures as a viable alternative [5 -7].
The rest of the paper is organized as follows. Section II provides a comprehensive summary of the many ML approaches utilized in colon cancer diagnosis. Section III provides an overview of the contents of the employed dataset and the method used for the classification purpose and techniques required to build this model. Moreover, it contains the criteria on which the performance of the model will be measured. Section IV elucidates the outcome of the model. Comparison of the result that different stages of the model's learning process are described in brief. Finally, Section V gives a summary of the work described in this article, along with some scopes of further research.

II. RELATED WORK
In the past three decades, several supervised learning algorithms have been created, and they are quite good at dealing with biological data. Toraman et al. in [8] presented research aimed at classifying the probability of colon cancer using Fourier Transform Infrared (FTIR) spectroscopy signals. The authors collected various statistical characteristics from the signals and then used SVM and ANN to categorize them, yielding a classification accuracy of 95.71 % for ANN. Liping Jiao et al. [9] used the Gray-Level Cooccurrence Matrix (GLCM) method to extract eighteen ordinary characteristics, including grayscale mean, grayscale variance, and 16 texture features. On 60 colon tissue images partitioned evenly into the two groups, an SVM-based classifier obtained accuracy, F1score, and recall of 96.67%, 83.33%, and 89.51%, respectively. S. Rathore et al. [10] developed a feature extraction method that mathematically mimics the geometric properties of colon tissue components. A hybrid feature set is created by combining conventional features such as morphological, texture, SIFT, and elliptic Fourier descriptors. SVM is then applied as a classifier on 174 colon biopsy pictures, with an accuracy of 98%. Yuan et al. [11] described a DL technique for automatically detecting polyps in colonoscopy films. The authors utilized AlexNet, a well-known CNN-based architecture, for classification, which resulted in a classification accuracy of 91.47 %. In [12], Babu et al. presented an RF-based classification algorithm for predicting the existence of colon cancer based on histological cancer images. First, the R-G-B images are transferred to the HSV plane. Then wavelet decomposition for feature selection is used to obtain a maximum classification accuracy of 85.4 % by varying the degree of image magnification. Mo et al. utilized a Faster R-CNN-based approach to identify colon cancer in [13]. The authors utilized a joint approximation optimization, which may optimize classification and regression losses simultaneously. In [14], Urban et al. developed a technique for detecting polyps in colonoscopy images with 96% classification accuracy. The authors hand-labeled 8641 colonoscopy images from 2000 individuals and used them to train a CNN model. They next tested their technique on 20 colonoscopy films totaling five hours in length. Akbari et al. developed a CNN-based classification approach with binarized weights in [15] to detect colorectal cancer from colonoscopy films. The approach was tested using data from the Asu Mayo Test Clinic database and obtained over 90% classification accuracy. Masud et al. [16] inscribe a classification framework to distinguish colon tissues (two benign and three malignant) by evaluating their histological pictures using CNN and Digital Image Processing (DIP) methods. The obtained findings indicate that the proposed framework can detect cancer tissues with an accuracy of up to 96.33 %. Garg et al. in [17] used and modify an existing pre-trained CNN-based model to detect lung and colon cancer using histopathology pictures and improved augmentation methods. On the LC25000 dataset, eight different Pre-trained CNN models, VGG16, NASNetMobile, InceptionV3, InceptionResNetV2, ResNet50, Xception, MobileNet, and DenseNet169, are trained. Precision, recall, f1-score, accuracy score are used to evaluate model performance. The findings show that all eight models achieved notable outcomes ranging from 96% to 100% accuracy.
In the proposed study, authors tested image data for colon cells obtained from online data sources to detect colon cancer. They are using the Transfer learning model MobileNetV2. The process contains two CNN layers, Max Pooling, and average pooling. The image data goes through a number of preprocessing steps to give a better classification outcome. The performance of the model is evaluated based on the confusion matrix.

III. METHODOLOGY
Image data of colon cells were used in the proposed method to detect colon cancer. The images are then labeled in order to determine which cells cause cancer. The prediction is made using the MobileNetV2 classifier. Fig. 3 illustrates the system's total flow diagram.

A. Data Description
Kaggle.com was used to gather the dataset. There are 25000 images in the dataset. The images are 768 x 768 pixels in resolution and JPEG format. In the dataset, there are two classes, i.e.
Of all the images in the dataset, 12,500 images are of colon cancer cells, as shown in Fig. 4(a, b). Fig. 4(c, d) shows the sample of the rest of the cell images without colon cancer.

B. Environment Setup
Tensorflow and the Keras library were used to carry out this analysis. Tensorflow is a free, open-source Python library for performing large-scale machine learning calculations. Tensorflow is used extensively in artificial neural networks and is used in Keras' backend.

C. Data Preprocessing
To make sure the image data are fit to be used to train and test the classifier, preprocessing is done. Raw data has to be preprocessing according to the use of the study. Following are: • To expand the volume of the dataset, ImageDataGenerator class in Keras library is used to create augmented images using the attributes in Table I. • Images resized to 224 X 224 pixels.
• LabelBinarizer() is used to assign unique values to each label in categorical features.
• The image data is converted to a NumPy array.

D. CNN Classifier
CNN is an example of a Deep Learning algorithm that takes an input image and assigns priority to different aspects of the image, allowing it to distinguish one image from another based on its features. In this system, two convolutional layers in the CNN model are used where each convolutional layer used convolutional 2D. In both convolutional 2D layers, 'Relu activation' is utilized. For complete connectivity, two Dense Layers are used. 'Relu activation' for the first dense layer and 'Sigmoid activation' for the second dense layer is used. Aside from these layers, there are several hidden layers, as well as an input layer. In this study, two pooling layers: Max Pooling 2D and Average Pooling 2D, are implemented [18]. Finally, for the classification of image data MobileNetV2 classifier is used.

1) Max pooling layer:
It is a pooling operation that selects the maximum element from the feature map area covered by the filter. By decreasing the number of pixels in the output, max-pooling lowers the dimensionality of pictures [19]. The following Fig. 5 is our study model based on the Max pooling Layer: 2) Average pooling layer: It is a pooling operation that selects the average element from the filter's covered area of the feature map. Average pooling counts all values and passes them on to the next layer, implying that all values are utilized for feature mapping and output generation, which is a comprehensive calculation [20]. Fig. 6 is our study model, which is based on the Average Pooling Layer.
3) MobileNetV2 classifier: MobileNetV2 model has 32 filters on its initial fully convolution layer. There are 19 bottleneck layers that remain. It is used in the classification of images [21]. MobileNetV2 introduces two new kinds of blocks.
i. Downsizing block of 2 stride. ii. Residual block of stride 1.
All blocks are made up of three layers. With 1X1 convolution, the ReLU6 activation mechanism is used in the first layer. On the second sheet, a depth wise is added, and the third layer is also a 1X1 convolution, except for some nonlinearity. The activation mechanism of ReLu is often included in the third layer. The architecture of the model is illustrated in Fig. 7.

E. Performance Evaluation
After the training and testing process, the performance is evaluated using specificity, recall, precision, accuracy and f1score. Eq. 1, 2, 3, 4 and 5 are the equations used for the task.

A. Outcome of Max Pooling Layer
The training set contains 80% of the data from the dataset and the rest 20% is in the test set. During the process of data classification, 94.44% accuracy is obtained in the training data set and 97.49% accuracy in the testing data set is obtained as shown in Table II at the max pooling layer.
The accuracy of the max pooling model gradually increases as the number of epochs increase as shown in Fig. 9. The training set reaches the highest accuracy at epoch 49, whereas the test set has the highest accuracy at epoch 46. The data loss of the model in the training and testing dataset decreases rapidly with the number of epochs as illustrated in Fig. 10. The lowest data loss is found at epoch 48 for both training and test set. 691 | P a g e www.ijacsa.thesai.org

1) MSE (Mean Square Error) and AUC:
The following MSE and AUC applying on the test data set using Max Pooling Layer are achieved: • MSE (Mean Square Error) of 0.0286 (Fig. 11) • AUC of 0.9932 (Fig. 12)

B. Confusion Matrix using Max Pooling Layer
Using 3000 image data, the confusion matrix is created for max-pooling layer. The outcome of the matrix is as follow: • Sensitivity or Recall= 0.993=99.3% • Specificity= 0.9782=97.82% • Precision= 0.983=98.3%

C. Outcome of Average Pooling Layer
In the average pooling model, the accuracy of 90.73% in the training data set and 95.48% in the testing data set was achieved. The record of the outcomes of all the epochs is shown in Table III.  As shown in Fig. 13, the accuracy of the average pooling model progressively improves as the number of epochs grows. The highest accuracy for the test set is in the 46 th epoch and the training set is in the 45 th epoch.
The model's data loss in the training and testing datasets reduces quickly with the number of epochs, as seen in Fig. 14 for the average pooling layer.

D. MSE (Mean Square Error) and AUC
The following MSE and AUC were achieved by applying the test data set on the Average Pooling Layer: • MSE (Mean Square Error) of 0.0588 (Fig. 15) • AUC of 0.9753 (Fig. 16)

F. Classification Outcome of MobileNetV2 Model
After loading the MobileNetV2 model, the top layer is frozen and the weights from ImageNet are loaded. A custom model is placed there, and the architecture is trained. The AveragePooling2D operation is included in the model, and the pool size is (7,7). There is a 128-node hidden layer, and the ReLU activation function is used to remove features correctly. Because deep learning models are prone to overfitting, dropout is used to select training images at random. All of MobileNetV2's trainable layers are no longer used. The Adam optimizer feature is used to better learn models from errors. By setting the trainable layer parameter to False, the base layers of all transfer learning models were frozen. A customize trainable layer consisting of one hidden layer with 128 neurons was introduced at this stage. The Average Pooling operation was applied where the pool size is (7,7). The process is shown in Fig. 17.
For the back-propagation process, the learning rate is set to 0.01. Binary cross-entropy is used to calculate the loss function. SoftMax activation is included in the output layer and is more accurate than other activation functions. Table IV displays the training and test accuracy, as well as the data loss rate.  Maximum training accuracy is 99.81%, and the minimum data loss is 1.46% in epoch 15 for the training set. The overall accuracy of the model is consistently high the data loss is consistently low for the Training data, as shown in Fig. 18 and 19.
As deep learning models learn faster with experience, data loss decreases as the number of epochs increases. The data loss at epoch 15 is 1.7% and the accuracy is 99.67% test set. The gradual decrease of data loss and gain of accuracy is illustrated in Fig. 20 and 21, respectively.    The confusion matrix was used to assess results, and the outcome represents the model's high accuracy on this dataset. The performance calculation is demonstrated in Fig. 22.  Table V compares the findings obtained from the suggested techniques of colon cancer cell categorization approaches. Image data were utilized in the study's training and testing purposes. MobileNetv2 outperforms the other two models (Max pooling and Average pooling) in terms of performance. Based on the talks in this part, it is possible to infer that the suggested models can perform the job of colon cancer tissue categorization with excellent accuracy and reliability. Though earlier work on the prediction of colon cancer cell has excellent accuracy, it is limited to models built on smaller datasets. In the final prediction stage, the suggested model outperformed the previously described studies. Furthermore, the models in the research are trained and tested on a larger dataset making it more efficient and reliable. In recent years, machine learning and deep learning have had a significant impact on image processing, the medical industry, and a variety of other applications. The proposed approach takes around a minute to identify colon cancer from the input pictures. The goal of the study is to make this procedure as easy, quick, and real-time as feasible. The dataset utilized for training and testing includes both cancer cells and healthy cells. Enhanced images were added to the dataset. In this work, the CNN algorithm with max and average pooling layers, as well as a transfer learning MobileNetV2 model, are used to identify colon cancer. It is observed that the CNNbased Max Pooling and Average Pooling operations have high accuracy of 97.49% and 95.48%, respectively and the MobileNetV2 model has a high accuracy rate of 99.67%. In future work, the model can be trained and tested using a more extensive dataset at the same time this model can be tested on other cancer datasets for classification and prediction. The study would be cooperated with medical researchers in hospitals or clinics that handle colon cancer work in the future, which would be beneficial for further application of this work in the medical sector.