Development of A Clinically-Oriented Expert System for Differentiating Melanocytic from Non-melanocytic Skin Lesions Classification of PSLs by Abbas Q

Differentiating melanocytic from non-melanocytic (MnM) skin lesions is the first and important step required by clinical experts to automatically diagnosis pigmented skin lesions (PSLs). In this paper, a new clinically-oriented expert system (COE-Deep) is presented for automatic classification of MnM skin lesions through deep-learning algorithms without focusing on preor post-processing steps. For the development of COEDeep system, the convolutional neural network (CNN) model is employed to extract the prominent features from region-ofinterest (ROI) skin images. Afterward, these features are further purified through stack-based autoencoders (SAE) and classified by a softmax linear classifier into categories of melanocytic and non-melanocytic skin lesions. The performance of COE-Deep system is evaluated based on 5200 clinical images dataset obtained from different public and private resources. The significance of COE-Deep system is statistical measured in terms of sensitivity (SE), specificity (SP), accuracy (ACC) and area under the receiver operating curve (AUC) based on 10-fold cross validation test. On average, the 90% of SE, 93% of SP, 91.5% of ACC and 0.92 of AUC values are obtained. It noticed that the results of the COE-Deep system are statistically significant. These experimental results indicate that the proposed COE-Deep system is better than state-of-the-art systems. Hence, the COEDeep system is able to assist dermatologists during the screening process of skin cancer. Keywords—Skin cancer; melanocytic; non-melanocytic; dermoscopy; deep learning; convolutional neural network; stack-


INTRODUCTION
Melanocytic and non-melanocytic (MnM) skin lesions [1] are the two major form of skin cancer.According to estimation in 2016, the skin cancer is rapidly increasing throughout the world and it is very common in white skin populations.Even in the United States, skin cancer is the most common form of cancer.For clinical experts, they have to first decide whether the lesion belongs to melanocytic or nonmelanocytic (MnM) class.After identification of this step, the clinical experts then classify the melanocytic lesion is benign or malignant.Whereas in a case of non-melanocytic lesions, the experts have to further classify them as a basal cell carcinoma (BCC), squamous cell carcinoma (SCC) or seborrheic keratosis (SK) skin lesions.An example of these lesions is visually represented in Fig. 1.All these classes are known as pigmented skin lesions (PSLs).Among different types of pigmented skin lesions (PSLs), the malignant melanoma has the highest mortality rate.Despite this fact, the occurrence of melanoma and nonmelanoma skin cancers are increasing with the highest rate.For early detection of skin cancer, it can definitely reduce the mortality of this disease.To diagnosis PSLs, the dermatologists are widely using digital dermoscopy with automatic image analysis computer-aided diagnostic (CADx) [2] system.In general, the dermoscopy equipped with CADx system is provided the most cost-effective non-invasive technique for early detection.
Over the last few years, the computer-aided diagnostic (CADx) systems are developed for automatic classification of pigmented skin lesions (PSLs).Those CADx systems were used for providing the second opinion to dermatologists and assist them in better diagnosis of skin cancer.
For classification of CADx system into melanocytic and nonmelanocytic categories, it is very crucial due to highest similarity among them.Compared to existing melanoma CAD system [3], the recognition rate of non-melanoma skin lesions is below than 75%.
To differentiate PSLs lesions, the authors developed many state-of-the-art CADx tools [4] because the diagnosis by clinical experts is based on subjective whereas, a CADx system is more objective and reliable.The current CADx tools [5], [6] are developed based on hand-crafted features combine with machine learning algorithms such as neural network www.ijacsa.thesai.org(NN), support vector machines (SVMs), AdaBoost and deeplearning to achieve very good performance on certain skin cancers such as melanoma.But they are unable to perform diagnosis [7] over bigger classes of skin diseases such as in the case of melanocytic and non-melanocytic (MnM) categories.
Human hand-crafted features are not providing a perfect solution for the development of CADx system for automatic diagnosis MnM skin lesions.In practice, the hand-crafted features required high expertise for domain-expert knowledge and it is suitable only for limited skin diseases.On the other hand, the deep learning algorithms are utilized in the few studies for the development of CADx tools.By using deep learning algorithms, the hand-crafted features are no need to define and it extracted automatically from an image.As a result, there is no need domain expert knowledge or pre-or post-processing steps to recognize PSLs lesions.Even for large-scale datasets, the deep-learning algorithms have displayed high performance compared to other algorithms such as NN, SVM or AdaBoost.Inspired by deep-learning algorithms, the convolutional neural network (CNN), stackbased autoencoders (SAE) and soft-max linear classifiers are integrated into this paper to get higher performance in terms of large-scale applicability of CADx tools to automatically diagnosis PSLs lesions.
The rest of the paper is organized as follows.Section 2 introduces the background about this research study and deep learning architectures.In Section 3, the dataset and the proposed methodology are technically described.Section 4 shows the experimental results on the performance of the deep-learning algorithms using different training settings.Conclusions and future works of this paper are given in Section 5.

II. BACKGROUND
The past studies suggested that the researchers focused only the classification of melanocytic lesions (benign and melanoma) from dermoscopy images due to certain issues mentioned in the previous section.In practice, it is not so easy for clinical experts to differentiate among non-melanocytic lesions [8] such as SK, BCC or SCC compared with melanocytic lesions.Due to this reason, the differentiation between melanocytic and non-melanocytic (MnM) skin lesions is the first and important steps that are ignored currently by many computer-aided diagnostics (CADx) systems.As those CADx tools were trained and developed through melanocytic lesions and if we provided those nonmelanocytic lesions then the results showed unreliably.In this case, if the CADx system is extended to work with nonmelanocytic lesions then the system should have the capability to recognize them as well.
To develop those CADx systems, there are mainly four steps involved such as image enhancement, segmentation, feature extraction and selection, and recognition.As a result, it is very much difficult for a person to develop a CADx system without having expertized on complex image processing techniques.In addition to this, the segmentation of nonmelanocytic lesions is very difficult to compare to melanocytic lesions due to rough and intensity variation around the lesion border.Moreover, the old CADx tools were developed through old machine learning algorithms such as artificial neural network (ANN), support vector machines (SVMs) and AdaBoost classifiers to recognize only melanocytic lesions.However, those CADx tools required lots of pre-or post-processing steps and domain expert knowledge for features selection.Also, those CADx tools were only applied on a limited dataset.Therefore in this paper, a deeplearning modern machine learning algorithms are used to differentiate between melanocytic from non-melanocytic (MnM) pigmented skin lesions, which applies in a large-scale environment.According to my limited knowledge, there is no study available that classify MnM through deep learning algorithm.
There are few CADx tools developed in the past to recognize only melanocytic skin lesions based on deep learning architectures.At the beginning, the most famous architecture was used is CNN model to extract the features and then the decision of classification is performed based on softmax linear classifier.As mentioned above, the CNN model can be used to select features for multiple objects.Therefore, the use of simply CNN model is not suitable for differentiation between MnM skin lesions.Those CADx tools are mentioned in the subsequent paragraphs.
The support vector machines (SVM) and deep belief network (DBN) are combined together in [9] to recognize a limited number of dermoscopy images such as 100.This system is tested on a set of the limited data set so unsuitable for a large-scale environment.In [10], the hybrid version of AdaBoost-SVM and deep neural network are integrated to learn hand-crafted features for classification of melanoma skin lesion.Also in [11], the SVM is combined with deep learning and sparse encoder techniques to classify melanoma images on 2624 images and reported 91.2% accuracy.By using of deep convolutional neural networks (DC-NN) machine learning algorithm in [12], the authors developed a three pattern detectors approach on a set of 211 images and reported accuracy below than 85%.The CNN model used in [13] to extract features with pooling techniques to recognize PSLS skin lesions and achieved 85.8% accuracy.The deep-neuralnetwork (DNN) is used to classify melanoma and achieved 89.3% accuracy.Similarly, the authors in [14] used CNN model to dermoscopy images to classify malignant melanoma skin lesions.
The above-mentioned CADx tools are just used to classify melanoma skin lesions instead of non-melanoma lesions that are the first step required by dermatologists.In the past approaches, there is only one study [15] developed for differentiation between melanocytic and non-melanocytic skin lesions but required pre-or post-processing steps.
Hence, this paper is focused on both categories and developed an automatic system through deep-learning algorithms.Deep learning algorithms are based on multilayer architecture and each is connected with other in a non-linear combination [16].There are many variants of deep-learning algorithms such as convolution neural network (CNN), deep belief network (DBN), restricted Boltzmann machine (RBM) and state-based autoencoders (SAE).For differentiation www.ijacsa.thesai.org between melanocytic and non-melanocytic (MnM) skin lesions, the CNN, SAE are integrated together and the final decision is performed through softmax linear classifier [17].In fact, the CNN model is used to best extract features from the pixels of the images and converted them into edges through its multilayer architecture approach.Afterward, the features are extracted by CNN model, are not optimized, therefore, the stack-based autoencoders (SAE) are employed to automatically select most discriminative features for better classification.As a result, the deep-learning algorithms are utilized to diagnosis pigmented skin lesions.

III. METHODOLOGY
The clinically-oriented expert system through deep learning (COE-Deep) algorithms involve three main steps such as extraction of deep features, optimization of deep features and classification of these features into melanocytic and non-melanocytic skin lesions.The overall systematic diagram of COE-Deep system is shown in Fig. 2.These phases are explained in the following sub-sections.

A. Dataset Acquisition
Clinically-oriented expert system using deep learning (COE-Deep) algorithms is tested on 5200 dermoscopy images contains an equal number of melanocytic and non-melanocytic skin lesions.These images were obtained from many public and private resources.Among 2300 dermoscopy images, the 400 melanocytic and another 400 non-melanocytic skin lesions are collected from EDRA [18] as a CD-room.One more, the dataset was collected from the Department of Dermatology, University of Auckland (DermAuck) [19].The DermAuck dataset contains 600 melanocytic and 600 nonmelanocytic lesions.The total 1600 melanocytic and 1600 non-melanocytic skin lesions were collected from the International Skin Imaging Collaboration (ISIC) [20].In total, the dataset of 5200 dermoscopy images is obtained from these three different sources along with different image sizes.All these images were resized to a standard size of (800 X 800) pixels resolution.Moreover, an expert dermatologist was requested to verify the images in all these two categories.The images contain skin lesion with other skin areas.Therefore from the center position of each image, the circular region-ofinterest (ROI) of size (400 X 400) pixels is automatically selected.An example of this dataset is also displayed in Fig. 1.

B. Features Extraction
During last few decades, the discriminative features extraction and selection becomes one of the difficult and challenging tasks because the subsequent recognition step depends on this step.As mentioned above, the features selected required domain expert knowledge for defining handcraft features and there are lots of steps about pre-or post-processing.Therefore in this paper, the convolutional neural networks (CNNs) model [17] is used to automatically select features from the raw pixels of the image.The CNNs model is used because it is utilized as a major tool in the past studies for classification problems.The CNNs model is applied to the pixel of images and there is no need to manually perform features extraction technique to define handcrafted features set.If the CNNs model is used to extract the features then without overfitting, it can have possible train the deep network in a sensible amount of time.
In this article, the CNN model employs in the form 3layers deep neural networks to solve the problem of features selection from dermoscopy images.The first layer is directly linked to the image pixels and generated features map after convoluting layer filter.In the second layer, the similar features map are combined to generate edges that are presented in dermoscopy images.At last, the third layer is used to select mean activation function of the features from edge map.In this paper, the unsupervised approach of CNNs model is employed.
The mathematical description of the CNN model is defined on a set of k filters, filters element as and elements as with C channels of size (m × n) with a set of N images with C channels of size (l × k).Based on this description, the first convolutional layer output is given as: And the output of an entire image/filter in the convolutional process is defined in CNN model as pairs as follows: Where represents 2D correlation.Fig. 3 illustrates the utilization of CNNs model to extract the features from the dermoscopy images.

C. Optimization
The features defined by CNNs model is not optimized.To optimize the most discriminative deep-invariant features, the stack-based autoencoders (SAEs) [17] is applied.In this paper, the SAEs algorithm is selected because it depicts the behavior of the human-like brain.The best results described in the past studies, if the supervised SAEs algorithm and four layers were used to optimize the deep features.In practice, the SDAs algorithm hypotheses are tested through trained greedy layerwise pre-training approach on the testing dataset.The main steps for the development of features optimization through SAEs are presented here.www.ijacsa.thesai.orgIn general, the pixels in an image that represents the feature vectors defined as an input hidden layer by autoencoders.However, the first input hidden layer in this paper is defined on features generated in the previous step.The second and third hidden layers transform those features into best representation, and an output final hidden layer matches the input layer for reconstruction.Autoencoders is assumed to be deep if the number of hidden layers is greater than one.Moreover, in this study, the original dimension of the hidden layers are defined small to perform features reduction step.Specifically, the autoencoders are developed through stochastic gradient descent method and trained by back propagation variants.
The mathematical description of autoencoders it to learn the code ( ) from the features data, ( ) and map with weights (W) according to some sigmoid ( ) function.It is defined as: Where, b represented the biases of autoencoders.The code is then mapped back through a decoder into a reconstruction (R) through the similar transformation as mentioned above and defined as: And the reconstruction error is measured as: To minimize this mean square reconstruction error, the stochastic gradient decent approach was used in the training process of an AutoEncoder.This minimization step is performed by searching the weights on the encoder and decoder's connection, and share those weights on the encoder and decoder that utilized the same weights.As a result, this step is definitely used to reduce the features by ½ without having any deficiency on the performance of autoencoders.The autoencoders with these four layers are not sufficient to take the final classification decision due to over-fitting problem on this deep neural architecture.Therefore, the softmax linear classifier is used to take the final classification decision.

D. Classification
The softmax classifier is normally utilized in the past studies to recognize the objects or features through logic regression classifier in the form of binary representation.The softmax linear classifier [17] proceeds with a vector of random real-valued scores and compresses them into a vector of values between zero and one.The decision of differentiation is performed by softmax classifier based on normalize class probabilities and normally, this classifier is used to reduce the cross-entropy between estimated of class probabilities and the known distribution.

IV. EXPERIMENTAL RESULTS
The proposed clinically-oriented deep-learning (CO-Deep) system was implemented in Matlab® 2016 and tested on Windows 10 platform on Core i7 CPU.The statistical analysis was performed through sensitivity (SE), specificity (SP), accuracy (ACC) and area under the receiver operating curve (AUC) on the dataset of 5200 dermoscopy images collected from different resources.In this selected dataset, the melanocytic and non-melanocytic lesions are in equal quantity to provide equal importance during testing and classification stages.For developing the CO-Deep system, the dataset is divided into 40% of training and 60% of testing through 10fold cross validation test.Some of the results are shown in Table 1 of the proposed COE-Deep system on 5200 melanocytic and non-melanocytic (MnM) skin lesions when diagnosis through digital dermoscopy images.This table describes the statistical analysis in terms of Sensitivity (SE), Specificity (SP), Accuracy (ACC), training errors (E) and area under the receiver operating curve (AUC).As a display in Table 1, the average values for SE of 92%, SP of 94%, ACC of 93%, AUC of 0.94 and E of 0.73 are obtained when tested on this dataset in the case of melanocytic skin lesions whereas in the case of non-melanocytic skin lesions, the SE of 88%, SP of 92%, ACC of 90%, AUC of 0.90 and E of 0.65 are achieved.From these results, it clears that the proposed COE-Deep system is getting significantly higher results in the case of melanocytic than non-melanocytic skin lesions.It is due to the fact that it is very difficult to recognize non-melanocytic lesions compared to melanocytic skin lesions.Therefore, according to limited knowledge, there is no effective study for differentiation between MnM skin lesions through deep-neural-network approach without the need of hand-crafted features and pre-or post-processing steps.
In the past studies, there was only one paper found [15], where the authors utilized domain expert knowledge of image processing and machine learning algorithms to perform this classification of MnM skin lesions but the system required lots of steps with pre-and post-processing stages.They represented classification results of melanocytic lesions on 548 lesions in terms of sensitivity of 98.0% and a specificity of 86.6% using a cross-validation test.These obtained results were mentioned on the small dataset and classifier may be over-fitted when applied on a large scale environment.Therefore, the proposed system is better compared to [15] in terms of large-scale applicability.Using the above-obtained results, it confirmed that the COE-Deep system based on the advanced deep learning algorithm is capable of classifying melanocytic and non-melanocytic skin lesions.This is the first and basic difficult step for dermatologists to draw a separate line between MnM skin lesions in the diagnosis process.As a result, the proposed method assists the clinical experts to draw this clear line.
The comparisons are also performed with the state-of-theart deep-learning algorithms in terms of SE, SP, ACC, AUC and E-statistical analysis on this selected dataset.As calculated in Table 2, the convolutional neural network (CNN) with four layers on average obtained SE of 80%, SP of 84%, ACC of 82%, AUC of 0.81 and E of 0.75 values to different MnM skin lesions.If CNN is integrated with the softmax linear classifier then the recognition results are high significantly better.In the case of CNN and softmax classifiers, SE of 84%, SP of 88%, ACC of 86%, AUC of 0.87 and E of 0.73 values are achieved.In contrast with CNN, if the stack-based autoencoders (SAEs) are utilized then the SE of 85%, SP of 88%, ACC of 86.5%, AUC of 0.86 and E of 0.71 values on average are obtained.However, the significantly better results are obtained in the case of SAE and softmax linear classifiers.In that case, the SE of 89%, SP of 90%, ACC of 89.5%, AUC of 0.88 and E of 0.69 values on average are gained.But the higher significant results are obtained in the case of proposed COE-deep system when combined CNN, SAE and softmax classifiers to recognize melanocytic and nonmelanocytic skin lesions.
All these above-mentioned results in Tables 1 and 2 were reported through 10-fold cross-validation test to classify MnM skin lesions.Fig. 3 has shown the corresponding receiving operating characteristic curve (ROC) for differentiation between MnM skin lesions.An area under the curve (AUC) shows the significant result of this COE-Deep system, which is greater than 0.5 compared to CNN and stack-based autoencoders (SAEs).The SAEs deep-learning algorithms are getting higher AUC value compared to CNN model but less than the proposed COE-Deep system.As displayed in Table 1, it can be noticed that in the case of melanocytic skin lesions, the best performance has been measured i.e., AUC: 0.94.This proposed system based on deep-learning algorithms significantly improves the performance with the average value of AUC: 0.92.It is because of designing an effective classification system through advanced concepts of deeplearning algorithms without focusing on features extraction and selection steps.

V. CONCLUSIONS
A clinically-oriented expert system based on deep-learning (COE-Deep) algorithms is presented in this paper to automatically differentiate between melanocytic and nonmelanocytic (MnM) skin lesions.The convolutional neural network (CNN) is employed to extract deep features and then most discriminative features are selected by stack-based autoencoders (SAEs) model.Finally, the recognition of decision is performed by Softmax linear classifier.On 5200 clinical dermoscopy images, the statistically significant results were obtained in terms of sensitivity (SE), specificity (SP), accuracy (ACC) and area under the receiver operating curve (AUC) when used 10-fold cross validation test.On average, the 90% of SE, 93% of SP, 91.5% of ACC and 0.92 of AUC values are obtained.Hence, the proposed COE-Deep system is best suited for classification of non-melanocytic skin lesions should improve the accuracy, reliability, and accessibility of pigmented skin lesions screening system.In the future work, this effort much added to get more accurate and an improved accuracy.

Fig. 2 .
Fig. 2. A systematic flow diagram of proposed COE-Deep system for classification of melanocytic and non-melanocytic skin lesions.

Fig. 3 .
Fig. 3. Performance comparisons of proposed DermaDeep system with state-of-the-art classification systems in terms of Area under the Receiver operating curve.

TABLE II
b. Sensitivity, b.Specificity, c. Accuracy, d.Area under ROC curve, e. Training errors