Lung-Deep : A Computerized Tool for Detection of Lung Nodule Patterns using Deep Learning Algorithms Detection of Lung Nodules Patterns

The detection of lung-related disease for radiologists is a tedious and time-consuming task. For this reason, automatic computer-aided diagnosis (CADs) systems were developed by using digital CT scan images of lungs. The detection of lung nodule patterns is an important step for the automatic development of CAD system. Currently, the patterns of lung nodule are detected through domain-expert knowledge of image processing and accuracy is also not up-to-the-mark. Therefore, a computerized CADs tool is presented in this paper to identify six different patterns of lung nodules based on multilayer deep learning ( known as Lung-Deep) algorithms compare to state-of-the-art systems without using the technical image processing methods. A multilayer combination of the convolutional neural network (CNN), recurrent neural networks (RNNs) and softmax linear classifiers are integrated to develop the Lung-Deep without doing any preor post-processing steps. The Lung-Deep system is tested with manually draw radiologist contours on the 1200 images including 3250 nodules by using statistical measures. On this dataset, the higher sensitivity (SE) of 88%, specificity (SP) of 80% and 0.98 of the area under the receiver operating curve (AUC) of 0.98 are obtained compared to other systems. Hence, this proposed lung-deep system is outperformed by integrating different layers of deep learning algorithms to detect six patterns of nodules. Keywords—Computer-aided diagnosis; lung nodules; patterns detection; deep learning; convolutional neural network; recurrent


I. INTRODUCTION
Lung cancer is increasing rapidly as estimated in 2016 [1] throughout the world.If lung cancer is detected at an early stage then it will definitely be cured but the chances for survival rate is below or is less than 70%.The radiologists are extensively using a high-resolution computed tomography (HRCT) [2] digital imaging tool and computer-aided diagnosis (CAD) systems to detect and diagnosis lung cancer.However, if the clinical experts are only using HRCT scan images to diagnosis lung cancer then it is a time-consuming job [3] to detect small lung nodules.In addition to this, the size of lung nodules is varying widely from few millimeters to several centimeters.Even though, it is also difficult for radiologists to maintain the screening process during regular visits of patients.
To solve above-mentioned problems, the automatic computer-aided diagnosis (CADs) systems were developed to detect and differentiate among disease nodules from HRCT digital images.In recent years, automatic CAD systems [4]- [11] are developed to improve the diagnostic accuracy of clinical experts.All those CAD systems were trying to compensate the problem of manual interpretation of lung nodules and reduce false positive.In addition to this, the HRCT scan images are visually much cleared compared to other scanning techniques but still, it is hard to detect small pulmonary nodules [9].Therefore, it is also difficult for the CAD systems to automatically detect them from the CT scan images as there are other objects also presented.
It noticed that many authors utilized complex image processing techniques to detect lung nodules and the classification accuracy is less than 80%.Instead of just detection of lung nodules, there is another step to classify those nodules into six different patterns such as honeycombing, ground glass, bronchovesicular, nodular, emphysema-like, and normal as required by the radiologists.However, the author did not find any CAD tool or study that classified six different patterns of lung nodules without using pre-or post-processing steps.It is very much important to classify them too perfectly for diagnosis of lung-related diseases instead of just differentiation between benign and malignant nodules.Hence, the main focus of this paper is to develop an effective system for classification of six lung nodules patterns from HRCT scan images through state-ofthe-art deep learning systems by avoiding complex image processing techniques.
Although, there were CAD systems developed in the previous recent studies.Those CAD tools were described here to provide the background about the past studies.Especially in [10], the author's utilized image processing and pattern recognition methods to differentiate between malignant and benign lung nodules instead of classifying lung nodule patterns after extracting various forms of features.The authors performed classification decision based on traditional machine learning algorithms such as genetic algorithm (GA) and support vector machine (SVM).On 1405 lung nodules, the authors reported an accuracy of 93.19 %.In that study, the authors focused only on recognition of benign and malignant www.ijacsa.thesai.orgnodules instead of identifying different patterns of the lung nodule.
The previous CAD systems [3]- [10] based on three main steps, such as segmentation of lungs or nodules, extraction of features and afterward, the selection of most prominent features.The last stage is to classify these discriminative features for recognition of lung disease patterns.In the past studies, these steps are well-addressed to search the most effective features for categorizing of lung nodules.Unluckily, many those CAD systems required pre-or post-processing steps and complicated image processing algorithms.Therefore, it is very hard for them to recognize all kinds of lung nodules for diagnosis of lung cancer.Instead of using old machine learning and image processing algorithms, there is the latest trend through deep learning methods.In practice, the deep learning algorithms are not prerequisite any domain expert knowledge to define and select features.These deep learning based CAD systems are explained in the subsequent paragraphs.
The authors recognized malignancy of lung nodules in [11] through a Multi-crop Convolutional Neural Network (MC-CNN) model to automatically extract nodules features without using time-consuming pre-or post-processing steps.The classification decision is performed through max-pooling technique on CNN features map.Whereas in [12], the features for lung nodules from CT scan images are automatically extracted and classified using deep learning algorithm on 1018 cases.The authors integrated a convolutional neural network (CNN), deep belief network (DBN), and stacked denoising AutoEncoder (SDAE) in that study.In that study, the authors compared the performance of proposed system with handcrafted features by using a 10-fold cross-validation method and area under the receiver operating characteristic curve (AUC).Whereas in [13], lung nodules are classified through the development of Multi-view convolutional neural networks (MV-CNN).The authors achieved higher classification result to differentiate benign and malignant lung nodules.In [14], the authors used a different approach by combing the genetic algorithm with deep learning to classify lung nodules without computing the shape of nodules.The presented methodology was tested on LIDC-IDRI dataset and showed the best sensitivity of 94.66%, specificity of 95.14%, an accuracy of 94.78% and area under the AUC of 0.949.
In [15], the authors used three pairs of convolutional layers and two fully-connected layers from CNN model to differentiate between benign and malignant lung nodules from CT scan images.Similarly in [16], a CNN model was employed to automatically learn image features and detect pulmonary nodules from CT scan images.In contrast to these approaches, the authors used both hand-crafted features and deep learning features in [17].For automatically defining the deep features, the authors used deep learning models of stacked denoising autoencoder (SDAE) and CNN.Whereas to define hand-crafted features, they utilized Haar-like and HoG features for detection of lung nodules in CT images.Same in [18], the authors used hand-crafted features combined with deep features to identify pulmonary nodules from CT scan images.They obtained higher accuracy compared to manual segmentation by radiologists.
The author did not find any study that classified six different patterns (honeycombing (HCmb), ground glass (GGlass), bronchovesicular (BCho), nodular (NDLR), emphysema-like (EmpMlk), and normal (NRM)) of lung nodules without pre-or post-processing steps.It is very much important to classify these lung patterns to perfectly identify lung-related diseases instead of just identifying benign and malignant lesions.
The basic purpose of this paper is to develop a computerized diagnostic system to detect lung nodules (Lung-Deep) based on advanced deep learning algorithms for early detection of lung cancer without extracting and selecting hand-crafted features.This paper demonstrates that patterns of lung nodules are classified without segmentation of nodules or defining hand-crafted features which are time-consuming tasks.The primarily main of this research study is to develop a system for classification of various patterns of lung nodules through integration of different layers of deep-learning algorithms compared to conventional machine learning algorithms.There are six lung disease tissues, such as honeycombing (HCmb), ground glass (GGlass), bronchovesicular (BCho), nodular (NDLR), emphysema-like (EmpMlk), and normal (NRM).In this study, six lung nodules patterns are classified by using a multilayer combination of convolutional neural network (CNN), recurrent neural networks (RNNs) and Softmax linear classifier algorithms [19].Fig. 1 shows the example of six tissue patterns in the dataset during the follow-up operation.

A. Acquisition of Dataset
To test the proposed Lung-Deep system, a data set of CT scans was acquired from the Lung Imaging Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) [20].All those images in the LIDC datasets were contained different size of lung nodules.Therefore in this paper, this LIDC-IDRI dataset was utilized to test the performance of Lung-Deep system.www.ijacsa.thesai.orgFrom all these images, the 300 CT cases and 3,250 lung nodules are selected for evaluating the Lung-Deep system.From each scan images, the region-of-interest (ROI) of lung nodules are defined of size (200 x 200) pixels.Moreover, these 3,250 lung nodules are manually defined contours around lung nodules by an experienced radiologist.An example of manual segmentation of lung nodules from one CT scan image is visually shown in Fig. 1.

B. Proposed Method
A combination of the convolutional neural network (CNN) and recurrent neural networks (RNNs) deep learning algorithms are used in this paper to detect lung nodules from CT scan images.The CNN model [19] is used to transform input images into features representation into layers in an unsupervised fashion.In practice, the CNN model is the top variant of the deep learning algorithm is used when an image contains multiple objects.Therefore to extract features from lung nodules, the CNN model extracts the features and represented them using multiple features map.Afterward, a supervised RNN model is integrated to optimize the features extracted from CNN layer.Finally, the six lung nodules patterns are recognized through Softmax linear classifier.
The six lung nodules patterns are identified by using a powerful combination of CNN, RNN, and Softmax multilayer deep learning algorithms.According to a literature review, it noticed that the CNN models are defined effective descriptive features set for recognition tasks instead of using hand-crafted features.In practice, the CNN model is to transform the low-level pixels to high-level one.However, the features define by CNN models are not optimized, so the recurrent neural networks (RNNs) model is integrated to perfectly optimize features.
In this paper, two-layers are utilized for un-trained CNN model to extract the features from extracted ROI lung nodule image of size (200 x 200) pixels.The first layer of CNN model contained 10 feature maps and the second one has 20 maps with a kernel size of 1 from each ROI lung nodule image.These two fully connected layers contain 4000 and 2000 nodes, respectively.The input to this CNN model is ROI lung nodule image of size (200 X 200) pixels.In order to optimize of features, the RNN model is applied with two-fully connected layers.In the past studies, the RNN model was outperformed to select most discriminative features that can provide better classification results.The RNN models are known as recurrent because they perform the same task for every feature of a sequence, with the output being depended on the previous computations.The RNN models are different in compared to feed forward neural network approach.In the feed forward neural network, the network is organized via layers and information flow unidirectional from input pixels to output.However, in RNN architecture, the flow of information is undirected cycles in the connectivity of like some patterns.This multilayer architecture RNNs model does not have to be arranged in terms of layers and directed cycles are also admissible.In practice, the neurons are actually allowed in this architecture to be fully-connected.In this paper, two-fully connected layers are utilized to optimize the features extracted by CNN model in the previous step.In the first layer, there are 1000 nodules.Whereas in the second layer, the RNN model has 500 nodes to represent the probabilities of six different lung nodule patterns.In this paper, the RNN model is used in an unsupervised manner.The architecture of RNNs model with CNN is shown in Fig. 2.
The Softmax linear classifier is used to six different patterns in a supervised fashion with already known class labels (Y).It is a statistical model that attempts to learn all of weight and bias parameters by using the learned features of the last hidden layer.In the case of binary classification (=6), the softmax regression hypothesis outputs ℎ().The predictive is, therefore, a multinomial distribution, which can be naturally parameterized by a softmax function at the output layer.In general, the experiments in this paper aim to predict at the finest granularity found in the data, so as to maximize the generative flexibility of the network.www.ijacsa.thesai.org

III. EXPERIMENTAL RESULTS
The performance of Lung-Deep system is evaluated and compared by using statistical measures such as sensitivity (SE), specificity (SP) and area under the receiver operating curve (AUC) [21].For comparisons with ground truth, there are 3,250 lung nodules utilized, which are manually defined by an experienced radiologist.This lung nodules dataset is divided into 35% training and 65% testing examples and applied 10-fold cross-validation test for calculating the robustness of Lung-Deep system.In this training and testing of Lung-Deep, lung disease patterns are divided into 6 classes such as honeycombing (HCmb), ground glass (GGlass), bronchovesicular (BCho), nodular (NDLR), emphysema-like (EmpMlk), and normal (NRM).It noticed that the higher the value of AUC indicates that the system is going to achieve significant better classification results.
On a total of 3,250 lung nodules, the average statistical measures of the Lung-Deep system are displayed in Table 1.From this table, it observed that, on average, the best lung nodules detection results are obtained such as SE of 88%, SP of 80.0% and AUC of 0.89.In the case of Hcmb lung patterns, the SE of 87%, SP of 74.5% and AUC of 0.89 are obtained.In GGclass nodule patterns, the Lung-Deep system is obtained best results such as 90% of SE, 79.5% and AUC of 0.90.Whereas, in the case of BCho patterns, a SE of 85%, SP of 80% and AUC of 0.87 values are obtained.Compared to other lung nodules patterns, the statistical significant results are obtained by the proposed Lung-Deep system in case of Nodular (NDLR) lung nodules such as SE of 93%, SP of 82.5% and AUC of 0.92 values.As a result, the proposed Lung-Deep system improved the detection accuracy of lung nodules.It happens due to combining of the convolutional neural network (CNN), recurrent neural network (RNN) and softmax deep classifiers for detection of lung disease patterns.
In this paper, a computerized system is developed to automatically classify disease patterns into six categories by using HRCT scan images.For early detection of lung cancer, the radiologists are facing many difficulties to interpret a large number of CT scan images.In such a consequence, if automatic CAD system may improve lung cancer detection rate and reduce errors to classify lung nodules.The proposed Lung-Deep system is implemented to use the features set generated by a convolutional neural network (CNN) and optimize using recurrent neural network (RNN), model.These features are finally classified candidate lung nodules as honeycombing (HCmb), ground glass (GGlass), bronchovesicular (BCho), nodular (NDLR), emphysema-like (EmpMlk), and normal (NRM).The performance of Lung-Deep system shows an improvement as used large data set.For the 32,50 lung nodules in this dataset, the Lung-Deep approach outperforms by recognizing and categorizing less than 11% of the observed false negatives.Therefore, it concludes that the presented system is expected to perform with high accuracy given the availability of large data set.Moreover, this technique by using deep learning algorithms do not require any pre-or post-processing steps or domain expert knowledge for selection of features.
The comparisons with other state-of-the-art deep learning systems are also performed in this study to show the importance of integration of various layers for the development of Lung-Deep system using AUC curve.These comparisons results are displayed in Fig. 3.The obtained results indicate that an effective computerized system is developed in this paper to detect six lung nodules by using a powerful combination of CNN, RNN, and Softmax multilayer deep learning algorithms.According to the literature review, it noticed that the CNN models are defined effective descriptive features set for recognition tasks instead of using hand-crafted features.

IV. CONCLUSIONS
A new computerized lung-nodules pattern detection system using multilayer deep learning algorithms is developed in this paper for the early diagnosis of lung cancer or lungrelated disease.The proposed Lung-deep system is better than state-of-the-art systems due to use of latest machine learning techniques without using complex image processing algorithms.A dataset of 3250 lung nodules are utilized in this study to test the performance of proposed Lung-Deep system.www.ijacsa.thesai.orgWorldwide, the early detection of lung cancer improves the patient survival rate and therefore in this study, an improved computerized system is proposed to classify lung nodules without clinical experts.Accordingly, the major contribution of this development lies in the application and analysis of two variants of deep learning architectures for classification of six lung nodules disease patterns.The developed system was tested and evaluated on the LIDC/IDRI database and the best result was achieved.For detection of six different patterns on LIDC/IDRI dataset, a good performance is obtained in terms of sensitivity, specificity, and area under ROC curve about 88%, 80%, and 89%, respectively.It outperforming the results obtained by using of variants of deep learning techniques.It is important to classify lung nodules into benign and malignant based on disease patterns that will be focused in the future work.In addition to this, the segmentation of lung nodules will be automatically performed [22] compared to the use of manual segmentation in this study done by an expert radiologist.Hence, the development of this CAD tool for pattern classification is having a great clinical importance and it assists radiologists to better identify the lung-related disease.

Fig. 1 .
Fig. 1.An example of images taken from LIDC/IDRI dataset of classification of six lung disease pattern.

Fig. 2 .
Fig. 2. A methodological systematic diagram of proposed lung-deep system for classification of six lung nodules patterns based on deep learning architecture. .

Fig. 3 .
Fig. 3. Area under the receiver operating curve (AUC) of the proposed deep system compared to other deep learing systems on classification of six lung disease pattern.