Detection of nCoV-19 from Hybrid Dataset of CXR Images using Deep Convolutional Neural Network

The Corona-virus spreads too quickly among humans and reaches more than 72 million people around the world until now. To avoid spread, it is very important to recognize the individuals infected. The Deep Learning (DL) technique for the detection of patients with Corona-virus infection using Chest X-rays (CXR) images is proposed in this article. Besides, we show how to implement an advanced model for deep learning, using Chest X-rays (CXR) images, to identify COVID-19 (nCoV-19). The goal is to provide an intellectual image recognition model for over-stressed medical professionals with a second pair of eyes. In using the current publicly available COVID-19 data-sets we emphasize the challenges (including image data-set size and image quality) in developing a valuable deep learning model. We suggest a pre-trained model of a semiautomated image, create a robust image data-set for designing and evaluating a deep learning algorithm. This will provide the researchers and practitioners with a solid path to the future development of an improved model. Keywords—COVID-19; artificial intelligence; deep learning; chest x-ray image analysis; convolutional neural network; InceptionV3.


INTRODUCTION
COVID-19 is a population affected pandemic that has continued to have catastrophic consequences worldwide. Individual screening is one of the key steps that can be used to manage the spread of infection at infected hot spots and to monitor the health of serious patients. The need for fast testing highlights simple and effective screening methods that are available. The novel COVID-19 (nCoV-19) is been called when viewed under an electron microscope because of its distinctive solar corona (crown-like) appearance [1]. As of December 2019, the epidemic of the nCoV-19 in Wuhan, China has an extent briskly to other nations [2]. The transmission from animals to human beings [3] is shown in Fig. 1. Infectious diseases caused by these types of viruses were identified as nCoV-19 by the World Health Organization (WHO) on Feb 12, 2020 [4]. China had received about 90,000 confirmed cases till (March 21, 2020) and the worldwide confirmation of more than seventy-two million cases [5] shown in Fig. 2(a), and total COVID-19 deaths are shown in Fig. 2(b). All the approaches under existing AI techniques detect pneumonia from COVID-19 by the use of the database through X-ray images. Among the difficulties, AI faces when it comes to detecting pneumonia is "how the machine knows that COVID-19 triggers the identification of pneumonia in the chest x-rays". The researchers also accept that there are complications, though majorities of deaths from nCoV-19 of vulnerable patients are due to pneumonia, "Dr. Tom Naunton Morgan Chief Physician at behold.ai. [6] said. There are a variety of pathogens that may potentially endanger the existence of pneumonia, including direct or indirect COVID-19 infections". The algorithms are preparing for real-time detection of pneumonia. In the majority of cases, the main cause of pneumonia is bacteria. The signs are cold, grip, and affected lungs. Soil and bird drops are also a cause of potential pneumonia. Some of the viruses that cause resentment and influenza lead to pneumonia. The justification for explaining and triggering these forms of pneumonia is to explain the problems by using a robust Artificial Intelligence (AI) program. AI can assist the public in various areas, including early cautions, diagnosis predictions, therapies, monitoring classification, treatments, analysis tools, and social control. A Canadian AI platform, the BlueDot [7] has proved its value and has gained some great reputation. Conferring to the AI model, the system was learned of an epidemic on 31 st December 2019 earlier the announcement from the WHO on 9 January 2020. In January Journal of Travel Medicine [8], several researchers have collaborated with BlueDot. They mentioned that Wuhan China's virus was spread through travelers in 20 different cities worldwide. While BlueDot is a powerful AI tool, it has also been exaggerated by several forums. HealthMap [9] was the second AI model, which also declared an early alarm in Boston Children's Hospital, U.S.A., a scientist predicted that a COVID-19 significant outbreak after half an hour. The AI model reacted more quickly than a human being. Exact identification can save many lives with short computational power. The spread and development of the disease can also be managed via a well-trained model. United Nations (UN) Global Pulse scientists have analyzed several COVID-19 applications based on AI's. It was claimed that both Chest X-rays (CXR) and Computed Tomography (CT) scans can be used for COVID-19 detection built on AI modeling. The scholars also proposed that CT images for the identification of COVID-19 should be scanned using a cell phone [10].  AI will also provide ample help for the prediction and monitoring purposes to decide how long this COVID-19 pandemic is spread. Following the last pandemic in 2015, an AI-based system for the prediction of disease propagation is being developed for Zika viruses [11]. For COVID-19, these existing models can be used so that the system can be retrained with COVID-19 data. nCoV-19 data may also retrain the algorithm for the prediction of seasonal flu [12]. We need a great deal of training data for deep learning architecture for this we use different types of available datasets [13] [14]. The databases used are very small in state of the art methods. The enactment of a system that is trained with 60-120 images with just one particular type of pneumonia that has occurred due to nCoV-19 cannot be dependent. To improve the AI-based system performance for detecting nCoV-19, a huge dataset of CXR images is essential for multiple categories of pneumonia.
AI leads to the exploration of potential emerging drugs well. Taking into account the COVID-19 outbreak, various research laboratories indicated that AI had to look for the treatment vaccine. Scientists consider that the procedure of the AI model can speed up the COVID-19 search and vaccination process, so Google Deep Mind predicts the protein structures of COVID-19 and can provide useful vaccine discovery information. On the website of Deep Mind, it is also mentioned: We highlight that these predictions of the structure were not experimentally confirmed. We cannot be sure that the systems we provide are correct [15]". A refined, visualized evidence on the facts of COVID-19 is delivered by the data dashboard. The AI-based data for pursuing and predicting the nCoV-19 outbreak has been provided with various lists from MIT technologies, including HealthMap [16], NextTrain [17], and Upcode [18]. These dashboards offer a global opinion of the outbreak of COVID-19 in every country. The skimming of people in congested areas, or possibly affected areas, can be monitored by AI so that the body temperature of humans for the virus forecast against nCoV-19 was detected by an infrared camera in the Chinese railway station [19]. AI is used to manage the pandemic by scanning and implementing social separation and lockdown. As the South China Morning Post describes, Infrared cameras can scan the crowd for high temperatures in airports and railway stations throughout China. China uses a facial recognition system that can identify people with high body temperature and whether they are wearing a surgical mask or not. The positive COVID-19 patients can be significantly confirmed by imagery methods like the CT and CXR [20]. In recent studies to classify COVID-19, CT images for lung and soft tissue were examined. However, the downside of CT imaging is the expense scan and a high dose of the patient. Conversely, in every hospital and clinic, CXR is available to generate 2dimensional (2D) thorax projection images [21]. The CXR model is generally the primary option to identify chest pathology and has been applied by radiologists in a limited number of patients to confirm COVID-19. The emphasis of this research is therefore only on using the chest x-rays imaging method for possible patients with nCoV-19 [22]. In CXR images, however, soft tissue with a bad contrast cannot readily be detected [23]. Computer-Aided Diagnoses (CAD) have been developed to help clinicians identify and measure distrusted infections of vital tissues automatically in CXR images to overcome these limitations. CAD systems are focused essentially on the fast progress of computing equipment such as graphical processing units (GPUs) to run algorithms for the processes of medical images, including improvement of images, segmentation of organ tumors, and interventional navigation [24]. Deep Neural Networks are one of the influential architectures of DL and has become used intuitively in multiple applications [25]- [27]. So far, the use of DL techniques in chest x-ray to classify and identify nCoV-19 is still very limited. Data is the first step in the creation of a diagnostic method in the sense of a COVID-19 pandemic. There are broad public collections of more CXR, also nCoV-19 CXR are collected. In this research, we isolate the public database of CXR pneumonia cases, in precise COVID-19 cases. Fig. 3. 55-Year-Old Woman Survived from nCoV-19 [28]. www.ijacsa.thesai.org Data must be obtained from public databases so as not to violate the confidentiality of patients. Fig. 3 shows an example of an infected female of age 55-year-old who survived COVID-19. This will provide important knowledge for the development and training of a deep learning system. These tools can be built to detect the characteristics of nCoV-19 concerning other pneumonia types. The purpose of this research is, therefore, to suggest a system for pre-trained DL classifiers, Convolutional Neural Network (CNN) as an advanced way of supporting x-ray images to analyze nCoV-19 automatically. The article is arranged according to the following. Section 2 describes the existing deep learning image classifiers. Section 3 delivers an overview of the related work. The suggested CNN model is defined in detail in Section 4. Section 5 provides trial results and discusses the performance of the model. The main prospects for this study are ended in Section 6.

II. DEEP LEARNING IMAGE CLASSIFIERS
One of the significant objectives of this study was to accomplish a state of the art grading outcomes using publicly handy data and models, with transfer learning to balances the limited sample size and speed up training processes so that modest hardware can provide reasonable results. In this section, we define some of the deep learning classifiers that are available today.

A. VGG19
(VGG) Visual Geometry Group was created based on the CNN architecture by Oxford Robotics Institute's Andrew Zisserman and Karen Simonyan [29]. It was presented at the Large Scale Visual Recognition Challenge in 2014. On the ImageNet data-set. To improve its image extraction, VGGNet uses small filters of 3×3, as compared to AlexNets with 11×11 filters. This deep network architecture is made up of two versions: VGG19 and VGG16, each with diverse layers and depths. VGG19 is more profound than VGG16. However, the numeral of parameters is greater for VGG19, and thus costlier to train the network than for VGG16.

B. DenseNet121
There are several important benefits for the Dense Convolutional Network they reduce the vanishing-gradient problem, increase the propagation of features, encourage the reuse of features, and significantly lower the number of parameters [30]. DenseNet121 is a 121-layer Dense Network interface that loaded the ImageNet database with pre-trained weights.

C. InceptionV3
The network consists of 159 layers and secured the 2014 ImageNet challenge with a top 5 accuracy of 93.3% [31]. Late versions are stated as Inception VN. N is the version number so InceptionV1, V2, and V3. The InceptionV3 network Implementation has many building blocks regular and irregular, with separate divisions of convolutions, mean, max pooling, concatenated, dropouts, and fully connected layers.

D. ResNetV2
To achieve strong convergence patterns, He et al., [32] have established Residual Neural Network prototypes by using skip connections to hop over certain layers of the network. The improved ResNet version is known as ResNet-V2. Even if the ResNet looks like the VGGNet, but it is about 8 times deeper than VGGNet [33].

E. Inception-ResNet-V2
The network is consisting of 572 layers deep, which combines the architecture of Inception with residual connections. Inception-ResNet-V2 is an InceptionV3 [34] variant. Over a million images in the ImageNet dataset are trained in Inception-ResNet-V2.

F. Xception
Xception model design is a linear heap of depthwise divisible convolution layers with residual connections that allow the deep-network design to be easily described and modified [35]. The Xception is an enhanced design of the Inception framework that substitutes standard initial units with distinctive depth convolutions.

G. MobileNetV2
CNN architecture for restricted computing power devices such as smartphones [36]. Sandler et al., [37] proposed the MobilleNetV2 model. MobileNet achieves this primary advantage by reducing the number of learning constraints and intuitively reducing memory consumption by inverted residuals using the linear bottleneck blocks. Besides, the pretrained execution of MobileNetV2 is extensively available in many standard deep learning environments.

III. RELATED WORK
A clear diagnosis and the cause of illnesses are identified, a major obstacle for doctors to reduce patient distress remains in time. Certainly, the usage of Image Processing (IP) and DL methods in biomedical image processing and analysis has delivered very satisfactory results. A brief overview of a few significant contributions from the existing literature is provided in this section.
Sethy et al. [38] suggested the identification of nCoV-19 based on the Support Vector Machine (SVM) and deep features using X-ray images. They had collected CXR images from the repository of Kaggle, GitHub, and Open-I repository. They mined the deep features of CNN prototypes and fed each individually to the SVM classifier. They have got an accuracy of 98.66%. with ResNet50 plus SVM.
Shan et al. [39] aim to estimate COVID-19 in CT scan by using the DL model named VB-Net. They used 300 images for validation and 250 images for training. They achieved a precision of 91.6%.
Butt et al. [40] suggested a model for detection from influenza-A viral pneumonia nCoV-19 with the use of deep learning techniques in pulmonary CT images. The CNN model provided 86.7% accuracy for CT images.
Wang et al. [41] used CT images for nCoV-19. The Transfer-Learning Model was also used to construct the algorithm with 89.5% accuracy reported.
Bhandary et al. [42] proposed the framework for the diagnosis of pneumonia and cancer. Two separate DL www.ijacsa.thesai.org techniques were proposed. The first CXR images were classified with the help of SVM into a normal and pneumonia class and their performance was validated with additional pretrained, deep learning models (ResNet50, VGG19, VGG16, and AlexNet). The second introduced a combination of handmade and studied features in the MAN to boost the precise rating during the lung cancer test.
Stephen et al. [43] suggested a new classifying pneumonia detection based on the ConvNet model trained from scratches based on a data-set from a collection of CXR images. The results obtained were 12.88% of training loss, 95.31% of training accuracy, and 93.73% of validation.
Ayan et al. [44] implemented an early diagnostic system based on the Xception and VGG16 CNN model. The study used 5856 frontal CXR images of the Kermany data-set. Test results show that the VGG16 network is better than the Xception network by precision 86%, sensitivity 85%, and recall 94%. The VGG16 network is more effective for classifying CXR images than the Xception network.
In this article, we presented the CNN model of DL for classifying nCoV-19 from CXR images. To input CXR images into the CNN, a classifier is then used to set the outputs of the consequences of the classification.

IV. RESEARCH METHODOLOGY
Use Methods of deep learning (DL), with state of the art Computer Vision (CV) and IP, have recently been shown to provide enormous potential [46]. These technologies have been employed in various methods for the segmentation, recognition, and classification of high-performance medical imaging [47]. Some DL techniques include identification of skin cancer, breast cancer identification, classification, detection of lung cancer, etc. While these methods have shown great achievement in medical imaging, they involve a huge amount of data which in this area of applications is not yet available.
To identify nCoV-19, we implement a simple CNN model consisting of a convolution layer including 5×5 filter tailed by batch normalization layer, rectified linear unit (ReLU), a completely connected layers, SoftMax layer, and an output layer. The loss function is used to initialize the weights and cross-entropy in the classification layer [48]. Fig. 4 gives details of the CNN model.

A. Input Layer
This layer is liable for reading a pre-processed image dataset. Since the medical image contains several letters and medical symbols. Image sizes are varied as images are taken from different sources. Therefore, changes the input image size by 255×255. We cropped the region of the lung and chest as much as possible so it does not contain any extra region.

B. Convolution Layer
This layer is the critical layer in our suggested CNN model which will perform most calculations. This layer's chief function is to recover features from the image data-set and to preserve the spatial association between pixels. The functions are obtained using a series of filters, where an individual filter is designed based on the size of a 5×5 filter.

C. Batch Normalization Layer
This layer represents an extremely deep technique for NN training that stabilizes the convoluted feature values. The aim of using this layer is to decrease the number of training epochs essential for deep network creation and to stabilize the learning cycle.

D. ReLU Layer
This layer aims to substitute the negative values of a pixel with zero in the convolved features. This creates the nonlinearity plan of the CNN network features.

E. Fully Connected Layer
All activation functions of the preceding layer are related to the neurons of this layer. The main task of this layer is to categorize the collected features in the specified classes from the image data-sets.

F. SoftMax Layer
This layer is merely used to consider the possible values of the previous layer activation function. The values can be interpreted in two groups of '0' and '1' in the diagnosis case.

G. Output Layer
The last layer of the CNN model can be labeled with the outcomes of the preceding layer. For example, '1' is marked at CoV-19 + , and '0' is marked at none of CoV-19 -. 70% of Chest X-rays (CXR) images containing abnormal and normal cases are arbitrarily selected for CNN training to test the efficiency of the suggested DL classifier. Training parameters for deep convolution neural network (DCNN) architecture in this study includes initial learning rate=3 ×10 -3 to accomplish the desired merging on this slight image dataset with few iterations and also to avoid the degradation issue as possible. The proposed CNN model has been trained with the Adam optimizer on the ImageNet data-set. For training purposes: the minimum learning rate = 4.78×10 -9 , the number of epochs = 500, the batch size = 16, factor = 0.3, and patience = 1 all these hyper-parameters were used. The successively times of all DL models are fairly undersized ranging starting from 410 to 2845 seconds, due to the use of powerful GPU tools with a small chest X-ray (CXR) image data-set. On 10 images tested, the test times of the proposed model that resulted did not exceed 10 seconds. Finally, a batch rebalancing plan is put in place to facilitate improved batch distribution. The CNN model prototype has been created and evaluated with the TensorFlow2 backend in the Keras profound learning library. The graphical enactment estimation of qualified profound DL classifiers with loss and accuracy in the testing and training phase. The finest results were achieved in training and testing accuracy. The resulting confusion matrices in the tested deep learning classification are shown in Fig. 5.
To check the classification efficiency of the DL classifier with the false positive rate (FPR), true positive rate (TPR) we have also added a receiver operating characteristic (ROC) curve to distinguish positive nCoV-19 cases in CXR images tested as shown in Fig. 6.

A. Dataset and Experimental Setup
Several publicly accessible data-sets provide a huge number of CXR images. For instance, nCoV-19 is very innovative, none of the broad repositories contains labeled nCoV-19 data, so we depend on a minimum of two data-sets for nCoV-19 and normal images. The publicly open COVID-19 Image Data Collection [49] collected nCoV-19 chest xrays. The images are predictably variable in size and quality from this series. In this dataset collection, there were a total of 115 PA images classified as COVID-19. The images are very different in resolution with a smaller pixel scale of 224×224 and a larger pixel of 1024×1024. Contrast, brightness, and subject locating are all extremely variable in this data-set. The data collection primarily contains adult patient data. The image samples used for this research are shown in Fig. 7 where Fig. 7(a) represents Positive nCoV-19 CXR and Fig. 7(b) represents Negative nCoV-19 CXR. National Institute of Health (NIH) CXR [50] foundation, 112, 120 anonymized X-Ray imaging with 14 condition labels including pneumonia and normal conditions. The images of the COVID-19 data-set have a similar quality, size, and feature ratio to the regular images with dimensions typically 1024×1024 pixels of portrait orientation. This data-set was chosen as the source of pneumonia and regular X-Ray images. Table I summarizes our findings concerning the data sources available: In this analysis, we aim to use real chest x-ray data and not at this point to create and use synthetic data. For our model experiments, we also intended to use a balanced data-set size. The master data-set containing COVID-19, Pneumonia, and normal Images for our model creation and testing purposes has been collected from the two source data-sets COVID-19 and NIH. The COVID-19 data-set was designed to eliminate images that were the incorrect projection, low resolution, and unwell cropped. We had more than 100 usable samples for a data-set of COVID-19. Since we use the NIH data-set to pick appropriate numbers of samples for normal and infected cases, we use sampling methods to select several more samples. The Pneumonia and Normal sample images from the downloaded NIH data-set randomly selected exclude samples for young patients. To draw the attention of machine learning (ML) algorithms, a few images so selected contained medical devices which we were concerned about so that these images were discarded and replaced by a random selection until those devices were absent from the data-set. Table II summarizes the outcomes of this process. The radio-graphs showed a Ground Glass Opacity/Opacification (GGO) pattern for reported COVID-19 disease, with rarely combined patches, peripherals, and bilateral. The methodology was established using a software bundle (Anaconda3). The execution was GPU precise. All trials were done on a Dell Inspiron 5570 Core(TM) i5-8250U CPU at 1.6 GHz (8 CPUs), 16 GB of RAM. All experiment trials were carried out with 70% of the data-set for training while 25% of the dataset for testing the leftover 5% is for validation. www.ijacsa.thesai.org

B. Testing Accuracy and Confusion Matrix
Testing accuracy is an estimate which shows the accuracy and precision of any selected deep model. Besides, the Confusion Matrix (CM) is a quantitative metric that offers further information about the accuracy of the test achieved. The confusion matrix of the model presented in Fig. 5 1) Accuracy: The significant metric for the outcomes of DL classifiers, as specified in equation (1). It is the ∑ TP and TN divided by the total values of the CM. (1) 2) Precision: Denoted in equation (2) to provide the association between the TP predicted values and total positive predicted values. (3) 4) F1-score: F1-score is a total degree of the accurateness that chains the precision and recall, as denoted in equation (4). F1-score is the double of the fraction among the multiplication to the ∑ of recall and precision metrics. (4)

5) Sensitivity:
Measures the fraction of actual positives that are properly recognized, as represented in equation (5) as such (e.g., the ratio of sick individuals who are properly recognized as having the disorder).
(5) 6) Specificity: (Also termed as the TN rate) trials the fraction of concrete negatives that are properly recognized, as represented in equation (6) as such (e.g., the ratio of fit individuals who are properly recognized as not having the disorder).
One of the benefits of generating a confusion matrix is measuring the accuracy of the tests. Precision, Recall, F1 Score, Sensitivity, and Specificity are provided in the following equations (2)(3)(4)(5)(6). They are the most important performance measurements in the field of DL. Table III provides information regarding the enactment of the model, respectively in terms of specificity, sensitivity, precision, and F1 scores.

VI. DISCUSSION
The number of people in over one hundred and seventy countries has been infected by the COVID-19 up to now, and this number may, unfortunately, increase over the coming days. COVID-19 has been declared a pandemic by the WHO. The use of the deep learning system approached will show promising results for identification by digital images of CXR of morphological changes in the lungs of COVID-19 infected patients [51]. In this analysis, InceptionV3 was used with an accuracy of 95.74% and the F1 score is 95.55. One of the limitations of this analysis is the number of cases. The organization gathering COVID-19 patient data from various countries of the world can be strategic and investigated to further improve the investigative method. With the variation of some parameters, different results are achieved which are shown in Table III. The interpretations accomplished from the present research verified that the proposed approach is capable and can be further applied to the multi-step on the prediction of a diverse set of disease parameters such as Cancer, Cardiac, Infectious, and Liver Disease. In the upcoming, we plan to authenticate our model by integrating more CXR images. This should decrease clinician capacity expressively. We will address CT images for nCoV-19 recognition and associate the gained outcomes using the suggested model trained to utilize CXR images. Also, we will attempt to assemble more native radiology images for nCoV-19 cases and estimate them using our CNN model from locations in Pakistan. Afterward, the compulsory trials are finished, we target to set up the established model in native hospitals for screening. www.ijacsa.thesai.org

VII. CONCLUSION
In this research, we presented a Deep Convolutional Neural Network (DCNN) strategy for the recognition of nCoV-19 cases from CXR images that is open-source and accessible to the broad community, which could be used as an auxiliary tool for extremely guarded health specialists in determining the sequence of treatment. We also defined a hybrid CXR data-set to train DCNN that is contained 10,000 CXR images from two open access data repositories, also we have added 150 new cases of the COVID-19 + CXR images. Additionally, we examined how the established model makes predictions in a shot to increase deeper insights into critical aspects related to COVID-19 cases, which can support clinicians in enhanced screening as well as transparency when DCNN for enhanced computer-aided screening (CAS). Our outcomes of around 95% for both precision and sensitivity are realistic initially, moreover that proficient clinicians have an explanatory radiological error of 2-10% liable on the radiological examination. We set up that the InceptionV3 classifier provided decent results within the tentative limitations of the small figures of presently available nCoV-19 chest x-rays. As the figures of existing nCoV-19 CXR images rise, we'll be capable to offer sufficient training data quantity to the deeper network and thus obtain superior outcomes from the InceptionV3 classifier. We aim to make our CNN model further reliable and correct utilizing extra chest x-ray images from our native hospitals.