Using Deep Learning Algorithms to Diagnose Coronavirus Disease (COVID-19)

—With the rapid development in the area of Machine Learning (ML) and Deep learning, it is important to exploit these tools to contribute to mitigating the effects of the coronavirus pandemic. Early diagnosis of the presence of this virus in the human body can be crucially helpful to healthcare professionals. In this paper, three well-known Convolutional Neural Network deep learning algorithms (VGGNet 16, GoogleNet and ResNet50) are applied to measure their ability to distinguish COVID-19 patients from other patients and to evaluate the best performance among these algorithms with a large dataset. Two stages are conducted, the first stage with 14994 x-ray images and the second one with 33178. Each model has been applied with different batch sizes 16, 32 and 64 in each stage to measure the impact of data size and batch size factors on the accuracy results. The second stage achieved accuracy better than the first one and the 64 batch size gain best results than the 16 and 32. ResNet50 achieves a high rate of 99.31, GoogleNet model achieves 95.55, while VGG16 achieves 96.5. Ultimately, the results affect the process of expediting the diagnosis and referral of these treatable conditions, thereby facilitating earlier treatment, and resulting in improved clinical outcomes.


I. INTRODUCTION
In November 2019, the unique COVID-19 was reportedly discovered for the first time in Wuhan, located in Hubei Province in China. A month later, the World Health Organization (WHO) announced that the virus is capable of causing a respiratory illness that manifests clinically as coughing, fever, and inflammation of the lungs. COVID-19 was first discovered in China, but it has now been found in a significant number of different places across the globe [1,2]. This is not only owing to the quick transmission of the virus but also the high incidence of death that has been seen as a result. Consequently, WHO declared the emergence of the new COVID-19 virus to form a pandemic [3]. WHO declared a public health emergency due to the pandemic on January 30, 2020. According to the recommendations made by the Chinese National Health Commission, radiographic evidence of pneumonia is needed as part of the clinical diagnostic criteria in Hubei Province [4], where it emphasizes the significance of different CT scan pictures for determining the degree of COVID-19 lung inflammation.
In order to a recent increase in the number of COVID-19 patients, waiting times at hospitals for CT scan image evaluation have increased significantly. This effect causes a significant danger by spreading illness to other patients. The healthcare system becomes overburdened due to the lack of radiologists, which is generally much lower than the number of patients. In return, this causes a delay in the discovery and quarantine of infected persons in addition to the low effective treatment of patients [4]. Researchers create more intelligent, highly responsive, and efficient diagnosis approaches due to the rapid spread of COVID-19 and the enormous demand for diagnosis. In the past several years, artificial intelligence in the medical field has received widespread attention as a potentially helpful tool for guiding clinical choices and the diagnosis of diseases [5,6]. It is noteworthy that artificial intelligence is being emphasized to work effectively in the current epidemic for the prediction of outbreaks.
A Canadian company (Blue Dot) successfully reported this outbreak's location in late December. Artificial intelligence is being emphasized to work effectively in the current epidemic to predict outbreaks. Consequently, there is an urgent need to create a clever system that can detect occurrences of COVID-19 correctly and mechanically. The research community needs to develop a dataset that is comprehensive and ready for testing as soon as possible.
CNN is the most popular machine learning algorithm which is used for image processing and training. In environment of Convolutional Neural Networks had validated very successful achievements, for example, classification of image processing, and text based, sign identification, object identifier, faces recognition. A CNN can also detect COVID traces.
This study aims to automatically identify and quantitatively evaluate the pneumonia lesions that are visible in individuals with COVID-19 who have chest CT scans. We will apply three well-known CNN models ResNet50, VGG16, and GoogleNet due to their classification efficiency in recognizing and classifying the images. The models will be compared to select the best model that can achieve high accuracy. Each model will introduce into two stages each stage has a different large dataset and different batch sizes to measure the impact and the effect of data size and batch size factors on the accuracy of the models. Each model will be trained and tested in 100 epochs. The outline of the paper is divided into the following sections. Section II illustrates the literature search. Section III discusses the proposed method in which the datasets and materials are included, where deep transfer learning models are also www.ijacsa.thesai.org explained. Section IV presents the analytical results, and finally, the paper's conclusion is drawn in Section V.

II. RELATED WORK
There is rising interest in alternate ways of identifying coronavirus infection when using medical imaging due to the rapid spread of COVID-19. Processing and analysis of X-rays, including Computed Tomography (CT), have been performed using different deep learning methods, which have assisted physicians in predicting COVID-19 infection [7]. Several approaches that rely on identifying coronaviruses when using deep learning applications were presented. Wang and Wong [8] propose the COVID-Net approach, which represents an artificially intelligent system based on convolutional neural networks that are designed to differentiate COVID-19 instances from other types of cases by assessing lung abnormalities from X-ray pictures. A variant form of the Inception model is presented where it has an accuracy of 89.5% when extracting the features of COVID-19. This extraction uses different CT images [9]. A 3D deep learning system that is based on the location-attention mechanism is constructed, and CT scans are used in its construction. The objective of this system is to locate sick regions associated with COVID-19 patients. This method is able to differentiate COVID-19 pneumonia from influenza-related illnesses. One caused by viruses has an accuracy of 86.7% [10]. In order to identify the cases of coronavirus, a powerful type of neural network called a deep neural network is trained by using different CT scans. This network is able to differentiate among the infected areas and those affected by other lung diseases [11]. Song et al. [12] built a ResNet architecture to extract complex features from the CT data, and for the classification of COVID-19, a feature pyramid network is integrated with an attention module. In order to differentiate between instances of COVID-19 and those caused by the coronavirus, a diagnostic technique that relies on CT scans has been developed [13]. Islam et al. [14] investigate whether those who have the coronavirus can be identified by the use of chest X-rays as having the infection. The Convolutional Neural Network (CNN) and the Short-Term Long Memory (LSTM) have both been included into a new design that many researchers have constructed. The applications of machine learning, including deep learning, are conducted in COVID-19's imaging-based medical research. The primary objective of the researchers in [1] is to design a model that could make a diagnosis of COVID-19 in a manner that is analogous to the way that radiologists do it, but in a shorter period. They are able to attain performance levels that are equivalent to those of expert radiologists while reducing the amount of time spent diagnosing patients by 65% when compared to the amount of time spent by in-clinic radiologists. Despite this, there are still areas of development that can be made in order to more efficiently enhance both their suggested model and the overall system so that people may have personal access through to it. Deconstruct, transfer, and compose is the name of the deep convolutional neural network that was created and validated to recognize COVID-19 patients from the chest X-ray photos of such patients. This network is referred to by its acronym, "DeTraC" [15]. They recommended a decomposition approach to check for anomalies in the dataset by studying the class boundary conditions. This is performed to achieve a high degree of accuracy (93.1%) and sensitivity (100%). In [16], a deep learning technique is founded on the ResNet-101 CNN model. In their suggested technique, the pre-training step consists of using thousands of photographs to distinguish significant items, and the re-training phase consists of the same images for detecting abnormalities in chest X-ray images. This approach has just a 71.9% success rate in terms of accuracy. The hospital layer, the patient layer, and the cloud layer are the proposed layers for the proposed structure, which includes three levels [17]. For the goal of collecting information from the patient layer of the data model, wearable sensors and a mobile application are utilized. An identification model based on deep learning and neural networks is used to the X-ray images of the patients in order to locate COVID-19. These images are used in conjunction with the diagnosis.
The proposed model achieves an accuracy of 97.9% and a specificity of 98.85% in its predictions. In [18], previously trained deep learning models are used, such as ResNet50, VGG16, VGG19, and DensNet121. A unique architecture is developed for the diagnosis of X-ray images as either COVID-19 or normal. The VGG16 and VGG19 models have the highest accuracy levels among those considered here. The model that is proposed consists of two phases, the first of which is preprocessing, the second of which is data augmentation, and the third of which is transfer learning. At the end, it demonstrates an accuracy of 99.3%. In the proposed model [19], three different types of deep transfer models, including AlexNet, Google Net, and ResNet18, are applied to a set of 307 photographs that include four various sorts of classes, which comprise: COVID-19, normal, pneumonia bacterial, and pneumonia viral. The models are used to classify the images. The study is broken up into three different situations to cut down on the amount of time spent when executing them and the amount of memory that is used. When it comes to the most recent deep transfer model, Google Net achieves a testing accuracy of 100% and a validation accuracy of 99.9%. In [20], a deep learning-based system is developed for identifying COVID-19 from chest X-ray images by utilizing four tuning models such as ResNet18, ResNet50, Squeeze Net, and DensNet-121. The system is able to do this by using the images as training data. The solution that is recommended involves the addition of additional data in order to generate an updated version of the COVID-19 photographs. As a direct consequence of this, the total number of samples increases. Ultimately, the obtained results achieve a sensitivity of 98% and a specificity of 90%. In [21], a model that makes use of deep learning, as well as machine learning classifiers, is developed. This model is used in a total of 38 trials to accurately identify COVID-19 through the use of chest X-ray images. Ten of these tests are carried out with a wide range of machine learning strategies, and fourteen of these experiments are carried out by utilizing a pre-trained network that is outfitted with the most recent advances in transferring learning technology. The accuracy of the system is 98.50%, while its specificity is 99.18%, and its sensitivity is 93.84%. They conclude that the CNN system can recognize COVID-19 from a limited number of photographs without the need for any preprocessing and with a decreased number of layers. This www.ijacsa.thesai.org result was obtained after they got to the opinion that the CNN system was able to.
In this research, a convolutional neural network models are presented to classify" COVID-19 images from other images, and the proposed methods are determined based on the most effective architectures, which are chosen by ILSVRC competitors as the top 5 CNN architectures. The GoogLeNet architecture won in the 2014 ILSVRC where it reduces the error rate in comparison to AlexNet and ZF-Net, and also reduces the number of parameters to 4 million in comparison to 60 million as it appears in AlexNet [22,23]. In contrast, the VGG outperforms the other model due to the existence of the multilayer model [24] that involves nineteen more layers than the ZefNet and AlexNet. The reason behind this is to show the relations of the network representational capacity in depth. The VGG uses 3 × 3 filters that are smaller than the 11×11 and 5 × 5 filters in the ZefNet. Small-size filters can produce the same impact and efficiency as large-size ones. Additionally, the small-size filter can reduce the computational complication and decrease the number of parameters. The ResNet (Residual Network) is developed in [25] and the winner of the ILSVRC in 2015, where the goal was to create an ultra-deep network to avoid the vanishing gradient issue and use the shortcut connections to enhance the deep network convergence. When compared with the VGG, the computational complexity is lower in terms of the ResNet also the enlarged depth.
In addition, many factors will affect the models leading to poor or accurate results. The study [35] aimed to point up the importance of choosing the appropriate batch size to gain the best accuracy on expected time. Different batch sizes were used to measure the best accuracy such as 16, 32, 64, and 128 they notice whenever use a small batch size that will improve the performance of the models. [36] were used 6 batch size and gained 97.78% accuracy while [39] achieved 75.51% with the same model GoogleNet, although the data size of positive COVID-19 in [36] was near to [39] but in general the number of non-COVID-19 was greater than in [36] than [39]. In the same way, the [37] and [38] use 32 batch size and [37] acquired high accuracy of 99.8% while the [38] get 76.38% with the same model ResNet50.
Consequently, some previous studies produced that the small batch size is an essential factor and has a major impact on the accuracy of the model while another study [40] gained the best accuracy by using 64 batch size. Thus, in this study, we will measure the impact of batch size and dataset size on the accuracy.

III. PROPOSED METHOD
The proposed method is introduced to utilize three transfer learning CNN models, which comprise VGG16, ResNet50 and GoogleNet, with larger datasets from that is used in the previous studies. Different batch sizes and different images sizes are used to evaluate the performance in order to improve the results and gain the best accuracy and lower loss rate. Anaconda open-source platform is used to install and manage the Python packages which contains from 1,500+ open-source packages and that simplify the deployment and management for packages which made our experience faster and easier.
From Anaconda we create the TensorFlow environment, and we abled to install all required packages on that environment.

A. Datasets
In this study, a publicly available accessible dataset of chest X-rays is used and taken from COVID-19 patients. These individuals either have pneumonia, normal chest X-rays or are suspected of catching COVID-19. The information for the dataset comes not just from publicly available sources, but also from hospitals and medical professionals through indirect means. The Kaggle dataset repository contains all of the photographs and data that have been made public. X-ray pictures are included in the first dataset that is collected. For the sake of this application, a COVID-19 detection model is constructed by using X-ray pictures. To create a training dataset that is divided in 70% and another scans that are divided in 30%, two datasets are used where the first of which contains 14994 images and the second which contains 33178 images are either pneumonia, normal or COVID-19 X-ray scans. The study [26] contains 2843 covid-19 images; the study [27] includes images of three categories train, test and validation. Similarly, [28] contains 2313 images of covid-19 while [29] represents many categories COVID, Lung_Opacity, Normal and Viral Pneumonia. Both datasets are divided into 70% for the training dataset. After being rescaled, the X-ray scan pictures are shown in Fig. 2 with a size of 160.
This dataset is one that can be accessed by the general public. It presently has the most data and is labeled as of the time this article was written. In the near future, there will be a rise in the number of datasets as well as the number of samples that each one contains. Labeling is a different topic to be considered. The illness is depicted on X-ray scan pictures that are included in this collection.

B. Image Preprocessing
Applying the algorithms to unclear data will not give accurate and correct outcomes as it will fail to recognize the patterns effectively. Thus, data preprocessing is essential before undergoing to computations methods to enhance data accuracy.
Preprocessing is applied to the best picture of the CT Scan COVID-19 to achieve greater consistency in classification results and improved feature extraction. In the current study, assigned image resize is performed on each image to draw attention to COVID-19 in the Region of Interest (ROI), eliminate irrelevant details, and lessen the amount of work to be conducted. The CNN method requires a significant amount of iterative training. To accomplish this, a large-scale image dataset is necessary. A large-scale dataset is required to eliminate the chance of overfitting. The flow chart of the proposed approach is shown in Fig. 1 which illustrate the basic processes of our proposed study.

C. Data Augmentation
Several distinct data augmentation procedures are applied to the training set by utilizing the image data generator function of the Keras library, which is part of the Python programming language. This is performed to minimize overfitting and to boost the variety of the dataset. The values are by scaling, by bringing them into the same range. As a result, the rescaling www.ijacsa.thesai.org factor of 1./255 is the proposed model is applied to convert each pixel value from [0,255] to the values 0 and 1. When performing a shear transformation, one axis of the image is kept in its original position while the other axis is stretched to a predetermined angle known as the shear angle. In this example, the shear angle is set to 0.2. When performing the random zoom transformation, the zoom range argument is used; a value of less than 1.0 means that the images will be magnified. On the other hand, in order to zoom out of the image, a value that is greater than 1.0 is utilized; consequently, a zoom range of 0.2 is applied, which results in the image being magnified. The image can be flipped vertically by using the Flip function. Increase in this type of data collection.

D. Convolutional Neural Network
In this section, we will consider the various methods of transfer learning as well as deep neural network analyses. In deep learning, the CNN neural network offers solutions in particular for the recognition, classification, and analysis of images and videos. An architecture for CNN is developed, with the visual cortex of the organization serving as a source of inspiration. Where this design is comparable to the connection model in which neurons in the brains of humans [17], CNN's recent success can be credited, at least in part, to the network's capacity to learn from large-scale datasets such as the Image Net effectively. The fundamental components of the CNN may be separated into three different levels. The neural network comprises three layers: the convolution layer, the pooling layer, and the fully connected layer. The fully connected layer is the last layer.
In conclusion, the convolutional and pooling layers are in charge of the learning that the model produces, whilst the fully connected layer is in charge of the classification [18]. The primary component of the CNN architecture is referred to as the convolutional layer. At this layer, the opportunity to acquire knowledge regarding the characteristics of the inputs is obtained. In order to produce the feature map, high-level and low-level filters are first applied to the input image. In general, the sigmoid and the ReLU function can be found in this layer.

E. Visual Geometry Group Network (VGG)
Simonyan et al. [30] developed the VGG-Net model, which includes minimal convolution within the network. Because of its more complex structure, which is followed by layers of related double or triple convolution layers, it is frequently used in CNN models. This is due to its structure [30]. This is the most significant difference between this model and the models that came before it, despite the fact that this model is relatively straightforward. In older models, the layers of convolution and sharing follow each other within the same order. Within this model, approximately 138 million parameters are calculated [31]. The VGG database offers an accurate representation of features for more than one million pictures (the Image Net dataset), which spans one thousand different categories. The model is able to function as an effective feature extractor for newly acquired photos that meet the requirements. The Image Net dataset has the capability to extract related features from different related photos, including entirely new ones that either does not currently exist in the dataset or that may be found in an entirely different category from those that are already existing. As a result, employing pre-trained models as an effective feature remover gives a distinct competitive edge [30]. Fig. 3 identifies the framework in which the VGG16 is built on.
Each convolution layer in the VGG16 is preceded by a ReLU layer and incorporates the maximum pooling layers for sampling. The architecture of the VGG16 employs three convolution filters and a total of 13 convolution layers to extract features. It contains three layers that are completely connected for classification, two of which a function represents hidden layers, and the final classification layer consists of one thousand units representing different picture categories that are stored in the ImageNet database [30]. Each of these layers has a different purpose. This structure provides the appearance of a larger filter while retaining the advantages of employing smaller filter sizes. It has been demonstrated that the VGGNet can function more effectively with a reduced number of parameters, particularly when compared to earlier models. Additionally, a single ReLU layer is replaced with two separate ReLU layers for the two convolution layers rather than a single ReLU layer. As a result of the convolution and partnering layers, the spatial size of the input volumes in each layer is shrunk, which leads to an increase in the volume's depth. This increment of using this depth In fact, this increment is due to the fact that the number of filters also increases. More effective when applied to object classification problems and edge detection [31].

1) ResNet50:
ResNet50 is an architecture that is designed to have a more in-depth structure than any of the other architectures that have come before it. It is made up of 152 layers in total. In 2015, the development of the ResNet was witnessed [32]. It achieved the highest-ranking possibility in the ImageNet competition that was held in 2015, coming in first place with an error rate of 3.6% [32]. The architecture of the model's residual mapping is displayed in Fig. 4 for viewing pleasure. The blocks that feed the data to the following levels are added to the model, which is the most significant aspect that separates it from other designs. This feature distinguishes it from other architectures. Fig. 4 illustrates the residual mapping structures where other architectures do not have this feature. The system value is altered in a manner that is described by adding this value at intervals of every two layers in the space that is occupied by the Linear and ReLU activation codes. Because of this issue, the derivative will produce a value of 0 in the problem such that it is not desired [32]. However, the value feed may optimize the learning error even if the value a=[I] from the below two layers is 0. This results in the network being trained into an efficient deal more rapidly. The architecture is made up of the residual blocks that is displayed in Figure 4.The convolution of the input value x in the residual block produces a=F(x). Fig. 4 represents the residual mapping structure. The result is applied after the ReLU convolution series. The final result is then added to the initial x entry, and the equation for H(x) = F(x)+x. Learning residuals from images rather than features is what allows the ResNet50 model to provide a simple training, including a significant advantage in its application [32].
2) GoogleNet: In contrast, the Google Net architecture has a total of 22 layers, making it significantly more extensive in depth and breadth than the Alex Net's architecture is, despite having a significantly lower total number of parameters in the network (five million parameters) than AlexNet has. This is due to the fact that GoogleNet has significantly fewer parameters than AlexNet has (60 million parameters). An implementation of the "network in network" design is among the most essential components of the Google Net architecture. Lin et al. [8] use a model called "inception modules." Inception makes use of parallel 1 1, 3 3, and 5 5 convolutions, including a parallel max-pooling layer in order to simultaneously collect a wide range of information. The www.ijacsa.thesai.org reason behind this is that it is able to record characteristics simultaneously. To satisfy the requirements of the practical application of the implementation, dimensionality reduction is accomplished by adding 1 x 1 convolutions prior to the previously mentioned 3 x 3, 5 x 5 convolutions (and also after the max-pooling layer). This is conducted in order to fulfil the expectations of the practical application. This is required because there is a requirement to maintain some level of control on the quantity of related computation. The very last layer is referred to the filter concatenation layer, and all it does is to aggregate the results of all of the layers that are running in parallel. Although this contributes to the formation of a single inception module, the version of the Google Net architecture that is utilized in the experiments that we carry out makes use of a total of nine inception modules. This is the status although the fact that this contributes to the formation of a single inception module. You can use as a reference in order to obtain a more comprehensive summary of the structure pertaining to this architecture [32].

1) Accuracy (ACC):
Accuracy is the proportion of accurately predicted authentic and forged images. Accuracy is computed as: (1) Where True Positive (TP) refers to photos that have been accurately categorized as tampered, and False Negative (FN) refers to images that have been incorrectly labelled as tampered. True Negative (TN) denotes the photos that are initially categorized appropriately, whereas False Positive (FP) refers to the images that are first classed incorrectly. Table I presents the fundamental of evaluation metrics.
3) True positive rate (TPR): The number of fake images that are correctly discovered is known as the TPR, which is calculated as follows: (3)

4) True negative rate (TNR)
: TNR refers to the proportion of real photos that have been correctly categorized.
The following formula represents the calculation of the TNR: (4)

5) False positive rate (FPR):
The FPR, or false positive rate, is the fraction of original photos that are incorrectly categorized. It may be computed as follows:

6) Precision:
The total of positive predictions that are accurate is calculated as follows:

7) Recall:
Recall is the ratio of the positive sample that is taken by model And is calculated as follows: 8) F1-score: The F1 score is calculated by averaging the recall of the model with the accuracy of the model. One is maximal and zero is minimal value. If the value is at its highest possible, the model is said to be of high quality.

A. Feature Extraction Performance
The trained CNN is in its system on extracted characteristics before using it to categorize the data. To assess the effectiveness of the CNN models, test features are extracted from the test pictures and are analyzed with a variety of pretrained models. Currently, the CNN models extract features from both the training and testing datasets using various pretrained models, such as Google Net, ResNet50, VGG-16, and ResNet50. In the current study, a comparison of the performance of several CNN models is carried out. Even though the ResNet50 does not demonstrate the highest performance in terms of greater accuracy, it is clear that the ResNet50, which is suggested in this study, obtains a superior accuracy and specificity than the other pre-trained models. However, when compared to CNN models, the performance of the scratch model is not considered to be sufficient. The results of the produced ResNet50 model are much superior to those obtained by the VGG16 and GoogleNet models presented in Table II.

B. Comparative Analysis
It is a time-consuming process to use the appropriate datasets of chest X-ray pictures for the COVID-19 detection process. The researchers made use of a variety of preprocessing methods, feature extraction strategies, and classification approaches [24,33].
However, it is difficult to identify a prospective strategy or combination of techniques that are more supportive in diagnosing COVID-19 from the chest X-ray picture. The reason behind this is that there are so many different possible approaches. In the vast majority of instances, a level of accuracy that is found greater than 90% is observed from a statistical perspective, this constitutes a very high degree of accuracy. However, the objective would be to improve the degree of accuracy to be as near as possible to 100%, where it is given that incorrect diagnosis, even in a very small number of instances, is not quite acceptable. It is very obvious that the approach that is proposed creates a greater classification accuracy when it comes to identifying COVID-19 in comparison to the other strategies that are proposed in the literature. However, the study by Loye et al. [19] shows a more accurate result of 100% than what is found in this study. This might be because the dataset that they use to assess the performance of the system only have a very small number of photos (69 COVID-19 and 79 normal images, respectively) [34]. The accuracy of the suggested technique, which utilizes the feature fusion that is derived from Googlenet, ResNet50, and CNN (VGG16), is demonstrated to be higher when utilizing the CNN as the classifier. On the other hand, when using a binary classifier on the chest X-ray dataset, the ResNet50 achieves a satisfactory level of performance.
In Table II, a different experiment is illustrated when using different batch sizes and epochs when also using the 14994 images dataset. The first set of experiments with GoogleNet employs three different batch sizes (16,32, and 64) and epochs (30, 21, and 34), and the results show that the testing accuracy reaches 94%, 94.46%, and 93.64% as shown in graph in Fig. 5.  Fig. 6. The final experiment in the 14994image dataset is performed by using the VGG16 model, which achieves 93.55%, 93.75%, and 94% as shown in graph Fig. 7, respectively , when using the same batch size 16, 32, 64 and epochs used in 61,89.41. Following that, an experiment is performed with X-ray images by using an increased dataset of 33178 images. When using the same model three times, the batch size and time period each time are changed. The very first GoogleNet model includes a testing accuracy of 95.64% and uses batch sizes of 64 and 84, respectively. The second ResNet50 model should then be run with two different batch sizes of 64 and 128, while maintaining the same level of testing accuracy, which should be 96.07%. The final VGG16 model that is utilized for the experiment, which includes a batch size of 64, and its epoch's size reaches 89. It is able to reach an accuracy of 95.81%. Both the VGG16 and GoogleNet models almost achieve the same level of performance when it comes to the COVID-19 classification. The ResNet50 model, which provides the maximum classification performance with a testing accuracy of 95%, is used for both datasets in order to conduct the evaluation. 32, 64, and 128 batch sizes are applied in order to measure the accuracy in each batch and select the best batch size.
Regardless of the fact that the picture size is decreased by more than half of the original sizes in this experiment, the system is still exhibiting its resilience by accurately detecting COVID-19 instances. This is the case although the fact that the image size is lowered. One possible explanation for this is because the state-of-the-art algorithms have been able to identify a greater variety of unique characteristics and extract those characteristics.  Fig. 9 represents confusion matrix of VGG16 also in first stage while Fig. 10 shows the confusion matrix of VGG16 and ResNet50 in second stage.  The rationale behind utilizing X-ray pictures in the COVID-19 detection process is initially discussed in this paper. After that, a few similar papers on pre-trained CNN systems utilizing X-ray pictures are highlighted. In order to create a COVID-19 detection, a large public dataset consisting of chest X-rays is used because there is insufficient public COVID-19 datasets. The first dataset contains 14994 COVID-19 X-ray pictures, while the second dataset contains 33174 COVID-19 X-ray. These are used for training and testing stages, respectively. The photographs are scaled to be 160 by 160 pixels. The dataset is made more comprehensive by including different picture enhancement techniques.
Transfer learning models GoogleNet, ResNet50, and VGG16 were used by utilizing the X-ray images of COVID-19 patients and other patients. The pre-trained ResNet50 model produced the highest classification performance of automated COVID-19 classification with an accuracy of 96%. This was compared to the other two suggested models, which both had a classification accuracy of 94%. By presenting our work's results in graphs and tables, we were able to draw attention to the performance of the categorization. It is recommended that the number of examples that are included in the dataset to be raised so that the model can achieve a more accurate performance. In addition, we hope in the future to use preprocessing techniques that can affect positively on the results and that will increase the intelligence of the models and will lead to predicting the accurate result that can be generalized.