Optimization of Multilayer Perceptron Hyperparameter in Classifying Pneumonia Disease Through X-Ray Images with Speeded-Up Robust Features Extraction Method

—Pneumonia is an illness that affects practically everyone, from children to the elderly. Pneumonia is an infectious disease caused by viruses, bacteria, or fungi that affect the lungs. It is quite difficult to recognize someone who has pneumonia. This is because pneumonia has multiple levels of classification, and so the symptoms experienced may vary. The multilayer perceptron approach will be used in this study to categorize Pneumonia and determine the level of accuracy, which will contribute to scientific development. The Multilayer Perceptron is employed as the classification method with hyperparameter learning rate and momentum, while SURF is used to extract the features in this classification. Based on the experiments that have been carried out, in general, the learning rate value is not very influential in the learning process, both at the momentum values of 0.1, 0.3, 0.5, 0.7, and 0.9. The best desirable accuracy value for momentum 0.1 is at a learning rate of 0.05. The best desirable accuracy value for momentum 0.3 is at a learning rate of 0.09. The most desirable accuracy value for momentum 0.3 is at a learning rate of 0.05 and 0.07. At a learning rate of 0.03 the highest ideal accuracy value is obtained. The best desirable accuracy value for momentum 0.9 is at a learning rate of 0.09. this research should be redone using the number of hidden layers and nodes in each hidden layer. The addition of a hidden layer, as well as variations in the number of nodes in the hidden layer, will affect computation time and yield more optimal accuracy results.


I. INTRODUCTION
Medical images have an important role in classifying or identifying a disease. One of the most frequently used techniques is X-ray Photo [1] [2] [3]. This technique is used by radiologists to be able to see and find out information about the patient's body condition. In addition, it has advantages, easy to use, and has a high economic value [4] [5]. A chest X-ray is used by radiographs to see the condition of the patient's chest area. The results of this chest x-ray illustrate the condition of the patient's body starting from the chest, lungs, heart, and trachea [6] [7]. Lung infections caused by bacteria, viruses, fungi, or parasites will be marked with a white-gray area. The pattern area makes it easier for doctors to identify the patient's illness, such as pneumonia [8] [9].
Pneumonia is a type of acute respiratory infection (ARI) at the bottom caused by inflammation of the tissues and air sacs in the lungs by bacteria, viruses, fungi, or parasites. The air sacs are filled with fluid which can cause coughing up phlegm, fever, difficulty breathing, and chills [10] [11]. Pneumonia is an airborne disease. Elderly people aged 65 years and over are a high-risk group for pneumonia. In the elderly the pneumonia rate has a high degree of severity of disease, it can even lead to death [12] [13].
Pneumonia classification using the k-nearest neighbor method with glcm feature extraction is the topic of Wijaya's 2020 study. the findings of his study suggest that cropping an x-ray image of the lungs can improve accuracy. the greatest accuracy per class is 66.20 percent while utilizing gray level co-occurrence (glcm) feature extraction and k-nearest neighbor (knn) classification for k = 5. the maximum level of accuracy is 72.90 percent for the virus lung picture object because the gray level co-occurrence (glcm) does not indicate the proper value, the precision and recall levels tend to have the same value [14].
The CNN model in classifying CT image data sets and determining the probability of COVID-19 infection, according to Xu [15] in his study. In this work, two test models were used: one that ignored the distraction component and another that took into account the distraction effect by adding a noisy-OR bayesian function. the acquisition of classification evaluation with 79.4 % accuracy, 68.9% precision, 76.5 % recall, and 72.5 % f1-score in the initial model. While the acquisition of classification evaluation in the second model had 86.7 % accuracy, 81.3 % precision, 86.7 % recall, and 83.9 % f1-score.
Detection of COVID-19 Infection in Chest X-rays Based Deep Transfer Learning, according to Das [16], explains to circumvent the low sensitivity of RT-PCR, chest X-ray images were employed to detect and diagnose COVID-19 in this study. CT scans were favored over chest X-rays. We chose chest x-ray images since X-ray machines are less expensive than CT scan devices. Furthermore, X-rays emit less ionizing radiation than CT scans. Chest X-ray scans of COVID-19 infected patients show several unique patterns and bilateral alterations, according to the exhaustive review. Manual COVID-19 testing from chest x-ray images, on the other hand, is a difficult task. As a result, *Corresponding Author. Optimizing [46] the hyper-parameters of multi-layer perceptron with greedy search, in experiment using Fashion MNIST and Kuzushiji MNIST datasets, is the topic of Bae's et al [36]. The proposed algorithm shows the similar performance as compared to complete search, which means the proposed algorithm can be a potential alternative to complete search. Experiments used with variations in the learning rate [0.1; 0.05; 0.01; 0.005; 0.001] the most optimal results are at a learning rate of 0.1 and 0.001 [36].
Ke [37] in his report 2021 discusses enhancement of multilayer perceptron model training accuracy through the optimization of hyperparameters: a case study of the quality prediction of injection-molded parts. In the experiment the learning rate used [0.1; 0.001; 0.0001; 0.0001; 0.00001] and the activation functions [Sigmoid, Tanh, ReLU, LeakyReLU, ELU] the most optimal accuracy results are at a learning rate of 0.1 both in the activation functions of Sigmoid, Tanh, ReLU, LeakyReLU, and ELU [37].
Kermany et al. [39] research uses deep learning model inception to determine medical conditions. Layer training was done using Adam Optimizer with a learning rate of 0.001 and stochastic gradient descent in batches of 1,000 photos each step. In all categories, training lasts 10,000 steps or 100 epochs.
A wavelet technique was included as a noise remover before the forecasting data was processed using the ANN model, according to Ekonomou [40], who found that while the ANN model produced good results when used to solve forecasting situations, more optimal results were obtained. This model demonstrates that in order to produce more accurate results, ANN still has to be expanded with different combinations and enhancements.
In his manuscript addressing the case of the Power System Topology Observability analysis, Reddy [43] also investigated the enhanced Neural Network Hopfield model; the Power System Topology Observability was then analyzed using the particle swarm optimization technique as a comparative model. According to his research, the improved Hopfield Neural Network model required the least amount of computational time when compared to the particle swarm optimization algorithm, which had a time ratio of 0.2811: 18,592 in seconds. The Hopfield model also required the fewest iterations to produce the best results. When compared to the particle swarm optimization approach, the upgraded neural network requires less iterations which had an iterations ratio 45:189.
In his research, Alqudah [44] claimed that by integrating the deep learning model as a feature extraction process and machine learning as a classification, the CNN model had been improved from the basic Neural network model. This study examined the accuracy, sensitivity, specification, and precision of the image size input model with variations of 32x32, 64x64, 128,128,256, and achieved the best results at 64x64 image input circumstances with successive values of 80.07, 79.24, 89.55, and 78.80.
The ResNet model was used by Latif [45] in his research on pneumonia detection. The CNN model (9 layers), CNN model (10 layers), CNN model (12 layers In the comparison of chest X-rays presenting as pneumonia versus normal, we achieved an accuracy of 92.8%, with a sensitivity of 93.2% and a specificity of 90.1%. The area under the ROC curve for detection of pneumonia from normal was 96.8%. Binary comparison of bacterial and viral pneumonia resulted in a test accuracy of 90.7%, with a sensitivity of 88.6% and a specificity of 90.9%. The area under the ROC curve for distinguishing bacterial and viral pneumonia was 94.0%. [39]. According to earlier studies, multilayer perceptron as a classification approach and speed up robust feature (SURF) as feature extraction have not been identified in X-Ray Image Classification research. The multilayer perceptron classification approach with variations in momentum and learning rates, as well as SURF as its feature extraction model, will be investigated in this study in categorizing pneumonia based on X-Ray pictures.
This work uses SURF as a feature selection and feature extraction model because surf can reduce data loss in the highquality data extraction process [35]. Feature selection and feature extraction are difficult tasks when getting high-dimensional data [41][42] [47].

A. Pneumonia
Pneumonia is inflammation of the lung parenchyma where the acini are filled with inflammatory fluid with or without infiltration of inflammatory cells into the walls of the alveoli and interstitial spaces characterized by coughing accompanied by rapid breathing and/or shortness of breath in children under five [17]. Pneumonia causes inflammation of the lungs that makes breathing difficult and oxygen intake less. Pneumonia is a disease caused by microorganisms such as pneumococcus, staphylococcus, streptococcus, and viruses whose mode of transmission can be through the medium of air, saliva splashes, direct contact through the mouth, and contact with shared objects [18] [19]. www.ijacsa.thesai.org Indonesia as a country located in the tropics has the potential to become an endemic area for infectious diseases which at any time can be a threat to public health. Pneumonia is the second leading cause of death for children under five in Indonesia after diarrhea. The number of pneumonia sufferers in Indonesia in 2013 ranged from 23-27% and deaths from pneumonia were 1.19% [20].

B. Multilayer Perceptron
The Multilayer Perceptron is a variant of the original Perceptron model proposed by Rosenblatt in 1950 [21]. Neural Network Model or Neural Network is a brain computing system. This is because Neural Networks are inspired by the human brain which can provide input, process, and produce output. Neural Network can produce output because it has acquired the knowledge gained through the learning process [22] [23]. Neural Networks here have several functions, such as pattern classification, mapping patterns from input to new at output, storing patterns to be recalled, mapping similar patterns, optimizing problems, and predictions [24] [25].
Neural Network Multilayer Perceptron is a development of Neural Network Perceptron. Developed to cover the weakness of the Neural Network Perceptron, namely performing complex logic operations [26].
The perceptron algorithm, which forms the basis of the Multilayer Perceptron model, was invented by Frank Rosenblatt, funded by the United States Maritime Research Department at the Cornell Aviation Laboratory in 1957.
Multilayer Perceptron is arranged in three levels consisting of one input layer, one or more hidden layers, and one output layer. So that later the process will run from the input layer to the output layer, which there is no repetition. Neural Network architecture like this is called feedforward [27], Fig. 1.

C. Surf
The Speed Up Robust Feature (SURF) algorithm was first published by researchers from ETH Zurich, Herbert Bay in 2006 [28]. In its development Herbert Bay was assisted by two colleagues, namely Tinne Tuytelaars and Luc Van Gool [29] [30]. The SURF algorithm can detect local features in an image reliably and quickly. This algorithm is inspired by the Scale Invariant Feature Transform (SIFT), especially at the scale space representation stage. The SURF algorithm uses a combination of an integral image algorithm and blob detection based on the determinants of the Hessian matrix [31].
SURF is a very powerful local feature detector, which can be used in computer vision such as object recognition and 3D reconstruction [32]. One of the advantages of SURF is its processing speed, this is due in part to the use of integral images. The value of this integral image comes from the sum of the grayscale values of the image [33]. SURF is designed to extract the uniqueness and similarity of features from images. The SURF algorithm is divided into several stages, namely interest point detection and feature description [34].

A. Dataset
The dataset used in this study was taken from the Kaggle dataset source which can be accessed via the following link https://www.kaggle.com/paultimothymooney/chest-xray-pneum onia. Consists of 5,863 X-Ray images (Jpeg) and has two categories, namely Pneumonia and Normal.

B. Data Analysis
Biomedical images come in a wide variety of shapes and sizes. Images for a comparable pathological condition could alter significantly from person to person, and even from encounter to encounter. These discrepancies could be due to variances in illumination, marker stains (for pathological investigations), image extraction procedure, image dimension, and so on. Image preprocessing guarantees that all of the photos are in the same format and are free of noise that is irrelevant to the study.

C. Research Architecture
This study's research architecture is based on multiple previous investigations [35] [38]. A training model and a test model are included in the created model. (See Fig. 2 2 shows the research steps that will be carried out with two processes, the first is the training process, namely the process of extracting data (grayscale to minimize the color space in the image from three color spaces R, G, B into one color space, namely grayscale and extracting by utilizing the SURF feature). and the data is stored as a pattern model that will be used in the testing phase, the second testing process is the process of matching the pattern model that has been trained by utilizing the Multilayer perceptron method as a classification. Fig. 3 shows some pneumonia samples and normal x-ray images used in this study Fig. 3.

A. Pneumonia X-Ray Sample Set
Sample X-ray Set of Pneumonia is illustrated in Fig. 3.

B. Surf Detection
The results of feature SURF detection are marked with a circle on the image, Fig. 4 shows the results of feature SURF detection with 100 strongest values on the X-ray Set Pneumonia image.

C. Performance Measure
A confusion matrix was used to describe the performance of a classification model on a test dataset for which the true values were known. It enables the visualization of an algorithm's performance. A method's accuracy indicates how accurate the projected values are. Precision refers to the measurement's repeatability, or how many of the forecasts are right. The recall specifies how many right answers are found. The f1-score calculates a balanced average outcome by combining precision and recall. The equations below illustrate how to calculate these values, with TP, TN, FP, and FN standing for true positive, true negative, false positive, and false negative, respectively.   Table I, Table II, Table III, Table IV and Table V.   Based on the results of the experiments, the accuracy, precision, recall, and f1-score obtained a relatively stable value, with the lowest accuracy value being 78.4% and the highest accuracy value being 82.8%, the lowest precision value being .80.5% and the highest precision value being 82.8%, the lowest recall value being 78.4% and the highest recall value being 82.8%, the lowest f1-score being 76.4% and the highest f1-score being 82.8%.

E. Discussion
The highest classification report rate was found at the momentum condition 0.3 and the learning rate 0.09, and at the momentum condition 0.9 and the learning rate 0.09, with 82. 8  Using the bayesian noise-OR function to account for the effects of interference on the image, previous research discussed the CNN Model in classifying CT image data sets and determining the probability of COVID19. The results of the classification report 86.7 % accuracy, 81.3 % precision, 86.7 % recall, and 83.9 % f1-score.
Deep learning and the inception have been used in previous research. To learn about medical disorders. The layers are trained using the Adam Optimizer with a learning rate of 0.001 and stochastic gradient descent in batches of 1,000 images each step. In all categories, training lasts 10,000 steps or 100 epochs. In a comparison of pneumonia to normal chest X-rays, we achieved an accuracy of 92.8 %, with a sensitivity of 93.2 % and a specificity of 90.1 %. The area under the ROC curve for pneumonia detection from normal was 96.8%.
The classification of pneumonia is covered by khairina [48]. The successes in this study are with the level of accuracy, precision, recall, and f1-score with values of 0.8067, 0.7948, 0.9237, and 0.8544 in identifying the symptoms of pneumonia by combining the K-Nearest Neighbor method with the Histogram of Oriented Gradient as a feature selection and feature extraction model.

IV. CONCLUSION
Based on the experiments that have been carried out, in general, the learning rate value is not very influential in the learning process, both at the momentum values of 0.1, 0.3, 0.5, 0.7, and 0.9. The best desirable accuracy value for momentum 0.1 is at a learning rate of 0.05. The best desirable accuracy value for momentum 0.3 is at a learning rate of 0.09. The most desirable accuracy value for momentum 0.3 is at a learning rate of 0.05 and 0.07. At a learning rate of 0.03 the highest ideal accuracy value is obtained. The best desirable accuracy value for momentum 0.9 is at a learning rate of 0.09 this research should be redone using the number of hidden layers and nodes in each hidden layer. The addition of a hidden layer, as well as variations in the number of nodes in the hidden layer, will affect computation time and yield more optimal accuracy results.