Modeling a Fault Detection Predictor in Compressor using Machine Learning Approach based on Acoustic Sensor Data

Proper functioning of the air compressor ensures stability for many critical systems. The ill-effect of the breakdown caused by the wear and tear in the system can be mitigated if there exists an effective automated fault classification system. Traditionally, the simulation-based methods help to extend to identify the faults; however, those systems are not so effective enough to build real-time adaptive methods for the fault detection and its type. This paper proposes an effective model for the fault classification in the air compressor based on the realtime empirical acoustic sensor time-series data were taken on a sampling frequency of 50Khz. In the proposed work, the timeseries datais transformedinto the frequency domain using fast Fourier transforms,where half of the signals are considered due to its symmetric representation. Afterward, a masking operation is carried out to extract significant feature vectors fed to the multilayer perception neural network. The uniqueness of the proposed system is that it requires less trainable parameters, thus reduces the training time and imposes lower memory overhead. The model is benchmarked with performance metric accuracy, and it is found that the proposed masked feature setbased MLP-ANN exhibits an accuracy of 91.32%. In contrast, the LSTM based fault classification model gives only 83.12% accuracy, takes more training time, and consumes more memory. Thus, the proposed model is realistic enough to be considered a real-time monitoring system of the fault and control. However, other performance metrics like precision, recall, and F1-Score are also promising with the LSTM based fault classifier. Keywords—Air-compressor; fault detection; LSTM; multi-layer perception; ANN; acoustic sensor data


I. INTRODUCTION
The air compressors (AC) play a significant role in essential functions like fuel injection and metal finishing in the aircraft's design [1]. The ACs are used widely in thermal plants [2], power generation systems [3], vehicle propulsion [4] and pipeline systems [5], etc. In building an effective quality control system, the compressor simulation plays an essentialrole in evaluating the tolerable pressure by the different components of the aircraft while in transit [6]. Specifically, the aircraft manufacturer depends on the very high quality of the compressors for every phase of the production for the operative success of the functions. Another important aspect is that the aircraft components thwart the contamination due to mixing the air with the lubricants in ACs. Fig. 1 demonstrates the common applications in aircraft ACs.In order to design the complete line of the product of anair compressor, a compress air system is used. Two main air compressors are widely used: i) Rotary screw AC, and ii) Reciprocating ACs used depending upon the application requirements.
The automated fault classification (AFC) problem in compressors has attracted researchers to address the issues and build solution paradigms in the recent past so that early detection can minimize the damage caused to the overall system. The early warning systems are a step towards preventive maintenance [7], which is broadly classified into two categories,namely: i) maintenance on breakdown [8] and ii) condition-based maintenance (CBM) [9][10]. However, the CBMhas taken an edge over the maintenance on breakdown because the CBMperforms both the detection and seclusion of the faults that occurred in the early stage of the breakdown itself. The early and intelligent fault detection system [11] synchronizes with the conditions [12] of the various machine aspects dynamic changes in pressure, temperature, vibration, or acoustics [13]. Dhosi et al. [14] reviewed the correlation of vibrations and fault for various machines like pumps, turbines, and compressors. However, the most predominant measurement for fault detection is an acoustic signal.Some of the recent studies include the fault diagnosis in the planetary gearbox [15], Polak et al. [16], have highlighted the fault detection in the compressor, combustion engines using acoustic signals, last but not least, Ahmed et al. [17] reviews the use of acoustic measurement for the power unit fault detection in aircraft.
This paper proposes fault classification in the air compressor based on the real-time empirical acoustic sensor data. However, the challenging point in designing such an intelligent fault detection system (IFDS) is to identify the sensitive points from where the signals are acquired, and that is performed by the mechanism of the sensitive point or position analysis (SPA) [18][19]. The raw data collected from all the www.ijacsa.thesai.org sensitive positions are exposed to various noises requiring appropriate denoising treatments [20]. The learning model does not take these de-noised data directly into its computing model. Therefore, the significant information representation of the acoustic signals using mathematical models plays a vital role in accuracy for both dimensionality reduction and predictor performance. Various representations of the signals, including time, frequency, and time & frequency domains, are found in the designing of the fault detection systems for pipeline leakage [21], mechanical compound fault [22], centrifugal compressor [23], and reciprocating compressor [24]. Theoretically, the detection probability and accuracy are assumed to be higher if more features are considered while designing the detection models [25]. However, in contrast, the machine learning-based model performances degrade with the higher number of features [26] due to ill-effect posed by high polynomiality complexities and approximation.
Furthermore, this is handled by using an effective feature selection mechanism towards dimension reduction techniques like Partial least squares (PLS) used in the gas turbine's compressor blade [27]. Many other popular ones used in machines related faults detection systems are PCA [28], the variance of the ICA [29], etc. Moreover, there exist various types of faults which is trained to the model using these feature sets and many of the learning models as a function approximator such that if Yn represents all the feature set, then a function Fn emerge out as Fn(Yn)  C, where C is the class or the type of the faults. Popular learning models used in the fault classification in the machine are linear-SVM used for vehicle power systems [30], CNN for bearing fault classification [31], and LSTM for compressor valves [32]. Towards achieving better classification performance, this paper has presented an effective model for the fault-classification in the air compressor using a multilayer perceptron neural network that overcomes memory overhead, unlike Long Short Term Memory (LSTM) based approaches. The paper utilizes the autistic signal data prepared by Verma et al. [33] from 24 sensor positions based on SPA from an air compressor. The entire process flow of this method is described in the respective sections in this paper as a snap-view given in Fig. 2.  As mentioned in the above Fig. 2 the remaining part of the paper are organized as follows: Section II discusses the related work in the context of AC fault prediction, Section III presents dataset visualization and analysis; Section IV discusses the implementation of LSTM model; Section V discusses implementationof the proposed model based on MLP; Section VI presents the result and performance analysis of the proposed system and finally entire work of this paper is concluded in Section VII.

II. REVIEW OF LITERATURE
This section discusses the related work carried out towards air compressor fault prediction. In the existing literature, there are two kinds of approaches used to predict fault AC classification. The first one is the model-based approach, and the second is the data-driven approach [34]. The model-based approaches utilize mathematical modeling for machine life estimation and fault prediction [35][36]. In contrast, data-driven approaches are based on statistical analysis and soft computing approaches like machine learning, deep learning, and evolutionary. However, model-based approaches involve complicated procedures to describe attributes of the mechanical system [37]. Conditions of a mechanical system like air compressors can be analyzed by processing sensory data using data-driven and soft-computing approaches [38]. The work carried out by Ouadine et al. [39] has applied a neural network optimized using a genetic algorithm for predicting Aircraft AC-bearing fault. The dataset used in this study consists of vibratory signals captured from different bearings defects. The features are determined based on spectral density estimation, and prediction outcomes were compared with discriminant analysis classifier.Li et al. [40] introduced an intelligent fault detection system for mechanical rotating systems. The study uses a recurrent neural network (RNN) with fuzzy logic. The www.ijacsa.thesai.org RNN filters the input signal, and then the filtered signal is fed to fuzzy logic to detect faults. Ghorbanian and Gholamrezaei [41] investigated the application of different machine learning mechanisms in the context of analyzing performance compressors. The authors have utilized general regression neural (GRN) and MLP to simulate the performance of these models. The result indicated GRN is less associated with mean error and performed well with the experimental data but limited to interpolationfactor. On the other hand, MLP was evaluated, and the result indicated the most favorable outcome to analyze compressor performance.The work of Aravinth and Sugumaran [42] adopted a statistical feature extraction approach and random forest (RF) classifier to monitor and predict the fault in the AC to avoid regular failure in industrial and domestic applications. In this study,the accelerometer sensory signal is processed via a statistical approach, and RF is applied to detect the type of fault in AC.Fan et al. [43] have considered the case study of vehicle communication and presented their work on predicting AC breakdown using data streamed by the vehicles. The authors have used histogram analysis to model the signal. However, the histogram is a more straightforward approach that can determine the deviation in the signal to some extent. The study of Cui et al. [44] suggested an intelligent model for the early detection of faults in AC. The approach used in this study is based on the construction of an adaptive matrix based on the PCA and backpropagation techniques. This matrix is constructed to store the signals and determine a function of deviation in the signal pattern.
Further, identify early fault signature, a threshold is computed based on the mechanism of the sliding statistical window method. Work towards evaluating trustworthiness and prediction reliability on the AC in the Ammonia Plantis considered by Musyafa et al. [45]. Chen et al. [46] presented an LSTM-oriented approach for classifying compressor breakdown using aggregated sensory data.The performance of the presented model is evaluated using information captured from large heavy-duty vehicles. The authors have formulated a classification task to identify whether a compressor fault will occur within the specified horizon. An LSTM learning model is used to predict, and its performance is evaluated against the RF classifier. The experimental outcome exhibited that RF slightly outperforms LSTM regarding AUC. However, the prediction outcome from LSTM shows stability over time, maintain stability in the trend of healthy faulty classification. Another work carried out in a similar direction by Yang et al. [47] suggested an AC fault classification mechanism using on lifting wavelet approach. Initially, this study has decomposed the vibration signal of the AC wavelet; and further statistical features of decomposition are computed as the AC faults. In the classification process,the probabilistic model-based supervised classifier is employed to predict the fault class. The study outcome suggested that faulty features determined using a wavelet-based approach provide comprehensive fault features that lead to higher accuracy by the supervised classifier.

III. DATASET DESCRIPTION, EXPLORATION, AND VISUALIZATION
This section describes the process of data creation, its description, and an exploratory analysis and Visualization that decides the line of research for the data presentation for the learning model to overcome the memory constraints of the traditionally applied LSTM based fault classification system for higher accuracy.

A. Data Description
The Department of Electrical engineering of the Indian Institute of Technology (IIT), Kanpur, India having an air compressor of the single-stage reciprocating type. An effort by Dr. N.K Verma and his team to provide an open-source dataset [33][34] on the following specification as in Table I.
In the work of Verma et al. [33], all sensors, as in Fig. 3, records raw data every 5 seconds at the sampling rate of each sensor at 50 kHz and gets stored into the respective eight files as a structure shown in Fig. 3.
A closer analysis of each record shows that it consists of precisely 225 'dat' files in all the folders. Since there are 24 sensors at the sensitive points and an additional one sensor kept at a far distance to record the acoustic data at every 5 seconds, there are nine different timeslots; thus, as per     The raw data undergoes various pre-processing stages, including filtering to eliminate the undesirable frequency component using FIR filter at 400 Hz cut-off frequency (COF) threshold and low pass filter with COF of 12 kHz to obtain the valuable information.Further, clipping, smoothing, and normalization operations are performed to obtain the preprocessed data. The operation for extracting the name of the fault classes is applied and as in Table II. Moreover, the fault class is categorized as seven faults classes and consists of the normal class.

1) Bearing fault:
Bearing fault in the compressor arises when there is malfunctioning in the bearings, which are meant to make the compressor wheels running smoothly. Either bearing may break or may get imbalanced due to wear and tear. Due to this, friction in the machine will increase, and noise will arise.
2) Piston fault: Piston is the major part of the mechanism which converts rotatory motion into linear or vice versa. If there is a fault in the piston RPM of the entire machine may reduce. Moreover, due to this, the full sound of the machine will get less loud.
3) Flywheel: Flywheel is the main storage of kinetic energy in any machine. The main source of rotary motionis its motor or IC engine, which may not provide continuous energy to the machine. Hence if there is a fault in the flywheel due to wear and tear, the wheel spins faster; however, it can store less kinetic energy. Since it spins faster, the frequency of the sound may increase.

4) Leakage in inlet valve:
This fault occurs when the inlet valve of the compressor leaks, the pressure in the cylinder also reduces significantly. The noise becomes lessloud since the compressor is no longer working at optimal efficiency. The speed of the piston will reduce, and the frequency of the noise also reduces.

5) Leakage in outlet valve:
Contrary to the previous problem, high frequency and loud noises will appear when the leakage is in the outlet valve. This is because the pressure in the cylinder and speed of the piston remains the same, but still, the air escapes from the outlet valve with high pressure. This causes extra noises of various frequencies to appear. 6) NRV fault: NRV refers to a Non-return valve, which means that the valve will close when air tries to flow in the opposite direction. The fault arises when the air starts hitting the NRV valve in the opposite direction, which might be caused due to blockage or damage. In either case, there will be an impulsive load on the valve. This will induce noises of low frequency to appear along with the rest of the noises.

B. Data Exploration and Visualization
Initially, all the 225 data files stored into the respective directories are read and converted into the 2D vector of size 50,000 x 1, as each file contains reading at every '1' sec at a 50KHz sampling rate. The explicit procedure for this operation is as in algorithm 1. That can be understood using the flow chart in Fig. 6.  In the process of this stage of the data representation, all the data stored into respective folders(fault_Dir) , the main folder (C) are read.Then '255' file fault_Dir are read to obtain the file names (Fn) of the data by joining the strings: {fault_Dir, fault, and the file name.wav}, however when the 'Fn' is read, it is in the string format that gets converted into the commaseparated string that gets converted into the number types as values(V). Further,the value and the fault are updated to the initialized prediction vector (P) and response (R).

C. Signal Transformation
This section presents the transformation of the time-domain signal into the frequency domain, as demonstrated in Fig. 7.
The time-domain audio signal is transformed into the frequency domain using the numerical expression given in equation 1.   Phase shifts can be easily recognized in the time domain representation.
However, even considering these advantages time domain may not be a suitable representation of the data since the readings are taken after sometime of the machine being started. In a mechanical system like AC, there are absolutely no changes of occurrences of phase shifts. Phase shifts occur only in electronic systems. In a complex machine-like AC, many types of audio signals are mixed. There may be sound coming from the main cylinder, sound from the flywheel, and minor sounds due to friction between moving parts. The disadvantages of AC signal analysis in the time domain are described as follows: Disadvantages  AC signals are captured at different frequencies and often change depending on the sampling frequency of the sensors.
 Time-domain analysis may not be suitable to determine the fault accurately because of a high number of captured signal overlapping.
 The ambient noises are removed in the adopted dataset [BP]. However, there are common sounds recorded by all the sensors, for example, the sound of the motor being captured by ASs since it transmits efficiently through the metal shell of the AC.

2) Frequency domain analysis:
The AS signals in the frequency domain represent the amplitude of the quantity over various frequencies. The signal in the frequency domain is called a spectrum. There are many advantages of representing AS signal in frequency domain described as follows: Advantages  Any frequency domain transformation works as a frequency un-mixer.
 Easier to find out which instrument faults by looking at variations in the natural frequencies in the spectrum. For example, when there is a fault in the bearing, the friction increases and produces high-frequency noise from rubbing the metal pieces. So, the higher frequency noise becomes more dominant when there is a bearing fault present. Hence, such fault can be easily recognized.
Disadvantages  In the frequency domain representing phase, shifts are quite challenging tasks. However, phase shits are not important in a mechanical system.
Further, a descriptive statistical analysis is performed better to understand the overall data through the summarization www.ijacsa.thesai.org process and generate actionable information from the signal representation data. It provides '225 x 8 = 1800' samples.
Each has nine statistically significant computations belonging to a set {min, Q1, Q2, Q3, max, count, standard deviation, mean, fault-type}. Table III provides some random samples subset from the complete descriptions.
The count of all the samples is 50,000, indicating that there is no need to work on the cleaning process as there are no missing values in the value point in the sample or data point.
However, a better correlation is analyzed through the histogram of a sample for each fault type as shown in Fig. 8 provides a better visual perception of the data pattern. As shown in Fig. 8, the amplitude ranges from a higher value to a lower value depending on the fault type. The histograms for the respective fault types, as in Fig. 8(a) to Fig. 8(h), indicates the repetitions of amplitudes with thecentral tendency of each curve to zero. However, Fig. 8(d) and 8(h) show multiple central tendencies with zero and a little higher peak in Bearing and a lower peak in the case of LOV fault. The detailed observatory description for the rest ofthe distribution is as below:  Flywheel: As it can be observed that the flywheel curve is wider compared to both above histograms. (When we call it wider, observe the x-axis. The curve is landing at -2,+2) due to this, it can be concluded that when there is a flywheel fault, the noise of a particular frequency from the machine gets louder. This is an important indicator.
 LIV: It can be observed here that the noise will become less loud compared to normal operation. This is quite understandable since LIV stands for leakage in the inlet valve. Moreover, due to this, pressure will reduce, and the loudness of the machine will also reduce.
 LOV: Contrary to the previous example, when there is a leakage in the outlet valve (LOV), it will induce another high-frequency noise. Upon closer inspection of the peak, there are two peak points present. The lower one is for the formal operation, and the higher one is for the noise. The air will escape with a much higher velocity from the outlet valve. Moreover, due to this, high-frequency noise is induced.
 NRV: NRV or non-return valve occurs only when the air hits the NRV with the impulsive load. The purpose of NRV is to ensure the unidirectional flow of air. Except for this, the machine is in normal working condition. Hence, this is very similar to a normal operation. However, due to lack of pressure in output, some noises are not present.
 PISTON: In this case, however, the outside piston is malfunctioning. Since both flywheel and piston are external components to the main turbine, this histogram looks very similar to flywheel fault.
 Ridge Belt: Ridge belt is the belt connecting the flywheel to some machine tool or energy converter. If this is at fault similar effect of flywheel fault is produced in the case of acoustics.
 Healthy: In a healthy air compressor, it can be observed that the central tendency of the KDE plot is little towards the positive side of the plot. This means that there is a very low-frequency noise is present when the compressor is working normally. This could be due to the rotation of the wheels and bearings.

IV. LEARNING MODEL DESIGN USING LSTM
LSTM is a specific Recurrent Neural Network (RNN) class, which is most suitable for predicting time-domain sequences and their long-term dependencies more accurately than ordinary machine learning models.RNN considers that the association amongcellsisformulated as a directed graph. The previous state of the cell may be recurrent, which gives the network the ability to "remember" the information. With this exclusive structure, RNN can make decisions based on previous output value and current Input. However, RNN encountered the issue of exploding andgradient vanishing during the training phase.LSTM has been conceptually designed to address the issue of vanishing and exploding gradients. The LSTM network has a unique structure called a cell (neuron), allowing it to control the flow of information in the network. The elementary unit structure of the LSTM cell is shown in Fig. 9.  Fig. 9, the basic structure of the LSTM cell is demonstrated that utilizes vector connection by different functions such as sigmoid ' 'and hyperbolic tangent ' ' with point-by-point addition '∑' and multiplication ' ' operations. The cell has knodes such as input nodes ( that takes input samples in the form of vector to the LSTM, activation-n ( shows the output of a node, the current short-term memory, or current state of cell ( ) where both { } , previous short-term memory ( ) indicates the previous state of cell and activation n-1 ( Shows the output of the previous node. Moreover, to have better control and memorize the flow of information, the LSTM cell utilizes gating mechanisms such as input gate , forget gate And the output gate , where each cell gate such that { } . The utilizing and determines what value to use to decide the value of . The operation of updating by gate numerically expressed as follows: Where, in equation (1) denotes input gate of the cell at timestep ' ' (occurrence of LSTM cell), the variable and are the weights and bias of ' ' sigmoid operation between and . In equation (2) denotes values of cell state generated by , the variable and denotes weight and bias of operation between and .The next gate decides what information from to be considered to update . The operation of information flow by the gate is numerically expressed as follows: Where, in equation (3) denotes forget gate of the cell at ' ', the variable and are the weights and bias of ' ' sigmoid operation between and . Using equations (1), (2), and (3), the operation of can be numerically expressed as follows: ( Further, the gate determines what information in become value of . The operation of information flow by the gate is numerically expressed as follows: Where, in equation (5) denotes output gate of the cell at ' ', the variable and are the weights and bias of ' ' . In equation (6) denotes the output of LSTM network computed point-by-point multiplication of previous equation (5) and function with the input argument . In the proposed work, the task of compressor fault prediction using timedomain AS signals is regarded as a sequence classification problem.Therefore, the proposed study explores implementinga deep learning mechanism, particularly LSTM, for large-scale time-domain AS signals modeling for fault prediction in AC.The proposed architecture learning model for AC fault classification is demonstrated in Fig. 10.  where functionf refers to the LSTM cell method discussed above that generalizes the longterm dependence between the time domain relationship of the Input signal. The LSTM is trained considering the Input in the form of vectors using the sliding windowing (w) approach,where the Input is a sequence of time-domain signals with length L and w+1 window length. The process of window sliding is illustrated in Fig. 11 with window length (w=1000 AS signal samples).
In the above illustrated, the model takes Input as the first window having the first 1000 AS signal. Then, the next window is selected from the second signal sample of the first window, i.e., from the 2 nd sample to the 1001 st sample. This process is recurrent until all-time-domain input signals are windowed and fed to the LSTM model. The process flow of AC faults prediction using LSTM is shown in Fig. 12.
The system initially imports the dataset, consistingof 1800 AS signals captured at a 50,000 Hz sampling frequency. To execute the sequence classification tasks, splitting the dataset into two sub-datasets, i.e., training and testing sets. The training dataset is used to train the model, and the testing dataset is used to evaluate the model performance. This allows understanding characteristics of the trained model and provides scope for minimizing the effects of overfitting and underfitting of the model. The dataset split is carried out with a ratio of 80%-20% for training and testing, respectively. Therefore, the training dataset consists of 1440 AS signal samples and a testing dataset composed of 360 AS signaling samples. The study further considers feature selection, where descriptive statistics were analyzed in the time domain. In the training phase, a sliding windowing operation is carried to represent AS signals into the fixed-sized frame, which is further processed via the LSTM layer. Its output is then accumulated with the operation of the dense layer that considers descriptive statistics as Input.
The adoption of a dense layer enhances the generalization of the learning model minimizes the issue of overfitting and underfitting during the learning process. In the proposed LSTM architecture, Adam optimizer is used with a categorical crossentropy loss function to reduce training loss by adjusting learning attributes such as weights, biases, and learning rate. The configuration details of the LSTM model implemented in this study are mentioned in Table IV. Moreover, Softmax activation is used at the output layer of the LSTM model, as it is designed to address multiclass sequence classification problems (i.e., multiple AC faults). A similar procedure is carried out during model testing, and its effectiveness is assessed regarding the accuracy, precision, recall, and F1score. Fig. 13 illustrates the learning curve of the LSTM model training performance over 1000 epochs. As the epochs pass by, the reduction in learning also reducesin the LSTM. Generally, learning rate reduction happens when the error is not reducing for more than 5 epochs. The learning rate is reducing rapidly the error is not converging quickly enough in LSTM. It can beobserved thata sharp exponential decrease in learning rate in the LSTM model. When evaluated with the testing dataset, the model achieved an accuracy rate of 83.12% in AC fault classification. The following section discussed the proposed learning model based on a multilayer perceptron neural network.   The optimal number of epochs for best generalization www.ijacsa.thesai.org

V. PROPOSED LEARNING MODEL USING MULTILAYER LAYER PERCEPTRON
The neural network consists of an artificial neuron interconnected together by synaptic weights to form a network. Each neuron is modeled by the linear threshold unit, which maps single Input to single output using mathematical operation described as follows: where denotes the output of the neuron, indicated synaptic weight, { } , is the Input { }, and indicates a threshold function. A non-linear (x) can be a sigmoid function or a hyperbolic tangent function.
Amultilayer perceptron (MLP)class of NN. In MLP, the signal travels only in a forward direction; numerically, it can be represented as follows: where, is an vector refers to the output of the neurons at the output layer; is a vector, indicates the outputs of neurons at the hidden layer; is an vector, indicates the feature vector of the input signal; and arethe threshold vector for the neurons at the output and hidden respectively;the size of is and is , and are the matrices of size and respectively. Both refer to synaptic weights connecting the hidden layer neuron to the output and the Input and hidden layer neurons. The nonlinearity function to be a sigmoid function, i.e., ( The unknown parameters , , and can be determined viareducing an error criterion such that: Where indicates expected outputs which are required to MLP learn and { }, indicated a total number of instances.
The proposed system implements MLP to classify frequency domain AS signals to predict AC faults because MLP can address complex non-linear problems. It works with both large and small input data and offers quick prediction after training. All these factors are highly significant to the real-time scenario. Although the LSTM is suitable for time sequence data prediction, it is prone to computational overhead and sometimes overfitting problems. The architecture of the proposed AC fault classification system using MLP is shown in Fig. 14.
The MLP can perform better if the AS signal is better exposed to the MLP; as observed in the data exploration, the frequency of audio samples provides a better insight into AS signal representing the faults. Hence, in the proposed study, the frequency domain AS signal is used to train the MLP model to get better accuracy in classification fault classes. The proposed model is composed of two core modules such as i) Adaptive filter and ii) MLP module. Adaptive filter as a frequency domain bandpassfilter, also known as the digital filter,restricts some frequencies from being givenInput to MLP. The functional process of an implemented digital filter with MLP is shown in Fig. 15. www.ijacsa.thesai.org In the real-time scenario, AS-generated signals often may associate with noisy environmental factors, consisting of recursive or redundant frequencies. Since the proposed learning model takes Input in the frequency domain, it is essential to ensure that the Input AS frequency domain signal does not associate with any irrelevant factors to achieve higher accuracy in the output (O/P). The implementation of adaptive digital filter ̂( n) restricts the irrelevant and redundant frequencies before it is being introduced to the MLP. As a result, reduction in the number of input frequency domain signal samples x(k) reduces computational complexity, thereby reducing feature space complexity by removing irrelevant frequency domain AS signaling features. The processed information by ̂( n) representing a precise input, which providesbetter generalization ability to the MLP in the training phase. The architecture of MLP for AC fault prediction is shown in Fig. 16.
The MLP architecture proposed in the current study consists of the single input layer, with input frequency domain sensory signals such that { } each at 25000 sampling frequency (Nyquist frequency) and mapped to output class { } at output layer via a hidden layer of type dense { }. Since the time domain AS signal is transformed into a frequency domain signal, the theoretical maximum frequency using FFT can be detected always half of the sampling frequency. In the current study, since AS signal sampling frequency is 50 KHz, after transforming to the frequency domain using FFT, Nyquist frequency is 25 KHz. In the proposed MLP architecture, a linear activation function is used at each hidden layer. In the output layer, SoftMax activation is used to deal with the prediction of multiclass AC faults. Therefore, the output layer contains only 8 neurons signifying 8 different outputs. The SoftMax function ensures the sum of all outputs is always 1; hence only the maximum output is selected as the final output with the help of argmax function. A common optimizer is used for both ANN as well as the filter. The optimizer sets the h(n),known as the filter's impulse response. Fig. 17   The process flow of AC faults prediction using MLP is shown in Fig. 18, where the system initially imports the dataset,consistingof 1800 AS signals captured at a 50,000 Hz sampling frequency. To execute the sequence classification task, splitting the dataset into two sub-datasets, i.e., training and testing sets. The training dataset is used to train the model, and the testing dataset is used to evaluate the model performance. The input AC signal is converted to the frequency domain using FFT. In this present study, since the sampling frequency is 50 KHz, using the Nyquistmechanism,the sampling frequency of AS signalis computed at 25 KHz, which is the theoretical maximum frequency of FFT. Next, descriptive statistics of frequency domain AS signalsare computed and processed with a domain bandpass adaptive filter. As a result, redundant frequencies from the given Input AS signals being restricted. The filtered AS signal is further fed to MLP, where training is carried out using linear activation functions at each dense layer. After training the model, the testing dataset is used to evaluate the model. Fig. 19 illustrates the learning curve of the MLP model training performance over 1000 epochs.
In Fig. 19, the learning curve trend exhibits a reduction in learning rate is slower compared to LSTM. This indicates that the error reduces rapidly, and effective generalization of MLP. The next section discusses the performance metrics considered for the proposed learning model performance analysis. www.ijacsa.thesai.org   Both learning models' outcome and performance evaluation is carried out concerning multiple performance parameters such as accuracy, precision, recall, and F-1 score. Accuracy (A): Accuracy can be defined as the ratio of correct predictions over a total number of predictions. Therefore, in the current context of the case study, accuracy can be described as follows: Precision (P): Precision is the ratio of the number of correct predictions over a total number of predictions made to the current fault class.
Recall (R): Recall is the ratio of correctly predicted values over the number of expected faults, i.e., the total fault classes present in the test dataset. The lower recall represents the inability of the system to detect the particular class. Like precision, even for recall,the weighted average is taken.
F1 score: This performance metricdescribes the harmonic mean of precision and recall, which truly represents the system's performance.

VII. RESULTS AND PERFORMANCE ANALYSIS
This section presents the outcome obtained from both implemented learning model and performance analysis and discussesAir Compressor faults classification using acoustic sensor signals. The entire modelling and development of the proposed system are carried out using Python.

A. Analysis of Learning Rate
The comparative analysis concerning the learning rate reduction to access training performance of both LSTM and the proposed learning model. Fig. 20 presents a comparison of implemented LSTM and Proposed MLP regarding learning rate. It can be analyzed from the learning curve trend that at the beginning, the proposed MLP method takes a little longer time to reduce the learning rate compared to LSTM. However, the proposed MLP model maintains a significant reduction in error during its initial stage of the training process. This indicates that the proposed method has a better optimization in learning and generalization compared to LSTM. It is to be noted that the more the area between the curves better the improvement will be, and if the error in the training phase is not reduced for continuous five epochs, then the learning rate will reduce. The proposed model's learning rate is more, which signifies that the proposed MLP learns faster than the LSTM.

B. Analysis of Classification Performance
The performance analysis is carried out considering multiple evaluation metrics. This is because the system's accuracy is always not a good metric to measure performances, especially when using learning models. The accuracy may not represent the performance of the system completely. If the modelcorrectly predicts fault classes, it indicates a higher accuracy even if it cannot predict negative values. Therefore, performance evaluation of the implemented learning model, i.e., LSTM and the proposed MLP, is carried out considering other two evaluation metrics such as precision and recall rate. However, there is always a trade-off between precision and recall. This is because the precision focuses more on the exactness of the learning models.On the other hand, recall rate focuses more on measuring the completeness of the learning model. For example, suppose the model recognizes the air compressor as faulty. When there are no faults in the air compressor, it is said to have a lower precision rate, which indicates many false positives or biases in the air compression fault predictions. If the model has a higher precision score, then the model is subjected to a low false-positive rate. A low recall rate indicates higher false negatives, and a higher recall rate indicates low false negatives in the prediction result. Since the precision represents the correctness of positive results and recall represents the correctness of negative results, the model should be built to balance both. In order to measure the balance between precision and recall, the F1_score metric is evaluated, which shows the harmonic mean of precision and recall. The harmonic mean is used instead of the regular average since the harmonic mean reduces the effect of extreme values. Table V presents the numerical outcome obtained, followed by the graphical outcome in Fig. 21 to evaluate the implemented learning models.  In Fig. 21, the comparative analysis exhibits that the proposed learning model outperforms LSTM in all performance metrics. The LSTM achieved an accuracy rate of 83.12%, whereas the proposed model achieved a 91.32% accuracy rate. In the case of precision metric, LSTM has scored 86.34%, and the proposed model has attained 96.23 % of the precision rate. The proposed model achieved an 89.23% recall rate, whereas LSTM achieved an 81.54% recall rate. Also, the proposed model has a higher F1_score than LSTM, i.e., 92.60% and 83.87, respectively. Based on the observation, it can be analyzed that the LSTM is biased to an extent towards the faulty results. This indeed came as no surprise since the data contains very few healthy signal samples than faulty signals. The proposed method also has a difference of approximately 7% between precision and recall. However, these are within acceptable limits, which indicates that the proposed model is better at detecting air compressor faults. However, even then, the system is more reliable when both precision and recall are balanced.

VIII. CONCLUSION
In the proposed work, the study aimed to predict different types of air compressor faults. The analysis was carried out using sensory signals captured from the Acoustic sensors mounted on the Air compressor. The proposed study carried out data visualization and exploratory analysis to characterize the signal features and faults in time and frequency domains. The proposed study is concerned with two aspects of the classification process: the first classification of air compressor faults using the LSTM learning model where the time-domain signal is used as input. On the other hand, the frequencydomain signal is used with a digital filter in the proposed MLP learning model. The result indicated that the proposed learning model outperforms LSTM in accuracy, precision, recall rate, and F1_Score. The outcome shows 83.12% and 91.32% of accuracy achieved by LSTM and MLP, respectively. Also, the learning performance of both models is evaluated. The analysis exhibited that the proposed MLP has less training time compared to LSTM. Therefore, the proposed learning can be claimed to be efficient and suitable for real-time implementation. It has less training time, does not suffer from feature extraction problems, has less memory overhead, and has good generalization ability due to preciseness in the input signal leads to achieving higher accuracy.