Hybrid Fault Diagnosis Method based on Wavelet Packet Energy Spectrum and SSA-SVM

—As one of the important components of mechanical equipment, rolling bearing has been widely used, and its motion state affects the safety and performance of equipment. To enhance the fault feature information in the bearing signal and improve the classification accuracy of support vector machine, a hybrid fault diagnosis method based on wavelet packet energy spectrum and SSA-SVM is proposed. Firstly, the wavelet packet decomposition is used to decompose vibration signals to generate frequency band energy spectrum, and the bearing characteristic information is constructed from the energy spectrum to extract and enhance the bearing fault characteristic information. Secondly, the penalty and kernel parameters are optimized globally by sparrow search algorithm to improve the classification accuracy of support vector machine, and then construct the WPES-SSA-SVM model. Finally, the proposed model is used to diagnose and analyze the measured signals. Compared with BP, ELM and SVM, the effectiveness and superiority of the proposed method are verified.


I. INTRODUCTION
With the deep integration of new generation information technology and manufacturing industry, the mechanical equipment is becoming more and more complex, accurate and intelligent. With the continuous operation of mechanical equipment, its running state and key parts will gradually degenerate, and the probability of failure and shutdown will gradually increase, which will affect the normal production and processing of enterprises. As one of the important components of machinery, rolling bearings are widely used because of their convenient use and maintenance, reliable operation and good starting performance [1]. Using the characteristics of bearings, the sliding friction between parts is transformed into rolling friction, which improves the production efficiency of the equipment. Once damaged, it will lead to problems in the operation of mechanical equipment, reduce the working efficiency, and even cause the functional failure of rotating machinery, resulting in serious economic losses and personal casualties [2][3]. Therefore, it is of great practical value to timely find and take corresponding measures for the faults of rolling bearings, and it has become a research hotspots in intelligent fault diagnosis.
In recent years, fault diagnosis methods for rolling bearings have been emerging and developing [4][5][6][7]. Fault diagnosis methods for rolling bearings have mushroomed and developed continuously. In general, the fault diagnosis techniques of rolling bearings include: based on vibration signal [8], acoustic signal [9], electrical signal [10] and temperature signal [11]. Among them, vibration signal is more widely used, more intuitive and simple, because it can best represent the fault characteristic information in the process of bearing operation.
As the rapid and continuous development of machine learning and artificial intelligence, more and more researchers combine bearing fault diagnosis with it, and the intelligent fault diagnosis methods and systems are gradually improved. The common fault identification methods include deep learning (DL) [12], artificial neural network (ANN) [13], decision tree (DT) [14] and support vector machines (SVM) [15,16]. Literature [17] proposed the improved BP neural network algorithm Levenberg-Maquardt algorithm in order to improve the diagnostic efficiency of BP neural network. Literature [18] proposed a fault extraction method based on modified Fourier mode decomposition (MFMD) and multi-scale displacement entropy, and combined with BP neural network. Experiments show that this method has high recognition accuracy for different types of faults. In literature [19], wavelet packet energy and decision tree algorithm are combined to extract faults using wavelet packet energy, and then faults are identified and classified using decision tree model. In view of the low fault diagnosis rate of rolling bearings, the method of wavelet packet decomposition and gradient lifting decision tree (GBDT) was proposed in literature [20], and the extracted fault feature data set was input into the classification model of gradient lifting decision tree for fault diagnosis. In literature [21], scale invariant feature transform (SIFT) and kernel principal component analysis (KPCA) were used to extract faults, and SVM classifier was combined to achieve fault classification. Literature [22] applied SVM to fault state identification of rolling bearings and achieved good results. Literature [23] proposed a rolling bearing fault diagnosis method optimized by simplex evolutionary algorithm and SVM. Literature [24] diagnoses fault types by reducing high-dimensional data and using LSSVM.
At present, various intelligent optimization algorithms have emerged one after another, such as particle swarm optimization (PSO), whale optimization algorithm (WOA), ant colony optimization (ACO), genetic algorithm (GA), sparrow search algorithm (SSA), etc., and the combination and improvement of other algorithms have also achieved good results [25]. In reference [26], PSO was used to optimize SVM to realize the identification of multiple fault states of rolling www.ijacsa.thesai.org bearings. In [27], gray wolf optimization algorithm (GWO) was used to optimize the kernel function parameters of SVM globally, so as to achieve the best classification performance of SVM and improve the accuracy of classification recognition. Aiming at the influence of mixed noise of bearing vibration signals on useful information extraction, a optimization classifier based on multi-scale permutation entropy and cuckoo search algorithm (CS) was proposed in literature [28], which used CS to optimize the global optimal solution of SVM. Literature [29] proposed a method based on quantum behavior particle swarm optimization algorithm (QPSO), multi-scale displacement entropy and SVM to construct fault feature sets to realize fault identification of rolling bearings. Compared with single method for fault diagnosis, the combinatorial optimization methods have higher accuracy, but at the same time, different optimization methods have different problems, for example, BP model must be learned through a large amount of sample data, even if has been optimized the BP network parameters globally by optimization algorithm, the model is still not ideal in a small sample environment. SVM parameters can be optimized by PSO and other optimization algorithms to improve the classification accuracy, but this algorithm is prone to fall into local extremum. Therefore, combining the advantages of each algorithm and joint application to improve the effectiveness of rolling bearing status identification and fault diagnosis is the current research trend.
To improve the accuracy of bearing fault diagnosis, this paper firstly uses wavelet packet energy spectrum to extract energy spectrum feature vectors of bearing vibration signals, which are used as the input of SVM. Meanwhile, SSA algorithm is used to optimize the parameters of SVM globally, so as to build a hybrid model. The feasibility and effectiveness of the model are verified by experiments.
The rest parts of this paper are given as lists: Section 2 presents the preliminaries. Section 3 describes of the proposed method. Section 4 details the experimental setup. Section 5 analyzes and discusses the experimental results. Finally, Section 6 outlines the main conclusions.

A. Wavelet Packet Energy Spectrum
Wavelet packet decomposition can decompose signals into different frequency bands without leakage and overlap according to any time-frequency resolution. After wavelet packet transform, the information is intact and all frequencies are retained, which provides strong conditions for extracting the main information in the signal. This decomposition can be performed as many times as needed to obtain the desired frequency. Fig. 1 shows the schematic diagram of orthogonal wavelet packet decomposition of a signal. The original signal was denoted as , and the two sub-bands and of layer 1 can be obtained after wavelet packet decomposition through filters H and G. Decompose the two sub-components of the first layer respectively to obtain the four sub-bands , , and of the second layer; By analogy, the sub-band of layer n can finally be obtained.  As can be seen from Fig.1, wavelet packet decomposition decomposes the decomposed frequency band several times, and re-decomposes the high frequency part without subdivision in the wavelet decomposition. In addition, according to the characteristics of the signal to be decomposed, the corresponding sub-frequency band can be adaptively selected to match the frequency spectrum of the signal. After wavelet decomposition, all the characteristic information, including the low frequency part and the high frequency part, can be preserved, which provides strong support for the feature information extraction of the signal.
It can also be seen from Fig.1 that if there are too many decomposition layers, the dimension of the data to be processed will be increased and the unrestricted decomposition cannot continue. In practical application, it is necessary to select an appropriate decomposition level according to the actual situation.
Wavelet packet energy spectrum enhances the stability of wavelet packet decomposition coefficient by extracting the energy of sub-band to construct feature vector. The wavelet packet frequency band energy is defined as follows: Using wavelet packet to decompose the original signal in n level, and 2n sub-frequency band can be decomposed. The energy calculation formula of sub-frequency band is Formula 1.
where, is the coefficient of sub-frequency band , .
Therefore, the wavelet packet frequency band energy spectrum is defined as Formula 2.
, - B. Support Vector Machines SVM is a machine learning algorithm based on statistical learning theory, which can successfully deal with many data mining problems such as pattern recognition, classification and regression analysis. It shows many unique advantages in solving small sample, nonlinear and high-dimensional pattern recognition problems, and overcomes the problems of dimension disaster and over-learning to a large extent. www.ijacsa.thesai.org Based on the theory of minimum construction risk, support vector machine maximizes the distance between the elements closest to the hyperplane and the hyperplane. Its core is to establish the best classification hyperplane, so as to improve the generalization processing ability of learning classification machine.
Taking binary classification as an example, its basic idea can be summarized as follows: first map the input vector to a high-dimensional feature space through some prior selected nonlinear mapping such as kernel function, and then seek the optimal classification hyperplane in the feature space, enables it to as much as possible to separate two classes of data points correctly, at the same time to separate two classes of data point furthest distance classification surface, as shown in Fig. 2.
In Fig. 2, square and triangle represent two types of samples respectively. H is the optimal classification hyperplane; H 1 and H 2 are straight lines that pass through the boundary points of the two types of samples and are parallel to H, and the distance between them γ is the interval. The optimal classification line requires that the classification line can not only correctly classify the two categories, but also maximize the interval. The vector closest to the optimal classification hyperplane is called the support vector.
Assume the training sample set * + ; * +, where is the input index, is the output index, is the sample number, and is the characteristic dimension of the sample. In the case of linear divisibility, there is a hyperplane that separates the two types of samples completely, as shown in Formula 3.
where, ( ) is the weight vector of the training sample, which determines the direction of the hyperplane. is the input vector; is the distance between the hyperplane and the origin.
Solving the optimal classified hyperplane is to find the optimal and , therefore, it can be summed up as the following quadratic programming problem: In order to solve the quadratic programming problem of Formula 4, the Lagrange function is introduced and the duality principle is used to transform the original optimization problem into Formula 5: According to Formula 5, the optimal V is ( ) .Thus, the optimal weight vector and the optimal value can be calculated by Formula 6 and Formula 7.
Then the optimal classification hyperplane is ( ) , and the optimal classification function is obtained.
C. Sparrow Search Algorithm SSA realizes optimization based on the idea that swarm organisms in nature can obtain a better living environment through mutual cooperation [30]. The bionic principle is as follows: in order to obtain abundant food, the sparrow population is divided into explorers and followers in the process of foraging. The explorer in the sparrow population who finds abundant food sources is responsible for providing the foraging area and the direction of food sources for the population, and the followers is responsible for finding more food according to the location provided by the explorer. At the same time, individual sparrows will also monitor the behavior of other individuals and compete for supplies with highforaging peers. When the population is in danger, it will make anti predation behavior. The external sparrow will constantly adjust its position to move closer to an internal or adjacent partner to increase its own security. Therefore, the distribution of food in space can be regarded as the numerical value of function in three-dimensional space. The purpose of sparrow search is to find the global optimal value.
The specific implementation process of sparrow search algorithm is as follows. In the process of searching for food, the randomly generated position matrix X of n sparrows in the d dimensional space is shown as follows: where n represents the number of sparrows, d represents the dimension of the variable of the problem to be optimized, ( ) is the position of the j sparrow in i-dimensional space. www.ijacsa.thesai.org The fitness values are calculated and sorted to determine the finders and entrants, and 10% of randomly selected individuals are scouters. Obtain the current optimal sparrow individual position, and the best fitness value. For the first generation of sparrows, the initial optimal is obtained.
where f represents fitness values of individual sparrows.
In constant iterative optimization process, the explorers in the sparrow population have two main tasks: looking for food and guiding the movement of the population. When the scouters feel dangerous, will alert the populations and guide the followers to a safe area. The location of the explorers is updated as follows: where represents the position of the i-th sparrow in the j-th dimension of the t generation. is a random number in the range of [0,1]; T represents the maximum number of iterations; Q is a random number that follows normal distribution; L represents a matrix where each element is 1； and represents the alarm value and alarm threshold respectively, , -, , -. When means that there are no predators around foraging at this time and the explorer can conduct extensive foraging operation. Conversely, it indicates that some sparrows in the group have found predators and send Danger Warnings to the rest, thus ensure that all sparrows can quickly move to a safe area to forage.
Followers search for food by monitoring and following the explorers with the highest fitness. According to the sorting principle, when , the individual fitness value is low, and these followers need to search other locations to improve the individual fitness value. Conversely, the sparrow will randomly find a location near the current optimal location for feeding.
where represents the global worst position of the tth iteration; is the best position of the t+1 generation explorer.
is a dimensional matrix with each dimensional value randomly generated from 1 or -1.
Individual sparrows will move to the search circle or other companions when they encounter danger during the foraging process. The method of updating the position of individual sparrows in this process is shown in Formula (14).
where is the step size control parameter, and it follows the normal distribution with mean value 0 and variance 1; is the moving direction of the sparrow, and the value range is [-1, 1]; is the minimum constant to avoid zero denominator; represents the current global optimal location; represents the fitness value of sparrow i; and represent the current worst and best fitness values, respectively.

III. PROPOSED MODEL
To improve the fault diagnosis accuracy of bearing vibration signals, a hybrid fault diagnosis model is constructed by using wavelet packet energy spectrum, SSA and SVM, which is named WPES-SSA-SVM. In order to accurately extract features, wavelet packet energy spectrum is used to extract feature information from vibration signals, and the energy of reconstructed signals are calculated through wavelet packet decomposition and reconstruction, and the feature vector is established. Then, SSA is used to optimize the penalty parameter c and kernel parameter g globally to improve the learning ability and generalization ability of SVM classifier. The model consists of data feature extraction, SSA optimization and SVM recognition. The functions of each part and the information transmission between them are shown in Fig. 3.

1) Data feature extraction module:
Using wavelet packet decompose the bearing vibration signal, and the wavelet packet frequency band energy spectrum is generated according to the decomposition results. Taking the energy spectrum information as the fault diagnosis features, and divide it into training and test data set in proportion. Then, the training data is transmitted to the SSA optimization module, and the training and test data are transmitted to the SVM recognition module.
2) SSA optimization module: The SSA optimization module receives the training data from the data feature extraction module and the value range of penalty parameter c and kernel parameter g from the SVM recognition module respectively, uses SSA to find the best penalty parameter c and kernel parameter g, and returns them to the SVM recognition module.
3) SVM recognition module: The SVM recognition module first transmits the value range of penalty parameter c and kernel parameter g to the SSA optimization module for parameter optimization, then receives the optimized parameters, and carries out machine training using the training data received from the data feature extraction module. After that, the fault diagnosis on the test data is recognized to test the recognition effect. www.ijacsa.thesai.org  The algorithm of the model is divided into nine steps, and the flow chart is shown in Fig. 4.
Step 1: The original vibration signal is decomposed by wavelet packet, and the frequency band energy spectrum is calculated, and then the data is randomly divided into test data and training data in proportion.
Step 2: Select the kernel function to construct SVM, mainly including linear kernel function, RBF kernel function, polynomial kernel function and Sigmod kernel function, and set the value range of penalty parameter c and kernel parameter g.
Step 3: Initialize sparrow population. Set the population size Size, the maximum number of iterations T max , the individual position X, where X is the multidimensional coordinate composed of penalty parameter c and kernel parameter g, the proportion E, F, S of explorers, followers and scouters, and the safety threshold ST.
Step 4: Using the classification accuracy as the fitness function value of SSA.
Step 5: Find the global optimal position. The fitness value f of individual position is obtained by using training data. The larger the value is, the better the position is, and the global optimal position is the position with the largest f. If multiple positions at the same f, the optimal position is the one with the smallest penalty parameter c.
Step 6: Update the population position and global optimal position.
Step 7: Iteration number condition judgment. If the current of iterations , return Step 6 to continue running; Otherwise, execute Step 8.
Step 8: Using SSA optimization to get the best parameters, and the SVM is trained through the training data.
Step 9: Input the test data into SVM, output the calculated bearing fault label value, identify the fault type, and compare it with the real fault type label in the original data to verify the diagnosis effect.

IV. EXPERIMENTATION
Feature extraction and fault diagnosis were performed using simulated fault data from the bearing experiment data provided by Case Western Reserve University (CWRU). The data set has been applied in many experimental studies and achieved good results. The time domain and wavelet packet characteristics of vibration signals are extracted from the official experimental data and fault diagnosis is carried out. The structure of the bearing test bench is shown in Fig. 5 [31]. The test bench is composed of three-phase induction motor, torque sensing device, electronic control unit, dynamometer and intermediate shaft. During the experiment, the motion state of the rolling bearing in the actual work is simulated. Single point defects with different widths such as 0.007, 0.014, 0.021, 0.028 and 0.040 inch are machined on different parts of the bearing by spark machining technology, so as to obtain the experimental data of different fault types, such as rolling element, inner race and outer race fault.
In this paper, the fault body diameter is 0.007 inch, the motor load horsepower is 1hp, the bearing model is SKF-6205-2RS-JEM, and the sampling frequency of acceleration sensor is 48KHz to collect the vibration signal data of normal bearing, inner race fault, outer race fault and rolling element fault at the driving end. Take 100 groups of data samples for each state, with a total of 400 groups of data samples. 100 samples are randomly divided into 70 training samples and 30 test samples after feature extraction by wavelet packet energy spectrum. The training samples are used to extract features for classification model training, and the test samples are used to test the effect of classification model. The parameters of rolling bearing are shown in Table I. The division and label setting of experimental data are shown in Table II.

V. RESULT AND DISCUSSION
The time domain waveform diagram can intuitively observe the waveform distribution and amplitude of the vibration signal in each state. The waveform will fluctuate with the fault location and size. The vibration signals of normal and different faults of bearings are shown in Fig. 6.
The wavelet packet decomposition with wavelet basis function as db3 is used to decompose the normal state, inner race, outer race and rolling element fault signals respectively, so as to obtain the decomposition coefficient and reconstruction coefficient, and then use the reconstruction coefficient to reconstruct, finally obtain 8 sub-band energy, and the energy proportion of each frequency band is analyzed. Due to space limitation, this paper only lists the wavelet packet components of reconstructed nodes in normal state, as shown in Fig. 7. The energy proportion of 8 sub-bands in different states is shown in Fig. 8.
It can be clearly seen from Fig. 8 that there are differences in normalized amplitude of wavelet energy spectrum in different frequency bands after reconstruction of each node. Among them, the energy spectrum of sub-band 1 and 2 is relatively large in the four states, followed by the energy spectrum of sub-band 3 and 4, and the energy spectrum of sub-band 5, 6, 7 and 8 is relatively small, but there are slightly different in different states.     For example, when the bearing is in the normal state, the energy spectrum value of sub-band 4 is higher than that in the fault state. When the outer race fault occurs, the energy spectrum value of sub-band 1 is lower than that in other cases. In case of bearing inner race fault or rolling element fault, the energy spectrum graph is relatively close, but there is still a certain gap between the values of sub-band 4 and sub-band 6. The difference of wavelet packet energy spectrum graphics in different states reflects that the features extracted by wavelet packet transform are sensitive to the fault feature information of vibration signal. Therefore, the energy amplitude corresponding to each sub-band and the energy difference between frequency bands can be used to evaluate the different states of bearings.
To verify the feasibility and effectiveness of WPES-SSA-SVM, experiments were conducted on BP, ELM, SVM and WPES-SSA-SVM respectively. The diagnosis results are shown in Fig. 9, where 'o' stands for the fault category of the actual testing set, and '*' stands for the fault category predicted by the model. From Fig. 9, the BP model misjudged 18 faults in total, including 5 rolling element faults misjudged into 3 inner race faults and 2 outer race faults, 8 inner race faults misjudged into 3 outer race faults and 4 rolling element faults and 1 normal, 5 outer race faults misjudged into 2 inner race faults, 2 rolling element faults and 1 normal, and the diagnostic accuracy is 85%. The ELM model misjudged a total of 16 faults, of which 4 rolling element faults were misjudged as 1 inner race fault and 3 outer race faults, 6 inner race faults were misjudged as 3 rolling element faults and 3 outer race faults, 6 outer race faults were misjudged as 1 rolling element fault and 5 inner race faults, and the diagnosis accuracy was 86.67%. There are 14 wrong judgments in SVM model, including 3 wrong judgments of rolling element fault, 2 wrong judgments of inner race and 9 wrong judgments of outer race. The diagnosis accuracy is 88.33%. WPES-SSA-SVM model misjudged 4 faults in total, including 1 rolling element fault misjudged as inner race fault, 2 inner race faults misjudged as outer race fault and 1 outer race fault misjudged as rolling element fault. The number of misjudged in the four states has been well improved. WPES-SSA-SVM model has the best diagnostic effect for ELM model, SVM model and BP model, and the diagnostic accuracy is 96.67%. The experimental results show that using wavelet packet energy spectrum for feature extraction and SSA to optimize SVM model can improve the performance of fault diagnosis, and has obvious advantages over other non-optimized models. www.ijacsa.thesai.org

VI. CONCLUSION
In this paper, we proposed a hybrid fault diagnosis method based on wavelet packet energy spectrum, SSA, and SVM in rolling bearing. Aiming at the difficulty of feature extraction of bearing vibration signals, wavelet packet decomposition was used to extract the wavelet packet features of vibration signals, and the energy spectrum of wavelet components is calculated and normalized to form the feature vector set, which fully contained the fault feature information of vibration signals. To improve the accuracy of fault diagnosis, the penalty parameter c and kernel parameter g of SVM are optimized by using the good global optimization ability of SSA, so as to build a hybrid fault diagnosis model WPES-SSA-SVM. To verify the classification performance of WPES-SSA-SVM, the CWRU bearing vibration data set is used to extract fault features and diagnose faults. The results show that compared with BP, ELM, and SVM, the proposed method can accurately extract the feature information from the original vibration signals, and has higher diagnosis accuracy. SSA helps to optimize the parameters and improve the classification performance of SVM. In the future, we will use data from other industries and scenarios for diagnosis, and further investigate the improvement of model performance and diagnostic accuracy.