Fast – ICA for Mechanical Fault Detection and Identification in Electromechanical Systems for Wind Turbine Applications

Recently, the approaches based on source separation are increasingly adopted for the fault diagnosis in several industrial applications. In particular, Independent Component Analysis (ICA) method is attractive, thanks to its simplicity of implementation. In the context of electrical rotating machinery with a variable speed, namely the wind turbine type, the interaction between the electrical and mechanical parts along with the fault is complex. Therefore, the essential system variables are affected and it thereby requires to be analyzed in order to detect the presence of certain faults. In this paper, the target system is the classical association of a doubly-fed induction motor to a two stage gearbox for wind energy application system. The investigated mechanical fault is a uniform wear of two gear wheels for the same stage. The idea behind the proposed technique is to consider the fault detection and identification as a source separation problem. Based on the analysis into independent components, Fast–ICA algorithm is adopted to separate and identify the sources of the gear faults. Afterwards, a spectral analysis is applied on the signals resulting from the separation in order to identify the fault components related to the damaged wheels. The efficiency of the proposed technique for the separation and identification of the fault components is evaluated by numerical simulations. Keywords—Source separation; fault diagnosis; independent component analysis; fast–ICA; spectral analysis


INTRODUCTION
Wind power increasingly gain ground, thanks to its characteristics as an inexhaustible and clean source of energy, which has made it a privileged field of scientific research and technological development in the world.A recent report shows the large-scale expansion of the installation of wind farms in the world [1].Yet, an electric machine, whether running as a motor or as a generator, is rather sized in torque.
In small powers, the speed is relatively high, however in the case of large powers, (several hundred KW to a few MW), the low speeds lead to very high torques and prohibitive generator masses.For this reason, a gearbox is typically interposed between the turbine and the generator.Consequently, the fast shaft of the gearbox is coupled to the shaft of the electric generator [2], [3], [4].A recent study of faults in the wind energy conversion systems revealed that about 10% of the identified defects are related to the gearbox [5], [6].Although this proportion is apparently low, this type of fault often leads to prohibitive production stops.That's from where comes the need to continuously monitor the proper functioning of this essential component in the energy conversion chain.That is why, several diagnostic techniques for the fault detection in these speed multipliers have been developed.These techniques include: Analysis of acoustic emissions [7], [8], oil analysis [9], [11] and specifically vibratory analysis.In particular, the investigation of vibratory signals has been proposed in different works, using different approaches: statistical analysis [10]- [11], temporal and/or frequency domain [7], [12], [13].
In reality, the vibratory signals collected during operation contain relevant informations which reflect several sources of faults relating to the speed multiplier itself and to those associated with the machine coupled with it.This is clearly justified in the references [14], [15], where the characterization of bar breaking faults, as well as the unbalance was based on the time-frequency analysis of the vibratory signals.
However, the measured observations are often mixtures of the vibrations of the defects mentioned before.This makes the diagnosis of defects a very difficult task.To solve this problem, several techniques have been used to identify the sources of defects from the spectral mixtures resulting from vibratory signals [16], [17], [18].
In the literature, Independent Component Analysis (ICA) has been widely applied for the separation of sources in different domains, including medical imagery, telecommunications, and more recently for the diagnosis of faults in electromechanical systems [19], [20], [21], [22].
More recently, new ICA-based techniques have been proposed for fault diagnosis in the electromechanical systems.In fact, the most used algorithms of the (ICA) can be classified as follows  The InfoMax algorithm [23] solves the ICA problem by maximizing the differential entropy of the output of an invertible non-linear transformation of the whitened observations; www.ijacsa.thesai.org  JADE [24]- [25] consists in jointly diagonalizing the set of the eigen-matrices constructed from the eigenvectors associated to the P greatest eigenvalues of the covariance matrix of the whitened observations;  Fast-ICA [26] tries, after the whitening step, to maximize a contrast function based on negentropy.
In the present work, the fast temporal algorithm, known as Fast-ICA , has been adopted for the identification of gear faults because of its appealing characteristics: high convergence speed and low computational cost.Moreover, this technique is interesting since it is relatively insensitive to the increase in the number of sources.This paper is organized as follows: Fast-ICA is formulated for fault diagnosis in the second section.Then, the gear vibration data is described in the third section.Afterwards, the fourth section is dedicated for the spectral analysis.Finally, the paper ends with a conclusion.

II. FORMULATION OF THE FAST-ICA FOR FAULT DIAGNOSIS
The Fast-ICA algorithm is an advanced version of the ICA, characterized mainly by a very fast convergence, whose separation into independent components takes place in a whitened space [27], [28].In fact, instantaneous linear mixtures (signals from sensors) are preprocessed.This consists in their projection into a whitened space.Then, they are separated by the Fast-ICA algorithm itself.The details of these two preprocessing steps and the Fast-ICA processing are described in the following.Furthermore, several nonlinearity functions are presented because of their impact on the performance of Fast-ICA algorithm.

A. Preprocessing step
Let n sources of faults s j denoted by [s 1 , ..., s n ] T , and mixed before being retrieved by the sensors.Thus, m mixtures x i of length N, are represented as rows of a mxN matrix denoted X = [x 1 , ..., x m ] T .
Moreover, it can be represented by a linear model as (2) where is an m×n mixing matrix, is the additive noise with the corresponding Gaussian weight vector given by b In order to apply Principal Component Analysis (PCA) to the mixtures, they should be considered differently.Indeed, the mixtures X should be seen as a set of N m-dimensional points.Now, each column of X is interpreted as the coordinates of a point in the space .
First of all, PCA computes the mean of the N points, denoted , as follows Then, PCA centers each point relatively to as follows: , for all j=1,…., N (4) Therefore, the resulting matrix denoted X" has as rows the centered mixtures.
Afterwards, the covariance matrix of X' is computed as follows (6) Then, the matrix is diagonalized as follows (7) Therefore, two matrices are obtained  a diagonal matrix denoted D composed of decreasingly sorted eigenvalues of the covariance matrix of X'.
 a matrix denoted E whose columns are the eigenvectors of the covariance matrix of X'.These eigenvectors are pairwise orthogonal.
Once PCA achieved, the whitening matrix denoted U is calculated by the following expression (8) Finally, this steps results in the matrix composed of whitened mixtures, denoted V, is obtained by (9)

B. Processing Step: Implementation of the "Fast-ICA " fixed point algorithm
The ICA method defines a separation model in order to estimate the sources ̂ given the whitened mixtures ̂ (10) Therefore, the goal of the ICA subsequently is to estimate W T , called the whitened separation matrix.In particular, the Fast-ICA estimates the independent components by maximizing the non-gaussianity, defined as the opposite of the deviation of this signal distribution relatively to a gaussian signal distribution of the same power.It is thus possible to separate the sources of a linear mixture by maximizing the non-gaussianity of the obtained output signal by a linear combination of the observations.There are multiple approaches to measure the nongaussianity.After several trials with different approaches, mainly: normalized kurtosis, negentropy, [26], [29], the authors in the literature opted for negentropy.Next, the steps of the Fast-ICA algorithm are described, represented by the flowchart of Figure 1 Indeed, the application of the Fast-ICA algorithm starts with step (e 1 ) which assigns a positive and infinitely small value to a parameter, called convergence threshold and denoted ε.
The next initialization step denoted (e 2 ) consists in constructing W(k), as well as its zero-order orthogonalization in step (e 3 ).Thereafter, the algorithm iteratively performs the following two steps  step (e 4 ) of updating the matrix W to the order k is performed by the following equation of the fixed point of the negentropy where the function g is representing the non-linearity of the Fast-ICA algorithm, which will be detailed later.
 the orthogonalization step (e 5 ) based on the symmetric method, which does not favor any vector w, consists of starting directly from any matrix W, orthogonalizing it by the Gram-Schmidt approach, as follows  finally, at the end of each iteration, the algorithm checks in step (e 6 ) whether it has reached a maximum of the negentropy, which is based on the thresholding process given by ‖ ‖ 

C. Choice of the nonlinearity:
The function of equation ( 11) is the non-linearity of Fast-ICA, which can be, as shown in the literature


The choice of the function has a direct impact on the updating of as indicated in equation ( 12), and consequently on the overall performance of the algorithm.

A. System Description
In order to evaluate the efficiency of the method described above, the asynchronous double-feed machine-speed multiplier combination of Figure 2 has been considered.More precisely, the defects relating to the gear-A twostage speed multiplier are interesting.Indeed, the gear in question is composed of four toothed wheels (R1, R2, R3 and R4).The system under consideration is assumed to operate at nominal speed of 1012 rpm on the side of the generator (Wheel R4) and 46 rpm on the turbine side (Wheel R1).
In fact, the vibrations resulting from the gearbox operation are due to the forces of mutual contact between the teeth of the wheels in contact.For two wheels, of the same stage, making contact, a meshing frequency is given by  where the f r,i and f r,i+1 are the rotational frequencies of the wheels for the same considered stage.The numbers of teeth relative to each wheel are denoted Z i and Z i+1 .
Under healthy gear, the vibration spectrum typically shows the harmonic chain in (19) with small amplitudes.


On the other hand, in presence of a uniform wear fault on all the teeth of the same wheel, the amplitude of the harmonics in (19) shows a noticeable increase, making it possible to www.ijacsa.thesai.orgidentify the wheel affected by the fault.In the considered system, the wheels R1 and R2 have the same meshing frequency f mesh1,2 and have respectively different lateral frequencies (f l1 ,f r1 ) and (f l2 ,f r2 ).
Likewise, the wheels R3 and R4 have the same meshing frequency f eng3,4 and have respectively different lateral frequencies (f g3 ,f d3 ) and (f g4 ,f d4 ), as detailed in Table I.

B. Description of mixtures
The mixtures are linear combinations of the sources as indicated in equation ( 2), where matrix A and vector b must be specified  the adopted mixture matrix A is chosen as On the other hand, a mixture can be represented either in the temporal space or in the spectral space.Nevertheless, the choice of the appropriate representation is required.It allows to know whether a mixture is in healthy mode or in faulty mode.
The temporal representation of the mixtures makes it possible to distinguish the healthy mode from the faulty mode.Indeed, the amplitudes in the faulty mode shown in Figure 4 are generally greater than the amplitudes in the healthy mode illustrated in Figure 3.However, the problem is that the temporal representation does not make possible to display exactly which wheels are affected by the fault.For this purpose, it is preferred to use the spectral representation instead of the temporal representation.For instance, for the frequency-band centered at f mesh34 , the spectrum of mixture 1 in the faulty mode, presented in Figure 6, is similar to the spectrum of this mixture in the healthy mode, in Figure 5.This result shows that the wheels R3 and R4 are healthy.
On the other hand, for the frequency-band centered in f mesh12 , the spectrum of mixture 1 in the fault mode, in Figure 6, is different from the spectrum of the same mixture in the healthy mode, in Figure 5.This proves that the wheels R1 and R2 are affected by the uniform wear fault.

C. Study of the whitening preprocessing
The mixtures are firstly whitened using Principal Component Analysis (PCA) technique.
Let x be the sample composed of points in extracted from the mixtures X in the faulty mode, Figure 7. PCA computes the two following moments of x  the arithmetic mean (22)  the covariance matrix The points of x are centered relatively to µ.It comes the two matrices  D : matrix whose diagonal values are the eigenvalues of  E : matrix whose columns are the eigenvectors of In Figure 7 and 8, only the three eigenvectors ⃗⃗⃗⃗ , ⃗⃗⃗⃗ and ⃗⃗⃗⃗ are displayed.
And, then, the obtained whitening matrix U expressed in (8) based on D and E Therefore, the whitened mixtures V is the projection of X into U, as shown in equation ( 9).These resulting mixtures V are more appropriate for the source separation than the original mixtures X.Indeed, the sample v composed of the points belonging to V, shown in Figure 8, has the following appealing characteristics  the zero arithmetic mean: (27)  the identity covariance matrix  the orthonormal basis composed by the eigenvectors corresponding to the columns of E.

D. Study of the source separation processing
In this section, the performance of the Fast-ICA algorithm is evaluated for the source separation task.

1) Performance measures
In the context of the separation of vibratory signal sources, performance measurement is an essential task for assessing separation quality.Therefore, the following measures are adopted [30]  the Signal-to-Distortion Ratio (SDR) ( ) is a version of the original source modified using an allowed distortion , such that encompasses several time-invariant gains distortions,  and are, respectively, the error terms relative to interferences and artifacts.

2) Results and discussion
Our goal in this section is to identify the non-linearity results in the best performance of source separation using Fast-ICA algorithm.Furthermore, the experiments are conducted with healthy and faulty gears.
In the case of healthy mode, the results of table II  What is interesting in this faulty mode is that the sources of the damaged gear wheels R1 and R2 have been well separated based on tanh.Indeed, tanh gives rise the highest values of SIR: 88.67 and SDR: 59 for the estimated source of damaged wheel R1.Similarly, tanh gives the highest values of SIR: 75.75 and SDR: 38.85 for the estimated source of damaged wheel R2.
Moreover, the obtained average SAR values are between 45 and 48 for all the non-linearities.These values are higher than average SAR values obtained in the case of healthy mode.Therefore, Fast-ICA results in less overlap artifact in the faulty mode.

IV. SPECTRAL ANALYSIS
In this section, the results obtained by the Fast-ICA algorithm are studied.First, Fast-ICA converges quickly in up to 15 iterations, which confirms that this algorithm is a fast variant of the ICA.On the other hand, in our experiments, Fast-ICA is applied on two types of mixing: healthy mode and fault mode.Therefore, two questions that arise: Is the Fast-ICA able to separate the gear signals associated to the four www.ijacsa.thesai.orgwheels R1, R2, R3 and R4? And, in the case of a fault mode, can it distinguish between damaged wheels and healthy wheels ?By observing the spectrums of the estimated sources, Fast-ICA succeeded in separating the gear signals corresponding to each wheel.The spectrum presented in Figure 9 is composed of a fundamental frequency f mesh,12 and two lateral frequencies f l,1 and f r,1 .Thus, the source 1 corresponds to the wheel R1.The second spectrum illustrated in Figure 10 is composed of a fundamental frequency f mesh,12 and two lateral frequencies f l,2 and f r,2 .Thus, the source 2 identifies the wheel R2.The third spectrum shown in Figure 11 is composed of a fundamental frequency f mesh,34 and two lateral frequencies f l,3 and f r,3 , leading to a clear identification of wheel R3.Finally, the spectrum given in Figure 12 is composed of a fundamental frequency f mesh,34 and two lateral frequencies f l,4 and f r,4 .Thus, the source 4, corresponding to the wheel R4, is clearly identified.
By comparing the results obtained in fault mode with the results obtained in healthy mode, Fast-ICA distinguishes between the faulty sources and the healthy sources of the gears  The wheel R1 has two slightly different spectrums.
Indeed, the spectrum of the wheel R1 mentioned in Figure 13 is slightly different from the spectrum of R1 in Figure 9 which is in a healthy mode.
 The wheel R2 has two slightly different spectrums.Indeed, the spectrum of the wheel R2 mentioned in Figure 14 is slightly different from the spectrum of R2 in Figure 10 which is in a healthy mode.
 On the other hand, the other two wheels R3 and R4 are healthy.Indeed, each of these two wheels keeps almost the same spectrum in the healthy mode and in the fault mode, as shown in Figures (15,11) and Figures (16,12) respectively.

V. CONCLUSION
In this paper, a diagnostic technique is presented for separating and identifying uniform wear in two-stage gearbox, classically associated to a double-fed induction machine in modern wind energy conversion systems.Based on Fast-ICA, the main contribution of the proposed technique is its ability to isolate the fault frequency components, representative of a uniform wear for each pinion or gear of the gearbox.
The obtained results show also clearly the ability of the Fast-ICA for separating the characteristic frequency components of the gears from noisy mixtures.Moreover, the spectral analysis allows us to distinguish for each estimated source associated to a gear whether it is healthy or faulty.As a perspective, further faults than gear fault would be taken into account in a future work.

Fig. 2 .
Fig. 2. Simplified representation of the generator association with doublefed multiplier of speed composed of two stages.


the vector b, of dimension 4×1, for the weighting of noise in the mixtures

Fig. 3 .
Fig. 3. Mixtures used for the separation under healthy condition.

Fig. 9 .Fig. 10 .
Fig. 9. Spectrum of signal resulting from the estimation of component relative to the wheel R1, in healthy mode.

Fig. 11 .
Fig. 11.Spectrum of signal resulting from the estimation of component relative to the wheel R3, in healthy mode.

Fig. 12 .
Fig. 12. Spectrum of signal resulting from the estimation of component relative to the wheel R4, in healthy mode.

Fig. 13 .
Fig.13.Spectrum of signal resulting from the estimation of component relative to the wheel R1, in fault mode with uniform wear of R1 and R2.

Fig. 14 .
Fig. 14.Spectrum of signal resulting from the estimation of component

Fig.Fig. 16 .
Fig. Fig. 16.Spectrum of signal resulting from the estimation of component relative to the wheel R4, in fault mode with uniform wear of R1 and R2.

TABLE I .
FREQUENCIES OF FAULT IDENTIFICATION Rotating Speed (tr/min) Number of teeth Z i f lk, (k=1..4) f rk, (k=1..4) are obtained.It is obvious that the tanh outperforms the other nonlinearities.Indeed, it gives rise to the highest average values of SIR: 69.3 and SDR: 41.47.Kurtosis gives the second best performance in terms of SIR: 65.87 and SDR: 38.02.The gauss non-linearity gives significantly lower average values of SIR: 27.23 and SDR: 21.95.The worse performance is obtained by Skew non-linearity.It gives very low average value of SIR: 3.96 and SDR: 3.97.All the non-linearities result in close high SAR values between 41 and 44.Therefore, Fast-ICA leads to low overlap artifact in the estimated sources.In the case of faulty mixtures, the results of table III are obtained.Particularly, Fast-ICA performs the best separation using tanh non-linearity.It results in SIR average value 81.27 and SDR average value 48.15.