Diagnosis of Parkinson ’ s Disease based on Wavelet Transform and Mel Frequency Cepstral Coefficients

The aim of this study presented in this paper is to determine the choice of the appropriate wavelet analyzer with the method of extraction of MFCC coefficients for an assistance in the diagnosis of Parkinson’s disease. The analysis used is based on a database of 18 healthy and 20 Parkinsonian patients. The suggested processing is based on the transformation of the speech signal by the wavelet transform through testing several sorts of wavelets, extracting Mel Frequency Cepstral Coefficients (MFCC) from the signals, and we apply the support vector machine (SVM) as classifier. The test results reveal that the best recognition rate, which is 86.84%, is obtained by the wavelets of level 2 at 3 scale (Daubechie, Symlet, ReverseBior or BiorSpline) combination-MFCC–SVM. Keywords—Parkinson disease; discrete wavelet transform; MFCC; Support Vector Machine (SVM)


I. INTRODUCTION
The history of Parkinson's disease began in 1817 by James Parkinson [1].It results in a slow and gradual destruction of neurons of the brain's dark substance.It is the second most common neurodegenerative disease, behind Alzheimer's disease.The most obvious motor symptoms of the Parkinson's disease are trembling, rigidity, slowness of movement, difficulty with walking, and communication.
Signal processing techniques have evolved swiftly in the last few years in the biomedical field such as respiratory sound analysis, electrocardiography (ECG), and even for the diagnosis of Parkinson's disease.
The acoustic treatment has been used recently in the diagnosis of many diseases.The MFCC for the extraction of cepstral coefficients has been used in the identification of diseases in newborns by Yasmina Kheddache and Chakib Tadj [2] also Takaya Taguchi et al. [3] for the major depressive disorder discrimination and for stress recognition from speech Salsabil Besbes and Zied Lachiri work with a multitaper MFCC features [4], whereas Zied Lachiri had also works on emotion recognition [5,6].Always at the acoustic treatment we found also Nawel SOUISSI and Adnane CHERIF they work on voice disorders identification [7].
We can opt for some approaches for the detection of Parkinson's disease by the use of different characteristics of speech: work on the short time jitter and shimmer parameters by Mohammad Shahbakhi et al. [8] and by Athanasios Tsanas et al. [9], who has an exactitude of over 90% reported to discriminate against Parkinson's disease compared to healthy patients.
Savitha S. Upadhyaa et al. [10] worked on the detection of Parkinson's disease from the extraction of MFCCs using the multitaper Thomson windowing technique.Orozco -Arroyave et al. [11], obtained a 60% accuracy by extracting the MFCC coefficient as recognition accuracy.Achraf Benba et al. [12] obtained a percentage accuracy (80%) by the combination of MFCC and the SVM [12], on a database of 17 healthy patients and 17 Parkinson patients.
We are interested in the Mel Frequency Cepstral Coefficients (MFCC) method [15], which is a method of extracting parameters according to the Mel scale.In fact, the perception of speech by the human auditory system is based on a frequency scale similar to the Mel scale [16].The diagnosis of Parkinson's disease from the detection of vocal disorders using MFCC was first suggested by Fraile et al. [17,18].
In this work, we develop the diagnostic model of Parkinson's disease [12] by the introduction of vocal signal compression of a database [19], which is composed of 18 healthy patients and 20 patients by wavelet transform through the testing of numerous sorts of wavelets, then we will extract the Mel scale cepstral coefficients (MFCC) of the transformed signals, and at the end classification study by vector machine (SVM) which is one of the algorithm of machine learning [20].We will create a learning base that shows the percentage of 73% of the database and test the entire database to confirm whether patients are ill or not of each wavelet in order to choose the accurate one.

II. WAVELET TRANSFORM
Wavelets were introduced in the early 1980s by Morlet and Grossmann [21].Then Mallat [22], Daubechies [23] and Meyer [24] established their own mathematical basis for wavelets.The wavelet transform decomposes the signals from a mother wavelet on a family of wavelets dilated by a coefficient of scale "a" inversely proportional to the frequency that enables to get different versions: dilated ones or compressed ones of the window, and translated by a translation coefficient "b" which characterizes the displacement of the window along the time www.ijacsa.thesai.orgaxis.The continuous wavelet transform (CWT) of a signal s(t) is defined by [25]: Where   Ψt is the mother wavelet, and   * Ψt is the complex conjugate of   Ψt.In this study, we are interested in discrete wavelets (DWT) because of their simplicity and their reduction of computation time and because, in the CWT, the coefficients of scales and dilations vary in a continuous way in the frequency and time domain of the signal analysed, which involves a significant consumption of the factor of time.
Mallat [22] had proposed an algorithm for the wavelets coefficients calculation which is based on multi-resolution analysis that conceives discrete wavelet transform such as sequence of filter application.
In fact, every signal consists of low frequency components called approximations and high frequency components called details.According to Mallat (1989), we can separate the details and approximations by using a pair of filters H and G which are a complementary low pass filter and a high pass filter.The low pass filter is a scaling function while the high pass filter is a wavelet function.Thus, the multi-resolution analysis allows a multi-scale decomposition of the starting signal by separating, at each level of resolution, the low frequencies (approximations) and the high frequencies (details) of the signal (Fig. 1).

III. MFCC
The MFCC method proposed by Davis and Mermelstein [26] aims at extracting the characteristics parameters of vocal signal.
MFCC analysis consists of exploiting the properties of the human auditory system by transforming the linear scale of frequencies into the Mel [27] scale which provides the most efficient representation of the voice.The block diagram in the following Fig. 2 roughly describes the process of generating MFCC coefficients.Now that we know the general operation of this procedure, we will explain the main blocks that constitute it.

A. Préaccentuation
The pre-emphasis is a filtering operation of a voice signal {s n , n = 1, ... , N} in a first-order finite impulse response digital filter whose transfer function H (z) is given by [28]. (2) In this study, we experimentally set the pre-emphasis coefficient k at 0.97 [26].
Thus, the pre-emphasized signal is linked to the signal by the following formula: This operation permits to accentuate the high frequencies of the signal.

B. Segmentation
The voice signal is of a non-stationary nature whereas the signal processing methods use stationary signals.It is therefore necessary, before extracting the parameters of the recognition, to cut the signal into frames of N speech samples in the interval.From 10 to 30 ms, this step enables us to obtain, for each speech segment, a quasi-stationary signal.The two adjacent frames are overlapped to avoid abrupt frame-to-frame transitions [12].

C. Windowing
The discontinuities at the ends of the frames are produced by the segmentation.The purpose of the windowing is to reduce these discontinuities of the signal by multiplying the samples {n = 1, ... .., N} of each frame of the vocal signal by a window of hamming.The Hamming window is given by the following equation [29,30]: This advantage of this window is that its frequency resolution is high and its secondary spectral lobe is very small compared to its primary lobe (attenuation of -43 dB) [31].

E. Mel Filtering with Filter Bank
The MFCC method is a method for extracting parameters according to the Mel scale in the frequency domain.Indeed, the perception of speech by the human auditory system is based on a frequency scale similar to the Mel scale.This scale is linear at low frequencies and logarithmic at high frequencies and is given according to the following equation:

F. Logarithm and DCT
The MFCC coefficients can be calculated directly by applying the discrete cosine transform (DCT) of the logarithms of the energies obtained by M triangular filters (In this study we have set M to 20) [29,30]: With i as the number of coefficients to extract, and N as the number of triangular filters

G. Liftering
The higher order of the cepstral coefficients is too small.To overcome this problem, we use the liftering in order to raise the cepstrum which consequently increase the amplitudes so that they become quite similar [29,30].
Where, L is the Cepstral sine lifter parameter.In this study we used L = 22 [29].
IV. CLASSIFIER SVM Support Vector Machine (SVM) is a class of machine learning method (kernel learning method [32]) developed by Vapnik and al. in the early 1990s [33].In the classification problems with small samples, the SVM is considered as one of the most powerful tool.
SVM can transform a nonlinear separable problem into a linear separable problem with different kernel functions by projecting the training set into the feature space and then constructing a hyperplane that maximizes the margin between the data [34,35] for classifying the test samples.The function of the hyperplane can be described as: Where w is a normal vector of the hyperplane and b is a variable.
Supposing that the training set can be described as: x ,y , x ,y , ........ , x ,y 1 1 2 2 n n x i ϵ R n , i =1, 2, ..., n, y n ϵ {−1, 1} is the class label for x i and n is the number of the training sample.
With the training set S, the optimal w and b can be obtained by solving the following optimal problem by associating a multiplier of Lagrange [36]: Where W is the weight vector and b is the bias, both of which are determined only by the training samples.The regular parameter C is a penalty factor, which can balance the model complexity and empirical risk.In addition, ξ i 's are positive parameters called slack variables, which represent the distance between the misclassified sample and the optimal hyperplane.Function   K x ,x ij is the kernel function, we represent among them:  Linear kernel (simple produit scalaire):  Radial Basis Function (RBF) kernel: www.ijacsa.thesai.org  Polynomial kernel: Then f(x) can be computed with the following formula: Where * α i are the no nulls α when α = (α 1 , ... , α T ) and T is the Lagrange multiplier vector.

VI. RESULTS
This study intends to choose the analyst wavelet.The criterion of choosing the best wavelet remains a problem to be determined.Unfortunately, there is no wavelet that is better than the others.It all depends on the application.In some cases, the simplest wavelet (Haar) will be optimal.For other applications, it will be the worst choice.
To determining the performance of the SVM classifier, we implant a block based on the discrete wavelet transform before the extraction of the MFCC coefficients in order to have an accurate diagnosis of Parkinson disease (see Fig. 4).
We make use of the database [19] which consists of 20 recordings of patients affected by Parkinson disease and 18 sound ones.They all utter the vowel "a".
We apply the multi-resolution analysis algorithm of Mallat (Fig. 1) by using different analysts wavelets (see Table I) with this vocal signals, then the extraction of the cepstral coefficients at the Mel scale, and at the end as classification by SVM.
Fig. 5 presents wavelet and scaling functions of each wavelet in Table I, at the level 2 (db2, coif2, sym2 …).In the first phase, we transform these recordings using the discrete wavelets.Fig. 6(a) presents the vocal signal of one patient before the compression and after using the different types of DTW; (b) presents a zoom at the two representations of the signal.At the second phase, the approximation a3 of each DWT will be the input to the MFCC block in which we extract the first 12 MFCC coefficients of each patient using the program "Htk mfcc matlab" [37].We take only the 12 first coefficients because after that the Accuracy starts to decrease [12].These coefficients are the characteristics on which we will rely to make a classification in order to have an accurate diagnosis.The MFCC contains a large number of frames that require significant processing time for classification and that prevent an accurate diagnosis [27].To solve this problem, we calculated the average value of these images to get the voiceprint.Fig. 7 TN Specificity = TN + FP (17) With:  TP a true positive (sound patients who were correctly classified).
 TN a true negative (patients affected by Parkinson disease who were correctly classified).
 FP a false positive (patient affected by Parkinson disease who was incorrectly classified).
 FN a false negative (sound patients who were incorrectly classified).
Percentage calculations of accuracy, sensitivity, and specificity of all the recordings from the training base that was created between the MFCC block output and the Input of the SVM block (of 73%) are given in following Table II.
Former process of diagnosis of Parkinson's disease published in the studies of Achraf [12] who based his study on MFCC without using wavelet reached 80% of accuracy whereas our study that used several wavelets before the extraction of the cepstral coefficients achieved an accurate more than 80% as shown in Table II.
The accurate of 86.84% as reached at the level 2 and at the 3 rd scale of some wavelet which are: daubechie, symlet, Bior Splines and Reverse Bior.DWT by the first five scales approximation which will be injected into the MFCC block in order to extract the 12 cepstral coefficients each time.These coefficients are applied in the classification using the SVM classifier with a learning base which is 73% of the database.When we do a test with testing data that contain all the recording we obtain an accuracy of 86% which is higher than the results achieved in the block without wavelet.From that we can conclude that working with the discrete wavelet transform increases the accuracy of the classifier at the level 2 and at the 3 rd scale while using daubechie, symlet, biorsplines and Reverse Bior.We conclude that there is no need to work with the 4 and 5 scale because from the 3 rd scale, the accuracy starts to decrease.

Fig. 2 .
Fig. 2. Process of the Extraction of the Cepstral Coefficients.
V. METHODOLOGY This study intends to determine the performance of the SVM classifier by implanting a block based on the wavelet transform before the extraction of the MFCC coefficients for an accurate diagnosis of the Parkinson's disease (see Fig.

Fig. 3 .
Fig. 3. Process of Diagnosis of Parkinson's Disease.Our methodology for fulfilling the objectives of this study, which is based on the combination of wavelet choice, MFCC and the classifier SVM, functions in two phases: learning phase and test phase, as shown in Fig. 4, from a database [19] composed of 18 healthy patients and 20 Parkinson's patients.During the learning phase, the vocal signals, after the wavelet application and the extraction of the MFCC coefficients, enables us to obtain a model for the sick patients and sound ones.During the test phase and even after the application of the wavelets and the extraction of the MFCC coefficients, the classifier makes the membership decision based on the similarity between the model established during the training and the test.

Fig. 6 .
Fig. 6.(a) A Zoom of Speech before Compression.(b) A zoom of Speech after Compression using.
(a) presents the 12 coefficients of MFCC and voiceprint for a healthy patient, whereas; (b) presents coefficients of MFCC and voiceprint for a patient affected by Parkinson disease.The third phase aims at classifying sick and sound patients.To achieve the goal, we create training base (73%) of the database.Then we do a diagnostic test on the whole data using the training base of (73%), with SVM classifier with the linear kernel.

TABLE II .
ACCURACY, SENSITIVITY AND SPECIFICITY OF THE DIFFERENT WAVELETS AT THE FIRST 5 SCALES In this article, we have presented a sample of Parkinson's disease based the extraction of MFCC coefficients from a database of recordings of sick patients and sound ones.The transformation of vocal signals is treated by numerous types of