Identification of People with Parkinson's Suspicions through Voice Signal Processing

Parkinson is considered a disease with a very random prognosis, in addition to its origin due to a multisystemic neurodegenerative process that affects the central nervous system, which is responsible for motor control of the body and also produces chronic joint pain if the patient is not treated also suffers states of depression. This disease currently has no cure so it recommends the patient's family to provide quality of life, the age of incidence is from 40 years, according to the INCN (Instituto Nacional de Ciencias Neurológicas) indicates that there are 3,000 cases of Parkinson in Peru annually. In this research paper, it proposes the creation of an algorithm in MATLAB capable of extracting the characteristics of the voice spectrum through the voice signal processing to provide an early detection so that they can receive treatment, appease and slow down Parkinson's disease. This processing will consist of submitting the audio by the Fast Fourier Transform (FFT), identifying the signal bodies, separating by frequency periods, to finally find the average and maximum values. It was identified that in the lower frequencies are where there are major differences, in addition the test was done with patients who has Parkinson's suspicions and the same differences were obtained resulting in the frequency periods [9Hz – 13Hz], [20Hz –30Hz] and [40Hz – 54Hz]. Also note that the period of 20 Hz to 30 Hz is where if the values in this frequency are less than 3.5 in amplitude they are principles of suspicion of Parkinson's disease. Keywords—Voice signal processing; Parkinson Disease (PD); Fast Fourier Transform; speech signal segmentation; audio treatment


I. INTRODUCTION
Parkinson's disease (PD) is caused by a multisystemic neurodegenerative process that primarily affects the central nervous system, also characterized as bradykinesia (slow movement), stiffness, tremor and progressive loss of postural control [1]. The progressive damage is random for each person, and the causes of this disease are not known with certainty. This disease is not fatal means that the affected person is not going to die from PD, in other words the life expectancy of a patient with PD is equal to that of a normal person. Unfortunately, there is currently no definitive cure for PD, which is why it is called chronic and incurable disease [2].
According to the INCN (Instituto Nacional de Ciencias Neurológicas), of the Ministerio de Salud (MINSA), it indicated that Parkinson's disease affects approximately 30 thousand people and also the current incidence in Peru is 3,000 cases of patients with PD annually [3]. Some more common symptoms that manifest at the beginning and continue in the disease process are, for example: joint pain, extreme tiredness means that feels the body tired very frequently in some cases without physical effort, dragging the foot, writing difficulties due to the tremor generated by his body, broken voice and long-term depressive symptoms [4].
According to the INCN, Parkinson's disease affects men and women, because it does not exclude sex, in addition, it indicates that 90% of cases of this disease happens since the age of 40 years old, however, the average age of incidence of the disease is between 50 and 60 years old [5]. The INCN also indicates that there are many medical investigations for the treatment and discovery of the cause of this chronic disease that affects the central nervous system and control of body movement [6]. The neurologist, Carlos Cosentino Esquerre, indicates that these patients with PD mostly have states of depression and that is why they should be provided with quality of life as well as motivation for the practice of the exercises and walking for periods of time [7].
Parkinson is classified by degrees from I to V, usually beginning with mild symptoms only in the middle of the body, then in the second grade, it presents throughout the body, after postural instability, these symptoms can already be observed but the patient remains independent; continuing with the next grade, the patient will present a severe disability, in addition to the well-marked symptoms; finally, when the patient is in the last grade, it is already when the patient becomes dependent and needs help for everything, just be sitting or in bed [8]. In all grades, constant tremor occurs in their bodies. There are currently medical and non-medical treatment to reduce symptoms, slow the evolution of the disease and effectively improve the quality of life of patients [9].
The identification of people with PD can help the early medication and in many cases prevent visible symptoms, in addition to slowing down the PD process. It is known that a patient who is not treated in time about this disease, its chronic pain will be very strong and the progress of the disease is fast. In addition to stressing the physical effort that must be made or practiced to keep the body active as well, avoiding depression and disease progression [10].
The interruption of motor speech control is very frequent, as well as notable in patients with PD, there is an estimate that more than 90% of these patients present and develop a speech disorder known as Hypokinetic Dysarthria (HD) [11]. Among other speech symptoms associated with HD, vocalization www.ijacsa.thesai.org abnormalities are the most frequent and widely observed in patients with PD. It can be characterized by the general reduction in tone and volume during the prolongation of speech, in addition to the self-control of abnormal speech [12].
In [13], they explain the creation of an electronic device that will provide speech exercises proposed for the patient with Parkinson's Disease for their voice spectrum analysis, their analyzes were carried out in closed environments to avoid ambient noise thus obtaining data without distortions also indicated that the frequency range of 10 Hz to 30 Hz is where different anomalies were presented to the audios of a normal person. This research paper explains the defragmentation of the voice signal of the patient with PD for each exercise, obtaining notable characteristics, such as distortion at low frequencies and low amplitudes, these same results were obtained in this paper with low frequencies being one of the most differentiated.
In [14], this research paper explains the voices at rest of Parkinson's Disease patients, which have a peculiarity because after obtaining the voice signal, they defragment it to obtain samples of the signal, then analyze the variations that the peaks and falls of the signal occur, resulting in the letters where there are more difficult to pronounce for patients with PD. In addition, they presented that the frequencies where the greatest variation is identified compared to people who do not suffer from this disease was between 15 and 45 Hz. Also, in this research paper, they used medical equipment such as dopamine to know how the patient improved, obtaining as a result, and an improvement in the pronunciation of the words up to certain phrases. Voice spectrum analysis is important for the identification of patients with PD due to its characteristic frequency and amplitude.
In [15], this research paper shows a predictive analysis of people with possible suspicions of PD through the analysis of the voice spectrum, samples were taken from people who had tremor in the voice and then represent it in the time and amplitude domain, In addition, they analyzed people with PD to identify in what time periods a difference and time difference is obtained. This analysis was done to prevent and accelerate the symptoms of Parkinson's disease, in addition, by applying artificial intelligence methods, they trained a system to be able to identify people who have possible signs of having Parkinson's disease.
The main objective of the research paper is the identification of people with Parkinson Disease through voice processing because these patients have a tremor at the time of reading or speaking, this is why audios of these patients with PD and people that do not suffer from this disease were recorded, to then process both voices signals and compare them, thus obtaining the most notable differences and also identify the frequencies, the average and the maximum value where there is a significant variation.
Digital voice processing is the study of voice signals and the techniques used for processing these signals; it is used for voice knowledge, voice coding, voice analysis and etc. Also in the biomedical environment, voice treatment can be used for the treatment of the voice spectrum and then use filtering techniques because it knows that we can obtain noise in the recording, which must be eliminated. Currently, it uses voice processing for voice distortion and pattern recognition as specific words [16].
Digital signal processing is the mathematical manipulation of an information signal to modify or improve it with respect to a parameter, they are usually represented in the time and frequency domain, but in the case it wants to compare signals or identify differences, These signals have to be filtered so that they are all in the frequency domain and their amplitude to identify the average and maximum values. This procedure will serve to verify differences between other voice compositions, this process is widely used in voice tuners to modulate it at different frequencies and powers. In this research paper, it will be used to know the amplitude and average of the frequency periods and to know the differences of patients with Parkinson's disease and who do not suffer from it.
The following research paper is structured as follows: In Section II, the digital voice processing methodology for conversion to the frequency and amplitude domain will be presented, in addition to filtering the ambient noise to obtain the real voice signal. In Section III, bar diagrams will be shown indicating the maximum values and the average of the frequency periods identifying the differences in voice between patients with early PD and patients who do not suffer from this disease. Finally in Section IV, we will present the discussions and conclusions of the research work.

II. METHODOLOGY
In this section, it will present the steps that were followed for the Identification of People with Parkinson's Suspicions through voice processing, in addition in Fig. 1, the process of the algorithm is shown, as the reader can see, several processes were used to obtain clean audio without any external noise.

A. Audio Acquisition
For audio acquisition, a Smartphone was used for recording because the audio recording was required to be as close to the person as shown in Fig. 2.
In addition, a Bachelor of Literature Milton Gonzales was required to generate paragraphs that require a lot of good modulation for the pronunciation of words. These paragraphs consisted of fragments of Peruvian literary books being the following: Audio 1: "Un poema es una obra. La poesía se polariza, se congrega y aísla en un producto humano: cuadro, canción, tragedia. Lo poético es poesía en un estado amorfo; el poema es creación, poesía erguida. Sólo en el poema la poesía se aísla y revela plenamente" [17]. Audio 2: "Apenas desviamos los ojos de lo poético para fijarlos en el poema, nos asombra la multitud de formas que asume ese ser que pensábamos único. ¿Cómo asir la poesía si cada poema se ostenta como algo diferente e irreducible? La ciencia de la literatura pretende reducir a géneros la vertiginosa pluralidad del poema" [17].  After obtaining the audios, these were changed from format to .wav because it is preferred to use this format when using MATLAB software for the following voice processing. In addition, it should be taken in mind that it does not matter how much time each person takes because it will be analyzed in the frequency and its amplitude domain.
To declare the audio files, it must indicate that these audios have two variables being the Data and the Sampling Frequency (Fs), the following programming: [data,fs]=audioread('audio_patient.wav'); These variables will be used for programming and use in the following steps of voice processing, this signal is like the one shown in Fig. 3(a).

B. Audio Processing
At this stage, the voice signal was already declared in the software, so filters must be applied to improve the voice signals, because of that, the first filter that is applied is the noise elimination. This filter helped us to sector the frequencies where the relevant data was for the analysis and eliminate the others.
After eliminating the external noise, the Fast Fourier Transform was applied to the voice signals; this process helps to change the domains of the signal so that they can then be analyzed. The software makes a sum by means of complex numbers where the frequency, time and amplitude of the signal are taken into account, as shown in Fig. 3(b) for that it extracts data using the following programming in the software: n = length(data); y = fft(data); f = (0:n-1)*(fs/n)/10; power = abs(y).^2/n; As shown in the previous programming, the length of the data and the sampling frequency are important to graph the signal submitted to the Fast Fourier Transform, in addition to the "Power" section, it is observed that it uses absolute value to the signal after the FFT because there are negative values. The next step for voice processing is to identify the signal bodies, this process is done to know the frequency periods to which the signal must be segmented and thus extract the characteristics of each of the periods [18]. The frequencies sectored by observing the bodies as shown in Fig. 3 These partitions by frequency period can be visualized in Fig. 3(d), where each one adapts to its amplitude and parameterized frequency. For this reason, the following programming was used: if f(i) < frecuencia2 C(control) = f(i); control=control+1; end end end As shown in the programming, a "For" loop was used because the algorithm was required to review data by data to sectored a certain frequency range and also create a frequency matrix to position the data. After that, it indicates in the software that each frequency position has an assigned data, for that the following programming was used: It uses the "find" function because each frequency is assigned a data and because of that, the algorithm knows that its amplitude data is in the same position of the frequency so that it can be plotted. To graph the voice signal in the declared frequency range, as well as find its average and maximum value, the following programming is used: plot(f(data1:data2),power(data1:data2)); mean = mean(power(data1:data2)); max = max(power(data1:data2)); Finally, when obtaining the data, it will graph the averages and the maximum values to identify the differences, as can be seen in Fig. 3(e) and (f), in these bar graph, it can be denoted that at low frequencies, it is where the difference between the voice spectrum of a person with PD and a person who does not suffer from this disease is mostly presented.

III. RESULTS
Voice audios were acquired from the same paragraphs because the differences of all audios are required. This analysis was performed on 10 people of which 2 of them are patients with Parkinson's disease are confirmed clinical states, 3 were people had early symptoms of Parkinson's disease and 5 people who do not suffer from this disease. Each one of them was taken 2 audios, which means that 20 data were obtained that were analyzed in the algorithm to recognize suspicions of people with Parkinson's disease.
As mentioned earlier, the first symptoms of Parkinson's disease are dysfunction of the motor control of the body, in addition to the voice with tremors and with a lower volume than normal, these tremors cause a variation in frequencies and when compared with audios of people who do not have this disease, the differences are remarkable.
In Table I, the characteristics extracted from the voice signals of the Parkinson's patient are shown, in which they were sectored by periods, audios, average and maximum value, as can be verified, the data is much lower in addition to having maximum values very low.  Table II shows the characteristics extracted from the voice signals of a person who does not suffer from this disease, in the same way that the previous table was divided by the different variables that were taken into account. In addition, it can be identified that the data are higher at the first frequencies and also the maximum values that become higher compared to the patient with Parkinson's disease. As can be seen in Fig. 4, the blue bars represent the average and the red bar, the maximum value of the frequency range, the most notable differences are in the first frequency ranges from 9 Hz to 30 Hz, as well as they were mentioned in the background previously studied.
The same test was applied to a patient with possible suspicions of Parkinson's disease, because it presented some symptoms such as motor dysfunction but in early stages, in addition to the trembling voice and the volume of the same. So in Table III, the characteristics extracted from the voice signals of the patient with Parkinson's disease suspicions are shown, where similarities can be identified with the data obtained in the patient with Parkinson's disease.
These data were compared with the voice data of another person who does not suffer or have suspicions of Parkinson's disease, as can be seen in Table IV, has higher maximum values in the lower frequency ranges. This same result being in the previous tests and also being similar to the results obtained by previous papers on the analysis of the voice processing of patients with Parkinson's disease.   Table III and IV. As can be seen in Fig. 5, in the same way the blue bars represent the average and the red bars represent the maximum value of the frequency period represented in the lower part, the differences are also found in the lower frequencies and also with the added of the signals of patients with suspected Parkinson's disease have more noticeable maximum values at higher frequencies.
Concluding both analyzes, it can be indicated that the frequency periods where notable differences are shown are: This means that mostly in lower frequencies different notables can be found and in addition to maximum values,they are also variants, but in both studies in the frequency range of 20 Hz -30 Hz a higher maximum value was presented as well as the averages of the data.

IV. DISCUSSION AND CONCLUSIONS
The research paper confirms the use of voice processing techniques for the detection of suspicions of people with Parkinson's disease due to analysis of the voice spectrum and knowing the frequency periods where a greater difference is appearing. In addition, knowing the importance of the early detection of this disease to slow down and soothe the chronic damage suffered by patients with this disease.
This research paper was focused on frequency and amplitude because it knows that when it records audios in different environments, not everyone can say a sentence at the same speed as another, because of this, the axes of the signal were changed to identify the spectrum of voice in periods of time and maximum observable values in the amplitude.
It is concluded, people with suspicions of Parkinson's disease have a lower voice volume and also choppy being nonlinear. These differences are verified more in the range of frequency and the amplitude of the signal that is why in this research paper the average and the maximum value of the data were found.
It is concluded that the signal was separated by frequency periods because notable bodies were present in such periods, in addition a range was parameterized to subject all the analyzed signals; this was done to parameterize all the analyzes of the voice spectra and thus obtain results when comparing them.
As a work in the future, we want to implement the algorithm in a device where more tests can be done, thus improving its accuracy and also detecting this disease in time to prevent its prolongation and also knowing if they have Parkinson's suspicions.