Hearing Aid Method by Equalizing Frequency Response of Phoneme Extracted from Human Voice

Hearing aid method by equalizing frequency response of phoneme which is extracted from human voice is proposed. One of the problems of the existing hearing aid is poor customization of the frequency response compensation. Frequency response characteristics are different by the person who need hearing aid. The proposed hearing aid is based on frequency response equalization by phoneme by phoneme. Frequency characteristics of phoneme are to be equalized. This is the specific feature of the proposed hearing aid method. Through experiments, it is found that the proposed hearing aid by phoneme is superior to the conventional hearing aid. Keywords—Hearing aid; phoneme; frequency response; equalization filter; hidden markov model (HMM)


INTRODUCTION
In general, hearing capability of human voices is getting bad for elderly persons since a high-frequency response of elderly persons' ears is getting poor.Hearing capability is defined with the well-known averaged hearing capability level that is defined as Averaged value of hearing capability for human voices regarding frequency components ranged from 500 Hz to 4000 Hz.In accordance with the definition, 25-40 dB of loudness of human voices are difficult to hear slightly when human voice is not loud while 40-70 dB of loudness of human voices are difficult to hear when human voice is normal level.
Earlier devices, known as ear trumpets or ear horns [1], [2], were passive funnel-like amplification cones designed to gather sound energy and direct it into the ear canal.After that not so small number of methods have been proposed so far [3]- [10].
Mobile device based personalized equalizer for improving the hearing capability of human voices for elderly persons are proposed.Through experiments, it is found that the proposed equalizer does work well for improving hearing capability by 2 to 55% of voice Recognition success ratio.According to the investigation of the frequency component analysis and formant detections, most of the voice sounds have the formant frequencies for the first to third frequencies within the range of 3445 Hz.Therefore, a nonlinear equalizing multiplier is better to enhance the frequency components for the first to third formants.The experimental results with the voice above input experiments show that a good Percent Correct Recognition: PCR is required for 0 to more than 8000 Hz of frequency components.Also, 8162 Hz cut off frequency would be better for both noise suppressions and keeping a good PCR [11].
As I described above, hearing capability is getting deteriorated for aged persons.It is called "Senile deafness".In Japan, around 18% of peoples whose age ranged from 65 to 74 have a trouble on hearing capability while 40% of peoples whose age is more than 74 have a trouble on hearing capability.There are some young peoples who have a trouble on hearing capability for some specific frequency component.Although they need a hearing aid, most of they do not like to have such conventional hearing aid due to some reasons.It does not look good.Hearing capability, frequency response varied for time being.Hearing capability is different by person.There are some other reasons.
Because of these reasons, a customization of hearing aid is required.Also, equalization of specific spectrum components is required.Furthermore, it would be better to equalize specific frequency component by phoneme by phoneme if they would like to hear human voices.Therefore, human voice hearing capability improvement method by equalizing frequency response equalization by phoneme by phoneme is proposed.This is the specific feature of the proposed hearing aid method.
The following section describes the proposed method for equalization followed by some experiments.Then conclusions are described together with some discussions and future research works.

A. Frequency Response Model
Fig. 1 shows the cochlea of human ear model.From the vestibular window to the end of the cochlea, frequency response is varied from high to low frequency components ranged from zero to 20KHz around.Usually, high frequency response is getting deteriorated by age.On the other hand, some frequency response is degraded for the young generation's deafness.

B. Japanese Phoneme Model
Frequency range of each Japanese phoneme is shown in Table 1.In Japanese, there are just 23 of phoneme.The number of phoneme is different by Language.The number of phoneme of Japanese is smallest followed by Germany (the number of phoneme is 25).
It is considerably certain that it would better to equalize by phoneme by phoneme because frequency component of each phoneme is different each other.This is the fundamental idea of the proposed hearing aid.Also, it is realized by using smartphone or i-phone as an application software installed on the mobile devices.Therefore, it can be customized by human and may be changed the equalization characteristics even if their frequency response is changed for time being.Also, it can be worked in a real-time basis because the equalization filter can be created in prior to use.
Auditory Steady-State Response: ASSR [12] allows to measure frequency response of human ear objectively (Galambos et al. (1981) [13], Rickards et al. (1994) [14], Kuwada et al. (1986) [15]).During sleep, frequency response can be measured using ASSR.Frequency Range(Hz) a 0～1500 i 0～1000,4000～5000 It is proposed to measure responses by input 23 of different phoneme to human ear using ASSR.Then appropriate equalizer for each phoneme is designed and installed it to smartphone or i-phone in prior to use.

C. Procedure of the Proposed Design of Equalization
Before using the proposed equalizer, customization of the equalization is required.The most appropriate equalization filter response is designed as follows: 1) Frequency response characteristic of each phoneme is measured with ASSR.
2) Equalization filter is designed by each phoneme.
Phoneme is extracted from the acquired voice signals based on Hidden Markov Model: HMM 2 which is shown in Fig. 2. "Julius" software which is developed by Julius development team composed with Kyoto University, Nagoya Institute of Technology, etc. which allows speech recognition. 3  First, input voice signals are divided into the frames (25 ms in this case) with the pre-asigned short term shift of the signals (10 ms in this case) as shown in Fig. 3.After that, phoneme is extracted from the frame signal with the quality assessed results "n_score" as shown in Fig. 4. The frames are attached frame ID and assessed frames are attached "unit number".These are candidates of the phoneme.The most reliable phoneme is selected from the candidates.In the case of Fig. 4, #2 of units are selected depending on the assessed "n_score".
Input voice signals are equalized using previously designed equalizing filters by each phoneme.The equalizing filter is designed as a bandpass filter as shown in Fig. 5.Such bandpass filter can be synthesized by composing low-pass, bandpass and high-pass filters.The low-pass filter suppresses the existing noises while bandpass filter enhances the required frequency response.
The high-pass filter suppresses a low frequency noise.Another method for creating equalizing filter is a composition of low-pass and high-pass filter which are shown in Fig. 6.By combine the two low-pass and high-pass filters, an arbitrary frequency response of equalizing filter can be designed.
The filter responses are candidates of the low-pass filters (see Fig. 7).From these candidates, calm frequency response of filter is selected.The detailed flow chart of the proposed procedure is shown in Fig. 8.
After the voice is input in the PC with microphone, phoneme is extracted from the input voice signal followed by division of phoneme by 25ms of frame.Then equalization filter is retrieved by phoneme database followed by integration of the equalized phoneme until the end of the divided frames.After that, the equalized voice signal is output from the PC with speaker.

A. Experimental Environment
Experimental environment is shown in Table 2.The entire program used for the experiment is based on Matlab.

B. Preliminary Experiment
The basic idea behind the proposed equalizing filter is illustrated in the Fig. 9(a).Example of the designed low-pass, high-pass and bandpass filters are shown in Fig. 9(b).Meanwhile, specific frequency ranges can be enhanced as shown in Fig. 9(c).

C. Experimental Results
One of the examples of actual spectrum of phoneme is shown in Fig. 10.This is an example of "a".There are peaks which are named as Formants (from the first to n-th formants) which represent features of the input voices.
Appropriate frequency ranges which must be enhanced are determined with the formants.These formants are estimated with envelops of frequency spectrum of each phoneme.Then appropriate filter response can be designed by the method.Four of patients participate a validation test for the proposed system."Kyo-wa Ii-Tenki-da" in Japanese ("It is fine day" in English) is pronounced by the user.44,100Hz (Sampling frequency) / 16bit (Quantization bit) / monaural voice signal is created.Also, a degraded input voice signal is created by using low-pass filter with "butter-worse" filter with the cut-off frequency at the 5 KHz.This is called as #1 input voice signal hereafter.The #2 input voice signal is also created with conventional frequency equalization with highpass filter (the cut-off frequency is at 5 KHz).Another #3 input voice signal is created with the proposed method of frequency equalizer.The four patients hear these three input voice signals and then evaluate the quality of voice with 5 grades.Table 3 shows the evaluation results.
As the results from the evaluation experiments for three input voice signals, it is found that the proposed method shows superior performance to the other two degraded voice signals and the restored voice signal with conventional highpass filter about 10 points.It is noticed that some of the consonances are not clear enough though.Also, it is noticed that #3 input voice is not so natural since reconstruction is made some sound defects caused by the combining the different frame signal peace of phoneme for the proposed frequency equalization method.In comparison to the conventional method, the reconstructed voice signal by the proposed method is not so noisy.This is one of the features of the proposed method.Through experiments, it is found that the proposed hearing aid by phoneme is superior to the conventional hearing aid by the factor of 9.4 %.
It is found that the proposed method shows superior performance to the other two degraded voice signals and the restored voice signal with conventional high-pass filter about 10 points.It is noticed that some of the consonances are not clear enough though.Also, it is noticed that #3 input voice is not so natural due to the fact that reconstruction is made some sound defects caused by the combining the different frame signal peace of phoneme for the proposed frequency equalization method.In comparison to the conventional method, the reconstructed voice signal by the proposed method is not so noisy.This is one of the features of the proposed method.
Further investigations are required for simultaneous estimation of cornea curvature center and cornea radius, noise removal of the depth image.

Fig. 7 .
Fig. 7. Candidates of the low-pass filter of frequency responses for equalizing filter.

Fig. 8 .
Fig. 8. Detailed flow chart of the proposed procedure.
Fig. 9. Basic idea of the proposed equalizing filter and example of the frequency responses of the designed low-pass, high-pass and bandpass filters.

Fig. 11 (
Fig. 11(a) shows the frequency responses with frequency enhancement while Fig. 11(b) shows the frequency responses without enhancement.#2 in Fig. 4 must be enhanced while #1 and #3 has not to be enhanced.The left image shows #1 and #3 of frequency response while the right image shows #2 of frequency response which must be equalized.The processed voice signals by the proposed frequency response equalization are shown in Fig. 12.The left image is the original voice input signal while the right image shows the reconstructed output voice signal after the frequency equalization.These are corresponding to the voice signals which are shown in Fig. 11(a) of the left and the right images, respectively.

Fig. 12 .
Fig. 12. Example of the processed voice signals by the proposed frequency response equalization.

TABLE I .
FREQUENCY RANGE OF EACH JAPANESE PHONEME1

TABLE III .
EVALUATED RESULTS FOR THREE INPUT VOICE SIGNALSHuman voice hearing capability is improved by equalizing frequency response equalization by phoneme by phoneme.One of the problems of the existing hearing aid is poor customization of the frequency response compensation.Frequency response characteristics are different by the person who need hearing aid.The proposed hearing aid is based on frequency response equalization by phoneme by phoneme.