**4. Results**

Tables 1 and 2 show the measurements of the first formant location for healthy (left) and esophageal (right) voices. Tables 3 and 4 show the measurement errors absolutely and relatively, the relative value is calculated comparing the obtained error with the average formant value. As it can be seen in those tables, conventional methods obtain very poor results, achieving an average deviation of about 70 Hz, approximately the value of the pitch in esophageal speech. These deviations could be inappropriate for some applications which require great accuracy, thus a new measurement method is necessary.

A simple wavelet algorithm with approximation to the formant band improves considerably this results reducing the deviation about a 30%. This represents quite an improvement comparing with LPC, but it is possible to obtain higher resolution without increasing substantially computational costs. The results of the adjustable resolution algorithm show that it is possible to reduce the average deviations up to a 50%.

The obtained values prove that it is feasible to locate formants position with minimum errors and effective algorithms. This fact constitutes a fundamental advance in esophageal speech regeneration, because formant location has great importance in many speech processing algorithms. Taking as an example previous works of the research group, for example for such as an algorithm as the one presented in [2], much better results would be obtained with more accurate formant location estimations.

It is important to highlight the great relevance that this results may have in some other speech technologies fields such as speech recognition, etc. So the applications of this analysis is not restricted to esophageal speech processing but can be implemented with many others purposes.


Table 1. 1st Formant location for **healthy voices** calculated with different methods: LPC, Band Approximation (B.A.) and Resolution Adjustement (R.A.).


Table 2. 1st Formant location for **esophageal voices** calculated with different methods: LPC, Band Approximation (B.A.) and Resolution Adjustement (R.A.).

Tables 1 and 2 show the measurements of the first formant location for healthy (left) and esophageal (right) voices. Tables 3 and 4 show the measurement errors absolutely and relatively, the relative value is calculated comparing the obtained error with the average formant value. As it can be seen in those tables, conventional methods obtain very poor results, achieving an average deviation of about 70 Hz, approximately the value of the pitch in esophageal speech. These deviations could be inappropriate for some applications which

A simple wavelet algorithm with approximation to the formant band improves considerably this results reducing the deviation about a 30%. This represents quite an improvement comparing with LPC, but it is possible to obtain higher resolution without increasing substantially computational costs. The results of the adjustable resolution algorithm show

The obtained values prove that it is feasible to locate formants position with minimum errors and effective algorithms. This fact constitutes a fundamental advance in esophageal speech regeneration, because formant location has great importance in many speech processing algorithms. Taking as an example previous works of the research group, for example for such as an algorithm as the one presented in [2], much better results would be

It is important to highlight the great relevance that this results may have in some other speech technologies fields such as speech recognition, etc. So the applications of this analysis is not restricted to esophageal speech processing but can be implemented with many others

> **F. with LPC (Hz)**

> **F. with LPC (Hz)**

**Es. 1** 894 698 883 890 **Es. 2** 830 762 754 778 **Es. 3** 808 774 754 805 **Es. 4** 776 744 754 756 Table 2. 1st Formant location for **esophageal voices** calculated with different methods: LPC,

**He. 1** 851 842 883 848 **He. 2** 776 633 711 756 **He. 3** 938 893 969 950 **He. 4** 960 929 926 966 Table 1. 1st Formant location for **healthy voices** calculated with different methods: LPC,

**F. with B.A (Hz)** 

**F. with B.A (Hz)** 

**F. with R.A. (Hz)** 

**F. with R.A. (Hz)** 

require great accuracy, thus a new measurement method is necessary.

that it is possible to reduce the average deviations up to a 50%.

obtained with more accurate formant location estimations.

**Original Values (Hz)** 

**Original Values (Hz)** 

Band Approximation (B.A.) and Resolution Adjustement (R.A.).

Band Approximation (B.A.) and Resolution Adjustement (R.A.).

**4. Results** 

purposes.

**Speech Signal** 

**Speech Signal** 


Table 3. Deviations obtained in formants location values with different methods.


Table 4. Percentual deviations obtained in formants location values with different methods.

After having applied the wavelet transform to the oespphageal speech signal, we can measure the final value of the acoustic parameters. Below is an example describing the basic operation of the software that authors have develop named "PASVoice software pack". It

Oesophageal Speech's Formants Measurement Using Wavelet Transform 93

When the 'Vocaligram' button is clicked on, we can see the same (in this example the results

If any of the spectrogram 'Show' buttons are selected, the corresponding spectrograms are

Fig. 10. Details of the application with both spectrograms calculated

Finally, if it is wished to save the numerical results, the File/Save Results option can be chosen, after which the following dialogue appears. As can be seen, it contains data

are below the threshold for each parameter, as expected):

Fig. 9. 'sana.wav' Vocaligram

corresponding to other patients:

automatically calculated and visualized:

analyses a speech signal in order to obtain objective parameters and graphic representation of values for helping doctors to understand the patient's stage,

When appliying over a healthy voice the results can be as follows:


Fig. 7. 'sana.wav' Results

If 'Show Pitch' is pressed/selected, we can observe the marks that have been located as a reference for measuring pitch:


Fig. 8. Details of pitch marks

When the 'Vocaligram' button is clicked on, we can see the same (in this example the results are below the threshold for each parameter, as expected):

Fig. 9. 'sana.wav' Vocaligram

92 Advances in Wavelet Theory and Their Applications in Engineering, Physics and Technology

analyses a speech signal in order to obtain objective parameters and graphic representation

If 'Show Pitch' is pressed/selected, we can observe the marks that have been located as a

of values for helping doctors to understand the patient's stage, When appliying over a healthy voice the results can be as follows:

Fig. 7. 'sana.wav' Results

reference for measuring pitch:

Fig. 8. Details of pitch marks

If any of the spectrogram 'Show' buttons are selected, the corresponding spectrograms are automatically calculated and visualized:

Fig. 10. Details of the application with both spectrograms calculated

Finally, if it is wished to save the numerical results, the File/Save Results option can be chosen, after which the following dialogue appears. As can be seen, it contains data corresponding to other patients:

Oesophageal Speech's Formants Measurement Using Wavelet Transform 95

Fig. 13. Message displayed when saving results

Fig. 14. Message displayed at initial session


Fig. 11. Dialogue box used to save results

If this is not the first session for the person we are dealing with, his/her name can be searched for by typing the first letters of the name in the box at the top. All concurrences, if there are any, will then be displayed in the main box below.


Fig. 12. Search for people whose names begin with 'w'

In our case we are going to create a new profile. As the name "Example" does not exist, by typing it out completely and clicking on "Save Results…", the new name will be created and the data saved. No results will appear as this is the patient's first session:

If this is not the first session for the person we are dealing with, his/her name can be searched for by typing the first letters of the name in the box at the top. All concurrences, if

In our case we are going to create a new profile. As the name "Example" does not exist, by typing it out completely and clicking on "Save Results…", the new name will be created and

the data saved. No results will appear as this is the patient's first session:

Fig. 11. Dialogue box used to save results

there are any, will then be displayed in the main box below.

Fig. 12. Search for people whose names begin with 'w'

Fig. 13. Message displayed when saving results

Fig. 14. Message displayed at initial session

Oesophageal Speech's Formants Measurement Using Wavelet Transform 97

Recent scientific progress has made it possible to take great steps forward in such fields of major interest as biomedical engineering. In this area, the application of new technologies becomes essential in order to improve techniques in the diagnosis, treatment and rehabilitation of certain medical pathologies. However, there are also collectives suffering from an illness or treatments that only affect a minority of people. This is a characteristic which usually implies that the level of technological development corresponding to the resources having to be used

The laryngectomized are people who, for various reasons, have had to undergo surgery to remove their larynx, vocal cords, epiglottis and the cartilages surrounding the larynx. These elements are of vital importance for the generation of speech as they form part of the phoning

The issue of treating a barely intelligible voice is also of great use from the point of view of the patient's psychology. We have noticed that a high proportion of the laryngectomized feel embarrassed when using this voice, particularly women, who would rather not speak

The results obtained from this research work have been useful mainly due to the IT contribution involving the design, development and implementation of a software application specifically intended for the assessment of laryngectomized voices, with a view to performing a correct medical monitoring that will make it possible to measure evolution and prevent relapses. In order to verify improvement in the quality of oesophageal voices, a database containing several phonemes of all kinds of voices was worked with; these voices, both pathological and healthy, were recorded with the help of members from the Asociación

Future work deriving from this research includes, most importantly, the incorporation of functionalities for vocal recognition and synthesis of phrases, as well as implementing the digital signal processing algorithms developed in systems based on cell phones and PDAs;

Due to the great relevance of Wavelet Transform for the analysis and processing of esophageal speech, and assuming that the final goal will be the implementation in a hardware DSP based device, with very strict real-time requirements, a significant computing resources optimization has been achieved, and consequently, a reduction of the code length in order to minimize computational load. Also it is important to highlight that the obtained

These advantages are achieved through a preprocessing algorithm, which, although Wavelets based, includes some improvements. Firstly, an approximation to the formant subband. And secondly, an adjustable resolution applied over the bands among which the

On the other hand, the here proposed algorithm allows to optimize previous research works concerning the treatment of the poles of the system which models esophageal speech, according to LPC. Taking into account the obtained accuracy, it is logical to assume an improvement in results if this technique is used as a first stage of the whole algorithm.

apparatus. Therefore, the removal of these seriously affects the quality of their speech.

for these pathologies is way behind that for other more common disorders.

than do so with oesophageal voice, as they consider it unfeminine.

all this with the goal of improving the laryngectomized's quality of life.

Vizcaína de Laringectomizados.

wavelet calculi can be used in later processing.

**5. Conclusions** 

formant energy is shared.

We could also have added results as if they were for a patient not coming for the first time. We choose an already existing patient, "Man 1", by choosing from the list and clicking on "Save Results…". A graph showing all the results saved to date from previous sessions is provided (pitch information is separated from that on jitter and shimmer as they are different units):

Fig. 15. Evolution of results by session

Finally, when this window is closed (top-right x), we are provided with a message informing us on evolution since the previous session:

Fig. 16. Message providing information on evolution since last session

Recent scientific progress has made it possible to take great steps forward in such fields of major interest as biomedical engineering. In this area, the application of new technologies becomes essential in order to improve techniques in the diagnosis, treatment and rehabilitation of certain medical pathologies. However, there are also collectives suffering from an illness or treatments that only affect a minority of people. This is a characteristic which usually implies that the level of technological development corresponding to the resources having to be used for these pathologies is way behind that for other more common disorders.

The laryngectomized are people who, for various reasons, have had to undergo surgery to remove their larynx, vocal cords, epiglottis and the cartilages surrounding the larynx. These elements are of vital importance for the generation of speech as they form part of the phoning apparatus. Therefore, the removal of these seriously affects the quality of their speech.

The issue of treating a barely intelligible voice is also of great use from the point of view of the patient's psychology. We have noticed that a high proportion of the laryngectomized feel embarrassed when using this voice, particularly women, who would rather not speak than do so with oesophageal voice, as they consider it unfeminine.

The results obtained from this research work have been useful mainly due to the IT contribution involving the design, development and implementation of a software application specifically intended for the assessment of laryngectomized voices, with a view to performing a correct medical monitoring that will make it possible to measure evolution and prevent relapses. In order to verify improvement in the quality of oesophageal voices, a database containing several phonemes of all kinds of voices was worked with; these voices, both pathological and healthy, were recorded with the help of members from the Asociación Vizcaína de Laringectomizados.

Future work deriving from this research includes, most importantly, the incorporation of functionalities for vocal recognition and synthesis of phrases, as well as implementing the digital signal processing algorithms developed in systems based on cell phones and PDAs; all this with the goal of improving the laryngectomized's quality of life.
