**1. Introduction**

80 Advances in Wavelet Theory and Their Applications in Engineering, Physics and Technology

Zhang H., Blackburn T. R., Phung B. T. & Sen D. (2007). A novel wavelet transform

2007), pp. 3-14, ISSN 1070-9878

technique for on-line partial discharge measurements. 1. WT de-noising algorithm. *IEEE Transactions on Dielectrics and Electrical Insulation*, Vol. 14, No. 1, (February,

> One of the most important concerns for the specialists in otorrinolaringologists and the patients who have suffer a laringectomie is a complex process for their rehabilitation. At the present, it is no available any advanced technique either for the learning or the evaluation of this process.

> Esophageal speech is characterized by its low intelligibility, which implies that its objective measurement parameters e.g. pitch, jitter, shimmer or HNR have values outside normal ranges [1]. One of the consequences of this fact is the impossibility of using speech recognizers, speech to text converters or any kind of automatic response device that requires a speech signal.

> The here presented paper explains a work which is included in a research whose objective is to adapt speech controlled systems so that they can be used by people with vocal disorders. Esophageal voices are the most grievous among these pathologies.

> Our research group has presented many works to the scientific community [2], [3], aimed to the improvement of esophageal speech quality by stabilizing the poles of the system which models the vocal tract with LPC. Nowadays the wavelet transform is being used in order to enhance the Harmonics to noise ratio. For this task, it is crucial to know accurately the frequency values of formants in vowels [7].

> In this paper results of a new algorithm are presented, this algorithm uses Wavelets Transform as basis, but proposes a new technique to improve calculation accuracy. In order to evaluate this new technique a comparative between its results and the ones obtained with the LPC will be elaborated. As a reference for the comparative the results of analyzing the FFT transform will be taken [4].

> The general objective of the chapter is the enhancement of esophageal speech quality in communications with humans and machines. This aim comes up of the low intelligibility of people who speak with esophageal voice after an operation called laryngectomy which is carry out like treatment of larynx cancer [6].

Oesophageal Speech's Formants Measurement Using Wavelet Transform 83

parameters: position, scale (as in wavelet decomposition), and frequency. It will be then selected the most suitable decomposition of a given signal with respect to an entropy-based

At the present time, many otolaryngologists (ORLs) use the software tools they have available in order to corroborate the diagnosis of vocal cord pathologies by means of objective parameters. These parameters complete the information gathered by the specialist, which usually comprises: the images obtained from a stroboscope and several perceptual

Special attention needs to be paid to vocal cord cancer, that is to say, to its diagnosis, treatment, rehabilitation and monitoring, as this cancer can cause the death of the patient suffering from it. Once the cancer has been detected, the ORL specialist removes the patient's vocal cords. This means that the patient will no longer be able to produce what is

After the operation, during rehabilitation, the patient begins the process of learning how to emit oesophageal voice: the voice produced by modulating air coming from the oesophagus. This enables the patient to communicate, albeit experiencing great difficulty to maintain fluent conversations, due to the poor quality of oesophageal voice. However, one of the major problems is that this type of oesophageal voice cannot be evaluated during the rehabilitation process as there is no application available on the market that can automatically obtain the previously mentioned acoustic parameters. The quality of oesophageal voice is so low that the algorithms obtaining the periodicity of the voice do not work properly, and thus measurements obtained by such software packs are not

Obviously, the accuracy of measurements made by the software pack presented in this work will also be applicable to less severe pathologies, such as polyps, nodules, hypo mobility of the vocal cords, etc. The deterioration of the voice in this type of pathology is also too high for the measurement of objective parameters to be precise. This means that these commercial software packs are not suitable for measuring these parameters in voices suffering from some kind of pathology. Being able to obtain accurate objective parameters is advantageous for the early detection of cancer in cases where the patient's laryngeal voice is of a very poor

The pitch, or fundamental frequency of the speech, is one of the properties of sound or musical tone perceived through frequency. Due to this natural pseudo-periodicity of the voiced voice, there are small variations in the peaks of the voice which change their fi

<sup>1</sup> ( )

*Pitch Hz*

*N i i f*

*N* 

(4)

criterion.

reliable.

**2.2 Basis of speech analysis** 

tests carried out on the patient.

quality and has high noise levels [1].

N being the number of pitch periods.

frequency, so that the pitch can be defined as:

called laryngeal voice and thus loses his/her speech.
