**3.2 User interface**

86 Advances in Wavelet Theory and Their Applications in Engineering, Physics and Technology

In the above explained step, an approximation to the formantic frequencies was obtained. As it will explained in next head, the resolution obtained with this approximation, though it is better than the one obtained with conventional methods, may not be enough for some

In order to achieve a finer resolution, an adjustable resolution analysis was designed. The scheme of this analysis is shown in Figure 1. The core idea of the designed technique is to obtain a higher resolution in the previously detected bands by dividing the selected nodes

The main reason of using narrower bands is that energy in wavelets packet spreads among various adjacent nodes, the solution to this problem is to divide the spectrum in such

As it can be seen in Figure 1 the first step of the algorithm consists on splitting the approximated formantic bands and their adjacent as many times as necessary. Secondly, the energy of each node is calculated again and the maximum value located, this value indicates

narrow bands that the energy of the formant locates in only one node.

Fig. 1. Adjustable Resolution Analysis Schema

**3.1.2 Step 2: Adjustable resolution analysis** 

environments.

and their adjacent ones.

formant location.

Using the advantages of the previously described algorithms, authors have developed a tool called "PAS Voice". The welcome screen will then be displayed:

Fig. 2. Welcome Screen

Once the application has been started up, the main screen will be displayed:


Fig. 3. Main Screen

Oesophageal Speech's Formants Measurement Using Wavelet Transform 89

3. Graphic Representation Area: In this area, the voice signal graphics and the evolution of pitch over the time are displayed once the voice has been analyzed. Underneath is a

4. Graphic Representation Options Area: Once the voice and pitch evolution have been represented, this area is enabled so as to be able to check other data in greater detail: Amplitude/Time/Pitch Detail.- When a particular point in the graphics above are clicked on, these frames fill up with information corresponding to the point that has been selected. The Amplitude/Time values (clicking on the upper one) or Pitch Detail values (clicking on the lower) will be displayed in accordance with the graphic function selected. Show Pitch.- Once the pitch of the voice signal has been calculated, when this button is pressed the marks

Play.- By pressing this button, the voice signal will be reproduced through the computer

5. Spectrogram Options Area: By default, the spectrograms are not calculated during the analysis process. If it is wished to do so, this should be done through the following area:

Frame Size/Overlap.- These are the parameters composing the spectrogram. The parameters indicated by default are typical ones for the representation of broad-band and narrow-band spectrograms respectively. Beware! It is not recommended to touch these parameters … A

Show.- This displays the spectrogram in its area corresponding to the indicated parameters.

6. Spectrogram Area: This shows the spectrograms when the "Show" button is pressed.

progress bar indicating the approximate percentage of analysis completed.

situated in the signal over the relevant points indicating periodicity will be shown.

Fig. 5. Example of oscilogram with "Show Pitch" option activated

Fig. 6. Example of Broad-Band Spectrogram

poor configuration may considerably increase spectrogram calculation time.

loudspeakers (if applicable).

The following areas can be observed on this screen:

1. Menu: The program's general option menus can be identified in this area:

File.- Menu with the "Open file" option, which allows you to open a voice signal in order to process it. The signal has to be in .WAV, .AU or .AIFF format. Voice processing begins automatically once the file to be analyzed has been chosen.

Save Results – This enables you to save the signal processing results; results from several sessions can be added for the same person or a new profile can be created. Once the results have been correctly saved, a graph will be displayed showing the evolution of the parameters throughout all sessions of analysis. When this graph is closed, an informative message on development since the previous session will be displayed.

Tools.- Tool menu for application configuration.

Language.- This allows the language to be chosen for the program (initially English and Spanish, although personalized translations can be applied). If the language is changed, the application will have to be rebooted.

Octave Path.- The octave.exe file, essential for the running of this program, can be specified using this option.

Help.- By clicking on this, help is provided for running the program.

2. Measurement area: In this area, once the a voice signal has been processed (through the File/Open file option), the numerical measurements of Pitch, Jitter and Shimmer are displayed. If one wishes to observe the measurements in graphic form with the normality threshold, the "Vocaligram" button can be clicked on; this will only be enabled once the processing has been performed to obtain the measurements needed to create the vocaligram. The vocaligram is a graphic representation of a measurement in each axis (in blue) superimposed over the threshold values for each parameter. The measurements are scaled so that abnormal values are always greater than the threshold (a value above the threshold implies that it is abnormal).

Fig. 4. Sample Vocaligram

File.- Menu with the "Open file" option, which allows you to open a voice signal in order to process it. The signal has to be in .WAV, .AU or .AIFF format. Voice processing begins

Save Results – This enables you to save the signal processing results; results from several sessions can be added for the same person or a new profile can be created. Once the results have been correctly saved, a graph will be displayed showing the evolution of the parameters throughout all sessions of analysis. When this graph is closed, an informative

Language.- This allows the language to be chosen for the program (initially English and Spanish, although personalized translations can be applied). If the language is changed, the

Octave Path.- The octave.exe file, essential for the running of this program, can be specified

2. Measurement area: In this area, once the a voice signal has been processed (through the File/Open file option), the numerical measurements of Pitch, Jitter and Shimmer are displayed. If one wishes to observe the measurements in graphic form with the normality threshold, the "Vocaligram" button can be clicked on; this will only be enabled once the processing has been performed to obtain the measurements needed to create the vocaligram. The vocaligram is a graphic representation of a measurement in each axis (in blue) superimposed over the threshold values for each parameter. The measurements are scaled so that abnormal values are always greater than the threshold

1. Menu: The program's general option menus can be identified in this area:

The following areas can be observed on this screen:

Tools.- Tool menu for application configuration.

application will have to be rebooted.

using this option.

Fig. 4. Sample Vocaligram

automatically once the file to be analyzed has been chosen.

message on development since the previous session will be displayed.

Help.- By clicking on this, help is provided for running the program.

(a value above the threshold implies that it is abnormal).


Amplitude/Time/Pitch Detail.- When a particular point in the graphics above are clicked on, these frames fill up with information corresponding to the point that has been selected. The Amplitude/Time values (clicking on the upper one) or Pitch Detail values (clicking on the lower) will be displayed in accordance with the graphic function selected. Show Pitch.- Once the pitch of the voice signal has been calculated, when this button is pressed the marks situated in the signal over the relevant points indicating periodicity will be shown.

Play.- By pressing this button, the voice signal will be reproduced through the computer loudspeakers (if applicable).

Fig. 5. Example of oscilogram with "Show Pitch" option activated

5. Spectrogram Options Area: By default, the spectrograms are not calculated during the analysis process. If it is wished to do so, this should be done through the following area:

Frame Size/Overlap.- These are the parameters composing the spectrogram. The parameters indicated by default are typical ones for the representation of broad-band and narrow-band spectrograms respectively. Beware! It is not recommended to touch these parameters … A poor configuration may considerably increase spectrogram calculation time.

Show.- This displays the spectrogram in its area corresponding to the indicated parameters.

Fig. 6. Example of Broad-Band Spectrogram

6. Spectrogram Area: This shows the spectrograms when the "Show" button is pressed.

Oesophageal Speech's Formants Measurement Using Wavelet Transform 91

 **F1 F2 F3 F1 F2 F3 F1 F2 F3 Healthy 1** 10 31 185 31 9 159 4 4 17 **Healthy 2** 143 12 78 65 31 41 20 9 10 **Healthy 3** 45 68 157 31 106 30 12 34 16 **Healthy 4** 31 15 10 34 12 35 6 12 5 **Esophageal 1** 196 77 72 11 33 35 4 26 75 **Esophageal 2** 50 193 101 76 77 2 42 48 11 **Esophageal 3** 34 88 51 54 33 2 3 15 10 **Esophageal 4** 32 21 60 22 0 13 20 9 21

**Deviation** 65 61 89 41 36 40 18 23 21

 **F1 F2 F3 F1 F2 F3 F1 F2 F3 Healthy 1** 1.135 2.292 7.000 3.518 0.665 6.016 0.454 0.296 0.643 **Healthy 2** 16.227 0.887 2.951 7.376 2.292 1.551 2.270 0.665 0.378 **Healthy 3** 5.106 5.028 5.941 3.518 7.837 1.135 1.362 2.514 0.605 **Healthy 4** 3.518 1.109 0.378 3.858 0.887 1.324 0.681 0.887 0.189

**<sup>1</sup>**23.700 5.487 2.580 1.330 2.352 1.254 0.484 1.853 2.687

**<sup>2</sup>**8.222 13.754 3.619 9.190 5.487 0.072 6.288 3.421 0.394

**<sup>3</sup>**4.111 6.271 1.827 6.530 2.352 0.072 0.363 1.069 0.358

**<sup>4</sup>**3.869 1.497 2.150 2.660 0.000 0.466 2.418 0.641 0.752

**Deviation** 7.964 4.359 3.306 4.747 2.632 1.486 1.790 1.781 0.751

Table 4. Percentual deviations obtained in formants location values with different methods.

After having applied the wavelet transform to the oespphageal speech signal, we can measure the final value of the acoustic parameters. Below is an example describing the basic operation of the software that authors have develop named "PASVoice software pack". It

**Deviations obtained only wit band approximation (%)** 

Table 3. Deviations obtained in formants location values with different methods.

**Deviations obtained only wit band approximation (Hz)** 

**Deviations obtained with resolution adjustment (Hz)** 

**Deviations obtained with resolution adjustment (%)** 

**Speech Signal Deviations obtained** 

**Average** 

**Speech Signal** 

**Esophageal** 

**Esophageal** 

**Esophageal** 

**Esophageal** 

**Average** 

**with LPC (Hz)** 

**Deviations obtained with LPC (%)** 
