**4.1. Data acquisition and preprocessing**

4 Computational and Numerical Simulations

others).

vocalic phonemes [1].

**4. Methodology**

*F*1. On the other hand, second formant frequency (*F*2) corresponds to length and size of the speaker's oral cavity; in this case, front vowels have high *F*2 whereas back vowels have low *F*2; the formant frequencies decrease through the cardinal vowels, where the cardinal vowels can be consulted at [18]. Nevertheless, these relationships are not straightforward since there are other factors influencing sound production (e.g. lip rounding, tongue retroflexion, among

Articulatory properties of vowels are determined by these *F*1 and *F*2 formants in such a way that one is plotted against the other. Because of the inverse relationship between articulatory parameters and formant frequencies, zero frequency is at the top right corner. In Fig. 2 [1],

**Figure 2.** Vowel trapezium inserted in the oral cavity, indicating tongue movements for the pronunciation of the different

In this section, we present the algorithm implemented to find the frequency and amplitude of the first formants during any vowel-like segment. In order to analyse any speech fragment, a time-frequency analysis is needed. Short-time Fourier transforms (STFT), constant-Q [19] and wavelet transforms are some of the most commonly employed solutions in several systems. In this paper, the main idea is based on a previous work [20], tested on a large number of utterances produced by several different speakers; McCandless's discovery was found to be extremely successful. This algorithm is combined with some other ideas already developed

We should remark that this manuscript comes to complement the work initiated in [1], so the recordings accepted by our system consists of only one vowel each, unlike the one presented in [20]. The latter developed a completely automatic algorithm which was meant to yield

by authors [11] in the context of polyphonic piano recordings.

we have displayed where English vowels are pronounced inside the oral cavity:

This stage consists in the recording of a vowel file. The audio data was kept in a WAV file at a sample rate of 44.1 kHz. The system accepts a monaural file as well as a stereophonic one. Then, the digitized signal is low-pass filtered in order to eliminate high frequency components.
