**4.2. Onset detection and temporal segmentation: windowing**

As in [11], our system divides the vowel-segment into temporal slots and, afterwards, a frequency analysis of each slot is done. This temporal segmentation is based on the detection of onsets, so the system is prepared for detecting when a phoneme starts in the recording. This information makes it possible to discard frames whose total spectral energy is below a threshold for silence, and that must not be processed by the system.

After that, a Hamming window [13, Eq. 56] is applied to the segmented signal so that the extreme samples of the segments had less weight that the central samples. In this paper, we use a *M*-points Hamming window symmetric about the point *M*/2 of the form

$$w[n] = \begin{cases} 0.54 - 0.46 \cos(2 \pi n / M), & 0 \le n \le M, \\ 0, & \text{otherwise} \end{cases} \tag{2}$$

owing to it is optimized to minimize the maximum (nearest) side lobe.
