**2. Methods**

### **2.1 Wavelet transform**

One of the most important techniques applied in the spectral analysis is the Fourier Transform (STFT), which will allow to recognize the spectral components of speech signal, so it makes possible to distinguish pathological voices and process them.

That transform has a resolution problem which is given by Heisenberg Uncertainty Principle. The Wavelet Transform (WT) was developed to overcome some resolution related problems of the STFT. It is possible to analyze any signal by using an alternative approach called the multiresolution analysis (MRA).

MRA, as implied by its name, analyzes the signal at different frequencies with different resolutions. MRA is designed to give good time resolution and poor frequency resolution at high frequencies and good frequency resolution and poor time resolution at low frequencies. The Continuous Wavelet Transform (CWT) is used for many different applications and it is defined as follows:

$$
\Psi\_x^{\nu\nu}(\tau, s) = \frac{1}{\sqrt{|s|}} \int \mathbf{x}(t) \cdot \boldsymbol{\nu}^\* \left(\frac{t-\tau}{s}\right) dt\tag{1}
$$

As the here used signals are digital, it is more useful to use Semi-discrete Wavelet Transform (discretized by dyadic grid, described by 2*<sup>j</sup> s* and 2*<sup>j</sup> t k* ) or Discrete Wavelet Transform (DWT). The DWT analyzes the signal at different frequency bands with different resolutions by decomposing the signal into a coarse approximation and detail information [5].

The decomposition of the signal into different frequency bands is simply obtained by successive highpass and lowpass filtering of the time domain signal. The original signal x[n] is first passed through a halfband highpass filter g[n] and a lowpass filter h[n]. This constitutes one level of decomposition and can mathematically be expressed as follows:

$$\mathbb{E}\left[y\_{high}\left[k\right]\right] = \sum\_{n} \mathbb{x}\left[n\right] \cdot \mathbb{g}\left[2k - n\right] \tag{2}$$

$$\mathbb{E}\left[y\_{low}\left[k\right]\right] = \sum\_{n} \mathbb{x}\left[n\right] \cdot h\left[2k - n\right] \tag{3}$$

where yhigh[k] and ylow[k] are the outputs of the highpass and lowpass filters, respectively, after subsampling by 2. This decomposition halves the time resolution since only half the number of samples now characterizes the entire signal.

However, this operation doubles the frequency resolution, since the frequency band of the signal now spans only half the previous frequency band, effectively reducing the uncertainty in the frequency by half. The above procedure, which is also known as the subband coding, can be repeated for further decomposition.

The wavelet packet method is a generalization of wavelet decomposition that offers a richer signal analysis. Wavelet packet atoms are waveforms indexed by three naturally interpreted parameters: position, scale (as in wavelet decomposition), and frequency. It will be then selected the most suitable decomposition of a given signal with respect to an entropy-based criterion.
