A Voice Signal Filtering Methods for Speaker Biometric Identification

*Eugene Fedorov,Tetyana Utkina and Tetyana Neskorodeva*

### **Abstract**

The preliminary stage of the personality biometric identification on a voice is voice signal filtering. For biometric identification are considered and in number investigated the following methods of noise suppression in a voice signal. The smoothing adaptive linear time filtering (algorithm of the minimum root mean square error, an algorithm of recursive least squares, an algorithm of Kalman filtering, a Lee algorithm), the smoothing adaptive linear frequency filtering (the generalized method, the MLEE (maximum likelihood envelope estimation) method, a wavelet analysis with threshold processing (universal threshold, SURE (Stein's Unbiased Risk Estimator)-threshold, minimax threshold, FDR (False Discovery Rate)-threshold, Bayesian threshold were used), the smoothing non-adaptive linear time filtering (the arithmetic mean filter, the normalized Gauss's filter, the normalized binomial filter), the smoothing nonlinear filtering (geometric mean filter, the harmonic mean filter, the contraharmonic filter, the *α*-trimmed mean filter, the median filter, the rank filter, the midpoint filter, the conservative filter, the morphological filter). Results of a numerical research of denoising methods for voice signals people from the TIMIT (Texas Instruments and Massachusetts Institute of Technology) database which were noise an additive Gaussian noise and multiplicative Gaussian noise were received.

**Keywords:** announcer biometric identification, voice signal filtering methods, the smoothing adaptive linear filtering, a wavelet analysis threshold processing, the smoothing non-adaptive linear filtering, the smoothing nonlinear filtering

### **1. Introduction**

The preliminary stage of the personality biometric identification on a voice is voice signal filtering. Methods of a signal cleaning from noise arose and gained broad development in the twentieth century. With development a wavelet analysis joined normal time and frequency filters a wavelet filter.

Noise (interference) is the sound of an undesirable additional source added to a desired signal during its record or transfer on communication channel.

Noise can be classified by the following features: periodicity/aperiodicity; additive/ multiplicative; continuity/impulsivity; to band width in a signal spectrum; color.

By continuity/impulsivity, noises are divided into: continuous; pulse (point); continuous and pulse.

Noises are divided by band width on:


From color noise by the most difficult for filtering the white noise which has a uniform energy spectrum in all frequency range. The most widespread kind of a white noise is Gauss's noise.

Additive and multiplicative continuous and continuous impulse noises are removed from a signal by means of a wavelet analysis with threshold processing, the smoothing linear and many nonlinear filters, spectral subtraction. Impulse noises are removed many smoothing nonlinear filters. Additive aperiodic noise is removed low-frequency filters. Additive periodic noise is removed the bandpass and rejection filters.

### **2. The smoothing adaptive linear time filtering**

*Adaptive linear time filters call* linear filters with adaptive impulse response function [1–3].

#### **2.1 Algorithm of the minimum root mean square error**

Algorithm of the minimum root mean square error which is applied to a signal *x n*ð Þ size *N*, is as follows [1]:

1. Impulse response function initialization

$$\mathbf{h} = \begin{pmatrix} h\_1 \\ \dots \\ h\_{2M+1} \end{pmatrix} = \begin{pmatrix} h(-M) \\ \dots \\ h(M) \end{pmatrix} = \begin{pmatrix} \mathbf{0} \\ \dots \\ \mathbf{0} \end{pmatrix}. \tag{1}$$

2.*n* ¼ *M*.

3.Noise vector forming from a noise signal

$$\mathbf{v} = \begin{pmatrix} v\_1 \\ \dots \\ v\_{2M+1} \end{pmatrix} = \begin{pmatrix} v(n-M) \\ \dots \\ v(n+M) \end{pmatrix} . \tag{2}$$

4. Signal filtering (receiving noise estimates)

$$
\overline{z}(n) = \mathbf{h}^T \mathbf{v}.\tag{3}
$$

5.Error signal current value calculation

$$e(n) = \varkappa(n) - \breve{z}(n). \tag{4}$$

6. Impulse response function adaptation

$$\mathbf{h} = \mathbf{h} + \mu \mathbf{v} e(n),\tag{5}$$

where 0 <*μ*<1.

7. If *n*< *N* � *M* then *n* ¼ *n* þ 1, go to a step 2.

The signal is algorithm work result *e n*ð Þ, *e n*ð Þ≈*s n*ð Þ.

#### **2.2 Recursive least squares algorithm**

Recursive least squares algorithm which is applied to a signal *x n*ð Þ size *N*, is as follows [1]:

1. Initialization of impulse response function and adaptation matrix

$$\mathbf{h} = \begin{pmatrix} h\_1 \\ \dots \\ h\_{2M+1} \end{pmatrix} = \begin{pmatrix} h(-M) \\ \dots \\ h(M) \end{pmatrix} = \begin{pmatrix} 0 \\ \dots \\ 0 \end{pmatrix}, \mathbf{P} = \begin{pmatrix} p\_{11} & \dots & p\_{1,2M+1} \\ \dots & \dots & \dots \\ p\_{2M+1,1} & \dots & p\_{2M+1,2M+1} \end{pmatrix} = \lambda \mathbf{I}, \quad \text{(6)}$$

where *λ*-regularization parameter which is small at a big ratio signal/noise and is big at a small ratio signal/noise.

1.*n* ¼ *M*.

2.Noise vector forming

$$\mathbf{v} = \begin{pmatrix} v\_1 \\ \dots \\ v\_{2M+1} \end{pmatrix} = \begin{pmatrix} v(n-M) \\ \dots \\ v(n+M) \end{pmatrix} . \tag{7}$$

3. Signal filtering (receiving noise estimates)

$$
\overline{z}(n) = \mathbf{h}^T \mathbf{v}.\tag{8}
$$

4.Error signal current value calculation

$$e(n) = \mathfrak{x}(n) - \breve{\mathfrak{z}}(n). \tag{9}$$

5.Adaptive gain **Γ** vector calculation

$$
\Gamma = \frac{\mathbf{P}\mathbf{v}}{\mathbf{v}^T \mathbf{P} \mathbf{v} + r},
\tag{10}
$$

where 0 <*r*<1.

6.Estimates covariance matrix **P** calculation

$$\mathbf{P} = \frac{1}{r} \left( \mathbf{P} - \frac{\mathbf{P} \mathbf{v} \mathbf{v}^T \mathbf{P}}{\mathbf{v}^T \mathbf{P} \mathbf{v} + r} \right). \tag{11}$$

7. Impulse response function calculation

$$\mathbf{h} = \mathbf{h} + \Gamma e(n). \tag{12}$$

8. If *n*< *N* � *M* then *n* ¼ *n* þ 1, go to a step 2.

The signal is algorithm work result *e n*ð Þ, *e n*ð Þ≈*s n*ð Þ.

#### **2.3 Kalman filtering algorithm**

Kalman filtering algorithm which is applied to a signal *x n*ð Þ size *N*, is as follows [1]:

1.Estimates and white noise covariance matrixes:

$$\mathbf{h} = \begin{pmatrix} h\_1 \\ \dots \\ h\_{2M+1} \end{pmatrix} = \begin{pmatrix} h(-M) \\ \dots \\ h(M) \end{pmatrix} = \begin{pmatrix} 0 \\ \dots \\ 0 \end{pmatrix}, \mathbf{P} = \begin{pmatrix} p\_{11} & \dots & p\_{1,2M+1} \\ \dots & \dots & \dots \\ p\_{2M+1,1} & \dots & p\_{2M+1,2M+1} \end{pmatrix} = \lambda \mathbf{I}, \quad \text{(13)}$$

$$\mathbf{Q} = \begin{pmatrix} q\_{11} & \dots & q\_{1,2M+1} \\ \dots & \dots & \dots \\ q\_{2M+1,1} & \dots & q\_{2M+1,2M+1} \end{pmatrix} = \sigma\_1^2 \mathbf{I},$$

where *λ*—regularization parameter which is small at a big ratio signal/noise and is big at a small ratio signal/noise,

*σ*2 <sup>1</sup>—variance of a white noise of process which has null mean value.

$$1.n = \mathcal{M}.$$

2.Noise vector forming from a noise signal

$$\mathbf{v} = \begin{pmatrix} \nu\_1 \\ \dots \\ \nu\_{2M+1} \end{pmatrix} = \begin{pmatrix} \nu(n-M) \\ \dots \\ \nu(n+M) \end{pmatrix}. \tag{14}$$

3. Signal filtering (receiving noise estimates)

$$
\breve{\boldsymbol{z}}(n) = \mathbf{h}^T \mathbf{v}.\tag{15}
$$

4.Error signal current value calculation

$$e(n) = \varkappa(n) - \breve{z}(n). \tag{16}$$

5.Adaptive gain **Γ** vector calculation

$$
\Gamma = \frac{\mathbf{P}\mathbf{v}}{\mathbf{v}^T \mathbf{P} \mathbf{v} + \sigma\_2^2},
\tag{17}
$$

where *σ*<sup>2</sup> <sup>2</sup>—variance of a white noise of measurement which has null mean value.

6.Estimates covariance matrix **P** calculation

$$\mathbf{P} = \mathbf{P} - \frac{\mathbf{P} \mathbf{v} \mathbf{v}^T \mathbf{P}}{\mathbf{v}^T \mathbf{P} \mathbf{v} + \sigma\_2^2} + \mathbf{Q}. \tag{18}$$

7. Impulse response function calculation

$$\mathbf{h} = \mathbf{h} + \Gamma e(n). \tag{19}$$

8. If *n*< *N* � *M* then *n* ¼ *n* þ 1, go to a step 2.

The signal is algorithm work result *e n*ð Þ, *e n*ð Þ≈*s n*ð Þ.

### **2.4 Lee algorithm**

Lee algorithm [2] which is applied to a signal *x n*ð Þ size *N*, is as follows:

1.Calculate local mean for each signal sample

$$\mu(n) = \frac{1}{2M + 1} \sum\_{m \in U\_n} \varkappa(m), n \in \overline{M, N - M + 1},\tag{20}$$

where *Un*—sample *n* neighborhood size 2*M* þ 1.

2.Calculate local variance for each signal sample

$$\sigma\_x^2(n) = \frac{1}{2M + 1} \sum\_{m \in U\_n} x^2(m) - \mu^2(m), n \in \overline{M, N - M + 1}.\tag{21}$$

3.Calculate variance for each signal sample

$$
\sigma\_{\nu}^{2} = \frac{1}{N - 2M} \sum\_{n} \sigma\_{x}^{2}(n), n \in \overline{M, N - M + 1}. \tag{22}
$$

4.Execute adaptive filtering of a signal

$$\kappa(n) = \sum\_{m=-M}^{M} h(m) \ \varkappa(n-m), n \in \overline{M, N-M+1}.\tag{23}$$

**Figure 1.** *Source signal for smoothing adaptive linear time filtering.*

#### **Figure 2.**

*A signal with an additive Gaussian noise for smoothing adaptive linear time filtering.*

$$h(m) = \begin{cases} \frac{1}{2M+1} + \frac{\max\left\{0, \sigma\_x^2(n) - \sigma\_\nu^2\right\}}{\sigma\_x^2(n)} \left(1 - \frac{1}{2M+1}\right), & m = 0\\\frac{1}{2M+1} - \frac{\max\left\{0, \sigma\_x^2(n) - \sigma\_\nu^2\right\}}{\sigma\_x^2(n)} \cdot \frac{1}{2M+1}, & otherwise \end{cases} . \tag{24}$$

**Figure 3.** *The signal denoised by the adaptive filter.*

### **Example**

In **Figure 1** the source signal, is presented on **Figure 2**—noisy (additive white is added the noise with a mean 0 and variance 0.001 is Gaussian), on **Figure 3**—filtered and *M* ¼ 1.

### **3. The smoothing adaptive linear frequency filtering**

*Adaptive linear frequency filters call* linear filters with adaptive transfer function [4]. The smoothing adaptive linear frequency filtering is called *spectral subtraction.* Let *Xp*ð Þ*k* —a noisy signal spectrum of on *p*-th a frame, *V k*ð Þ—mean noise

spectrum, *Sp*ð Þ*k* —a mean of the restored signal on *p*-th a frame.

Adaptive linear frequency filtering represents the inverse discrete Fourier transform of performing adaptive transfer function of the filter *Hp*ð Þ*k* on *p*-th frame and signal spectrum *Xp*ð Þ*k* on *p*-th a frame in a next form

$$\chi(n) = \frac{1}{N} \sum\_{k=0}^{N-1} (X(k)H(k))e^{\frac{j2\pi k}{N}}.\tag{25}$$

The following spectral subtraction methods are selected [1]:

1.General method (proposed Beruti, Schwartz and Makhoul)

$$S\_p(k) = H\_p(k)X\_p(k),\tag{26}$$

$$H\_p(k) = \begin{cases} \mathbf{G} \left( \frac{\left| \boldsymbol{X}\_p(k) \right|^r - a \left| \boldsymbol{V}(k) \right|^r}{\left| \boldsymbol{X}\_p(k) \right|^r} \right)^{1/r}, & \mathbf{G} \left( \frac{\left| \boldsymbol{X}\_p(k) \right|^r - a \left| \boldsymbol{V}(k) \right|^r}{\left| \boldsymbol{X}\_p(k) \right|^r} \right)^{1/r} > \beta |\boldsymbol{V}(k)| \\\ \beta |\boldsymbol{V}(k)|, & \text{otherwise} \end{cases} \tag{27}$$

where *G*, *α*, *β*, *γ*—parameters.

2.The Ball method

$$H\_p(k) = \begin{cases} \left| \frac{\left| \mathbf{X}\_p(k) \right| - \left| V(k) \right|}{\left| \mathbf{X}\_p(k) \right|}, & \left| \mathbf{X}\_p(k) \right| - \left| V(k) \right| > 0 \\ \mathbf{0}, & \text{otherwise} \end{cases} \right. \tag{28}$$

3.Wiener filtering

$$H\_p(k) = \begin{cases} \left| \frac{\left| \mathbf{X}\_p(k) \right|^2 - \left| V(k) \right|^2}{\left| \mathbf{X}\_p(k) \right|^2}, & \left| \mathbf{X}\_p(k) \right|^2 - \left| V(k) \right|^2 > \mathbf{0} \\\\ \mathbf{0}, & \text{otherwise} \end{cases} . \tag{29}$$

4.The MLEE method

$$S\_p(k) = H\_p(k)X\_p(k),$$

$$H\_p(k) = \begin{cases} \frac{1}{2} + \frac{1}{2} \sqrt{\frac{\left|X\_p(k)\right|^2 - \left|V(k)\right|^2}{\left|X\_p(k)\right|^2}}, & \left|X\_p(k)\right|^2 - \left|V(k)\right|^2 > 0\\ 0, & \text{otherwise} \end{cases}.\tag{30}$$

**Figure 4.** *Source signal for smoothing adaptive linear frequency filtering.*

*A Voice Signal Filtering Methods for Speaker Biometric Identification DOI: http://dx.doi.org/10.5772/intechopen.101975*

**Figure 5.**

*A signal with additive Gaussian noise for smoothing adaptive linear frequency filtering.*

#### **Example**

In **Figure 4** the source signal, is presented on **Figure 5**—noisy (additive white is added the noise with a mean 0 and variance 0.001 is Gaussian), the signals denoised by means of filtering according to general method (*G* ¼ 1, *γ* ¼ 2, *α* ¼ 6, *β* ¼ 0*:*1) (**Figure 6**), Ball (**Figure 7**), Wiener (**Figure 8**), MLEE (**Figure 9**). For these methods

**Figure 6.** *The signal denoised by means of filtering according to general method.*

**Figure 7.** *The signal denoised by means of filtering according to Ball.*

#### **Figure 8.**

*The signal denoised by means of filtering according to Wiener.*

as frame length, it was selected Δ*N* = 512 (about 20 ms). In signal quality the word "Sasha" with a sampling rate of 22050 Hz, 8-bits, mono was selected.

### **4. Wavelet analysis threshold processing**

For a wavelet analysis the soft and rigid threshold processing is widely used [5].

*A Voice Signal Filtering Methods for Speaker Biometric Identification DOI: http://dx.doi.org/10.5772/intechopen.101975*

**Figure 9.** *The signal denoised by means of filtering according to MLEE.*

#### **4.1 Signal analysis**

1. Initialization.

Decompositions level number *i* ¼ 1.

$$\mathfrak{c}\_{0\mathbf{x}} = \mathfrak{s}(\mathfrak{x}), \mathfrak{x} \in \mathbf{0}, \mathbf{N}/2^{i-1} - \mathbf{1}, \tag{31}$$

where *s x*ð Þ—original signal length *N*.

2.On the current *i*th the decomposition level signal convolution with impulse response functions of FIR-HP (Finite Impulse Response—High Pass) and FIR-LP (Finite Impulse Response—Low Pass) filter is executed *g k*ð Þ, *h k*ð Þ respectively

$$d\_{im} = \sqrt{2} \sum\_{k=0}^{N\_2/2^{i-1}-1} c\_{i-1,k} \mathbf{g}(k+2m), m \in \overline{\mathbf{0}, N/2^i - 1},\tag{32}$$

$$c\_{im} = \sqrt{2} \sum\_{k=0}^{N\_2/2^{i-1}-1} c\_{i-1,k} h(k+2m), m \in \overline{0, N/2^i - 1}. \tag{33}$$

3. If *i*< *P* then *i* ¼ *i* þ 1, go to a step 1.

#### **4.2 Decomposition coefficients conversion**

1.Decompositions level number *i* ¼ 1.

2.Create the vector arranged on increase

$$a\_i = \left( |d\_{i0}|, \dots, \left| d\_{i, N/2^i - 1} \right| \right), |d\_{im}| < |d\_{i, m+1}|. \tag{34}$$

3.Calculate noise standard deviation based on a received vector median

$$
\sigma\_i = \frac{median(a\_i)}{0.6745},
\tag{35}
$$

where *median x*ð Þ—function which returns a median of a vector *x*.

	- 1.Calculate a universal threshold

$$T\_i = \sigma\_i \sqrt{2 \ln \left( \mathcal{N} / 2^i \right)}. \tag{36}$$

2.Calculate a SURE-threshold

1.Define a threshold based on minimal risk

$$r\_{im} = 1 + \frac{-2(m-1) + \sum\_{k=0}^{m-1} (a\_{ik})^2 + (a\_{im})^2 \left(N/2^i - 1 - m\right)}{\left(N/2^i\right)}, m \in \overline{0, N/2^i - 1}, \quad \text{(37)}$$

$$m^\* = \arg\min\_m r\_{im}, m \in \overline{\mathbf{0}, N/2^i - 1}, \tilde{T}\_i = a\_{im^\*}\,. \tag{38}$$

#### 2.Calculate a SURE-threshold based on the received threshold

$$T\_i = \begin{cases} \sigma\_i \sqrt{2 \ln \left( N / 2^i \right)}, & \sum\_{m=0}^{N/2^i - 1} d\_{im} - \left( N / 2^i \right) \sigma\_i^2 \le \varepsilon\_i \\\\ \tilde{T}\_i, & \sum\_{m=0}^{N/2^i - 1} d\_{im} - \left( N / 2^i \right) \sigma\_i^2 > \varepsilon\_i \end{cases}, \varepsilon\_i = \sigma\_i^2 \sqrt{\left( N^f / 2^i \right) \ln \left( N / 2^i \right)^3}. \tag{39}$$

3.Calculate a minimax threshold

$$T\_i = \begin{cases} \sigma\_i \left( 0.3936 + 0.1829 \ln \left( N/2^i \right) \right), & N/2^i > 32 \\ 0, & N/2^i \le 32 \end{cases} \tag{40}$$

4.Calculate a FDR-threshold

$$\mu\_i = \frac{\sum\_{m=1}^{N/2^i - 1} a\_{im}}{N/2^i - 1}, \Delta\_{im} = \text{erfc}\left(\frac{\mathbf{1}}{\sqrt{2}} \left| \frac{a\_{im} - \mu\_i}{\sigma\_i} \right| \right) - q \frac{m}{N/2^i - 1}, m \in \overline{0, N/2^i - 1}, \tag{41}$$

*A Voice Signal Filtering Methods for Speaker Biometric Identification DOI: http://dx.doi.org/10.5772/intechopen.101975*

$$\mathfrak{g}m^\* = \arg\min\_{\mathfrak{m}} \left( \mathfrak{sgn}\left(\Delta\_{i\mathfrak{m}}\right) \mathfrak{sgn}\left(\Delta\_{i,m+1}\right) \right), m \in \mathbf{0}, \mathcal{N}/2^i - \mathbf{2}, T\_i = \mathfrak{a}\_{\mathfrak{m}^\*},\tag{42}$$

where *q*—parameter, *q*∈ð � 0, 0*:*5 , and it is normal *q* ¼ 0*:*05, *erfc x*ð Þ additional function of errors.

5.Calculate a Bayesian threshold (using Quasi-Cauchy distribution which is the most effective)

1.Calculate function *β* (using Quasi-Cauchy distribution)

$$\beta(a\_{im}) = \frac{\mathbf{g}(a\_{im})}{\overline{\mathbf{g}(a\_{im})}} - \mathbf{1}, m \in \overline{\mathbf{0}, N/2^i - 1}, \mathbf{g}(\mathbf{x}) = (\mathbf{y} \cdot \boldsymbol{\uprho})(\mathbf{x}) = \frac{1}{\sqrt{2\pi}} \mathbf{x}^{-2} \left(\mathbf{1} - \mathbf{e}^{-\mathbf{x}^2/2}\right),\tag{43}$$

$$\rho(\mathbf{x}) = \frac{1}{\sqrt{2\pi}} e^{-\mathbf{x}^2/2}, \eta(\mathbf{x}) = \frac{1}{\sqrt{2\pi}} \frac{\rho(\mathbf{x}) - |\mathbf{x}|(1 - \Phi(\mathbf{x}))}{\rho(\mathbf{x})},\tag{44}$$

where *φ*ð Þ *x* —standard normal distribution density, *γ*ð Þ *x* —Quasi-Cauchy's density of distribution.

2.Calculate the minimum parameter *wi* value (using Quasi-Cauchy distribution)

$$w\_i^{\min} = \frac{\frac{1}{2} \left(\tilde{T}\_i\right)^2 e^{-\left(\frac{\tilde{T}\_i}{\tilde{T}}\right)^2/2}}{1 + \Phi\left(\tilde{T}\_i\right) - \tilde{T}\_i \ \phi\left(\tilde{T}\_i\right) - \frac{1}{2}},\\ \tilde{T}\_i = \sqrt{2\ln\left(N/2^i\right)}.\tag{45}$$

3.Find parameter value *wi* by the equation numerical solution on an interval *w*min *<sup>i</sup>* , 1 � �

$$S\_i(w\_i) = \sum\_{m=0}^{N/2^i - 1} \frac{\beta(a\_{im})}{1 + w\_i \beta(a\_{im})} = \mathbf{0}.\tag{46}$$

4.Find a Bayesian threshold *Ti* by the numerical solution of the equation on an interval 0, *<sup>T</sup>*max ½ � (using Quasi-Cauchy distribution)

$$-\Phi(T\_i) + T\_i \cdot \rho(T\_i) + \frac{1}{2} + \frac{1}{2}(T\_i)^2 e^{-(T\_i)^2/2} (\mathbf{1}/w\_i - \mathbf{1}) = \mathbf{0}.\tag{47}$$

	- 1.Execute soft threshold processing (for universal, minimax, Bayesian, a SURE-threshold)

$$\check{d}\_{im} = \begin{cases} d\_{im} - T\_i, & d\_{im} \ge T\_i \\ d\_{im} + T\_i, & d\_{im} \le -T\_i, m \in \overline{0, N/2^{i-1} - 1}. \\ 0, & |d\_{im}| \le T\_i \end{cases} \tag{48}$$

2.Execute rigid threshold processing (for universal, minimax, Bayesian, SURE, a FDR threshold)

$$\tilde{d}\_{im} = \begin{cases} d\_{im}, & |d\_{im}| > T\_i \\ 0, & |d\_{im}| \le T\_i \end{cases}, m \in \overline{0, N/2^{i-1} - 1}. \tag{49}$$

6. If *i*< *P* then *i* ¼ *i* þ 1, go to a step 1.

### **4.3 Signal design**

1. Initialization.

Level number of recoveries *i* ¼ *P*.

2.On the current *i*-th the recovery level signal convolution with impulse response functions of FIR-HP and FIR-LP filter is executed *g k*ð Þ, *h k*ð Þ respectively

$$c\_{i-1,n} = \sqrt{2} \sum\_{m=0}^{N/2^i - 1} c\_{im} h(n + 2m) + \sqrt{2} \sum\_{m=0}^{N/2^i - 1} \tilde{d}\_{im} g(n + 2m), n \in \overline{0, N/2^{i-1} - 1}. \tag{50}$$

3. If *i*> 1 then *i* ¼ *i* � 1, go to a step 1.

#### **Example**

In **Figure 10** the source signal, is presented on **Figure 11**—noisy (additive white is added the noise with a mean 0 and variance 0.001 is Gaussian), in **Figure 12**—filtered. The soft SURE-threshold with Daubechies wavelet with amount of the zero moments *L* ¼ 4 was used (i.e., an order of FIR-HP and FIR-LP filter *M* ¼ 8).

**Figure 10.** *Source signal for wavelet analysis threshold processing.*

**Figure 11.**

*A signal with an additive Gaussian noise for wavelet analysis threshold processing.*

#### **Figure 12.**

*The signal cleaned using a wavelet analysis with threshold processing.*

### **5. The smoothing non-adaptive linear temporary filtering**

The smoothing non-adaptive linear time filters are low pass filters [6].

In case of the FIR-LP filter with symmetric impulse response function *h*ð Þ �*M* , … , *h*ð Þ 0 , … , *h M*ð Þ, non-adaptive linear time filtering represents convolution of non-adaptive impulse response function *h m*ð Þ signal *x n*ð Þ as

*Recent Advances in Biometrics*

$$\chi(n) = \sum\_{m=-M}^{M} h(m) \ \varkappa(n-m). \tag{51}$$

Let us give impulse response functions of the most widespread two-dimensional smoothing linear filters:

1. Impulse response function of the arithmetic mean filter

$$h(m) = \frac{1}{2M + 1}, m \in \overline{-M, M}.\tag{52}$$

2. Impulse response function of the normalized Gauss filter

$$h(m) = \frac{\frac{1}{2\pi\sigma^2} \exp\left(-\frac{1}{2}\frac{m^2}{\sigma^2}\right)}{\sum\_{l=-M}^{M} \frac{1}{2\pi\sigma^2} \exp\left(-\frac{1}{2}\frac{l^2}{\sigma^2}\right)}, m \in \overline{-M, M}. \tag{53}$$

3. Impulse response function of the normalized binomial filter

$$h(m) = \frac{\mathbf{C}\_{2M}^{M+m}}{\sum\_{l=0}^{2M} \mathbf{C}\_{2M}^{l}}, \mathbf{C}\_{n}^{m} = \frac{n!}{m!(n-m)!}, m \in \overline{-M, M}. \tag{54}$$

#### **Example**

In **Figure 13** the source signal, is presented on **Figure 14**—noisy (additive white is added the noise with a mean 0 and variance 0.001 is Gaussian), on **Figure 15** filtered, wherein the arithmetic mean filter with *M* ¼ 1.

**Figure 13.** *Source signal for smoothing non-adaptive linear temporary filtering.*

**Figure 14.** *A signal with an additive Gaussian noise for smoothing non-adaptive linear temporary filtering.*

**Figure 15.** *The signal denoised by means of the arithmetic mean filter.*

### **6. The smoothing nonlinear filtering**

The smoothing nonlinear filters [6] are low-pass filters.

#### **6.1 Geometric mean, harmonic mean, contraharmonic filters**

1.Geometric mean filter

$$\chi(n) = \left(\prod\_{m=-M}^{M} \varkappa(n-m)\right)^{\frac{1}{2M+1}}.\tag{55}$$

2.Harmonic mean filter

$$\chi(n) = \frac{2M + 1}{\sum\_{m=-M}^{M} \frac{1}{\chi(n-m)}}.\tag{56}$$

3.Contraharmonic filter

$$\mathcal{Y}(n) = \frac{\sum\_{m=-M}^{M} \mathbb{1}^{Q+1}(n-m)}{\sum\_{m=-M}^{M} \mathbb{1}^{Q}(n-m)}.\tag{57}$$

Geometric mean, harmonic mean, contraharmonic filters delete additive and multiplicative continuous and continuous impulse noises.

The harmonic mean filter in case of an impulse noise suppresses only white points. The contraharmonic filter in case of an impulse noise at *Q* >0 suppresses only black points (at *Q* ¼ �1 receive the harmonic mean filter), and at *Q* <0 suppresses only white points. At *Q* ¼ 0 receive the arithmetic mean filter.

Therefore, for suppression of an impulse noise it is better to use *α*-trimmed mean, median or rank, conservative and morphological filters.

#### **6.2** *α***-trimmed mean filter**

Algorithm *α*-trimmed mean filtering applied to a signal *x n*ð Þ size *N*, is as follows:

1.Create for each sample of a signal a vector from elements of its neighborhood *Un* size 2*M* þ 1 as

$$a\_n = (\mathfrak{x}(n-M), \dots, \mathfrak{x}(n+M)), n \in \overline{M, N-M+1}.\tag{58}$$

2. Sort for each sample of a signal element of its vector by increase

$$
\tilde{a}\_n = \text{sort}(a\_n), n \in \overline{M, N - M + 1}. \tag{59}
$$

3.Execute *α*-trimmed mean filtering of a signal in a form

$$\chi(n) = \frac{\sum\_{m=1+a/2}^{2M+1+a/2} \tilde{a}\_n(m)}{2M+1-a}, n \in \overline{M, N-M+1},\tag{60}$$

where *α*—parameter, which multiple 2, 0 ≤*α* ≤2*M*.

At *α* = 0 we receive the arithmetic mean filter, and at *α* ¼ 2*M* receive the median filter. *α*-trimmed mean filter deletes additive and multiplicative continuous both continuous impulse noises and impulse noises.

### **6.3 Median and rank filters**

Median filtering is defined in a form.

$$\mathcal{Y}(n) = \operatorname\*{median}\_{m \in U\_n} \{ \boldsymbol{\kappa}(m) \}, n \in \overline{\boldsymbol{M}, \boldsymbol{N} - \boldsymbol{M} + 1}. \tag{61}$$

where *Un*—neighborhood of sample *n* size 2*M* þ 1.

Median filtering is a special case of rank filtering at a rank *r* ¼ *M* þ 1. In case of rank filtering not the central sample, but sample which number corresponds to a rank undertakes *r*, and 1≤*r*≤2*M* þ 1.

Median and rank filters delete additive and multiplicative continuous both continuous impulse noises and impulse noises.

#### **6.4 Midpoint filter**

Algorithm of the midpoint filtering applied to a signal *x n*ð Þ size *N*, is as follows:

1.Calculate for each sample of a signal the minimum and maximum value in its neighborhood in a form

$$a(n) = \min\_{m \in U\_n} \{ \mathbf{x}(m) \}, \beta(n) = \max\_{m \in U\_n} \{ \mathbf{x}(m) \}, n \in \overline{M, N - M + 1},\tag{62}$$

where *Un*—neighborhood of sample *n* size 2*M* þ 1.

2. Signal midpoint filtering execute in a form

$$\mathcal{Y}(n) = \frac{1}{2}(a(n) + \beta(n)), n \in \overline{M, N - M + 1}. \tag{63}$$

The Midpoint filter deletes additive and multiplicative continuous and continuous impulse noises.

#### **6.5 Conservative filter algorithm**

Conservative filtering algorithm applied to a signal *x n*ð Þ size *N*, is as follows:

1.Calculate for each sample of a signal the minimum and maximum value in its neighborhood in a form

$$a(n) = \min\_{m \in U\_n \backslash \{n\}} \{ \mathfrak{x}(m) \}, \beta(n) = \max\_{m \in U\_n \backslash \{n\}} \{ \mathfrak{x}(m) \}, n \in \overline{M, N - M + 1}. \tag{64}$$

where *Un*—sample neighborhood *n* size 2*M* þ 1.

2. Signal conservative filtering execute in a form

$$\gamma(n) = \begin{cases} \varkappa(n), & a(n) < \varkappa(n) < \beta(n) \\ a(n), & \varkappa(n) \le a(n) \\ \beta(n), & \varkappa(n) \ge \beta(n) \end{cases}, n \in \overline{M, N - M + 1}. \tag{65}$$

The conservative filter deletes additive and multiplicative continuous both continuous impulse noises and impulse noises.

#### **6.6 Morphological filter**

Morphological filtering is carried out by consecutive performing operations of open and close or close and open. At open, operations dilatation and an erosion are consistently executed, and at close—an erosion and dilatation.

Dilatation can be defined in a form.

$$z(n) = \max\_{m \in U\_n} \{ \varkappa(m) \} \ n \in \overline{M, N - M + 1}. \tag{66}$$

Erosion can be defined in a form.

$$z(n) = \min\_{m \in U\_n} \{ \varkappa(m) \}, n \in \overline{M, N - M + 1}, \tag{67}$$

where *Un*—sample neighborhood *n*.

The morphological filter deletes impulse noises.

#### **Example**

In **Figure 16** the source signal, is presented on **Figure 17**—noisy (additive white is added the noise with an mean 0 and variance 0.001 is Gaussian), the signals denoised

**Figure 16.** *Source signal for smoothing nonlinear filtering of additive Gaussian noise.*

*A Voice Signal Filtering Methods for Speaker Biometric Identification DOI: http://dx.doi.org/10.5772/intechopen.101975*

**Figure 17.** *A signal with an additive Gaussian noise for smoothing nonlinear filtering.*

by means of the geometric mean filter (*M* ¼ 1) (**Figure 18**), *α*-trimmed mean filter (*M* ¼ 2, *α* ¼ *M* ¼ 2) (**Figure 19**), median filter (*M* ¼ 2) (**Figure 20**), midpoint filter (*M* ¼ 1) (**Figure 21**), conservative filter (*M* ¼ 1) (**Figure 22**). In signal quality the syllable "sa" with a sampling rate of 22050 Hz, 8-bits, mono was selected.

**Figure 18.** *The signal denoised by means of the geometric mean filter.*

**Figure 19.** *The signal denoised by means of the α-trimmed mean filter.*

#### **Example**

In **Figure 23** the source signal, on **Figure 24**—noisy (the impulse noise "salt and pepper" with a noisiness of 1% of sample of a signal is added), the signals denoised by means of the *α*-trimmed mean filter (*M* ¼ 2, *α* ¼ *M* ¼ 2) (**Figure 25**), the median

**Figure 20.** *The signal denoised by means of the median filter.*

**Figure 21.** *The signal denoised by means of the midpoint filter.*

#### **Figure 22.**

*The signal denoised by means of the conservative filter.*

filter (*M* ¼ 2) (**Figure 26**), the conservative filter (*M* ¼ 1) (**Figure 27**), the morphological filter (consistently executed by open and close with *M* ¼ 3) (**Figure 28**). In signal quality the syllable "sа" a sampling rate of 22050 Hz, 8-bits, mono was selected.

**Figure 23.** *Source signal for smoothing nonlinear filtering of impulse noise.*

#### **Figure 24.**

*A signal with an impulse noise "salt and pepper" for smoothing nonlinear filtering.*

### **7. Numerical research of denoising methods noise**

For the voice signals containing vocal sounds the sampling rate of 8 kHz and quantity of quantizing levels 256 was set.

Numerical research results of denoising methods on a basis a wavelet analysis with threshold processing in case of Daubechies wavelet about 8 with soft threshold

**Figure 25.** *The signal denoised by means of the α-trimmed mean filter.*

#### **Figure 26.**

*The signal denoised by means of the median filter.*

processing with a SURE-threshold, the adaptive filter about 1, it is Gaussian the filter about 1 with parameter *σ* ¼ 0*:*7, the arithmetic mean filter about 1, geometric mean filters about 1, harmonic mean filters about 1, contraharmonic filters about 1 with parameter *Q* ¼ 1, median filter about 2, *α*-trimmed mean filter of about 2 with parameter *α* ¼ 2, the midpoint filter about 1, conservative filters about 1 for voice

**Figure 27.** *The signal denoised by means of the conservative filter.*

**Figure 28.** *The signal denoised by means of the morphological filter.*

signals people from the TIMIT database which were noise an additive Gaussian noise with mean 0 and variance 0.001 (a signal-to-noise ratio about 11 dB) and multiplicative Gaussian noise with mean 1 and variance 0.07 (a signal-to-noise ratio about 23 dB), are presented to **Table 1**, where MSE—Mean Square Error.


#### **Table 1.**

*Results of a numerical research of denoising methods from additive Gaussian noise and multiplicative Gaussian noise.*

The result is provided in **Table 1** shows that the smallest MSE is provided *α*-trimmed mean filter.

### **8. Conclusion**

For biometric identification are considered and in number investigated the following methods of noise suppression in a voice signal. The smoothing adaptive linear time filtering (the minimum root mean square error algorithm, the recursive least squares algorithm, the Kalman filtering algorithm, the Lee algorithm), the smoothing adaptive linear frequency filtering (the generalized method, the MLEE method, a wavelet analysis with threshold processing (universal threshold, SURE-threshold, minimax threshold, FDR-threshold, Bayesian threshold were used), the smoothing non-adaptive linear time filtering (the arithmetic mean filter, the normalized Gauss's filter, the normalized binomial filter), the smoothing nonlinear filtering (geometric mean filter, the harmonic mean filter, the contraharmonic filter, the α-trimmed mean filter, the median filter, the rank filter, the midpoint filter, the conservative filter, the morphological filter). Numerical research results of denoising methods for voice signals people from the TIMIT database which were noise an additive Gaussian noise and multiplicative Gaussian noise were received. The *α*-trimmed mean filter proved to be the most effective for both noise types.

*Recent Advances in Biometrics*

## **Author details**

Eugene Fedorov<sup>1</sup> \*, Tetyana Utkina<sup>1</sup> and Tetyana Neskorodeva<sup>2</sup>

1 Cherkasy State Technological University, Cherkasy, Ukraine

2 Vasyl' Stus Donetsk National University, Vinnytsia, Ukraine

\*Address all correspondence to: fedorovee75@ukr.net

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*A Voice Signal Filtering Methods for Speaker Biometric Identification DOI: http://dx.doi.org/10.5772/intechopen.101975*

### **References**

[1] Diniz PSR. Adaptive Filtering: Algorithms and Practical Implementation. Berlin: Springer; 2020. 505 p. DOI: 10.1007/978-1-4614-4106-9

[2] Lim JS. Two-Dimensional Signal and Image Processing. Englewood Cliffs, NJ: Prentice Hall; 1990. p. 694

[3] Rabiner LR, Schafer RW. Theory and Applications of Digital Speech Processing. Upper Saddle River, NJ: Pearson Higher Education, Inc.; 2011. p. 1042

[4] Yektaeian M, Amirfattahi R. Comparison of spectral subtraction methods used in noise suppression algorithms. In: Proceedings of 6th International Conference on Information, Communications and Signal Processing (ICICS 2007). 2007. pp. 1-4

[5] Mallat S. A Wavelet Tour of Signal Processing: Sparse Way. 3rd ed. Bourlington, MA: Academic Press; 2008. p. 832

[6] Gonzalez R, Woods R. Digital Image Processing. Hoboken, NJ: Pearson Education, Inc.; 2018. p. 1306

## *Edited by Muhammad Sarfraz*

Biometrics are widely used in various real-life applications, including personal recognition, identification, verification, and more. They may also be used for safety, security, permission, banking, crime prevention, forensics, medical applications, and communication. This book explores the latest developments, theories, methods, approaches, algorithms, analysis, systems, hardware, and software in biometrics and related systems.

Published in London, UK © 2022 IntechOpen © Rost-9D / iStock

Recent Advances in Biometrics

Recent Advances in

Biometrics

*Edited by Muhammad Sarfraz*