**4. Wavelet thresholding**

The signals carry a large amount of useful information which is difficult to find. The discovery of orthogonal bases and local time-frequency analysis opens the door to the world of sparse representation of signals. An orthogonal basis is a dictionary of minimum size that can yield a sparse representation if designed to concentrate the signal energy over a set of few vectors (Mallat, 2009a). The smaller amount of wavelet coefficients reveals the information of signal we are looking for. The generation of those vectors is an approximation of original signal by linear combination of wavelets. For all *f* in <sup>2</sup>*L R*( ) ,

$$P\_j f = P\_{j+1} f + \left. \int f \, \nu \right|\_{j,k} > \left. \nu \right|\_{j,k} \,, \tag{31}$$

where , , *j k f* stands for the inner product of *f* and *<sup>j</sup>*,*<sup>k</sup>* , *Pj* is the orthogonal projection onto *Vj* . In orthogonal decomposition, *Vj* is the subspace which satisfies *VVVV V* 210 1 2 , <sup>2</sup> ( ) *<sup>j</sup> <sup>j</sup> V L* and {0} *<sup>j</sup> <sup>j</sup> V* (Daubechies, 1992).

Thresholding the wavelet coefficients keeps the local regularity of signal. Usually, wavelet thresholding includes three steps (Shim et al., 2001; Zhang et al., 2007):


$$\tilde{F} = \sum\_{j=L+1}^{l} \sum\_{k=0}^{2^{-j}} \rho\_{\Gamma}() \mu\_{j,k} + \sum\_{k=0}^{2^{-l}} \rho\_{\Gamma}() \phi\_{l,k} \tag{32}$$

where *<sup>T</sup>* is either a hard threshold or a soft threshold. Normally, the selected threshold is applied on all coefficients except the coefficients contain the lowest frequency energy , , *X J k* . This aims to keep the regularity of reconstructed signal. The difference between keeping and not keeping the lowest-frequency approximate coefficients is illustrated by Fig.3. Universal threshold with hard thresholding is used in estimation. The original data in Fig.3(a) has a wide frequency range. It contains both low frequency regular component and high frequency irregular components. Fig.3(c) shows when lowest-frequency part is kept,

Wavelet Denoising 69

pass filter, respectively. They usually denote the filter banks at reconstruction. At decomposition, the wavelet coefficients are calculated with *h k*[ ] and *g k*[ ] where *hk h k* [] [ ] and *gk g k* [] [ ] . Accordingly, the coefficients generated by low pass filter and

high pass filter are called approximates and detail, respectively (Mallat, 2009b)

1 1

*a hp d gp*

Fig. 4. Thresholding procedure with multi-resolution analysis (the lowest-frequency

Different time-frequency structures are contained in complex signals. This motivates the exploration of time-frequency representation with adaptive properties. Although similar to multi-resolution analysis, wavelet package can divide the frequency axis in separate interval of various sizes. Its spaces *Wj* are also divided into two orthogonal spaces. In order to discriminate the detail spaces of wavelet packet from those of multi-resolution analysis, the *Wj* is represented as *<sup>p</sup> Wj* . Thus, the relation between detail spaces is 2 21

. The orthogonal bases at the children nodes can be represented as

Wavelet packet coefficients are computed with a filter bank that is the same as multiresolution analysis. The wavelet packet transform is an iteration of the two-channel filter bank decomposition presented in section 4.1. At the decomposition, the wavelet coefficients

<sup>1</sup> () [ ] ( 2 ) *p p <sup>j</sup>*

 *t gk t k*

of 2 1

are obtained by subsampling the convolutions of

*j j k*

1 *<sup>p</sup> Wj* , and 2 1

1 *p*

*<sup>j</sup> d* with low-pass filter *h k*[ ] and high-pass filter *g k*[ ]:

*<sup>j</sup> <sup>d</sup>* and 2 1

1 *p <sup>j</sup> <sup>d</sup>*

*j j*

Fig.4 shows the thresholding procedure with multi-resolution analysis.

, and 1[ ] [ 2 ] [ ] [2 ] *j j <sup>j</sup>*

1

*a p hp na n gp na n*

*jj j k k*

 

[] []

[] [ 2] [] [ 2] []

*k*

*d p gk pa n a g p*

. (34)

. (33)

1 1 *p p W W j j*

> 1 *<sup>p</sup> Wj* (Mallat,

 , and

1[ ] [ 2 ] [ ] [2 ] *j j <sup>j</sup>*

*a p hk pa n a h p*

*k*

At the reconstruction,

approximate 2 *a* is kept)

2 21 1 1 *pp p WW W jj j* 

*j j k*

2

2009c).

*p*

**4.2 Wavelet packet transform** 

<sup>1</sup>() [ ] ( 2 ) *p p <sup>j</sup>*

of wavelet packet children <sup>2</sup>

 *t hk t k*

of <sup>2</sup>

the regular component is still included in reconstructed signal. But if the lowest-frequency part is removed, only the high-frequency irregular components left as in Fig.3(d).

3. Reconstruction. After thresholding, all the coefficients are reconstructed to form the denoised signal.

Fig. 3. Difference between keeping lowest-frequency approximates and not keeping it. (a) original data, (b) noisy data (SNR=22.93dB), (c) estimation with , , *X J k* kept (SNR=34.83dB), (d) estimation without , , *X J k* kept (SNR=0.24dB)

Multi-resolution analysis and wavelet packet transform are the most widely employed orthogonal analyses. The wavelet thresholding by using multi-resolution analysis and wavelet packet are introduced in section 4.1 and section 4.2.

#### **4.1 Multi-resolution analysis**

Multi-resolution analysis is discrete wavelet transform using series of conjugate mirror filter pairs. The signal *f* is projected onto a multi-resolution approximation space *Vj* . This space is then decomposed into a lower resolution space *Vj*1 and a detail space *Wj*<sup>1</sup> . The two spaces satisfy *V W <sup>j</sup>*1 1 *<sup>j</sup>* , and *V WV j jj* 1 1 (Daubechies, 1992). The orthogonal basis ( 2) *j j n t n* of *f* in *Vj* is also divided into two new orthogonal bases <sup>1</sup> 1 (2 ) *j j k t k* of *Vj*<sup>1</sup> , and <sup>1</sup> 1 (2 ) *j j k t k* of *Wj*<sup>1</sup> .

This decomposition process is realized by filtering *f* by a pair of conjugate mirror filters *h k*[ ] and <sup>1</sup> [ ] ( 1) [1 ] *<sup>k</sup> g k hk* . The [ ] *h k* and *g k*[ ] are also called low pass filter and high

the regular component is still included in reconstructed signal. But if the lowest-frequency

3. Reconstruction. After thresholding, all the coefficients are reconstructed to form the

Fig. 3. Difference between keeping lowest-frequency approximates and not keeping it. (a)

Multi-resolution analysis and wavelet packet transform are the most widely employed orthogonal analyses. The wavelet thresholding by using multi-resolution analysis and

Multi-resolution analysis is discrete wavelet transform using series of conjugate mirror filter pairs. The signal *f* is projected onto a multi-resolution approximation space *Vj* . This space is then decomposed into a lower resolution space *Vj*1 and a detail space *Wj*<sup>1</sup> . The two spaces satisfy *V W <sup>j</sup>*1 1 *<sup>j</sup>* , and *V WV j jj* 1 1 (Daubechies, 1992). The orthogonal basis

This decomposition process is realized by filtering *f* by a pair of conjugate mirror filters *h k*[ ] and <sup>1</sup> [ ] ( 1) [1 ] *<sup>k</sup> g k hk* . The [ ] *h k* and *g k*[ ] are also called low pass filter and high

*j n t n* of *f* in *Vj* is also divided into two new orthogonal bases <sup>1</sup>

*J k* kept (SNR=0.24dB)

*J k* kept

1 (2 ) *j j k t k* of *Vj*<sup>1</sup> ,

original data, (b) noisy data (SNR=22.93dB), (c) estimation with , , *X*

wavelet packet are introduced in section 4.1 and section 4.2.

(SNR=34.83dB), (d) estimation without , , *X*

**4.1 Multi-resolution analysis** 

 ( 2) *j*

and <sup>1</sup> 1 (2 ) *j j k t k* of *Wj*<sup>1</sup> .

part is removed, only the high-frequency irregular components left as in Fig.3(d).

denoised signal.

pass filter, respectively. They usually denote the filter banks at reconstruction. At decomposition, the wavelet coefficients are calculated with *h k*[ ] and *g k*[ ] where *hk h k* [] [ ] and *gk g k* [] [ ] . Accordingly, the coefficients generated by low pass filter and high pass filter are called approximates and detail, respectively (Mallat, 2009b)

$$a\_{j+1}[p] = \sum\_{k=-\infty}^{+\infty} h[k - 2p] a\_j[n] = a\_j \ast \overline{h}[2p], \text{and } d\_{j+1}[p] = \sum\_{k=-\infty}^{+\infty} g[k - 2p] a\_j[n] = a\_j \ast \overline{g}[2p]. \tag{33}$$

At the reconstruction,

$$\begin{split} a\_j[p] &= \sum\_{k=-\infty}^{+\infty} h[p-2n] a\_{j+1}[n] + \sum\_{k=-\infty}^{+\infty} g[p-2n] a\_j[n] \\ &= \bar{a}\_{j+1} \* h[p] + \bar{d}\_{j+1} \* g[p] \end{split} \tag{34}$$

Fig.4 shows the thresholding procedure with multi-resolution analysis.

Fig. 4. Thresholding procedure with multi-resolution analysis (the lowest-frequency approximate 2 *a* is kept)

#### **4.2 Wavelet packet transform**

Different time-frequency structures are contained in complex signals. This motivates the exploration of time-frequency representation with adaptive properties. Although similar to multi-resolution analysis, wavelet package can divide the frequency axis in separate interval of various sizes. Its spaces *Wj* are also divided into two orthogonal spaces. In order to discriminate the detail spaces of wavelet packet from those of multi-resolution analysis, the *Wj* is represented as *<sup>p</sup> Wj* . Thus, the relation between detail spaces is 2 21 1 1 *p p W W j j* , and 2 21 1 1 *pp p WW W jj j* . The orthogonal bases at the children nodes can be represented as 2 <sup>1</sup>() [ ] ( 2 ) *p p <sup>j</sup> j j k t hk t k* of <sup>2</sup> 1 *<sup>p</sup> Wj* , and 2 1 <sup>1</sup> () [ ] ( 2 ) *p p <sup>j</sup> j j k t gk t k* of 2 1 1 *<sup>p</sup> Wj* (Mallat,

$$\text{2009c}\text{).}$$

Wavelet packet coefficients are computed with a filter bank that is the same as multiresolution analysis. The wavelet packet transform is an iteration of the two-channel filter bank decomposition presented in section 4.1. At the decomposition, the wavelet coefficients of wavelet packet children <sup>2</sup> 1 *p <sup>j</sup> <sup>d</sup>* and 2 1 1 *p <sup>j</sup> <sup>d</sup>* are obtained by subsampling the convolutions of *p <sup>j</sup> d* with low-pass filter *h k*[ ] and high-pass filter *g k*[ ]:

$$d\_{j+1}^{2^p}[k] = d\_j^{p^\*} \ast \overline{h}[2k]\_{j'} \text{ and } \ d\_{j+1}^{2^{p+1}}[k] = d\_j^{p^\*} \ast \overline{g}[2k] \cdot \tag{35}$$

Iterating the decomposition of coefficients along the branches forms a binary wavelet packet tree with 2 1 *<sup>L</sup>* leaves (0 2 1) *n L <sup>L</sup> d n* at level *L* . Then, at the reconstruction,

$$\mathbf{d}\_{\dot{j}}^{p}[k] = \breve{d}\_{\dot{j}+1}^{2p} \* h[k] + \breve{d}\_{\dot{j}+1}^{2p+1} \* g[k] \, \cdot \,\tag{36}$$

Wavelet Denoising 71

factor in threshold *T* . In practical application, the variance is unknown and its estimation is

the influence from *f n*[ ] must be considered. When *f* is piecewise regular, a robust estimator of variance can be calculated from the median of the finest-scale wavelet coefficients. Fig.7 depicts the wavelet transforms of three functions: blocks, pulses and heavisine. They are chosen because they often arise in signal processing. It is easy to find the large-magnitude coefficients only occur exclusively near the areas of major spatial activities

Fig. 7. Three functions and their wavelet transform, (a) blocks (left) and its wavelet

coefficients (right), (b) pulses (left) and its wavelet coefficients (right), (c) heavisine (left) and

of noise *W n*[ ] from the data *Xn f n Wn* [] [] [] ,

of noise *W* is an important

**4.3 Noise variance estimation** 

(Donoho &Johnstone, 1994).

its wavelet coefficients (right)

needed. When estimating the variance <sup>2</sup>

In threshold estimation discussed in section 3, the variance <sup>2</sup>

The decomposition and reconstruction of wavelet packet transform are illustrated in Fig.5. The thresholding procedure is added before reconstruction.

Fig. 5. Thresholding procedure by using wavelet packet transform (the lowest-frequency approximate 0 <sup>2</sup> *d* is kept)

Both multi-resolution analysis and wavelet packet transform are used in estimation of a same noisy signal. The estimations are shown in Fig.6. The coiflet 2 is used to calculate wavelet coefficients and the hard threshold is set as *T N* 2log*<sup>e</sup>* .

Fig. 6. Estimation with different wavelet transforms, (a) original data, (b) noisy data (SNR=23.56dB), (c) estimation with multi-resolution analysis (SNR=35.3dB), (d) estimation with wavelet packet transform (SNR=33.22dB)

Iterating the decomposition of coefficients along the branches forms a binary wavelet packet

2 2 1 1 1 [] \*[] \* [] *pp p jj j d k d hk d gk*

The decomposition and reconstruction of wavelet packet transform are illustrated in Fig.5.

Fig. 5. Thresholding procedure by using wavelet packet transform (the lowest-frequency

Fig. 6. Estimation with different wavelet transforms, (a) original data, (b) noisy data (SNR=23.56dB), (c) estimation with multi-resolution analysis (SNR=35.3dB), (d) estimation

with wavelet packet transform (SNR=33.22dB)

Both multi-resolution analysis and wavelet packet transform are used in estimation of a same noisy signal. The estimations are shown in Fig.6. The coiflet 2 is used to calculate

2log*<sup>e</sup>* .

*<sup>L</sup> d n* at level *L* . Then, at the reconstruction,

<sup>1</sup> [ ] \* [2 ] *p p j j d k d gk*

. (36)

. (35)

2

tree with 2 1 *<sup>L</sup>* leaves (0 2 1) *n L*

<sup>2</sup> *d* is kept)

approximate 0

<sup>1</sup>[ ] \* [2 ] *p p j j d k d hk* , and 2 1

The thresholding procedure is added before reconstruction.

wavelet coefficients and the hard threshold is set as *T N*

#### **4.3 Noise variance estimation**

In threshold estimation discussed in section 3, the variance <sup>2</sup> of noise *W* is an important factor in threshold *T* . In practical application, the variance is unknown and its estimation is needed. When estimating the variance <sup>2</sup> of noise *W n*[ ] from the data *Xn f n Wn* [] [] [] , the influence from *f n*[ ] must be considered. When *f* is piecewise regular, a robust estimator of variance can be calculated from the median of the finest-scale wavelet coefficients. Fig.7 depicts the wavelet transforms of three functions: blocks, pulses and heavisine. They are chosen because they often arise in signal processing. It is easy to find the large-magnitude coefficients only occur exclusively near the areas of major spatial activities (Donoho &Johnstone, 1994).

Fig. 7. Three functions and their wavelet transform, (a) blocks (left) and its wavelet coefficients (right), (b) pulses (left) and its wavelet coefficients (right), (c) heavisine (left) and its wavelet coefficients (right)

If *f* is piecewise smooth, the wavelet coefficients |, | *l k*, *f* at finest scale *l* are very small, in which case , , , , *X W l k l k* . As mentioned in section 3.1, the wavelet coefficients , , *W l k* are still white if *W* is white. Therefore, most coefficients contribute to noise with variance <sup>2</sup> and only a few of them contribute to signal. Then, a robust estimator of <sup>2</sup> is calculated from the median of wavelet coefficients |, | *W l k*, . Different from mean value, median is independent of the magnitude of those few largemagnitude coefficients related with signal. If *M* is the median of absolute value of independent Gaussian random variables with zero mean and variance <sup>2</sup> <sup>0</sup> , then one can show that

$$E\{M\} \approx 0.6745\sigma\_0\tag{37}$$

Wavelet Denoising 73

Fig. 8. Wavelet thresholding with hard and soft threshold, (a) original data (left) and its wavelet transform, (b) noisy data (SNR=23.58dB) (left) and its wavelet transform, (c) Estimation with hard thresholding (SNR=36.47dB) (left) and its wavelet coefficients (right), (d) Estimation with soft thresholding (SNR=31.98dB) (left) and its wavelet coefficients

(right).

The variance of noise *W* is estimated from the median *MX* of absolute wavelet coefficients |, | *W l k*, by neglecting the influence of *f* (Mallat, 2009d):

$$
\tilde{\sigma} = \frac{M\_{\chi}}{0.6745} \tag{38}
$$

Actually, piecewise smooth signal *f* is only responsible for a few large-magnitude coefficients, and has little impact on the value of *MX* .

#### **4.4 Hard or soft threshold**

As mentioned in section 3.2, the estimation can be done with hard and soft thresholding. The estimator *F* with soft threshold is at least as regular as original signal *f* since the wavelet coefficients have a smaller magnitude. But this will result in a slight difference in magnitude when comparing estimation with original signal. This is not true if hard threshold is applied. All the coefficients with large-amplitude above threshold *T* are unchanged. However, because of the error induced by hard-thresholding, oscillations or small ripples are created near irregular points.

Fig.8 shows the wavelet estimation with hard and soft thresholding. The original data consists of a pulse signal and a sinusoidal. It includes both piecewise smooth signal and irregular segments. The wavelet coefficients are calculated with a coiflet 2. The variance <sup>2</sup> of white noise is calculated with (38) and the threshold is set to *T N* 2log*<sup>e</sup>* . In Fig.8(c), the hard thresholding removes the noise in the area where the original signal *f* is regular. But the coefficients near the singularities are still kept. The SNR of estimation with hard thresholding is 36.47dB. Compared with hard thresholding, the magnitude of coefficients with soft thresholding is a little smaller. The soft thresholding estimation attenuates the noise affect at the discontinuities, but it also reduces the magnitude of estimation. The SNR of soft-thresholding estimation reduces to 31.98dB. The lower SNR of soft thresholding doesn't mean poor ability of signal estimation. The two thresholdings are selected in different applications.

Different from mean value, median is independent of the magnitude of those few largemagnitude coefficients related with signal. If *M* is the median of absolute value of

<sup>0</sup> *E M*{ } 0.6745

The variance of noise *W* is estimated from the median *MX* of absolute wavelet coefficients

0.6745 *MX*

Actually, piecewise smooth signal *f* is only responsible for a few large-magnitude

As mentioned in section 3.2, the estimation can be done with hard and soft thresholding.

wavelet coefficients have a smaller magnitude. But this will result in a slight difference in magnitude when comparing estimation with original signal. This is not true if hard threshold is applied. All the coefficients with large-amplitude above threshold *T* are unchanged. However, because of the error induced by hard-thresholding, oscillations or

Fig.8 shows the wavelet estimation with hard and soft thresholding. The original data consists of a pulse signal and a sinusoidal. It includes both piecewise smooth signal and irregular segments. The wavelet coefficients are calculated with a coiflet 2. The variance <sup>2</sup>

the hard thresholding removes the noise in the area where the original signal *f* is regular. But the coefficients near the singularities are still kept. The SNR of estimation with hard thresholding is 36.47dB. Compared with hard thresholding, the magnitude of coefficients with soft thresholding is a little smaller. The soft thresholding estimation attenuates the noise affect at the discontinuities, but it also reduces the magnitude of estimation. The SNR of soft-thresholding estimation reduces to 31.98dB. The lower SNR of soft thresholding doesn't mean poor ability of signal estimation. The two thresholdings are selected in

of white noise is calculated with (38) and the threshold is set to *T N*

with soft threshold is at least as regular as original signal *f* since the

 *l k* are still white if *W* is white. Therefore, most coefficients

is calculated from the median of wavelet coefficients |, | *W*

*l k* . As mentioned in section 3.1, the wavelet

and only a few of them contribute to signal. Then, a

at finest scale *l* are very

(37)

*l k*, .

(38)

2log*<sup>e</sup>* . In Fig.8(c),

<sup>0</sup> , then one can

If *f* is piecewise smooth, the wavelet coefficients |, | *l k*, *f*

 *l k* 

independent Gaussian random variables with zero mean and variance <sup>2</sup>

*l k*, by neglecting the influence of *f* (Mallat, 2009d):

coefficients, and has little impact on the value of *MX* .

small ripples are created near irregular points.

small, in which case , , , , *X W*

contribute to noise with variance <sup>2</sup>

coefficients , , *W*

robust estimator of <sup>2</sup>

show that


**4.4 Hard or soft threshold** 

The estimator *F*

different applications.

Fig. 8. Wavelet thresholding with hard and soft threshold, (a) original data (left) and its wavelet transform, (b) noisy data (SNR=23.58dB) (left) and its wavelet transform, (c) Estimation with hard thresholding (SNR=36.47dB) (left) and its wavelet coefficients (right), (d) Estimation with soft thresholding (SNR=31.98dB) (left) and its wavelet coefficients (right).

Wavelet Denoising 75

Wavelet thresholding explores the ability of wavelet bases to approximate signal *f* with only a few non-zero coefficients. Therefore, optimal selection of wavelet bases is an important factor in wavelet thresholding. This depends on the properties of signal and

The number of vanishing moments determines what the wavelet doesn't "see" (Hubbard,

with two vanishing moments cannot see the linear functions; the wavelet with three vanishing moments will be blind to both linear and quadratic functions; and so on. If *f* is regular and *<sup>k</sup> C* , which means *f* is *p* times continuously differentiable function, when *k p* then the wavelet can generate small coefficients at fine scales 2 *<sup>j</sup>* (Mallat, 2009b). But it is not the higher the better. Too high vanishing moment may miss the useful information in signal, and leave more useless information such as noise. The proper number of vanishing

The size of support is the length of interval in which the wavelet values are non-zero. If *f* has an isolated singularity at 0*t* and if 0*t* is inside the support of

<sup>0</sup>*t* (Mallat, 2009b). In wavelet thresholding application, the signal *f* is supposed to be represented by a few non-zero coefficients. The support of wavelet should be in a smaller

*p* 12 . The Daubechies wavelets are optimal to have minimum size of support for a given number of vanishing moments. When choosing a wavelet, we have to face a trade-off between number of vanishing moments and size of support. This is dependent on the

A polynomial function with degree less than 4 is shown in Fig.10. The noisy data and estimations with Daubechies wavelets are also listed. Here, level-dependent threshold is set

. The estimation with wavelet Daubechies 3 whose vanishing moments *p* is

,*kj* may have a large amplitude. If

for 0 *k p* . (39)

has *p* vanishing moments, its support size must be at least

,*kj* whose support includes

has a compact

is orthogonal to any polynomial of degree *p* 1 . Therefore, the wavelet

has *p* vanishing moments if

wavelets such as regularity, number of vanishing moments, and size of support.

*t t dt* 

**5. Selection of optimal wavelet bases** 

moments is thus important in optimal wavelet selection.

support of size *N* , at each scale *<sup>j</sup>* 2 there are *N* wavelets

, then *f* ,

() 0 *<sup>k</sup>*

**5.1 Vanishing moments** 

1998). Usually, the wavelet

This means that

**5.2 Size of support** 

size.

)2(2)( , 2/ , *t ktj kj j kj*

If an orthogonal wavelet

regularity of signal *f* .

3 has larger SNR than others.

as *<sup>T</sup>* log2 *<sup>e</sup> <sup>N</sup>* <sup>~</sup> 

#### **4.5 Fixed or level-dependent variance estimation**

If the influence of level is neglected, the estimated variance of white noise can be set as the estimation of finest scale, or 1 *d* in Fig.9. As discussed in section 4.3, most wavelet coefficients at finest scale contribute to noise, and only a few of them contributes to signal. The use of fixed estimator reduces the influence of signal and edge effect in wavelet transform. But when the added noise is no longer white noise, for example, colored Gaussian noises, the noise variance should be estimated level by level (Johnstone & Silverman, 1997). Fig.9 gives the estimation of original signal in Fig.8(a).

In Fig.9, a Gaussian noise is added. Section 4.3 explains how to calculate the threshold value from the wavelet coefficients. Here, the universal threshold *T N* 2log*<sup>e</sup>* proposed in section 3.3.1 is used. In Fig.9(a), we estimate the noise variance with the median formula in (38) at the finest scale. Only one threshold *T* is produced. In level-dependent estimation, the estimation of noise variance (38) is done for each scale. That is to say, six scales in Fig.9(b) will generate six estimated variances and thus six thresholds *T* . Each threshold is applied on each scale accordingly. The SNR of estimation with level-dependent estimation (36.7dB) is greater than that of fixed estimation (36.4dB).

Fig. 9. Wavelet thresholding with fixed and level-dependent variance estimation, (a) Estimation with fixed estimation (SNR=36.4dB) (left) and its wavelet transform (right), (b) Estimation with level-dependent estimation (SNR=36.7dB) (left) and its wavelet transform (right).

the estimation of finest scale, or 1 *d* in Fig.9. As discussed in section 4.3, most wavelet coefficients at finest scale contribute to noise, and only a few of them contributes to signal.

transform. But when the added noise is no longer white noise, for example, colored Gaussian noises, the noise variance should be estimated level by level (Johnstone &

In Fig.9, a Gaussian noise is added. Section 4.3 explains how to calculate the threshold value

section 3.3.1 is used. In Fig.9(a), we estimate the noise variance with the median formula in (38) at the finest scale. Only one threshold *T* is produced. In level-dependent estimation, the estimation of noise variance (38) is done for each scale. That is to say, six scales in

is applied on each scale accordingly. The SNR of estimation with level-dependent estimation

Fig. 9. Wavelet thresholding with fixed and level-dependent variance estimation, (a) Estimation with fixed estimation (SNR=36.4dB) (left) and its wavelet transform (right), (b) Estimation with level-dependent estimation (SNR=36.7dB) (left) and its wavelet transform

reduces the influence of signal and edge effect in wavelet

and thus six thresholds *T* . Each threshold

of white noise can be set as

2log*<sup>e</sup>* proposed in

**4.5 Fixed or level-dependent variance estimation** 

Fig.9(b) will generate six estimated variances

(36.7dB) is greater than that of fixed estimation (36.4dB).

The use of fixed estimator

(right).

If the influence of level is neglected, the estimated variance

Silverman, 1997). Fig.9 gives the estimation of original signal in Fig.8(a).

from the wavelet coefficients. Here, the universal threshold *T N*

## **5. Selection of optimal wavelet bases**

Wavelet thresholding explores the ability of wavelet bases to approximate signal *f* with only a few non-zero coefficients. Therefore, optimal selection of wavelet bases is an important factor in wavelet thresholding. This depends on the properties of signal and wavelets such as regularity, number of vanishing moments, and size of support.

#### **5.1 Vanishing moments**

The number of vanishing moments determines what the wavelet doesn't "see" (Hubbard, 1998). Usually, the wavelet has *p* vanishing moments if

$$\int\_{-\eta}^{+\eta} t^k \nu(t) dt = 0 \quad \text{for } 0 \le k < p \text{ .} \tag{39}$$

This means that is orthogonal to any polynomial of degree *p* 1 . Therefore, the wavelet with two vanishing moments cannot see the linear functions; the wavelet with three vanishing moments will be blind to both linear and quadratic functions; and so on. If *f* is regular and *<sup>k</sup> C* , which means *f* is *p* times continuously differentiable function, when *k p* then the wavelet can generate small coefficients at fine scales 2 *<sup>j</sup>* (Mallat, 2009b). But it is not the higher the better. Too high vanishing moment may miss the useful information in signal, and leave more useless information such as noise. The proper number of vanishing moments is thus important in optimal wavelet selection.

#### **5.2 Size of support**

The size of support is the length of interval in which the wavelet values are non-zero. If *f* has an isolated singularity at 0*t* and if 0*t* is inside the support of )2(2)( , 2/ , *t ktj kj j kj* , then *f* , ,*kj* may have a large amplitude. If has a compact support of size *N* , at each scale *<sup>j</sup>* 2 there are *N* wavelets ,*kj* whose support includes <sup>0</sup>*t* (Mallat, 2009b). In wavelet thresholding application, the signal *f* is supposed to be represented by a few non-zero coefficients. The support of wavelet should be in a smaller size.

If an orthogonal wavelet has *p* vanishing moments, its support size must be at least *p* 12 . The Daubechies wavelets are optimal to have minimum size of support for a given number of vanishing moments. When choosing a wavelet, we have to face a trade-off between number of vanishing moments and size of support. This is dependent on the regularity of signal *f* .

A polynomial function with degree less than 4 is shown in Fig.10. The noisy data and estimations with Daubechies wavelets are also listed. Here, level-dependent threshold is set as *<sup>T</sup>* log2 *<sup>e</sup> <sup>N</sup>* <sup>~</sup> . The estimation with wavelet Daubechies 3 whose vanishing moments *p* is 3 has larger SNR than others.

Wavelet Denoising 77

Both orthogonal wavelets and biorthogonal wavelets can be used in orthogonal wavelet transform. Thus, Daubechies wavelets, symlets, coiflets and biorthogonal wavelets are

Daubechies *N <sup>N</sup>* }1{ *N N* 12 Orthogonal Symlets *N <sup>N</sup>* }2{ *N N* 12 Orthogonal Coiflets *N <sup>N</sup>* }51{ 2*N N* 16 Orthogonal

vanishing moments Size of support Orthogonality

12*Nd* for dec.

*Nr* 12 for rec. Biorthogonal

studied in this chapter. Their properties are listed in Table 1 (Mallat, 2009b).

for rec. *Nr*

Table 1. Information of some wavelet families, 'dec'. is short for decomposition, 'rec'. is

Fig. 11. Estimation of regular data, (a) original data, (b) noisy data (SNR=13.03dB), (c) estimation with 'db2' (SNR=24.75dB), (d) estimation with 'coif3' (SNR=36.96dB)

Choosing the suitable wavelet in wavelet thresholding depends on the features of signal and wavelet properties mentioned in section 5.1, 5.2 and 5.3. For different applications, the optimal wavelets change. For instance, the irregular wavelet Daubechies 2 induces irregular errors in wavelet thresholding of regular signal processing. But it achieves better estimation when applied to estimate transient signal in power system which are often composed by pulses and heavy noises (Ma et al., 2002). As illustrated in Fig.11 and Fig.12, two original datasets are tested with an irregular wavelet Daubechies 2 and a regular wavelet coiflet 3.

Wavelet name Order Number of

for dec.

*Nd <sup>N</sup>* }81{

*Nr <sup>N</sup>* }61{

**5.4 Wavelet families** 

Biorthogonal wavelets

short for reconstruction

Fig. 10. The estimations by using wavelets with different vanishing moments, (a) original data, (b) noisy data (SNR=22.4dB), (c) estimation with Daubechies 3 (SNR=35.5dB), (d) estimation with Daubechies 7 (SNR=34.6dB), (e) estimation with Daubechies 11 (SNR=33.8dB), (f) estimation with Daubechies 15 (SNR=32.8dB)

#### **5.3 Regularity**

The regularity of wavelet induces an obvious influence on wavelet coefficients in thresholding. When reconstructing a signal from its wavelet coefficients *f* , ,*kj* , an error is added. Then a wavelet component ,*kj* will be added to the reconstructed signal. If is smooth, ,*kj* is a smooth error. For example, in image-denoising, the smooth error is often less visible than irregular errors (Mallat, 2009b). Although the regularity of a function is independent of the number of vanishing moments, the smoothness of some wavelets is related to their vanishing moments such as biorthogonal wavelets.

Fig. 10. The estimations by using wavelets with different vanishing moments, (a) original data, (b) noisy data (SNR=22.4dB), (c) estimation with Daubechies 3 (SNR=35.5dB), (d) estimation with Daubechies 7 (SNR=34.6dB), (e) estimation with Daubechies 11 (SNR=33.8dB), (f) estimation with Daubechies 15 (SNR=32.8dB)

thresholding. When reconstructing a signal from its wavelet coefficients *f* ,

related to their vanishing moments such as biorthogonal wavelets.

is added. Then a wavelet component ,*kj*

The regularity of wavelet induces an obvious influence on wavelet coefficients in

often less visible than irregular errors (Mallat, 2009b). Although the regularity of a function is independent of the number of vanishing moments, the smoothness of some wavelets is

is a smooth error. For example, in image-denoising, the smooth error is

will be added to the reconstructed signal. If

,*kj* , an error

**5.3 Regularity** 

is smooth, ,*kj*

#### **5.4 Wavelet families**

Both orthogonal wavelets and biorthogonal wavelets can be used in orthogonal wavelet transform. Thus, Daubechies wavelets, symlets, coiflets and biorthogonal wavelets are studied in this chapter. Their properties are listed in Table 1 (Mallat, 2009b).


Table 1. Information of some wavelet families, 'dec'. is short for decomposition, 'rec'. is short for reconstruction

Choosing the suitable wavelet in wavelet thresholding depends on the features of signal and wavelet properties mentioned in section 5.1, 5.2 and 5.3. For different applications, the optimal wavelets change. For instance, the irregular wavelet Daubechies 2 induces irregular errors in wavelet thresholding of regular signal processing. But it achieves better estimation when applied to estimate transient signal in power system which are often composed by pulses and heavy noises (Ma et al., 2002). As illustrated in Fig.11 and Fig.12, two original datasets are tested with an irregular wavelet Daubechies 2 and a regular wavelet coiflet 3.

Fig. 11. Estimation of regular data, (a) original data, (b) noisy data (SNR=13.03dB), (c) estimation with 'db2' (SNR=24.75dB), (d) estimation with 'coif3' (SNR=36.96dB)

Wavelet Denoising 79

Daubechies, I. (1992). *Ten lectures on wavelets*, Society for Industrial and Applied

Donoho D. L. (1995). De-noising by soft-thresholding. *IEEE transactions on information theory*,

Donoho D. L. & Johnstone I. M. (1994). Ideal Denoising In an Orthonormal Basis

Donoho D. L. & Johnstone I. M. (1995). Adapting to unknown smoothness via wavelet

Donoho D. L. & Johnstone I. M. (1998). Minimax estimation via wavelet shrinkage. *Annals of* 

Hubbard, B. B. (1998). *The world according to wavelets: the story of a mathematical technique in the* 

Johnstone I. M. & Silverman B. W. (1997). Wavelet threshold estimators for data with

Lehmann, E. L. & Casella G. (1998). *Theory of point estimation*, Springer, ISBN 0-387-98502-6,

Ma, X., Zhou, C. & Kemp, I. J. (2002). Automated wavelet selection and thresholding for PD

Mallat, S. G. (2009). Sparse Representations, In: *A wavelet tour of signal processing : the Sparse* 

Mallat, S. G. (2009). Wavelet Bases, In: *A wavelet tour of signal processing : the Sparse way,* 

Mallat, S. G. (2009). Wavelet Packet and Local Cosine Bases, In: *A wavelet tour of signal* 

Mallat, S. G. (2009). Denoising, In: *A wavelet tour of signal processing : the Sparse way,* Mallat, S.

Shim I., Soraghan J. J. & Siew W. H. (2001). Detection of PD utilizing digital signal

Stein C. M. (1981). Estimation of the Mean of a Multivariate Normal-Distribution. *Annals of Statistics,* Vol. 9, No. 6, (November 1981), pp. 1317-1322, ISSN 0090-5364

*Magazine*, Vol. 17, No. 1, (February, 2001), pp.6-13, ISSN 0883-7554

*statistics*, Vol. 26, No. 3, (June 1998), pp. 879-921, ISSN 00905364

*making,* A. K. Peters, ISBN 1568810725, Wellesley, Mass

Vol. 59, No.2, (May, 1997), pp. 319-351, ISSN 13697412

Chosen From A Library of Bases. *Comptes Rendus De L Academie Des Sciences Serie I-Mathematique,* Vol. 319, No. 12, (December 1994), pp. 1317-1322, ISSN 0764-

shrinkage. *Journal of the American Statistical Association,* Vol. 90, No. 432, (December

correlated noise. *Journal of the royal statistical society: series B (statistical methodology)*,

detection. *IEEE Electrical Insulation Magazine*, Vol. 18, No. 2, (August, 2002), pp. 37-

*way,* Mallat, S. G., pp. 1-30, Elsevier/Academic Press, ISBN 13:978-0-12-374370-1,

Mallat, S. G., pp. 263-370, Elsevier/Academic Press, ISBN 13:978-0-12-374370-1,

*processing : the Sparse way,* Mallat, S. G., pp. 377-432, Elsevier/Academic Press, ISBN

G., pp. 535-606, Elsevier/Academic Press, ISBN 13:978-0-12-374370-1,

processing methods. Part 3: Open-loop noise reduction. *IEEE Electrical Insulation* 

Mathematics, ISBN 0898712742, Philadelphia, Pa.

1995), pp. 1200-1224, ISSN 0162-1459

Vol.41, No. 3, (May, 1995), pp. 613-627, ISSN 00189448

**7. References** 

4442

New York

35, ISSN 0883-7554

Amsterdam,Boston

Amsterdam,Boston

Amsterdam,Boston

13:978-0-12-374370-1, Amsterdam,Boston

Fig. 12. Estimation of irregular data, (a) original data, (b) noisy data (SNR=10.38dB), (c) estimation with 'db2' (SNR=21.68dB), (d) estimation with 'coif3' (SNR=20.61dB)
