**4. Methodology**

All experiments used MATLAB® R2020b & Deep Learning Toolbox, and the CNN utilized was the MatConvNet, which is also reported in Williams and Li [9]. Only one change was made to the CNN, the local response normalization (LRN) layer is used instead of the batch normalization (BN) layer for practicality of the applications. All training used stochastic gradient decent, initial learning rate of 0.001, minibatch of 64, and 2 iterations, the first one for 20 epochs and the second one for 40 epochs. Besides, all tests were run on a 64-bit operating system (Windows 10). The core is AMD RYZEN 7 4800H series with Radeon Graphics @ 2.90 GHZ, with 16.0 GB RAM and the GPU utilized is NVIDIA GEFORCE GTX 1050. This study used three wavelet functions implemented for the lifting scheme: the Haar, Daubechies 4 and 6.

#### **4.1 Max-Pooling frequency analyzes procedure**

In CNN, feature loss in pooling layers is more relevant than in convolution layers. Thus, the pooling process is where most of the size reduction of the information takes place. The main idea of the pooling method is to select just one value and eliminate the rest. Depending on the pooling method that value can be the maximum, the average, or a randomly selected one. For example, the average pooling loses relevant information by acting like a low-pass filter, generalizing features when obtaining its mean value. In contrast, max pooling focus on the highest magnitude producing an irregular effect that allows the CNN to improve its accuracy and prevent overfitting, which is the reason to be selected for this study over other methods.

The pooling methods used in this study are divided into two groups, the single or standard methods and the wavelet methods. The objective it to perform a complete comparison among all these methods. The first group consists of five methods:

Average (Avg), Maximum (Max.), Mixed (Mix.), Mix. by Region and Stochastic (Stoch.), which are the most referred methods for CNNs. The second group is made of seven methods based on The Lifting Scheme: The Lifting Scheme only for the LL Coefficient (Lift), hybrid combination of Lifting Scheme LL coefficient and maxpooling (Lift. + Max), The Lifting Scheme for all the Coefficient (Lift.+ Coeff.), hybrid combination of all the Lifting Scheme Coefficient and Max-pooling (Lift. + Max. + Coeff.), Lifting LL coefficient with one other coefficient selected randomly (LH, HL and HH) for the entire channel (Lift. + Rand. Channel), same as before but for a 2*x*2 region (Lift. + Rand. Region) and finally, the last one is also as the previous one but adding the Max-pooling (Lift. + Max. + Rand. Region) for a 2*x*2 region.

In a previous study [22], the Max-pooling is considered as some kind of high-pass filter that retains some high frequency features. However, the results suggested otherwise. In consequence, the max-pooling effect was analyzed in the frequency domain. To keep it as simple as possible, we reduce the pooling process for a 1D signal. To analyze the max-pooling frequency response several simulated signals are generated randomly to circumscribe as much spectrum values and combinations as possible, as shown in **Figure 5**.

Once the max-pooling frequency response is analyzed, its behavior may be emulated by a new model. In this sense, the proposed model looks to reproduce the effects of max-pooling but eliminating the need of an input signal with specific characteristics, to trigger them at some point.

## **4.2 Proposed model**

The proposed model in this study is a pooling method that incorporates the 2D lifting scheme. The lifting scheme is used as a base of the model because it is very suitable within CNN architecture, as in Ma et al. [24] and Bastidas-Rodriguez et al. [25].

**Figure 5.** *Example of a random input signal used to produce max-pooling frequency responses.*

*Random Wavelet Coefficients Pooling for Convolutional Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.105162*

#### **Figure 6.**

*The proposed model is a pooling method that uses the lifting scheme.It extracts the LL coefficient and concatenates it with a randomly selected coefficient (LH, HL or HH).The model shown is for the Haar case, but it may be also constructed for the Db4 and Db6.*

It is also a MRA technique that preserves frequency and space information when reducing dimensionality without any loss. Even if only the approximation coefficients were used, the lifting scheme could extract relevant features from images to improve CNN performance.

The model is shown in **Figure 6**, although it is for the Haar case, it can be also constructed for other wavelet functions such as Db4 and Db6. First, it extracts the four wavelet coefficients (LL, LH, HL or HH) and selects randomly one of the three coefficients with the highest frequencies. Finally, the selected coefficients are concatenated with the lowest frequency coefficient (LL), according to Williams and Li [26], this subband has most of the image energy and structure information and it is the most alike to the original image. The proposed model is then embedded within the CNN instead of every pooling layer. This pooling layer performs 2*x*2 region based downsampling, and 2*x*2 stride with no padding.

**Figure 8.** *The benchmark dataset selected for this study was the MNIST.*
