**2.2. CNN basic**

Convolution neural networks (CNNs) have an inference process for recognition and a back propagation process for training. Since training CNN spends a lot of time, many CNN application usually complete training off-line in advanced, and then use the trained CNN in terminals to execute tasks. Therefore, the inference process for recognition in terminals has more pressing demands on speed and power. In this work, we focus on the inference process of CNN and explore how to speed up the inference process with hardware accelerator.

**Figure 1** is the simplified structure of a typical CNN. A typical CNN model usually consists of several parts, including convolutional layer (CONV), nonlinear activation function,

**Figure 1.** Structure of a typical convolution neural networks.

pooling layer and fully connected layer (FC). Layers are executed layer by layer. CNN reads an input image, and go through a series of CONV layers, nonlinear activation function, pooling layers and generate the output feature maps. These feature maps are turned into a feature vector in the FC layers. Finally, the feature vector is read by a classifier and the input image is classified to the most possible category and the probabilities of each classification are generated.

*2.2.2. Nonlinear activation function*

*2.2.3. Pooling layer*

CNN usually applies nonlinear activation function after each CONV layers or FC layers. The main function of nonlinear activation function is introducing nonlinearity into the CNN. Generally, nonlinear and differentiable are the two conditions that activation functions should to meet. Some conventional nonlinear activation functions in CNN such as sigmoid and tanh are shown in **Figure 4(a)**. However, these activation functions require long training time. In recent years the activation function Rectified Linear Unit (ReLU) [19] becomes more and more popular among CNN models. The expression of ReLU is f(x) = max(0, x), and it can be shown in **Figure 4(b)**. Compared to the conventional activation functions, the computation process of ReLU is more simple, and it makes training faster. In addition, since a number of

Optimizing of Convolutional Neural Network Accelerator

http://dx.doi.org/10.5772/intechopen.75796

151

Pooling layer usually attach to the CONV layer. As **Figure 5** shown, usually the type of pooling is maximum or average. The pooling layers read input feature maps, and calculate the maximum or average value of every sub-area of input feature maps so that get low-dimension feature maps. Usually the stride of pooling is equal to the size of pooling.

output of ReLU are equal to 0, it makes the sparsity of CNN model.

**Figure 3.** Pseudo code of the computing process of convolutional layer.

**Figure 4.** Nonlinear activation function: (a) sigmoid and Tanh; (b) Rectified Linear Unit.
