3. Convolution neural network

Convolutional neural network is an artificial neural network inspired by the animal visual system [29]. Convolutional layer, pooling layer, and fully connection layer are the three main types of layers used to construct the CNN architecture. Compared to traditional neural networks, CNNs can extract features without losing much spatial correlations of the input. Each layer consists of neurons that have learnable weights and biases. The optimal model is achieved after feeding data into the network and minimizing the loss function at the top layer. Several different architectures of CNN have been proposed. In this work, we used LeNet-5. LeNet-5 [30] was first used in handwritten digit recognition and achieved an impressive error rate as low as 0.8%. Figure 4 shows the architecture of the LeNet-5 convolutional neural network used for classification of the red blood cell images.

One of the major challenges of the research is that the current image dataset is still too small, which could lead to overfitting when used for training deep convolutional neural network. To this end, we consider data augmentation. More similar images can be added to the dataset by applying to the existing images operations such as rotation, translation, flip, zoom, and color perturbations. Other methods include data augmentation in the spatial domain by learning the statistical models of data transformation [31], as well as data augmentation through interpolation and extrapolation in the feature domain ([32, 33]). In the following, we present our work in augmenting the image dataset of the red blood cells and discuss the impact of the data augmentation on the image classification accuracies using deep convolutional neural network.

Figure 4. LeNet-5 convolutional neural network architecture. There are two convolution layers (C1 and C3), two subsampling (pooling) layers (S2 and S4), and two fully connected layers.
