4.2. Image interpolation in the feature domain

To obtain the features of the red blood cell images, we used Hinton's autoencoder [34], which in essence is artificial neural network that performs unsupervised learning on the input data [35]. In the encoding phase, low-dimensional representations of the input data are learned through training the neural network. These learned representations are extracted features of the input image. The features can then be used to reconstruct the original data (decoding). The training algorithm will seek to optimize the neural network by minimizing the reconstruction loss as a cost function on sufficiently large amount of data. Moreover, a deep neural network can be constructed by concatenating multiple autoencoders. This would allow for a hierarchical representation of the data through a multilayer architecture. In [34], the Restricted Boltzmann Machine (RBM) was used as an autoencoder, which serves as a building block of a deep autoencoder network. Each RBM was pretrained and unrolled. Then, back propagation was carried out to fine-tune the entire stacked autoencoder based on cross entropy as the cost function.

For any two images A and B in the dataset, two 30-point feature vectors FA and FB can be obtained by the stacked autoencoders that have been trained. Similar to the image interpolation in the spatial domain, we can generate a new 30-point vector FC by finding a weighted

Figure 6. Original images (top row) and the reconstructed images (bottom row) using the stacked autodecoders. (a)

Figure 5. A four-layer stacked autoencoders and autodecoders used to extract the 30-point features of the input image in an unsupervised learning manner. After network training, interpolation of the input images is performed using their 30 point features. The resulting feature vector was then used to reconstruct the red blood cell image by the autodecoders.

Classification of Malaria-Infected Cells Using Deep Convolutional Neural Networks

http://dx.doi.org/10.5772/intechopen.72426

167

where k is a weight varied between 0 and 1 with a step size of 0.1. The newly generated feature vectors are then fed into the trained autodecoders to reconstruct the image C in the spatial

FC ¼ min½ �þ FA; FB k � f g max½ �� FA; FB min½ � FA; FB , (2)

average. Specifically, the pixel at location ð Þ i; j in C can be obtained by

domain.

Malaria-infected cells and (b) normal cells.

In our implementation, the numbers of neurons in each of the four layers are 2500-1500-500-30. The maximum number of epochs for training the autoencoder was set to 1000, and back propagation was set to 500 iterations. Figure 5 shows the architecture of the stacked autoencoders and autodecoders, where data interpolation is performed on the 30-point feature vectors. Figure 6 shows some examples of reconstructed images.

Classification of Malaria-Infected Cells Using Deep Convolutional Neural Networks http://dx.doi.org/10.5772/intechopen.72426 167

evaluate the quality of the augmented data set, we used only half of the infected cell images (400 images) for data augmentation, with the remaining 400 images untouched. The same configuration applies to the set of normal red blood cell images, which contains 4000 images.

We first describe the algorithms for data augmentation by using image interpolation in the spatial domain (Section 4.1), and in the feature domain (Section 4.2), respectively. As a comparison, we then present some example read blood cell images to show the effect of image

For any two images A and B in the dataset, we can generate a new image by finding a

where k is a weight ranging between 0 and 1. It can be seen that C ið Þ¼ ; j min½ � A ið Þ ; j ; B ið Þ ; j for k = 0; C ið Þ¼ ; j max½ � A ið Þ ; j ; B ið Þ ; j for k = 1. By varying the k values, for example, from 0 to 1 with a step size of 0.1, we can create 11 different images for any two input images. Assume the

To obtain the features of the red blood cell images, we used Hinton's autoencoder [34], which in essence is artificial neural network that performs unsupervised learning on the input data [35]. In the encoding phase, low-dimensional representations of the input data are learned through training the neural network. These learned representations are extracted features of the input image. The features can then be used to reconstruct the original data (decoding). The training algorithm will seek to optimize the neural network by minimizing the reconstruction loss as a cost function on sufficiently large amount of data. Moreover, a deep neural network can be constructed by concatenating multiple autoencoders. This would allow for a hierarchical representation of the data through a multilayer architecture. In [34], the Restricted Boltzmann Machine (RBM) was used as an autoencoder, which serves as a building block of a deep autoencoder network. Each RBM was pretrained and unrolled. Then, back propagation was carried out to fine-tune the entire stacked autoencoder based on cross entropy as the cost

In our implementation, the numbers of neurons in each of the four layers are 2500-1500-500-30. The maximum number of epochs for training the autoencoder was set to 1000, and back propagation was set to 500 iterations. Figure 5 shows the architecture of the stacked autoencoders and autodecoders, where data interpolation is performed on the 30-point feature vectors. Figure 6

C ið Þ¼ ; j min½A ið Þ ; j ; B ið Þ ; j � þ k � f g max½A ið Þ ; j ; B ið Þ ; j � � min½ � A ið Þ ; j ; B ið Þ ; j , (1)

<sup>2</sup> images, which

weighted average C. Specifically, the pixel at location ð Þ i; j in C can be obtained by

number of images in the dataset to be augmented is N, we can generate <sup>11</sup>N Nð Þ �<sup>1</sup>

Only half (2000 images) of the dataset were used for augmentation.

interpolation in the spatial and feature domains at the end of Section 4.2.

4.1. Image interpolation in the spatial domain

166 Machine Learning - Advanced Techniques and Emerging Applications

can lead to a much enlarged dataset.

function.

4.2. Image interpolation in the feature domain

shows some examples of reconstructed images.

Figure 5. A four-layer stacked autoencoders and autodecoders used to extract the 30-point features of the input image in an unsupervised learning manner. After network training, interpolation of the input images is performed using their 30 point features. The resulting feature vector was then used to reconstruct the red blood cell image by the autodecoders.

Figure 6. Original images (top row) and the reconstructed images (bottom row) using the stacked autodecoders. (a) Malaria-infected cells and (b) normal cells.

For any two images A and B in the dataset, two 30-point feature vectors FA and FB can be obtained by the stacked autoencoders that have been trained. Similar to the image interpolation in the spatial domain, we can generate a new 30-point vector FC by finding a weighted average. Specifically, the pixel at location ð Þ i; j in C can be obtained by

$$F\_{\mathbb{C}} = \min[F\_A, F\_B] + k \times \{ \max[F\_A, F\_B] - \min[F\_A, F\_B] \},\tag{2}$$

where k is a weight varied between 0 and 1 with a step size of 0.1. The newly generated feature vectors are then fed into the trained autodecoders to reconstruct the image C in the spatial domain.
