**Acknowledgements**

The authors would like to appreciate Guan-Yu Huang for conducting experiments to collect results. This work is supported in part by Pervasive Artificial Intelligence Research (PAIR) Labs, Taiwan.

**Author details**

*Incomplete Data Analysis*

*DOI: http://dx.doi.org/10.5772/intechopen.94068*

**71**

Kaohsiung City, Taiwan

Bo-Wei Chen<sup>1</sup> and Jia-Ching Wang<sup>2</sup>

University, Taoyuan City, Taiwan

provided the original work is properly cited.

\*Address all correspondence to: jcw@csie.ncu.edu.tw

\*

1 Department of Electrical Engineering, National Sun Yat-sen University,

2 Department of Computer Science and Information Engineering, National Central

© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium,

*Incomplete Data Analysis DOI: http://dx.doi.org/10.5772/intechopen.94068*

Besides, RMSEs became higher when missing rates were increased. Observations showed that KNRImpute, RTImpute, and RFImpute generated similar RMSEs. Overall, KNNImpute and PCAImpute were affected by the hyperparameters.

This chapter introduces recent methods for processing missing values. Besides, four types of commonly used algorithms, namely, *K*-Nearest Neighbors, regression, tree-based algorithms, and latent component-based approaches, were examined. Their advantages and disadvantages were also discussed in each subsection. It is worth noting that data imputation usually does not require training data. It becomes impractical when data imputation needs supervisory information or the ground truth (notably, the ground truth is unobservable). This is because when missing values occur in training data and even when the ground truth is missing, the supervised methods even cannot work to learn the ground truth. Therefore, those selected four types of commonly used algorithms in this chapter did not rely on and

To evaluate those commonly used algorithms, this chapter conducted experiments on open datasets. Criteria including root-mean-squared errors and coefficients of determination were adopted. Numerical results were also displayed in the

In more recent years, surveys showed that a deep learning model "Generative Adversarial Network (GAN)" has attracted much attention, and several novel imputation methods based on GANs have been proposed, e.g., MisGAN [49], MIWAE [50], and GAIN [51]. For future studies, deep learning architectures such as Deep PCA, PCANet, and Deep NMF, can be integrated into those four types of commonly used algorithms, namely, *K*-Nearest Neighbors, regression, tree-based algorithms, and latent component-based approaches and subsequently enhance data

The authors would like to appreciate Guan-Yu Huang for conducting experiments to collect results. This work is supported in part by Pervasive Artificial

**4. Conclusions**

*Applications of Pattern Recognition*

imputation.

**70**

**Acknowledgements**

Intelligence Research (PAIR) Labs, Taiwan.

require any supervisory information.

experimental section for reference.
