Impact of Missing Data on Classification Success in Health and Comparative Analysis of Imputation Methods

Ergun E. U., Kok I., ÖZDEMİR S.

2022 International Symposium on Networks, Computers and Communications, ISNCC 2022, Shenzhen, China, 19 - 21 July 2022 identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/isncc55209.2022.9851791
  • City: Shenzhen
  • Country: China
  • Keywords: Healthcare systems, Internet of Things, Machine learning, Missing data imputation
  • Hacettepe University Affiliated: Yes


© 2022 IEEE.Data quality plays an important role in increasing the success and reliability of IoT applications. However, due to the nature of IoT, generated data can be missing, erroneous and noisy due to hardware failures, synchronization issues, unstable network communication and manual system closure. Particularly, missing data must be imputed correctly to reduce erroneous or inaccurate decisions in IoT healthcare systems. Therefore, in this paper, we use naive bayes, k-nearest neighbors, decision tree, XGboost algorithms in IoT healthcare domain to reveal the effect of missing data on the results of machine learning algorithms in detail. Then, we make a comparative analysis of the missing data imputation methods.