首页    期刊浏览 2024年09月15日 星期日
登录注册

文章基本信息

  • 标题:Clustering-Based Hybrid Approach for Multivariate Missing Data Imputation
  • 其他标题:Clustering-Based Hybrid Approach for Multivariate Missing Data Imputation
  • 本地全文:下载
  • 作者:Aditya Dubey ; Akhtar Rasool
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2020
  • 卷号:11
  • 期号:11
  • DOI:10.14569/IJACSA.2020.0111186
  • 出版社:Science and Information Society (SAI)
  • 摘要:In the era of big data, a significant amount of data is produced in many applications areas. However due to various reasons including sensor failures, communication failures, environmental disruptions, and human errors, missing values are found frequently These missing data in the observed data make a challenge for other data mining approaches, requiring the missed data to be handled at the preprocessing stage of data mining. Several approaches for handling the missing data have been proposed in the past. These approaches consider the whole dataset for making a prediction, making the whole imputation approach to be cumbersome. This paper proposes the procedure which makes use of the local similarity structure of the dataset for making an Imputation. The K-means clustering technique along with the weighted KNN makes efficient imputation of the missed value. The results are compared against imputations by mean substitution and Fuzzy C Means (FCM). The proposed imputation technique shows that it performs better than other imputation procedures.
  • 关键词:Clustering; imputation; KNN; missing at random; multivariate
国家哲学社会科学文献中心版权所有