首页    期刊浏览 2024年07月08日 星期一
登录注册

文章基本信息

  • 标题:Replace Missing Values with EM algorithm based on GMM and Naïve Bayesian
  • 本地全文:下载
  • 作者:Xi-Yu Zhou ; Joon S. Lim
  • 期刊名称:International Journal of Software Engineering and Its Applications
  • 印刷版ISSN:1738-9984
  • 出版年度:2014
  • 卷号:8
  • 期号:5
  • 页码:177-188
  • DOI:10.14257/ijseia.2014.8.5.14
  • 出版社:SERSC
  • 摘要:In data mining applications, there are various kinds of missing values in experimental datasets. Non-substitution or inappropriate treatment of missing values has a high probability to cause a lot of warnings or errors. Besides, many classification algorithms are very sensitive to the missing values. Because of these, handling the missing values is an important phase in many classification or data mining task. This paper introduces traditional EM algorithm and disadvantage of the EM algorithm. We propose a new method to implement the missing values based on EM algorithm, which uses Naive Bayesian to improve the accuracy. We conclude by classifying seeds dataset and vertebral columns dataset and comparing the results to those obtained by applying two other missing value handling methods: the traditional EM algorithm and the non-substitution method. The experimental results prove a stable algorithm for improving the data classification accuracy on large datasets, which contain a lot of missing values.
  • 关键词:missing values; EM algorithm; GMM; Naive Bayesian
国家哲学社会科学文献中心版权所有