期刊名称:International Journal of Computer Science and Information Technologies
电子版ISSN:0975-9646
出版年度:2014
卷号:5
期号:4
页码:5315-5321
出版社:TechScience Publications
摘要:Data mining has made a great progress in recent year but the problem of missing data or value has remained great challenge for data mining. Missing data or value in a datasets can affect the performance of classifier which leads to difficulty of extracting useful information from datasets Dataset taken for this work is student dataset that contains some missing values. The missing value are present in tm_10and tm_12.To impute theses missing value we use three techniques are used that are lit wise deletion, mean imputation, KNN imputation. After applying these techniques we have three imputed dataset. On these imputed dataset we apply classification algorithm c4.5/j48. In this work analyzes the performance of imputation methods using C4.5 classifier on the basis of accuracy for handling missing data or value. After that decide which imputation method is best to handle missing value. On the basis of experimental results accuracy KNN is greater than other two techniques. So, KNN imputation is a better way of handling missing value. Weka data mining tool is used for this analysis