期刊名称:International Journal of Computer Trends and Technology
电子版ISSN:2231-2803
出版年度:2014
卷号:12
期号:4
页码:193-195
DOI:10.14445/22312803/IJCTT-V12P138
出版社:Seventh Sense Research Group
摘要:In Statistical analysis, missing data is a common problem for data quality. Many real datasets have missing data. Imputation preserves all cases by replacing missing data with a probable value based on other available information. Once all missing values have been imputed, the data set can be analyzed using standard techniques for complete data. This paper aim is to describe the efficient imputation method like Mean, Median, Refined Mean, Standard Deviation, Linear Regression, Discretization based method and some of clustering techniques like KMean and KNN methods which are used for imputing missing values in the dataset. The datasets are taken from the UCI ML repository. The results are compared in terms of accuracy.