期刊名称:International Journal of Advanced Research In Computer Science and Software Engineering
印刷版ISSN:2277-6451
电子版ISSN:2277-128X
出版年度:2013
卷号:3
期号:8
出版社:S.S. Mishra
摘要:Data mining refers to extracting or mining knowledge from large amounts of data. Organizing data into valid groupings is one of the most basic ways of understanding and learning. Cluster analysis is important for analysing the number of clusters of natural data in several domains. Outlier detection is a fundamental part of data mining. A key challenge with outlier detection is that it is not a well-formulated problem like clustering. This paper discussion on two different techniques and then comparison by analysing their different accuracy, mean squared error, time complexity. The techniques were: First is threshold based approach and second is entropy based approach. In order to find the best clustering algorithm for outlier detection several performance measures are used. The experimental results sho w that the outlier detection accuracy is very good in threshold approach clustering algorithm compared to the existing algorithms
关键词:clustering ; outlier; entropy ; threshold; mean squared error