首页    期刊浏览 2024年09月16日 星期一
登录注册

文章基本信息

  • 标题:Impact of Outlier Removal and Normalization Approach in Modified k-Means Clustering Algorithm
  • 本地全文:下载
  • 作者:Vaishali Rajeev Patel ; Rupa G. Mehta
  • 期刊名称:International Journal of Computer Science Issues
  • 印刷版ISSN:1694-0784
  • 电子版ISSN:1694-0814
  • 出版年度:2011
  • 卷号:8
  • 期号:5
  • 出版社:IJCSI Press
  • 摘要:Clustering technique is mainly focus on pattern recognition for further organizational design analysis which finds groups of data objects such that objects in a group are similar to one another and dissimilar from the objects in the other group. It is important to preprocess data due to noisy data, errors, inconsistencies, outliers and lack of variable values. Different data preprocessing techniques like cleaning method, outlier detection, data integration and transformation can be carried out before clustering process to achieve successful analysis. Normalization is an important preprocessing step in Data Mining to standardize the values of all variables from dynamic range into specific range. Outliers can significantly affect data mining performance, so outlier detection and removal is an important task in wide variety of data mining applications. k-Means is one of the most well known clustering algorithms yet it suffers major shortcomings like initialize number of clusters and seed values preliminary and converges to local minima. This paper analyzed the performance of modified k-Means clustering algorithm with data preprocessing technique includes cleaning method, normalization approach and outlier detection with automatic initialization of seed values on datasets from UCI dataset repository.
  • 关键词:Clustering; k-Means; Normalization Approach; Outlier Removal; Preprocessing
国家哲学社会科学文献中心版权所有