首页    期刊浏览 2024年09月15日 星期日
登录注册

文章基本信息

  • 标题:An adaptive outlier removal aided k-means clustering algorithm
  • 本地全文:下载
  • 作者:Nawaf H.M.M. Shrifan ; Muhammad F. Akbar ; Nor Ashidi Mat Isa
  • 期刊名称:Journal of King Saud University @?C Computer and Information Sciences
  • 印刷版ISSN:1319-1578
  • 出版年度:2022
  • 卷号:34
  • 期号:8
  • 页码:6365-6376
  • 语种:English
  • 出版社:Elsevier
  • 摘要:K-means is one of ten popular clustering algorithms. However, k-means performs poorly due to the presence of outliers in real datasets. Besides, a different distance metric makes a variation in data clustering accuracy. Improve the clustering accuracy of k-means is still an active topic among researchers of the data clustering community from outliers removal and distance metrics perspectives. Herein, a novel modification of the k-means algorithm is proposed based on Tukey’s rule in conjunction with a new distance metric. The standard Tukey rule is modified to remove the outliers adaptively by considering whether the data is distributed to the left, right or even to the input data's mean value. The elimination of outliers is applied in the proposed modification of the k-means before calculating the centroids to minimize the outliers' influences. Meanwhile, a new distance metric is proposed to assign each data point to the nearest cluster. In this research, the modified k-means significantly improves the clustering accuracy and centroids convergence. Moreover, the proposed distance metric's overall performance outperforms most of the literature distance metrics. This manuscript's presented work demonstrates the significance of the proposed technique to improve the overall clustering accuracy up to 80.57% on nine standard multivariate datasets.
国家哲学社会科学文献中心版权所有