首页    期刊浏览 2024年11月08日 星期五
登录注册

文章基本信息

  • 标题:Efficient Clustering Algorithm for Large Data Set
  • 本地全文:下载
  • 作者:N.Sudhakar Reddy ; KVN Sunitha
  • 期刊名称:International Journal of Advanced Research In Computer Science and Software Engineering
  • 印刷版ISSN:2277-6451
  • 电子版ISSN:2277-128X
  • 出版年度:2012
  • 卷号:2
  • 期号:1
  • 出版社:S.S. Mishra
  • 摘要:The concept-drift phenomenon is used for outlier detection or data labeling, which plays a vital role in detection of outlier. But in that there is a disadvantage which is of reclustering when drift occurred. In this connection two scanning operations are required, one for the drifting and another for the reclustering of sliding window. It is necessary to investigate the principal of clustering to design efficient algorith ms to minimize the disk I/O and minimizing the number of scanning operations. In this paper, to overcome the problems of scanning operations and also it is extended to the categorical data where as in literature the leader algorithm for the numerical domain and sequence of data set. The main objective of the idea of clustering the outliers based on the leader algorith m instead of reclustering the entire sliding window /data set by calculating the threshold with the average method and maximal resemblance of leaders. This is more efficient than using reclustering the sliding window
  • 关键词:reclustering; incremental clustering; sliding window; resemblance; threshold
国家哲学社会科学文献中心版权所有