首页    期刊浏览 2024年07月02日 星期二
登录注册

文章基本信息

  • 标题:A Novel High Dimensional and High Speed Data Streams Algorithm: HSDStream
  • 本地全文:下载
  • 作者:Irshad Ahmed ; Irfan Ahmed ; Waseem Shahzad
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2016
  • 卷号:7
  • 期号:9
  • DOI:10.14569/IJACSA.2016.070952
  • 出版社:Science and Information Society (SAI)
  • 摘要:This paper presents a novel high speed clustering scheme for high-dimensional data stream. Data stream clustering has gained importance in different applications, for example, network monitoring, intrusion detection, and real-time sensing. High dimensional stream data is inherently more complex when used for clustering because the evolving nature of the stream data and high dimensionality make it non-trivial. In order to tackle this problem, projected subspace within the high dimensions and limited window sized data per unit of time are used for clustering purpose. We propose a High Speed and Dimensions data stream clustering scheme (HSDStream) which employs exponential mov-ing averages to reduce the size of the memory and speed up the processing of projected subspace data stream. It works in three steps: i) initialization, ii) real-time maintenance of core and outlier micro-clusters, and iii) on-demand offline generation of the final clusters. The proposed algorithm is tested against high dimensional density-based projected clustering (HDDStream) for cluster purity, memory usage, and the cluster sensitivity. Experi-mental results are obtained for corrected KDD intrusion detection dataset. These results show that HSDStream outperforms the HDDStream in all performance metrics, especially, the memory usage and the processing speed.
  • 关键词:thesai; IJACSA; thesai.org; journal; IJACSA papers; Evolving data stream; high dimensionality; pro-jected clustering; density-based clustering; micro-clustering
国家哲学社会科学文献中心版权所有