期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2018
卷号:96
期号:12
出版社:Journal of Theoretical and Applied
摘要:Density-based clustering method has come into existence as a prominent class for clustering data streams. It has the ability to discover clusters with arbitrary shape, and it can handle noise in data. Recently, several density-based clustering algorithms have been proposed in the literature for clustering data streams. But each algorithm has its own limitation that renders them ineffective and makes a new algorithm necessary for dealing with big data. Existing density-based clustering algorithms require high computation time and more memory for clustering process. In this paper, we present a novel density-based clustering algorithm called Real-time Density-based Clustering (RTDBStream) for evolving data streams. This algorithm is a hybrid density-based clustering algorithm that integrates the pros of density-grid and density micro-clustering algorithms to get better results. The quality of the proposed algorithm is evaluated on various data sets with distinct characteristics using different quality metrics.
关键词:Big data; Data stream; Density-based clustering; Grid-based clustering; Micro-clustering