期刊名称:International Journal of Computer Science Issues
印刷版ISSN:1694-0784
电子版ISSN:1694-0814
出版年度:2011
卷号:8
期号:6
出版社:IJCSI Press
摘要:Squeezer is an effective histogram based approach for categorical data stream clustering. Drawback of Squeezer is that it is not scalable in terms of memory. The size of histogram increases with the increase in records in the dataset. Accommodation of unpredictably large histogram in the main memory is not always feasible. To handle the bottleneck, a modified version of Squzeer, FLoMSqueezer, is proposed in this paper. It uses concise sampling technique for handling increasing memory requirement by the Squzeer. Experimental results shows that proposed approach scales better in terms of quantitative cluster, memory as well as execution time.
关键词:Cluster analysis; data stream; histogram; sampling; quantitative cluster.