期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
印刷版ISSN:2320-9798
电子版ISSN:2320-9801
出版年度:2013
卷号:1
期号:8
出版社:S&S Publications
摘要:The data stream is a new arrival of research area in data mining where as data stream refers to the processof extracting knowledge structures from nonstop, fast growing data records. Emerging applications involved in datastreams are motivated by many researches involving continuous massive data sets such as customer click streams, ecommerce,wireless sensor network, network monitor, telecommunication system, stock market and meteorologicaldata. For handling this type of large data, the current data mining systems are not sufficient and equipped to deal withthem, for this cause it leads to a numerous computational and mining challenges due to shortage of hardwarelimitations. Nowadays many researchers have focused on mining data streams and they proposed many techniques fordata stream classification, data stream clustering and finding frequent items from data streams. Data stream Clusteringand outlier detection provides a number of unique challenges in evolving data stream environment. Data streamclustering algorithms are highly used for detecting the outliers efficiently. The main objective of this research work isto perform the clustering process and detecting the outliers in data streams. In this research work, two clusteringalgorithms namely CURE with K-Means and CURE with CLARANS are used for finding the outliers in data streams.Different sizes and types of data sets and two performance factors such as clustering accuracy and outlier detectionaccuracy are used for analysis. By analyzing the experimental results, it is observed that the proposed CURE withCLARANS clustering algorithm performance is more accurate than the existing algorithm CURE with K-Means.
关键词:Data stream; Data stream Clustering; Outlier detection; CURE; K-Means; CLARANS