期刊名称:International Journal of Engineering and Computer Science
印刷版ISSN:2319-7242
出版年度:2015
卷号:4
期号:11
页码:14946-14949
DOI:10.18535/Ijecs/v4i1.11
出版社:IJECS
摘要:Outlier detection and is an important branch of data mining. Data mining is extensively studied field of research area; where mostof the work is focused on the information discovery. A data stream is a massive sequence of data objects continuously generatedat much faster rate. There are various approaches and methods are used for outlier detection. Some of them use K-Meansalgorithm for outlier detection in data streams which help to create a similar group or cluster of data points. The K-meansalgorithm is the best known partitioned clustering algorithm. As we know that streaming data often fails to scan the multiple itemsand also the new concepts may keep evolving in coming data over time hence the outlier detection plays the challenging role inthe streaming data. The irrelevant attributes can be termed as noisy attribute at the time of working with the data streams objectsand such attributes imposes the challenge. In high dimensional data the number of attributes associated with the dataset is verylarge and it makes the dataset unmanageable. Clustering is a data stream mining task which is very useful to gain insight of dataand data characteristics. Clustering is also used as a pre-processing step in over all mining process for an example clustering isused for outlier detection and for building and development of Hybrid approach. Purpose of this paper is to review of Hybridapproach of outlier detection with others approach which uses K-Means algorithm for clustering dataset with some othertechniques like Euclidean distance approach. Various application domains of outlier detection are discussed in this paper.
关键词:Outlier Detection; Euclidean Distance; K-Means; Dataset; Information Discovery