期刊名称:International Journal of Data Mining & Knowledge Management Process
印刷版ISSN:2231-007X
电子版ISSN:2230-9608
出版年度:2012
卷号:2
期号:4
出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:Data mining is a user-centric process that is used to extract useful patterns from large volumes of data. With the growth of the Internet, a data stream is today a key area of advanced analysis and data mining. Handling data streams is a difficult task due to the variations in the data and the frequent occurrences of concept drifts. No single classifier can be relied upon to correctly classify data stream data since they are developed through a specific learning approach. Hence we use a multi-chunk ensemble of classifiers to classify evolving data streams and improve the prediction accuracy over single classifiers. We evaluate our ensemble on synthetic as well as real time data, compute the precision and represent it graphically using both majority voting as well as new proposed weighted averaging and compare its performance against individual classifiers. Current techniques include a single chunk approach, where the entire set of data is considered as a whole, and the used method shows better efficiency.
关键词:Ensemble; data stream; multi-chunk; weighted averaging.