期刊名称:International Journal of Electronics, Communication and Soft Computing Science and Engineering
印刷版ISSN:2277-9477
出版年度:2013
卷号:2
期号:9
出版社:IJECSCSE
摘要:Data mining is an increasingly important technology for extracting useful knowledge hidden in large collections of data. The proposed work presents the design and the implementation of architecture for the analysis of data streams in distributed environments. In particular, data stream analysis has been carried out for the computation of items and item sets that exceed a frequency threshold. The mining approach is hybrid, that is, frequent items are calculated with a single pass, using a sketch algorithm, while frequent item sets are calculated by a further multi-pass analysis. The architecture combines parallel and distributed processing to keep the pace with the rate of distributed data streams. In order to keep computation close to data, miners are distributed among the domains where data streams are generated
关键词:Data Mining; Frequent Item; Frequent Item sets; Data Streams