期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2021
卷号:99
期号:14
语种:English
出版社:Journal of Theoretical and Applied
摘要:Frequent itemset mining is a data mining technique to discover the frequent patterns from a collection of databases. However, it becomes a computational expensive task when it is used for mining large volume of data. Hence, there is a necessity for a scalable algorithm that can handle bigger datasets. Binary-based Technique Algorithm (BBT) can simplify the process of generating frequent patterns by using bit wise operations and binary database representation. However, it still suffers with the problem of low performance when dealing with high volume of data and a minimum values of support threshold to generate the list of frequent itemset patterns. This is due to its design which run in a single thread of execution. This research proposed a Parallel Binary-Based Algorithm (P-BBA) to solve the mentioned problem. The proposed algorithm is designed with collaborative threads which simultaneously work together to generate frequent itemsets in a big data environment. A master/slave architecture is used to fit the algorithm with distributed computing platform. The obtained results showed significant reductions in execution time when using the proposed parallel binary-based algorithm.
关键词:Big Data mining;Distributed Framework;Frequent Itemsets Mining;Parallel Frequent Item Mini