首页    期刊浏览 2024年11月25日 星期一
登录注册

文章基本信息

  • 标题:AN EFFICIENT IMPLEMENTATION OF APRIORI ALGORITHM BASED ON HADOOP-MAPREDUCE MODEL
  • 本地全文:下载
  • 作者:OTHMAN YAHYA ; OSMAN HEGAZY ; EHAB EZAT
  • 期刊名称:International Journal of Reviews in Computing
  • 印刷版ISSN:2076-3328
  • 电子版ISSN:2076-3336
  • 出版年度:2012
  • 卷号:12
  • 页码:59-67
  • 出版社:Little Lion Scientific Research and Developement
  • 摘要:Finding frequent itemsets is one of the most important fields of data mining. Apriori algorithm is the most established algorithm for finding frequent itemsets from a transactional dataset; however, it needs to scan the dataset many times and to generate many candidate itemsets. Unfortunately, when the dataset size is huge, both memory use and computational cost can still be very expensive. In addition, single processor’s memory and CPU resources are very limited, which make the algorithm performance inefficient. Parallel and distributed computing are effective strategies for accelerating algorithms performance. In this paper, we have implemented an efficient MapReduce Apriori algorithm (MRApriori) based on Hadoop-MapReduce model which needs only two phases (MapReduce Jobs) to find all frequent k-itemsets, and compared our proposed MRApriori algorithm with current two existed algorithms which need either one or k phases (k is maximum length of frequent itemsets) to find the same frequent k-itemsets. Experimental results showed that the proposed MRApriori algorithm outperforms the other two algorithms.
  • 关键词:Hadoop; MapReduce; Parallel Computing; Distributed Computing; Apriori Algorithm; Frequent Itemset; Data Mining; Association Rule
国家哲学社会科学文献中心版权所有