首页    期刊浏览 2024年11月23日 星期六
登录注册

文章基本信息

  • 标题:Efficient Processing Distributed Joins with Bloomfilter using MapReduce
  • 本地全文:下载
  • 作者:Changchun Zhang ; Lei Wu ; Jing Li
  • 期刊名称:International Journal of Grid and Distributed Computing
  • 印刷版ISSN:2005-4262
  • 出版年度:2013
  • 卷号:6
  • 期号:3
  • 出版社:SERSC
  • 摘要:The MapReduce framework has been widely used to process and analyze large-scale datasets over large clusters. As an essential problem, join operation amonglarge clusters attracts more and more attention in recent years due to the utilizationof MapReduce. Many strategies have been proposed to improve the e.ciency of dis-tributed join, among which bloomfilter is a successful one. However, the bloomfilter'spotential has not yet been fully exploited, especially in the MapReduce environmen-t. In this paper, three strategies are presented to build the bloomfilter for the largedatasets using MapReduce. Based on these strategies, we design two algorithms fortwo-way join and one algorithm for multi-way join. The experimental results showthat our algorithms can significantly improve the e.ciency of current join algorithm.Moreover, cost models of these algorithms are characterized in order to find out theway of improving the performance of two-way and multi-way joins
  • 关键词:Bloomfilter; MapReduce; Query optimization; Cost model
国家哲学社会科学文献中心版权所有