首页    期刊浏览 2024年12月25日 星期三
登录注册

文章基本信息

  • 标题:Improving Hadoop Performance by Decouple Shuffle and Reduce with MapReduce
  • 本地全文:下载
  • 作者:Sindhuja.R ; Priyadharshini.P
  • 期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
  • 印刷版ISSN:2320-9798
  • 电子版ISSN:2320-9801
  • 出版年度:2018
  • 卷号:6
  • 期号:1
  • 页码:302
  • DOI:10.15680/IJIRCCE.2017.0601051
  • 出版社:S&S Publications
  • 摘要:Hadoop parallelizes job execution with map and reduce tasks. Shuffle, the all-to-all input data fetchphase in a reduce task can remarkably affect job performance. To attribute the delay in job completion to the couplingof the shuffle phase and reduce tasks, fails to address data distribution skew among reduce tasks, and makes taskscheduling inefficient. In this work, a proposal is made to decouple shuffle from reduce tasks and convert it into aplatform service provided by Hadoop. To present iShuffle, a user-transparent shuffle service that pro-actively pushesmap output data to nodes via a novel shuffle-on-write operation and flexibly schedules reduce tasks consideringworkload balance.
  • 关键词:MapReduce; iShuffle; Data distribution skew; Task scheduling.
国家哲学社会科学文献中心版权所有