首页    期刊浏览 2024年07月06日 星期六
登录注册

文章基本信息

  • 标题:Data-Replicas Scheduler for Heterogeneous MapReduce Cluster
  • 本地全文:下载
  • 作者:Yang, Yang ; Shi, Biaobiao ; Jiang, Bo
  • 期刊名称:Journal of Networks
  • 印刷版ISSN:1796-2056
  • 出版年度:2013
  • 卷号:8
  • 期号:5
  • 页码:1096-1103
  • DOI:10.4304/jnw.8.5.1096-1103
  • 语种:English
  • 出版社:Academy Publisher
  • 摘要:Large scale data processing has rapidly increased in nowadays. MapReduce programming model, which is firstly mentioned in functional languages, appeared in distributed system and perform excellently in large scale data processing since 2006. Hadoop, which is the most popular framework of open-sourced MapReduce runtime environment, supplies reliable, scalable and distributed system processing large scale data across clusters of computers using this virtue programming model. In this system, files are split into many blocks and all blocks are replicated over several computers in clusters. To process these blocks efficiently, each job runs parallel and is divided into many tasks which deals with a file block. In order to fully take advantage of network bandwidth these systems, data locality is paid more and more attentions. Considering the existence of data-replica blocks, we propose a data-replicas scheduler which includes task scheduling and data allocation. The data-replicas scheduler takes fully advantage of data replicas in local Data node, reduce the costs of data transfer and improve the system performance. The results of experiments show that our scheduler not only improves the CPU ratio, but also reduces the packets that transfer in the network.
  • 关键词:Hadoop;virtual machine;data locality
国家哲学社会科学文献中心版权所有