首页    期刊浏览 2025年02月21日 星期五
登录注册

文章基本信息

  • 标题:Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Implemented on Different Hadoop Platforms
  • 本地全文:下载
  • 作者:Purvi Parmar ; MaryEtta Morris ; John R. Talburt
  • 期刊名称:International Journal of Computer Science & Information Technology (IJCSIT)
  • 印刷版ISSN:0975-4660
  • 电子版ISSN:0975-3826
  • 出版年度:2020
  • 卷号:12
  • 期号:4
  • 页码:27-34
  • DOI:10.5121/ijcsit.2020.12403
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:This paper describes the outcome of an attempt to implement the same transitive closure (TC) algorithm for Apache MapReduce running on different Apache Hadoop distributions. Apache MapReduce is a software framework used with Apache Hadoop, which has become the de facto standard platform for processing and storing large amounts of data in a distributed computing environment. The research presented here focuses on the variations observed among the results of an efficient iterative transitive closure algorithm when run against different distributed environments. The results from these comparisons were validated against the benchmark results from OYSTER, an open source Entity Resolution system. The experiment results highlighted the inconsistencies that can occur when using the same codebase with different implementations of Map Reduce.
  • 关键词:Entity Resolution;Hadoop;MapReduce;Transitive Closure;HDFS;Cloudera;Talend.
国家哲学社会科学文献中心版权所有