首页    期刊浏览 2024年07月06日 星期六
登录注册

文章基本信息

  • 标题:Heterogeneous Computing Based K-Means Clustering Using Hadoop-MapReduce Framework
  • 本地全文:下载
  • 作者:Sandip A. Ganage ; Dr. R. C. Thool ; Heshsham Abdul Basit
  • 期刊名称:International Journal of Advanced Research In Computer Science and Software Engineering
  • 印刷版ISSN:2277-6451
  • 电子版ISSN:2277-128X
  • 出版年度:2013
  • 卷号:3
  • 期号:6
  • 出版社:S.S. Mishra
  • 摘要:K-means is a well-known clustering algorithm in the field of data mining. It is simple to implement and its speed allows it to run on large data sets. However, it also has a drawback. Advancement in many data collection techniques has been generating enormous amount of data, leaving scientists with the challenging task of processing them. Its performance will not be sufficient when it has to deal with large data sets. To solve this problem, a method is proposed in this paper by which k-means will be implemented using OpenCL heterogeneous computing platform with the help of Hadoop-MapReduce framework. MapReduce is a framework which is pioneered by Goggle for distributed programming. It includes user specified Map and Reduce functions which process inputs in the form of key/value pairs. Along with the MapReduce paradigm, Hadoop also implements HDFS which is known distributed file system. GPU Computing with many-core graphics processors is playing today an important role in the advancements of modern highly concurrent processors. Their ability to accelerate computation is being explored under several scientific fields. OpenCL is a heterogeneous computing platform and one of the widely used for GPU Computing. In the current paper we present the acceleration of a widely used data clustering algorithm, K-means, implemented using Hadoop & MapReduce framework, in the context of heterogeneous computing devices like CPUs and GPUs.
  • 关键词:Hadoop; MapReduce; GPU; OpenCL; HDFS
国家哲学社会科学文献中心版权所有