首页    期刊浏览 2024年12月03日 星期二
登录注册

文章基本信息

  • 标题:Research of Performance of Distributed Platforms Based on Clustering Algorithm
  • 其他标题:Research of Performance of Distributed Platforms Based on Clustering Algorithm
  • 本地全文:下载
  • 作者:Di Jian ; Yanfeng Peng
  • 期刊名称:Journal of Computers
  • 印刷版ISSN:1796-203X
  • 出版年度:2016
  • 卷号:11
  • 期号:3
  • 页码:195-200
  • DOI:10.17706/jcp.11.3.195-200
  • 出版社:Academy Publisher
  • 摘要:With the deep development and application of Internet technology, data need to be processed more and more, when dealing with large amounts of data. Spark is a versatile high-performance and parallel computing framework, which can be applied to data mining. This paper is based on the parallelization of platforms’ K-means algorithm, by building a YARN cluster environment and making experiments to analyze performance of two distributed platforms, and finally find out that the match of Spark and YARN shows more effective on clustering results and consumes less time on the execution of programs, so it’s more suitable for cluster analysis of big data.
  • 其他关键词:Clustering algorithm, distributed platforms, research of performance.
国家哲学社会科学文献中心版权所有