首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:Replica Scheduling Strategy for Streaming Data Mining
  • 本地全文:下载
  • 作者:Shufan Li ; Siyuan Yu ; Fang Xiao
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2022
  • 卷号:13
  • 期号:5
  • DOI:10.14569/IJACSA.2022.0130503
  • 语种:English
  • 出版社:Science and Information Society (SAI)
  • 摘要:In a distributed storage and computing framework, traditional streaming data mining techniques are inefficient when processing massive amounts of data. In this paper, we take the copy in cloud storage as an allocatable resource for scheduling and propose a RepRM strategy to improve the efficiency of data mining and analysis. The key idea of this work is to take the data copy as the resource to be allocated, and use the backward inference method of dynamic programming to solve the data copy ratio, the optimal number of copies is obtained. Experiments and observations have proved that compared with the traditional scheduling method of Hadoop, after adopting the RepRM strategy scheduling, the memory resources of the homogeneous cluster are saved by about 40-50% during parallel mining of streaming data, and the throughput rate is increased by 20% to 30%.
  • 关键词:Streaming data mining; dynamic programming; replica scheduling strategy; cloud computing
国家哲学社会科学文献中心版权所有