首页    期刊浏览 2024年11月23日 星期六
登录注册

文章基本信息

  • 标题:SnIClustering Algorithm based on Sampling Filtering under the MapReduce Framework
  • 本地全文:下载
  • 作者:Fei Yang ; Wan-zhen Zhang ; Wei Dai
  • 期刊名称:International Journal of Hybrid Information Technology
  • 印刷版ISSN:1738-9968
  • 出版年度:2015
  • 卷号:8
  • 期号:2
  • 页码:301-310
  • DOI:10.14257/ijhit.2015.8.2.28
  • 出版社:SERSC
  • 摘要:SnIClustering Algorithm is put forward to deal with the large number of intermediate values when processing MapReduce. SnIClustering Algorithm picks up a few representative data through cluster sampling, and then retains the useful data through filtration according to the distribution characteristics. By doing so, intermediate values of MapReduce can be reduced sharply, saving time and easing network load. The last step is to cluster the selected data and samples. Experimental results show that SnIClustering is suitable to process large-scale data, since it can both process large-scale data within a short time and maintain fine clustering effect.
  • 关键词:Data mining; MapReduce; Clustering; Hadoop
国家哲学社会科学文献中心版权所有