文章基本信息

标题：The Large-scale Dynamic Data Rapid Reduction Algorithm Based on Map-Reduce
本地全文：下载
作者：Yuan, Jing-ling ; Xie, Jing ; Yuan, Yan 等
期刊名称：Journal of Software
印刷版ISSN：1796-217X
出版年度：2014
卷号：9
期号：4
页码：1028-1035
DOI：10.4304/jsw.9.4.1028-1035
语种：English
出版社：Academy Publisher
摘要：With the advent of the era of “Big Data”, the application of the large-scale data is becoming popular. Efficiently using and analyzing the data has become an important problem. Traditional knowledge reduction algorithm read small data samples once into the computer main memory for reduction, but it is not suitable for large-scale data. This paper takes the large-scale sensor monitoring dynamic data as the research object and puts forward an incremental reduction algorithm based on Map-Reduce. Using Hash fast partitioning strategy this algorithm divides the initial data set into multiple subdatasets to compute which has greatly reduced the calculation time and space complexity of each node. Finally，through some experiments on the data sets in UCI machine learning repository based on Hadoop platform，the algorithm is proved more efficient and suitable for large-scale dynamic data. Compared to the traditional algorithm, the highest speedup of the parallel algorithm can be increased up to 1.55 times.
关键词：Large-scale dynamic data;increment knowledge reduction;Hash algorithm;Map-Reduce