期刊名称:International Journal of Grid and Distributed Computing
印刷版ISSN:2005-4262
出版年度:2014
卷号:7
期号:6
页码:53-70
DOI:10.14257/ijgdc.2014.7.6.05
出版社:SERSC
摘要:MapReduce, a large-scale data processing paradigm, is gaining popularity. However, like other distributed computing frameworks, MapReduce suffers from the integrity assurance vulnerability: malicious workers in the MapReduce cluster could tamper with its computation result and thereby render the overall computation result inaccurate. Existing solutions are effective in defeating the malicious behavior of non-collusive workers, but are less effective in detecting collusive workers. In this paper, we propose the Verification-based Integrity Assurance Framework (VIAF). By using task replication and probabilistic result verification, VIAF can detect both non-collusive and collusive workers, even if the malicious workers dominate the environment. We have implemented VIAF on Hadoop, an open source MapReduce implementation. Our theoretical analysis and experimental result show that VIAF can achieve high job accuracy while imposing moderate performance overhead.