期刊名称:Advanced Computing : an International Journal
印刷版ISSN:2229-726X
电子版ISSN:2229-6727
出版年度:2012
卷号:3
期号:3
DOI:10.5121/acij.2012.3302
出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:It is difficult to use the traditional Message Passing Interface (MPI) approach to implement synchronization, coordination, and prevent deadlocks in distributed systems. This difficulty is lessened by the use of Apache's Hadoop/MapReduce and Zookeeper to provide Fault Tolerance in a Homogeneously Distributed Hardware/Software environment. A mathematical model for the availability of the JobTracker in Hadoop/MapReduce using Zookeeper's Leader Election Service is presented in this paper. Although the availability is less than what is expected in f+1 Fault Tolerance systems for crash failures, this approach makes coordination and synchronization easy, reduces the effect of Byzantine faults and provides Fault Tolerance for distributed systems. The results obtained show that the availability changes with change in the number of Zookeeper servers. This model can help determine how many servers are optimal for high availability, from which vendor they must be purchased, and when to use a Zookeeper coordinated Hadoop cluster to perform safety critical tasks.