期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
印刷版ISSN:2320-9798
电子版ISSN:2320-9801
出版年度:2015
卷号:3
期号:10
DOI:10.15680/IJIRCCE.2015.0310037
出版社:S&S Publications
摘要:Hadoop is an open-source, Java-based implementation of Google's MapReduce framework. Hadoop is designed for any application which can take advantage of massively parallel distributed -processing, particularly with clusters composed of unreliable hardware. The term "big data" is pervasive, and yet still the notion engenders confusion. Big Data has been used to convey all sorts of concepts, including: huge quantities of data, social media analytics, next generatio n data management capabilities, real-time data, and much more. The HDFS architecture of Hadoop implements the mapping and reducing of data into clusters and then reducing the space. In this paper we try to have an overview about the open source with Hortonworks PIG, Hive and info sphere