期刊名称:International Journal of Computer Trends and Technology
电子版ISSN:2231-2803
出版年度:2017
卷号:48
期号:1
页码:36-40
DOI:10.14445/22312803/IJCTT-V48P109
出版社:Seventh Sense Research Group
摘要:These last years, the new technologies produce each day large quantities of data. Companies are faced with certain problems of collecting, storing, analyzing and exploiting these large volumes of data in order to create the added value. The whole issue, for companies and administrations, is not to pass by valuable information drowned in the mass. It is here where the technology of the "Big Data" intervenes. This technology is based on an analysis of very fine masses of data. It is interesting to note that there are several publishers who offer distributions ready to use for managing a system Big Data namely HortonWorks [1], Cloudera [2], MapR [3], IBM Infosphere BigInsights [4], pivotal HD [5], Microsoft HD Insight [6], etc. The different distributions have an approach and a different positioning in relation to the vision of a platform Hadoop. These solutions are the Apache Projects and therefore available. Yet, the interest of a complete package resides in the compatibility between the components, the simplicity of installation, support, etc. In this article, we shall discuss the world of big data by defining these characteristics and its architecture. Then we shall talk about some distributions Hadoop, and finally, we shall conclude by a comparative study on the top five suppliers of Hadoop distributions of Big Data.
关键词:Big Data; 5 V’s; Distribution Hadoop; comparison