首页    期刊浏览 2024年10月05日 星期六
登录注册

文章基本信息

  • 标题:Comparative Study of Hive and Map Reduce to Analyze Big Data
  • 本地全文:下载
  • 作者:Nisha Bhardwaj ; Dr Balkishan ; Dr. Anubhav Kumar
  • 期刊名称:International Journal of Computer Science & Technology
  • 印刷版ISSN:2229-4333
  • 电子版ISSN:0976-8491
  • 出版年度:2015
  • 卷号:6
  • 期号:3
  • 页码:75-81
  • 语种:English
  • 出版社:Ayushmaan Technologies
  • 摘要:Big data is the combination of large datasets and the management of this large dataset is very difficult. So, we require some new techniques to handle such huge data. The challenge is to collect or extract the data from multiple sources, process or transform it according to our analytical need and then load it for analysis, this process is known as “Extract, TransformandLoad” (ETL). In this research paper, firstly implementation of hadoop in pseudodistributed mode is done and then implement hive on hadoop to analyze the large dataset. In this paper, we consider the data from Book-Crossing dataset and take only BX-Books.csv file from dataset. Over this dataset we perform query by executing hive on command line to calculate the frequency of books which are published each year. Then, comparison of hive code is done with the mapreduce code. And, finally this paper shows that how hive is better than map reduce.
  • 关键词:Hadoop;Hive;Map Reduce;Hadoop Distributed File System.
国家哲学社会科学文献中心版权所有