首页    期刊浏览 2024年11月10日 星期日
登录注册

文章基本信息

  • 标题:Statistical Analysis of Web Server Logs Using Apache Hive in Hadoop Framework
  • 本地全文:下载
  • 作者:Harish S ; Kavitha G
  • 期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
  • 印刷版ISSN:2320-9798
  • 电子版ISSN:2320-9801
  • 出版年度:2015
  • 卷号:3
  • 期号:5
  • DOI:10.15680/ijircce.2015.0305074
  • 出版社:S&S Publications
  • 摘要:Web log file is log file automatically created and maintained by a web server.Analyzing web serveraccess logs files will offer valuable insight into website usage. Because of the tremendous usage of web, the web logfiles are growing at faster rate and the size is becoming huge. Processing this explosive growth of log files usingrelational database technology has been facing a bottle neck. To analyze such large datasets we need parallel processingsystem and reliable data storage mechanism. Hadoop rides the big data where massive quantity of information isprocessed using cluster of commodity hardware. In this paper based on the architecture of Hadoop Distributed FileSystem and HadoopMapReduce framework and HiveQL query language, we present the methodology used in preprocessingof huge volume of web log files and finding the statics of website and learning the user behavior.
  • 关键词:big data; hadoop; mapreduce; web server logs; log analysis; hive
国家哲学社会科学文献中心版权所有