首页    期刊浏览 2024年10月06日 星期日
登录注册

文章基本信息

  • 标题:Analysis of Log Data and Statistics Report Generation Using Hadoop
  • 本地全文:下载
  • 作者:Siddharth Adhikari ; Devesh Saraf ; Mahesh Revanwar
  • 期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
  • 印刷版ISSN:2320-9798
  • 电子版ISSN:2320-9801
  • 出版年度:2014
  • 卷号:2
  • 期号:4
  • 出版社:S&S Publications
  • 摘要:Web Log analyser is a tool used for finding the statics of web sites. Through Web Log analyzer the web log files are uploaded into the Hadoop Distributed Framework where parallel procession on log files is carried in the form of master and slave structure. Pig scripts are written on the classified log files to sa tisfy certain query. The log files are maintained by the web servers. By analysing these log files gives an idea ab out the user in the way like which IP address have generated the most errors, which user is visiting a web page frequently. . This paper disc uss about these log files, their formats, access procedures, their uses, the additional parameters that can be used in the log files which in turn gives way to an effective mining and the t ools used to process the log files . It also provides the idea of cr eating an extended log file and learning the user behaviour. Analysing the user activities is particularly useful for studying user behaviour when using highly interactive systems . This paper presents the details of the methodology used, in which the focus is on studying the information - seeking process and on finding log errors and exceptions . The next part of the paper describes the working and techniques used by web log analyzer.
  • 关键词:Hadoop; MapReduce; Pig; Web log files
国家哲学社会科学文献中心版权所有