首页    期刊浏览 2025年06月28日 星期六
登录注册

文章基本信息

  • 标题:Agglomerative Approach for Identification and Elimination of Web Robots from Web Server Logs to Extract Knowledge about Actual Visitors
  • 本地全文:下载
  • 作者:Dilip Singh Sisodia 1 , Shrish Verma 2 , Om Prakash Vyas
  • 期刊名称:Journal of Data Analysis and Information Processing
  • 印刷版ISSN:2327-7211
  • 电子版ISSN:2327-7203
  • 出版年度:2015
  • 卷号:03
  • 期号:01
  • 页码:1-10
  • DOI:10.4236/jdaip.2015.31001
  • 语种:English
  • 出版社:Scientific Research Publishing
  • 摘要:In this paper we investigate the effectiveness of ensemble-based learners for web robot session identification from web server logs. We also perform multi fold robot session labeling to improve the performance of learner. We conduct a comparative study for various ensemble methods (Bagging, Boosting, and Voting) with simple classifiers in perspective of classification. We also evaluate the effectiveness of these classifiers (both ensemble and simple) on five different data sets of varying session length. Presently the results of web server log analyzers are not very much reliable because the input log files are highly inflated by sessions of automated web traverse software’s, known as web robots. Presence of web robots access traffic entries in web server log repositories imposes a great challenge to extract any actionable and usable knowledge about browsing behavior of actual visitors. So web robots sessions need accurate and fast detection from web server log repositories to extract knowledge about genuine visitors and to produce correct results of log analyzers.
  • 关键词:Web Robots; Web Server Log Repositories; Ensemble Learning; Bagging; Boosting; and Voting; Actionable Knowledge; Usable Knowledge; Browsing Behavior; Genuine Visitors
国家哲学社会科学文献中心版权所有