首页    期刊浏览 2024年11月08日 星期五
登录注册

文章基本信息

  • 标题:AN OVERVIEW OF PREPROCESSING OF WEB LOG FILES FOR WEB USAGE MINING
  • 本地全文:下载
  • 作者:C.P. SUMATHI ; R. PADMAJA VALLI ; T. SANTHANAM
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2011
  • 卷号:34
  • 期号:1
  • 页码:088-095
  • 出版社:Journal of Theoretical and Applied
  • 摘要:With the Internet usage gaining popularity and the steady growth of users, the World Wide Web has become a huge repository of data and serves as an important platform for the dissemination of information. The users� accesses to Web sites are stored inWeb server logs.However, the data stored in the log files do not present an accurate picture of the users� accesses to the Web site. Hence, preprocessing of the Web log data is an essential and pre-requisite phase before it can be used for knowledge-discovery or mining tasks. The preprocessed Web data can then be suitable for thediscovery and analysis of useful information referred to as Web mining. Web usage mining, a classification of Web mining, is the application of data mining techniques to discover usage patterns from clickstream and associated data stored in one or more Web servers.This paper presents an overview of the various steps involved in the preprocessing stage.
  • 关键词:Web Server; Data Cleaning; User Identification; Session Identification; Path Completion
国家哲学社会科学文献中心版权所有