首页    期刊浏览 2025年06月25日 星期三
登录注册

文章基本信息

  • 标题:An Enhanced Pre-Processing Research Framework for Web Log Data Using a Learning Algorithm
  • 本地全文:下载
  • 作者:V.V.R. Maheswara Rao ; V. Valli Kumari
  • 期刊名称:Computer Science & Information Technology
  • 电子版ISSN:2231-5403
  • 出版年度:2013
  • 卷号:3
  • 期号:2
  • 页码:01-15
  • DOI:10.5121/csit.2011.1101
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:With the continued growth and proliferation of Web services and Web based information systems, the volumes of user data have reached astronomical proportions. Before analyzing such data using web mining techniques, the web log has to be pre processed, integrated and transformed. As the World Wide Web is continuously and rapidly growing, it is necessary for the web miners to utilize intelligent tools in order to find, extract, filter and evaluate the desired information. The data pre-processing stage is the most important phase for investigation of the web user usage behaviour. To do this one must extract the only human user accesses from weblog data which is critical and complex. The web log is incremental in nature, thus conventional data pre-processing techniques were proved to be not suitable. Hence an extensive learning algorithm is required in order to get the desired information.This paper introduces an extensive research frame work capable of pre processing web log data completely and efficiently. The learning algorithm of proposed research frame work can separates human user and search engine accesses intelligently, with less time. In order to create suitable target data, the further essential tasks of pre-processing Data Cleansing, User Identification, Sessionization and Path Completion are designed collectively. The framework reduces the error rate and improves significant learning performance of the algorithm. The work ensures the goodness of split by using popular measures like E ntropy and Gini index. This framework helps to investigate the web user usage behaviour efficiently. The experimental results proving this claim are given in this paper.
  • 关键词:Web usage mining; intelligent pre-processing system; cleansing; sessionization and
国家哲学社会科学文献中心版权所有