首页    期刊浏览 2024年11月25日 星期一
登录注册

文章基本信息

  • 标题:Mining Sequential Access Pattern with Low Support From Large Pre-Processed Web Logs
  • 本地全文:下载
  • 作者:Vijayalakshmi, S. ; Mohan, V.
  • 期刊名称:Journal of Computer Science
  • 印刷版ISSN:1549-3636
  • 出版年度:2010
  • 卷号:6
  • 期号:11
  • 页码:1293-1300
  • DOI:10.3844/jcssp.2010.1293.1300
  • 出版社:Science Publications
  • 摘要:Problem statement: To find frequently occurring Sequential patterns from web log file on the basis of minimum support provided. We introduced an efficient strategy for discovering Web usage mining is the application of sequential pattern mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Approach: The approaches adopt a divide-and conquer pattern-growth principle. Our proposed method combined tree projection and prefix growth features from pattern-growth category with position coded feature from early-pruning category, all of these features are key characteristics of their respective categories, so we consider our proposed method as a pattern growth, early-pruning hybrid algorithm. Results: Our proposed Hybrid algorithm eliminated the need to store numerous intermediate WAP trees during mining. Since only the original tree was stored, it drastically cuts off huge memory access costs, which may include disk I/O cost in a virtual memory environment, especially when mining very long sequences with millions of records. Conclusion: An attempt had been made to our approach for improving efficiency. Our proposed method totally eliminates reconstructions of intermediate WAP-trees during mining and considerably reduces execution time.
  • 关键词:Data mining; sequential pattern mining; frequent pattern mining; web usage mining; hybrid algorithm; WAP-tree
国家哲学社会科学文献中心版权所有