首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:FoCUS: A Technique to Overcome Existing Crawl Methods
  • 本地全文:下载
  • 作者:G.Bharathi ; B.Poorna Satyanarayana
  • 期刊名称:International Journal of Computer Science & Technology
  • 印刷版ISSN:2229-4333
  • 电子版ISSN:0976-8491
  • 出版年度:2014
  • 卷号:5
  • 期号:4
  • 页码:25-27
  • 语种:English
  • 出版社:Ayushmaan Technologies
  • 摘要:The objective of FoCUS is to merely rummage appropriate forum content from the web with nominal overhead. Forum threads comprise information content that is the objective of forum crawlers. While forums have altered arrangements or styles and are power-driven by various forum software packages, they continuously have related implicit navigation lanes associated by precise URL types to lead users from entry pages to thread pages. Robust page type classifiers can be accomplished from as limited as 5 noted forums and applied to an enormous set of unseen forums.
  • 关键词:EIT Path;Forum Crawling;ITF Regex;Page Classification;Page Type;URL Pattern Learning;URL Type
国家哲学社会科学文献中心版权所有