首页    期刊浏览 2024年09月13日 星期五
登录注册

文章基本信息

  • 标题:LSI Based Relevance Computation for Topical Web Crawler
  • 本地全文:下载
  • 作者:Minhas, Gurmeen ; Kumar, Mukesh
  • 期刊名称:Journal of Emerging Technologies in Web Intelligence
  • 印刷版ISSN:1798-0461
  • 出版年度:2013
  • 卷号:5
  • 期号:4
  • 页码:401-406
  • DOI:10.4304/jetwi.5.4.401-406
  • 语种:English
  • 出版社:Academy Publisher
  • 摘要:Today, size of the web is exceptionally large. And this size is increasing rapidly. Huge number of web pages and web sites are being added each day. Hence, results which are effective, factual and authentic are needed. A simple crawler cannot cover each web page as it would take polynomial time to do so. In order to overcome such issues, this paper proposes an algorithm to develop an efficient, focused, domain specific crawler using LSI (Latent Semantic Indexing). This algorithm makes the crawler highly efficient in downloading relevant documents, thus, avoiding over-heads and resource wastage, and also increases the precision and recall values of the IR system developed on it.
  • 关键词:Crawling;focused crawler;latent semantic indexing;domain specific crawler.
国家哲学社会科学文献中心版权所有