首页    期刊浏览 2024年09月18日 星期三
登录注册

文章基本信息

  • 标题:Review of Domain Based Crawling System
  • 本地全文:下载
  • 作者:Radhika Gupta ; AP Gurpinder Kaur
  • 期刊名称:International Journal of Advanced Research In Computer Science and Software Engineering
  • 印刷版ISSN:2277-6451
  • 电子版ISSN:2277-128X
  • 出版年度:2013
  • 卷号:3
  • 期号:6
  • 出版社:S.S. Mishra
  • 摘要:In this research paper we explore the various developments that have occurred to build crawler that feed the search engines. After systematic literature review of algorithms related to information retrieval, we have found that most of the search engines became irrelevant in terms of their results as internet grew, and the challenge remains as fresh as ever in developing algorithm that can have high precision and recall values. Since all search engines take their data fed using crawlers, it is critical to improve its working. Now, due to size Big Data Generic Crawlers are no longer applicable in real life. So there is an urgent need to develop a domain specific crawler built on stock of existing algorithms like LSI so that they become relevant again, the paper proposes such domain specific crawler algorithm.
  • 关键词:Web crawlers; Latent Semantic Index; Domain based crawler; Focused crawler; Recall; Precision
国家哲学社会科学文献中心版权所有