首页    期刊浏览 2025年02月20日 星期四
登录注册

文章基本信息

  • 标题:AN IMPROVED APPROACH TO PERFORM CRAWLING AND AVOID DUPLICATE WEB PAGES
  • 本地全文:下载
  • 作者:Dhiraj Khurana ; Satish Kumar
  • 期刊名称:International Journal of Computer Science and Management Studies
  • 电子版ISSN:2231-5268
  • 出版年度:2012
  • 卷号:2
  • 期号:Special Issue
  • 出版社:Imperial Foundation
  • 摘要:When a web search is performed it includes many duplicate web pages or the websites. It means we can get number of similar pages at different web servers. We are proposing a Web Crawling Approach to Detect and avoid Duplicate or Near Duplicate WebPages. In this proposed work we are presenting a keyword Prioritization based approach to identify the web page over the web. As such pages will be identified it will optimize the web search.
  • 关键词:Crawler; Optimization; Duplicate; Webpage; Prioritization
国家哲学社会科学文献中心版权所有