期刊名称:International Journal of Computer Science and Management Studies
电子版ISSN:2231-5268
出版年度:2012
卷号:2
期号:Special Issue
出版社:Imperial Foundation
摘要:When a web search is performed it includes many duplicate web pages or the websites. It means we can get number of similar pages at different web servers. We are proposing a Web Crawling Approach to Detect and avoid Duplicate or Near Duplicate WebPages. In this proposed work we are presenting a keyword Prioritization based approach to identify the web page over the web. As such pages will be identified it will optimize the web search.