期刊名称:International Journal of Computer Science and Network Security
印刷版ISSN:1738-7906
出版年度:2008
卷号:8
期号:12
页码:349-354
出版社:International Journal of Computer Science and Network Security
摘要:Due to the deficiency in their refresh techniques [12], current crawlers add unnecessary traffic to the already overloaded Internet. Moreover there exist no certain ways to verify whether a document has been updated or not. In this paper, an efficient approach is being proposed for building an effective incremental web crawler [13]. It selectively updates its database and/ or local collection of web pages instead of periodically refreshing the collection in batch mode thereby improving the ��freshness�� of the collection significantly and bringing new pages in more timely manner. It also detects web pages which frequently undergo up-dation and dynamically calculates the refresh time of the page for its next update.