首页    期刊浏览 2024年11月25日 星期一
登录注册

文章基本信息

  • 标题:Validation of Architecture of Migrating Parallel Web Crawler using Finite State Machine
  • 本地全文:下载
  • 作者:Md Faizan Farooqui ; Md Rizwan Beg ; Md Qasim Rafiq
  • 期刊名称:International Journal of Computer Science Issues
  • 印刷版ISSN:1694-0784
  • 电子版ISSN:1694-0814
  • 出版年度:2014
  • 卷号:11
  • 期号:1
  • 出版社:IJCSI Press
  • 摘要:The process of downloading web pages is known as web crawling. In this paper we validate the architecture of Migrating parallel web crawler using finite state machine. The method for Migrating Parallel Web Crawling approach will detect changes in the content and structure. Also Domain specific crawling will yield high quality pages. The crawling process will migrate to host or server with specific domain and start downloading pages within specific domain. Incremental crawling will keep the pages in local database fresh thus increasing the quality of download-ed pages. The crawling strategy makes web crawling system more effective and efficient. Test cases are generated for the validation of the architecture. The approach for generating the test cases through FSM is very reliable and efficient and does not support for the invalid test cases. Valid input strings are generated as test cases.
  • 关键词:Web crawling; parallel migrating web crawler; search engine; validation
国家哲学社会科学文献中心版权所有