首页    期刊浏览 2024年11月26日 星期二
登录注册

文章基本信息

  • 标题:Hybrid focused crawling on the Surface and the Dark Web
  • 本地全文:下载
  • 作者:Christos Iliou ; George Kalpakis ; Theodora Tsikrika
  • 期刊名称:EURASIP Journal on Information Security
  • 印刷版ISSN:1687-4161
  • 电子版ISSN:1687-417X
  • 出版年度:2017
  • 卷号:2017
  • 期号:1
  • 页码:1-13
  • DOI:10.1186/s13635-017-0064-5
  • 出版社:Hindawi Publishing Corporation
  • 摘要:Focused crawlers enable the automatic discovery of Web resources about a given topic by automatically navigating through the Web link structure and selecting the hyperlinks to follow by estimating their relevance to the topic of interest. This work proposes a generic focused crawling framework for discovering resources on any given topic that reside on the Surface or the Dark Web. The proposed crawler is able to seamlessly navigate through the Surface Web and several darknets present in the Dark Web (i.e., Tor, I2P, and Freenet) during a single crawl by automatically adapting its crawling behavior and its classifier-guided hyperlink selection strategy based on the destination network type and the strength of the local evidence present in the vicinity of a hyperlink. It investigates 11 hyperlink selection methods, among which a novel strategy proposed based on the dynamic linear combination of a link-based and a parent Web page classifier. This hybrid focused crawler is demonstrated for the discovery of Web resources containing recipes for producing homemade explosives. The evaluation experiments indicate the effectiveness of the proposed focused crawler both for the Surface and the Dark Web..
  • 关键词:Focused crawling ; Dark web ; Darknets ; Tor ; I2P ; Freenet ; Dynamic linear combination ;
国家哲学社会科学文献中心版权所有