首页    期刊浏览 2025年02月25日 星期二
登录注册

文章基本信息

  • 标题:A Focused Crawling Method Based on Detecting Communities in Complex Networks
  • 本地全文:下载
  • 作者:ShenGui-lan ; Sun Jie ; Yang Xiao-ping
  • 期刊名称:International Journal of Smart Home
  • 印刷版ISSN:1975-4094
  • 出版年度:2015
  • 卷号:9
  • 期号:8
  • 页码:187-196
  • DOI:10.14257/ijsh.2015.9.8.20
  • 出版社:SERSC
  • 摘要:The rapid growth of the large-scale World-Wide Web poses great challenge to existing focused crawling methods. Whetheranalyzing text content or link structure, traditional focused crawler were mainly based on the page granularity. Random walking in the network composed of a large number of pages, the focused crawler is easy to get lost. Obviously, narrowing the focused crawling range from the entireWeb can improve the precision and efficiency. A focused crawling method based on the twogranularitiesis put forward. Firstly, using detectingcommunity algorithm to analyze the link structure of the network composed of websites, a given topic web sites group is built up. It contributes to narrow the crawling range. Secondly, all topic relevant analysis for web pages and link prediction are performed inside this generated group. Topic relevant analysis is implemented through calculating the topic similarity for title and content separately. The similarity of father pages, anchor texts and the string text for URL all are considered to predict the topic relevance for unknown links.The experimental results suggest that this method is very effective for given topic, and it can improve the precision.
  • 关键词:detecting community; focused crawling; web site granularity;similarity ; analysis; link precision
国家哲学社会科学文献中心版权所有