首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:Tractable near-optimal policies for crawling
  • 本地全文:下载
  • 作者:Yossi Azar ; Eric Horvitz ; Eyal Lubetzky
  • 期刊名称:Proceedings of the National Academy of Sciences
  • 印刷版ISSN:0027-8424
  • 电子版ISSN:1091-6490
  • 出版年度:2018
  • 卷号:115
  • 期号:32
  • 页码:8099-8103
  • DOI:10.1073/pnas.1801519115
  • 语种:English
  • 出版社:The National Academy of Sciences of the United States of America
  • 摘要:The problem of maintaining a local cache of n constantly changing pages arises in multiple mechanisms such as web crawlers and proxy servers. In these, the resources for polling pages for possible updates are typically limited. The goal is to devise a polling and fetching policy that maximizes the utility of served pages that are up to date. Cho and Garcia-Molina [(2003) ACM Trans Database Syst 28:390–426] formulated this as an optimization problem, which can be solved numerically for small values of n, but appears intractable in general. Here, we show that the optimal randomized policy can be found exactly in
  • 关键词:web crawling ; caching policies ; scheduling optimization
国家哲学社会科学文献中心版权所有