首页    期刊浏览 2025年12月29日 星期一
登录注册

文章基本信息

  • 标题:Exploring Information Extraction Resilience
  • 作者:Dawn G. Gregg
  • 期刊名称:Journal of Universal Computer Science
  • 印刷版ISSN:0948-6968
  • 出版年度:2008
  • 卷号:14
  • 期号:11
  • 页码:1911-1920
  • 出版社:Graz University of Technology and Know-Center
  • 摘要:There are many challenges developers face when attempting to reliably extract data from the Web. One of these challenges is the resilience of the extraction system to changes in the web pages information is being extracted from. This article compares the resilience of information extraction systems that use position based extraction with an ontology based extraction system and a system that combines position based extraction with ontology based extraction. The findings demonstrate the advantages of using a system that combines multiple extraction techniques, especially in environments where web sites change frequently and where data collection is conducted over an extended period of time.
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有