期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2012
卷号:2012
出版社:ACL Anthology
摘要:The Web and digitized text sources contain
a wealth of information about named entities
such as politicians, actors, companies, or cultural
landmarks. Extracting this information
has enabled the automated construction of large
knowledge bases, containing hundred millions
of binary relationships or attribute values about
these named entities. However, in reality most
knowledge is transient, i.e. changes over time,
requiring a temporal dimension in fact extraction.
In this paper we develop a methodology
that combines label propagation with constraint
reasoning for temporal fact extraction. Label
propagation aggressively gathers fact candidates,
and an Integer Linear Program is used
to clean out false hypotheses that violate temporal
constraints. Our method is able to improve
on recall while keeping up with precision,
which we demonstrate by experiments
with biography-style Wikipedia pages and a
large corpus of news articles.