期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2010
卷号:2010
出版社:ACL Anthology
摘要:Many researchers are trying to use information
extraction (IE) to create large-scale knowledge
bases from natural language text on the
Web. However, the primary approach (supervised
learning of relation-specific extractors)
requires manually-labeled training data
for each relation and doesn¡¯t scale to the thousands
of relations encoded in Web text.
This paper presents LUCHS, a self-supervised,
relation-specific IE system which learns 5025
relations ¡ª more than an order of magnitude
greater than any previous approach¡ªwith an
average F1 score of 61%. Crucial to LUCHS¡¯s
performance is an automated system for dynamic
lexicon learning, which allows it to
learn accurately from heuristically-generated
training data, which is often noisy and sparse.