期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2006
卷号:2006
出版社:ACL Anthology
摘要:Aligning sentences belonging to comparable
monolingual corpora has been suggested
as a first step towards training
text rewriting algorithms, for tasks such
as summarization or paraphrasing. We
present here a new monolingual sentence
alignment algorithm, combining a
sentence-based TF*IDF score, turned into
a probability distribution using logistic regression,
with a global alignment dynamic
programming algorithm. Our approach
provides a simpler and more robust solution
achieving a substantial improvement
in accuracy over existing systems.