首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:CODRA: A Novel Discriminative Framework for Rhetorical Analysis
  • 本地全文:下载
  • 作者:Shafiq Joty ; Giuseppe Carenini ; Raymond T. Ng
  • 期刊名称:Computational Linguistics
  • 印刷版ISSN:0891-2017
  • 电子版ISSN:1530-9312
  • 出版年度:2015
  • 卷号:41
  • 期号:3
  • 页码:385-435
  • DOI:10.1162/COLI_a_00226
  • 语种:English
  • 出版社:MIT Press
  • 摘要:Clauses and sentences rarely stand on their own in an actual discourse; rather, the relationship between them carries important information that allows the discourse to express a meaning as a whole beyond the sum of its individual parts. Rhetorical analysis seeks to uncover this coherence structure. In this article, we present CODRA — a COmplete probabilistic Discriminative framework for performing Rhetorical Analysis in accordance with Rhetorical Structure Theory, which posits a tree representation of a discourse. CODRA comprises a discourse segmenter and a discourse parser. First, the discourse segmenter, which is based on a binary classifier, identifies the elementary discourse units in a given text. Then the discourse parser builds a discourse tree by applying an optimal parsing algorithm to probabilities inferred from two Conditional Random Fields: one for intra-sentential parsing and the other for multi-sentential parsing. We present two approaches to combine these two stages of parsing effectively. By conducting a series of empirical evaluations over two different data sets, we demonstrate that CODRA significantly outperforms the state-of-the-art, often by a wide margin. We also show that a reranking of the k-best parse hypotheses generated by CODRA can potentially improve the accuracy even further.
国家哲学社会科学文献中心版权所有