首页    期刊浏览 2024年11月29日 星期五
登录注册

文章基本信息

  • 标题:Pushdown Automata in Statistical Machine Translation
  • 本地全文:下载
  • 作者:Cyril Allauzen ; Bill Byrne ; Adrià de Gispert
  • 期刊名称:Computational Linguistics
  • 印刷版ISSN:0891-2017
  • 电子版ISSN:1530-9312
  • 出版年度:2014
  • 卷号:40
  • 期号:3
  • 页码:687-723
  • DOI:10.1162/COLI_a_00197
  • 语种:English
  • 出版社:MIT Press
  • 摘要:This article describes the use of pushdown automata (PDA) in the context of statistical machine translation and alignment under a synchronous context-free grammar. We use PDAs to compactly represent the space of candidate translations generated by the grammar when applied to an input sentence. General-purpose PDA algorithms for replacement, composition, shortest path, and expansion are presented. We describe HiPDT, a hierarchical phrase-based decoder using the PDA representation and these algorithms. We contrast the complexity of this decoder with a decoder based on a finite state automata representation, showing that PDAs provide a more suitable framework to achieve exact decoding for larger synchronous context-free grammars and smaller language models. We assess this experimentally on a large-scale Chinese-to-English alignment and translation task. In translation, we propose a two-pass decoding strategy involving a weaker language model in the first-pass to address the results of PDA complexity analysis. We study in depth the experimental conditions and tradeoffs in which HiPDT can achieve state-of-the-art performance for large-scale SMT .
国家哲学社会科学文献中心版权所有