首页    期刊浏览 2024年11月10日 星期日
登录注册

文章基本信息

  • 标题:Finite-State Chart Constraints for Reduced Complexity Context-Free Parsing Pipelines
  • 本地全文:下载
  • 作者:Brian Roark ; Kristy Hollingshead ; Nathan Bodenstab
  • 期刊名称:Computational Linguistics
  • 印刷版ISSN:0891-2017
  • 电子版ISSN:1530-9312
  • 出版年度:2012
  • 卷号:38
  • 期号:4
  • 页码:719-753
  • DOI:10.1162/COLI_a_00109
  • 语种:English
  • 出版社:MIT Press
  • 摘要:We present methods for reducing the worst-case and typical-case complexity of a context-free parsing pipeline via hard constraints derived from finite-state pre-processing. We perform O ( n ) predictions to determine if each word in the input sentence may begin or end a multi-word constituent in chart cells spanning two or more words, or allow single-word constituents in chart cells spanning the word itself. These pre-processing constraints prune the search space for any chart-based parsing algorithm and significantly decrease decoding time. In many cases cell population is reduced to zero, which we term chart cell “closing.” We present methods for closing a sufficient number of chart cells to ensure provably quadratic or even linear worst-case complexity of context-free inference. In addition, we apply high precision constraints to achieve large typical-case speedups and combine both high precision and worst-case bound constraints to achieve superior performance on both short and long strings. These bounds on processing are achieved without reducing the parsing accuracy, and in some cases accuracy improves. We demonstrate that our method generalizes across multiple grammars and is complementary to other pruning techniques by presenting empirical results for both exact and approximate inference using the exhaustive CKY algorithm, the Charniak parser, and the Berkeley parser. We also report results parsing Chinese, where we achieve the best reported results for an individual model on the commonly reported data set.
国家哲学社会科学文献中心版权所有