文章基本信息

标题：A General Technique to Train Language Models on Language Models
本地全文：下载
作者：Mark-Jan Nederhof
期刊名称：Computational Linguistics
印刷版ISSN：0891-2017
电子版ISSN：1530-9312
出版年度：2005
卷号：31
期号：2
页码：173-185
DOI：10.1162/0891201054223986
语种：English
出版社：MIT Press
摘要：We show that under certain conditions, a language model can be trained on the basis of a second language model. The main instance of the technique trains a finite automaton on the basis of a probabilistic context-free grammar, such that the Kullback-Leibler distance between grammar and trained automaton is provably minimal. This is a substantial generalization of an existing algorithm to train an n-gram model on the basis of a probabilistic context-free grammar.