期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2017
卷号:2017
页码:15-20
语种:English
出版社:ACL Anthology
摘要:Noise Contrastive Estimation (NCE) is a learning procedure that is regularly used to train neural language models, since it avoids the computational bottleneck caused by the output softmax. In this paper, we attempt to explain some of the weaknesses of this objective function, and to draw directions for further developments. Experiments on a small task show the issues raised by an unigram noise distribution, and that a context dependent noise distribution, such as the bigram distribution, can solve these issues and provide stable and data-efficient learning.