首页    期刊浏览 2025年02月20日 星期四
登录注册

文章基本信息

  • 标题:Fully Unsupervised Word Segmentation with BVE and MDL
  • 本地全文:下载
  • 作者:Daniel Hewlett ; Paul Cohen
  • 期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
  • 出版年度:2011
  • 卷号:2011
  • 出版社:ACL Anthology
  • 摘要:Several results in the word segmentation literature suggest that description length provides a useful estimate of segmentation quality in fully unsupervised settings. However, since the space of potential segmentations grows exponentially with the length of the corpus, no tractable algorithm follows directly from the Minimum Description Length (MDL) principle. Therefore, it is necessary to generate a set of candidate segmentations and select between them according to the MDL principle. We evaluate several algorithms for generating these candidate segmentations on a range of natural language corpora, and show that the Bootstrapped Voting Experts algorithm consistently outperforms other methods when paired with MDL.
国家哲学社会科学文献中心版权所有