首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:CloudLM: a Cloud-based Language Model for Machine Translation
  • 作者:Jorge Ferrández-Tordera ; Sergio Ortiz-Rojas ; Antonio Toral
  • 期刊名称:The Prague Bulletin of Mathematical Linguistics
  • 印刷版ISSN:0032-6585
  • 电子版ISSN:1804-0462
  • 出版年度:2016
  • 卷号:105
  • 期号:1
  • 页码:51-61
  • DOI:10.1515/pralin-2016-0002
  • 语种:English
  • 出版社:Walter de Gruyter GmbH
  • 摘要:Language models (LMs) are an essential element in statistical approaches to natural language processing for tasks such as speech recognition and machine translation (MT). The advent of big data leads to the availability of massive amounts of data to build LMs, and in fact, for the most prominent languages, using current techniques and hardware, it is not feasible to train LMs with all the data available nowadays. At the same time, it has been shown that the more data is used for a LM the better the performance, e.g. for MT, without any indication yet of reaching a plateau. This paper presents CloudLM, an open-source cloud-based LM intended for MT, which allows to query distributed LMs. CloudLM relies on Apache Solr and provides the functionality of state-of-the-art language modelling (it builds upon KenLM), while allowing to query massive LMs (as the use of local memory is drastically reduced), at the expense of slower decoding speed.
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有