首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:A Large-scale Evaluation of Neural Machine Transliteration for Indic Languages
  • 本地全文:下载
  • 作者:Anoop Kunchukuttan ; Siddharth Jain ; Rahul Kejriwal
  • 期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
  • 出版年度:2021
  • 卷号:2021
  • 页码:3469-3475
  • DOI:10.18653/v1/2021.eacl-main.303
  • 语种:English
  • 出版社:ACL Anthology
  • 摘要:We take up the task of large-scale evaluation of neural machine transliteration between English and Indic languages, with a focus on multilingual transliteration to utilize orthographic similarity between Indian languages. We create a corpus of 600K word pairs mined from parallel translation corpora and monolingual corpora, which is the largest transliteration corpora for Indian languages mined from public sources. We perform a detailed analysis of multilingual transliteration and propose an improved multilingual training recipe for Indic languages. We analyze various factors affecting transliteration quality like language family, transliteration direction and word origin.
国家哲学社会科学文献中心版权所有