首页    期刊浏览 2025年05月02日 星期五
登录注册

文章基本信息

  • 标题:Towards a Better Integration of Fuzzy Matches in Neural Machine Translation through Data Augmentation
  • 本地全文:下载
  • 作者:Arda Tezcan ; Bram Bulté ; Bram Vanroy
  • 期刊名称:Informatics
  • 电子版ISSN:2227-9709
  • 出版年度:2021
  • 卷号:8
  • 期号:1
  • 页码:7
  • DOI:10.3390/informatics8010007
  • 出版社:MDPI Publishing
  • 摘要:We identify a number of aspects that can boost the performance of Neural Fuzzy Repair (NFR), an easy-to-implement method to integrate translation memory matches and neural machine translation (NMT). We explore various ways of maximising the added value of retrieved matches within the NFR paradigm for eight language combinations, using Transformer NMT systems. In particular, we test the impact of different fuzzy matching techniques, sub-word-level segmentation methods and alignment-based features on overall translation quality. Furthermore, we propose a fuzzy match combination technique that aims to maximise the coverage of source words. This is supplemented with an analysis of how translation quality is affected by input sentence length and fuzzy match score. The results show that applying a combination of the tested modifications leads to a significant increase in estimated translation quality over all baselines for all language combinations.
  • 关键词:translation memories; data augmentation; fuzzy matching; NMT; sub-word units translation memories ; data augmentation ; fuzzy matching ; NMT ; sub-word units
国家哲学社会科学文献中心版权所有