首页    期刊浏览 2025年05月25日 星期日
登录注册

文章基本信息

  • 标题:Multi-split Reversible Transformers Can Enhance Neural Machine Translation
  • 本地全文:下载
  • 作者:Yuekai Zhao ; Shuchang Zhou ; Zhihua Zhang
  • 期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
  • 出版年度:2021
  • 卷号:2021
  • 页码:244-254
  • DOI:10.18653/v1/2021.eacl-main.19
  • 语种:English
  • 出版社:ACL Anthology
  • 摘要:Large-scale transformers have been shown the state-of-the-art on neural machine translation. However, training these increasingly wider and deeper models could be tremendously memory intensive. We reduce the memory burden by employing the idea of reversible networks that a layer’s input can be reconstructed from its output. We design three types of multi-split based reversible transformers. We also devise a corresponding backpropagation algorithm, which does not need to store activations for most layers. Furthermore, we present two fine-tuning techniques: splits shuffle and self ensemble, to boost translation accuracy. Specifically, our best models surpass the vanilla transformer by at least 1.4 BLEU points in three datasets. Our large-scale reversible models achieve 30.0 BLEU in WMT’14 En-De and 43.5 BLEU in WMT’14 En-Fr, beating several very strong baselines with less than half of the training memory.
国家哲学社会科学文献中心版权所有