文章基本信息

标题：SuVashantor: English to Bangla Machine Translation Systems
本地全文：下载
作者：Mahjabeen Akter ; M. Shahidur Rahman ; Muhammed Zafar Iqbal 等
期刊名称：Journal of Computer Science
印刷版ISSN：1549-3636
出版年度：2020
卷号：16
期号：8
页码：1128-1138
DOI：10.3844/jcssp.2020.1128.1138
出版社：Science Publications
摘要：This paper presents the system description of Machine Translation (MT) systems for English-Bangla language pair. Our goal was to create two benchmark MT systems that produce a better quality translation and comparatively higher evaluation score than existing MT systems for English to Bangla. In our experiments, we implemented two baseline MT systems using both statistical and neural methods for the said language pair. Our phrase-based statistical model and 2-layer LSTM neural model were trained and evaluated with a large dataset that is carefully pre-processed and contains unique training data to avoid biases from the cross-validation and test data. We achieved the highest scoring BLEU for our experiments with these setups. Furthermore, we improved the performance of the neural model using pre-trained embedding and synthetic monolingual data which are cutting-edge technology for neural models.
关键词：Machine Learning;Machine Translation Systems;Statistical Machine Translation Systems;Neural Network;Neural Machine Translation Systems;Pre-trained Word Embedding;Synthetic Monolingual Data