Neural machine translation for low-resource languages: A survey

S Ranathunga, ESA Lee, M Prifti Skenduli… - ACM Computing …, 2023 - dl.acm.org
Neural Machine Translation (NMT) has seen tremendous growth in the last ten years since
the early 2000s and has already entered a mature phase. While considered the most widely …

A survey of multilingual neural machine translation

R Dabre, C Chu, A Kunchukuttan - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
We present a survey on multilingual neural machine translation (MNMT), which has gained
a lot of traction in recent years. MNMT has been useful in improving translation quality as a …

Survey of low-resource machine translation

B Haddow, R Bawden, AVM Barone, J Helcl… - Computational …, 2022 - direct.mit.edu
We present a survey covering the state of the art in low-resource machine translation (MT)
research. There are currently around 7,000 languages spoken in the world and almost all …

Domain adaptation and multi-domain adaptation for neural machine translation: A survey

D Saunders - Journal of Artificial Intelligence Research, 2022 - jair.org
The development of deep learning techniques has allowed Neural Machine Translation
(NMT) models to become extremely powerful, given sufficient training data and training time …

A comparison of transformer and recurrent neural networks on multilingual neural machine translation

SM Lakew, M Cettolo, M Federico - arxiv preprint arxiv:1806.06957, 2018 - arxiv.org
Recently, neural machine translation (NMT) has been extended to multilinguality, that is to
handle more than one translation direction with a single system. Multilingual NMT showed …

As good as new. How to successfully recycle English GPT-2 to make models for other languages

W de Vries, M Nissim - arxiv preprint arxiv:2012.05628, 2020 - arxiv.org
Large generative language models have been very successful for English, but other
languages lag behind, in part due to data and computational limitations. We propose a …

Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages

G Ramesh, S Doddapaneni, A Bheemaraj… - Transactions of the …, 2022 - direct.mit.edu
We present Samanantar, the largest publicly available parallel corpora collection for Indic
languages. The collection contains a total of 49.7 million sentence pairs between English …

Pivot-based transfer learning for neural machine translation between non-English languages

Y Kim, P Petrov, P Petrushkov, S Khadivi… - arxiv preprint arxiv …, 2019 - arxiv.org
We present effective pre-training strategies for neural machine translation (NMT) using
parallel corpora involving a pivot language, ie, source-pivot and pivot-target, leading to a …

Effective cross-lingual transfer of neural machine translation models without shared vocabularies

Y Kim, Y Gao, H Ney - arxiv preprint arxiv:1905.05475, 2019 - arxiv.org
Transfer learning or multilingual model is essential for low-resource neural machine
translation (NMT), but the applicability is limited to cognate languages by sharing their …

A survey on low-resource neural machine translation

R Wang, X Tan, R Luo, T Qin, TY Liu - arxiv preprint arxiv:2107.04239, 2021 - arxiv.org
Neural approaches have achieved state-of-the-art accuracy on machine translation but
suffer from the high cost of collecting large scale parallel data. Thus, a lot of research has …