Neural machine translation for low-resource languages: A survey

S Ranathunga, ESA Lee, M Prifti Skenduli… - ACM Computing …, 2023 - dl.acm.org
Neural Machine Translation (NMT) has seen tremendous growth in the last ten years since
the early 2000s and has already entered a mature phase. While considered the most widely …

[HTML][HTML] Progress in machine translation

H Wang, H Wu, Z He, L Huang, KW Church - Engineering, 2022 - Elsevier
After more than 70 years of evolution, great achievements have been made in machine
translation. Especially in recent years, translation quality has been greatly improved with the …

Finetuned language models are zero-shot learners

J Wei, M Bosma, VY Zhao, K Guu, AW Yu… - arxiv preprint arxiv …, 2021 - arxiv.org
This paper explores a simple method for improving the zero-shot learning abilities of
language models. We show that instruction tuning--finetuning language models on a …

[PDF][PDF] mt5: A massively multilingual pre-trained text-to-text transformer

L Xue - arxiv preprint arxiv:2010.11934, 2020 - fq.pkwyx.com
The recent" Text-to-Text Transfer Transformer"(T5) leveraged a unified text-to-text format and
scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this …

Exploring the limits of transfer learning with a unified text-to-text transformer

C Raffel, N Shazeer, A Roberts, K Lee, S Narang… - Journal of machine …, 2020 - jmlr.org
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-
tuned on a downstream task, has emerged as a powerful technique in natural language …

[PDF][PDF] Unsupervised cross-lingual representation learning at scale

A Conneau - arxiv preprint arxiv:1911.02116, 2019 - fq.pkwyx.com
This paper shows that pretraining multilingual language models at scale leads to significant
performance gains for a wide range of cross-lingual transfer tasks. We train a Transformer …

[PDF][PDF] Multilingual denoising pre-training for neural machine translation

Y Liu - arxiv preprint arxiv:2001.08210, 2020 - fq.pkwyx.com
This paper demonstrates that multilingual denoising pre-training produces significant
performance gains across a wide variety of machine translation (MT) tasks. We present …

Beyond english-centric multilingual machine translation

A Fan, S Bhosale, H Schwenk, Z Ma, A El-Kishky… - Journal of Machine …, 2021 - jmlr.org
Existing work in translation demonstrated the potential of massively multilingual machine
translation by training a single model able to translate between any pair of languages …

Language-agnostic BERT sentence embedding

F Feng, Y Yang, D Cer, N Arivazhagan… - arxiv preprint arxiv …, 2020 - arxiv.org
While BERT is an effective method for learning monolingual sentence embeddings for
semantic similarity and embedding based transfer learning (Reimers and Gurevych, 2019) …

Gpipe: Efficient training of giant neural networks using pipeline parallelism

Y Huang, Y Cheng, A Bapna, O Firat… - Advances in neural …, 2019 - proceedings.neurips.cc
Scaling up deep neural network capacity has been known as an effective approach to
improving model quality for several different machine learning tasks. In many cases …