Neural machine translation for low-resource languages: A survey
Neural Machine Translation (NMT) has seen tremendous growth in the last ten years since
the early 2000s and has already entered a mature phase. While considered the most widely …
the early 2000s and has already entered a mature phase. While considered the most widely …
Beyond english-centric multilingual machine translation
Existing work in translation demonstrated the potential of massively multilingual machine
translation by training a single model able to translate between any pair of languages …
translation by training a single model able to translate between any pair of languages …
Language-agnostic BERT sentence embedding
While BERT is an effective method for learning monolingual sentence embeddings for
semantic similarity and embedding based transfer learning (Reimers and Gurevych, 2019) …
semantic similarity and embedding based transfer learning (Reimers and Gurevych, 2019) …
Making monolingual sentence embeddings multilingual using knowledge distillation
We present an easy and efficient method to extend existing sentence embedding models to
new languages. This allows to create multilingual versions from previously monolingual …
new languages. This allows to create multilingual versions from previously monolingual …
Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond
We introduce an architecture to learn joint multilingual sentence representations for 93
languages, belonging to more than 30 different families and written in 28 different scripts …
languages, belonging to more than 30 different families and written in 28 different scripts …
MLQA: Evaluating cross-lingual extractive question answering
Question answering (QA) models have shown rapid progress enabled by the availability of
large, high-quality benchmark datasets. Such annotated datasets are difficult and costly to …
large, high-quality benchmark datasets. Such annotated datasets are difficult and costly to …
Survey of low-resource machine translation
We present a survey covering the state of the art in low-resource machine translation (MT)
research. There are currently around 7,000 languages spoken in the world and almost all …
research. There are currently around 7,000 languages spoken in the world and almost all …
SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
What does it take to create the Babel Fish, a tool that can help individuals translate speech
between any two languages? While recent breakthroughs in text-based models have …
between any two languages? While recent breakthroughs in text-based models have …
Scaling neural machine translation to 200 languages
NLLB Team - Nature, 2024 - pmc.ncbi.nlm.nih.gov
The development of neural techniques has opened up new avenues for research in
machine translation. Today, neural machine translation (NMT) systems can leverage highly …
machine translation. Today, neural machine translation (NMT) systems can leverage highly …
Wikimatrix: Mining 135m parallel sentences in 1620 language pairs from wikipedia
We present an approach based on multilingual sentence embeddings to automatically
extract parallel sentences from the content of Wikipedia articles in 85 languages, including …
extract parallel sentences from the content of Wikipedia articles in 85 languages, including …