Neural machine translation: A review

F Stahlberg - Journal of Artificial Intelligence Research, 2020‏ - jair.org
The field of machine translation (MT), the automatic translation of written text from one
natural language into another, has experienced a major paradigm shift in recent years …

Position information in transformers: An overview

P Dufter, M Schmitt, H Schütze - Computational Linguistics, 2022‏ - direct.mit.edu
Transformers are arguably the main workhorse in recent natural language processing
research. By definition, a Transformer is invariant with respect to reordering of the input …

Language model tokenizers introduce unfairness between languages

A Petrov, E La Malfa, P Torr… - Advances in Neural …, 2023‏ - proceedings.neurips.cc
Recent language models have shown impressive multilingual performance, even when not
explicitly trained for it. Despite this, there are concerns about the quality of their outputs …

Byt5: Towards a token-free future with pre-trained byte-to-byte models

L Xue, A Barua, N Constant, R Al-Rfou… - Transactions of the …, 2022‏ - direct.mit.edu
Most widely used pre-trained language models operate on sequences of tokens
corresponding to word or subword units. By comparison, token-free models that operate …

Adapterfusion: Non-destructive task composition for transfer learning

J Pfeiffer, A Kamath, A Rücklé, K Cho… - arxiv preprint arxiv …, 2020‏ - arxiv.org
Sequential fine-tuning and multi-task learning are methods aiming to incorporate knowledge
from multiple tasks; however, they suffer from catastrophic forgetting and difficulties in …

Adversarial attacks on deep-learning models in natural language processing: A survey

WE Zhang, QZ Sheng, A Alhazmi, C Li - ACM Transactions on Intelligent …, 2020‏ - dl.acm.org
With the development of high computational devices, deep neural networks (DNNs), in
recent years, have gained significant popularity in many Artificial Intelligence (AI) …

Canine: Pre-training an Efficient Tokenization-Free Encoder for Language Representation

JH Clark, D Garrette, I Turc, J Wieting - Transactions of the Association …, 2022‏ - direct.mit.edu
Pipelined NLP systems have largely been superseded by end-to-end neural modeling, yet
nearly all commonly used models still require an explicit tokenization step. While recent …

Improving massively multilingual neural machine translation and zero-shot translation

B Zhang, P Williams, I Titov, R Sennrich - arxiv preprint arxiv:2004.11867, 2020‏ - arxiv.org
Massively multilingual models for neural machine translation (NMT) are theoretically
attractive, but often underperform bilingual models and deliver poor zero-shot translations. In …

Massively multilingual neural machine translation in the wild: Findings and challenges

N Arivazhagan, A Bapna, O Firat, D Lepikhin… - arxiv preprint arxiv …, 2019‏ - arxiv.org
We introduce our efforts towards building a universal neural machine translation (NMT)
system capable of translating between any language pair. We set a milestone towards this …

Unsupervised neural machine translation

M Artetxe, G Labaka, E Agirre, K Cho - arxiv preprint arxiv:1710.11041, 2017‏ - arxiv.org
In spite of the recent success of neural machine translation (NMT) in standard benchmarks,
the lack of large parallel corpora poses a major practical problem for many language pairs …