Neural machine translation: A review

F Stahlberg - Journal of Artificial Intelligence Research, 2020 - jair.org
The field of machine translation (MT), the automatic translation of written text from one
natural language into another, has experienced a major paradigm shift in recent years …

An overview of neural network compression

JO Neill - arxiv preprint arxiv:2006.03669, 2020 - arxiv.org
Overparameterized networks trained to convergence have shown impressive performance
in domains such as computer vision and natural language processing. Pushing state of the …

Knowledge distillation: A survey

J Gou, B Yu, SJ Maybank, D Tao - International Journal of Computer Vision, 2021 - Springer
In recent years, deep neural networks have been successful in both industry and academia,
especially for computer vision tasks. The great success of deep learning is mainly due to its …

Towards understanding ensemble, knowledge distillation and self-distillation in deep learning

Z Allen-Zhu, Y Li - arxiv preprint arxiv:2012.09816, 2020 - arxiv.org
We formally study how ensemble of deep learning models can improve test accuracy, and
how the superior performance of ensemble can be distilled into a single model using …

Multilingual neural machine translation with knowledge distillation

X Tan, Y Ren, D He, T Qin, Z Zhao, TY Liu - arxiv preprint arxiv …, 2019 - arxiv.org
Multilingual machine translation, which translates multiple languages with a single model,
has attracted much attention due to its efficiency of offline training and online serving …

Alp-kd: Attention-based layer projection for knowledge distillation

P Passban, Y Wu, M Rezagholizadeh… - Proceedings of the AAAI …, 2021 - ojs.aaai.org
Abstract Knowledge distillation is considered as a training and compression strategy in
which two neural networks, namely a teacher and a student, are coupled together during …

Compression of deep learning models for text: A survey

M Gupta, P Agrawal - ACM Transactions on Knowledge Discovery from …, 2022 - dl.acm.org
In recent years, the fields of natural language processing (NLP) and information retrieval (IR)
have made tremendous progress thanks to deep learning models like Recurrent Neural …

Domain adaptation and multi-domain adaptation for neural machine translation: A survey

D Saunders - Journal of Artificial Intelligence Research, 2022 - jair.org
The development of deep learning techniques has allowed Neural Machine Translation
(NMT) models to become extremely powerful, given sufficient training data and training time …

End-to-end speech translation with knowledge distillation

Y Liu, H **ong, Z He, J Zhang, H Wu, H Wang… - arxiv preprint arxiv …, 2019 - arxiv.org
End-to-end speech translation (ST), which directly translates from source language speech
into target language text, has attracted intensive attentions in recent years. Compared to …

Data diversification: A simple strategy for neural machine translation

XP Nguyen, S Joty, K Wu… - Advances in Neural …, 2020 - proceedings.neurips.cc
Abstract We introduce Data Diversification: a simple but effective strategy to boost neural
machine translation (NMT) performance. It diversifies the training data by using the …