[HTML][HTML] Progress in machine translation

H Wang, H Wu, Z He, L Huang, KW Church - Engineering, 2022 - Elsevier
After more than 70 years of evolution, great achievements have been made in machine
translation. Especially in recent years, translation quality has been greatly improved with the …

Viola: Conditional language models for speech recognition, synthesis, and translation

T Wang, L Zhou, Z Zhang, Y Wu, S Liu… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org
Recent research shows a big convergence in model architecture, training objectives, and
inference methods across various tasks for different modalities. In this paper, we propose …

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

L Barrault, YA Chung, MC Meglioli, D Dale… - arxiv preprint arxiv …, 2023 - arxiv.org
What does it take to create the Babel Fish, a tool that can help individuals translate speech
between any two languages? While recent breakthroughs in text-based models have …

Direct speech-to-speech translation with discrete units

A Lee, PJ Chen, C Wang, J Gu, S Popuri, X Ma… - arxiv preprint arxiv …, 2021 - arxiv.org
We present a direct speech-to-speech translation (S2ST) model that translates speech from
one language to speech in another language without relying on intermediate text …

Direct speech-to-speech translation with a sequence-to-sequence model

Y Jia, RJ Weiss, F Biadsy, W Macherey… - arxiv preprint arxiv …, 2019 - arxiv.org
We present an attention-based sequence-to-sequence neural network which can directly
translate speech from one language into speech in another language, without relying on an …

Daspeech: Directed acyclic transformer for fast and high-quality speech-to-speech translation

Q Fang, Y Zhou, Y Feng - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Direct speech-to-speech translation (S2ST) translates speech from one language into
another using a single model. However, due to the presence of linguistic and acoustic …

Unity: Two-pass direct speech-to-speech translation with discrete units

H Inaguma, S Popuri, I Kulikov, PJ Chen… - arxiv preprint arxiv …, 2022 - arxiv.org
Direct speech-to-speech translation (S2ST), in which all components can be optimized
jointly, is advantageous over cascaded approaches to achieve fast inference with a …

Enhanced direct speech-to-speech translation using self-supervised pre-training and data augmentation

S Popuri, PJ Chen, C Wang, J Pino, Y Adi, J Gu… - arxiv preprint arxiv …, 2022 - arxiv.org
Direct speech-to-speech translation (S2ST) models suffer from data scarcity issues as there
exists little parallel S2ST data, compared to the amount of data available for conventional …

Transpeech: Speech-to-speech translation with bilateral perturbation

R Huang, J Liu, H Liu, Y Ren, L Zhang, J He… - arxiv preprint arxiv …, 2022 - arxiv.org
Direct speech-to-speech translation (S2ST) with discrete units leverages recent progress in
speech representation learning. Specifically, a sequence of discrete representations derived …

Polyvoice: Language models for speech to speech translation

Q Dong, Z Huang, Q Tian, C Xu, T Ko, Y Zhao… - arxiv preprint arxiv …, 2023 - arxiv.org
We propose PolyVoice, a language model-based framework for speech-to-speech
translation (S2ST) system. Our framework consists of two language models: a translation …