SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

L Barrault, YA Chung, MC Meglioli, D Dale… - arxiv preprint arxiv …, 2023 - arxiv.org
What does it take to create the Babel Fish, a tool that can help individuals translate speech
between any two languages? While recent breakthroughs in text-based models have …

Seamless: Multilingual Expressive and Streaming Speech Translation

L Barrault, YA Chung, MC Meglioli, D Dale… - arxiv preprint arxiv …, 2023 - arxiv.org
Large-scale automatic speech translation systems today lack key features that help machine-
mediated communication feel seamless when compared to human-to-human dialogue. In …

End-to-end speech-to-text translation: A survey

N Sethiya, CK Maurya - Computer Speech & Language, 2024 - Elsevier
Abstract Speech-to-Text (ST) translation pertains to the task of converting speech signals in
one language to text in another language. It finds its application in various domains, such as …

Multilingual large language model: A survey of resources, taxonomy and frontiers

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao… - arxiv preprint arxiv …, 2024 - arxiv.org
Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …

Multi-resolution HuBERT: Multi-resolution speech self-supervised learning with masked unit prediction

J Shi, H Inaguma, X Ma, I Kulikov, A Sun - arxiv preprint arxiv:2310.02720, 2023 - arxiv.org
Existing Self-Supervised Learning (SSL) models for speech typically process speech signals
at a fixed resolution of 20 milliseconds. This approach overlooks the varying informational …

Salm: Speech-augmented language model with in-context learning for speech recognition and translation

Z Chen, H Huang, A Andrusenko… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
We present a novel Speech Augmented Language Model (SALM) with multitask and in-
context learning capabilities. SALM comprises a frozen text LLM, a audio encoder, a …

A survey of multilingual large language models

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao, M Li… - Patterns, 2025 - cell.com
Multilingual large language models (MLLMs) leverage advanced large language models to
process and respond to queries across multiple languages, achieving significant success in …

Evaluating multilingual speech translation under realistic conditions with resegmentation and terminology

E Salesky, K Darwish, M Al-Badrashiny… - Proceedings of the …, 2023 - aclanthology.org
We present the ACL 60/60 evaluation sets for multilingual translation of ACL 2022 technical
presentations into 10 target languages. This dataset enables further research into …

Joint speech and text machine translation for up to 100 languages

Nature, 2025 - nature.com
Abstract Creating the Babel Fish, a tool that helps individuals translate speech between any
two languages, requires advanced technological innovation and linguistic expertise …

Evaluating self-supervised speech representations for indigenous American languages

CC Chen, W Chen, R Zevallos, JE Ortega - arxiv preprint arxiv …, 2023 - arxiv.org
The application of self-supervision to speech representation learning has garnered
significant interest in recent years, due to its scalability to large amounts of unlabeled data …