SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
What does it take to create the Babel Fish, a tool that can help individuals translate speech
between any two languages? While recent breakthroughs in text-based models have …
between any two languages? While recent breakthroughs in text-based models have …
Seamless: Multilingual Expressive and Streaming Speech Translation
Large-scale automatic speech translation systems today lack key features that help machine-
mediated communication feel seamless when compared to human-to-human dialogue. In …
mediated communication feel seamless when compared to human-to-human dialogue. In …
End-to-end speech-to-text translation: A survey
N Sethiya, CK Maurya - Computer Speech & Language, 2024 - Elsevier
Abstract Speech-to-Text (ST) translation pertains to the task of converting speech signals in
one language to text in another language. It finds its application in various domains, such as …
one language to text in another language. It finds its application in various domains, such as …
Multilingual large language model: A survey of resources, taxonomy and frontiers
Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …
Models to handle and respond to queries in multiple languages, which achieves remarkable …
Multi-resolution HuBERT: Multi-resolution speech self-supervised learning with masked unit prediction
Existing Self-Supervised Learning (SSL) models for speech typically process speech signals
at a fixed resolution of 20 milliseconds. This approach overlooks the varying informational …
at a fixed resolution of 20 milliseconds. This approach overlooks the varying informational …
Salm: Speech-augmented language model with in-context learning for speech recognition and translation
We present a novel Speech Augmented Language Model (SALM) with multitask and in-
context learning capabilities. SALM comprises a frozen text LLM, a audio encoder, a …
context learning capabilities. SALM comprises a frozen text LLM, a audio encoder, a …
A survey of multilingual large language models
Multilingual large language models (MLLMs) leverage advanced large language models to
process and respond to queries across multiple languages, achieving significant success in …
process and respond to queries across multiple languages, achieving significant success in …
Evaluating multilingual speech translation under realistic conditions with resegmentation and terminology
We present the ACL 60/60 evaluation sets for multilingual translation of ACL 2022 technical
presentations into 10 target languages. This dataset enables further research into …
presentations into 10 target languages. This dataset enables further research into …
Joint speech and text machine translation for up to 100 languages
Nature, 2025 - nature.com
Abstract Creating the Babel Fish, a tool that helps individuals translate speech between any
two languages, requires advanced technological innovation and linguistic expertise …
two languages, requires advanced technological innovation and linguistic expertise …
Evaluating self-supervised speech representations for indigenous American languages
The application of self-supervision to speech representation learning has garnered
significant interest in recent years, due to its scalability to large amounts of unlabeled data …
significant interest in recent years, due to its scalability to large amounts of unlabeled data …