Google Academic

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Salvați Citați Citat de 246 ori Articole cu conținut similar Toate cele 7 versiuni

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

A comprehensive review of multimodal large language models: Performance and challenges across different tasks

J Wang, H Jiang, Y Liu, C Ma, X Zhang, Y Pan… - arxiv preprint arxiv …, 2024 - arxiv.org

In an era defined by the explosive growth of data and rapid technological advancements,
Multimodal Large Language Models (MLLMs) stand at the forefront of artificial intelligence …

Salvați Citați Citat de 28 ori Articole cu conținut similar Toate cele 3 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

SeamlessM4T: Massively Multilingual & Multimodal Machine Translation

L Barrault, YA Chung, MC Meglioli, D Dale… - arxiv preprint arxiv …, 2023 - arxiv.org

What does it take to create the Babel Fish, a tool that can help individuals translate speech
between any two languages? While recent breakthroughs in text-based models have …

Salvați Citați Citat de 116 ori Articole cu conținut similar Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Direct speech-to-speech translation with discrete units

A Lee, PJ Chen, C Wang, J Gu, S Popuri, X Ma… - arxiv preprint arxiv …, 2021 - arxiv.org

We present a direct speech-to-speech translation (S2ST) model that translates speech from
one language to speech in another language without relying on intermediate text …

Salvați Citați Citat de 172 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

STEMM: Self-learning with speech-text manifold mixup for speech translation

Q Fang, R Ye, L Li, Y Feng, M Wang - arxiv preprint arxiv:2203.10426, 2022 - arxiv.org

How to learn a better speech representation for end-to-end speech-to-text translation (ST)
with limited labeled data? Existing techniques often attempt to transfer powerful machine …

Salvați Citați Citat de 103 ori Articole cu conținut similar Toate cele 9 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

ESPnet-ST: All-in-one speech translation toolkit

H Inaguma, S Kiyono, K Duh, S Karita… - arxiv preprint arxiv …, 2020 - arxiv.org

We present ESPnet-ST, which is designed for the quick development of speech-to-speech
translation systems in a single framework. ESPnet-ST is a new project inside end-to-end …

Salvați Citați Citat de 179 ori Articole cu conținut similar Toate cele 6 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Multilingual speech translation with efficient finetuning of pretrained models

X Li, C Wang, Y Tang, C Tran, Y Tang, J Pino… - arxiv preprint arxiv …, 2020 - arxiv.org

We present a simple yet effective approach to build multilingual speech-to-text (ST)
translation by efficient transfer learning from pretrained speech encoder and text decoder …

Salvați Citați Citat de 148 ori Articole cu conținut similar Toate cele 6 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Enhanced direct speech-to-speech translation using self-supervised pre-training and data augmentation

S Popuri, PJ Chen, C Wang, J Pino, Y Adi, J Gu… - arxiv preprint arxiv …, 2022 - arxiv.org

Direct speech-to-speech translation (S2ST) models suffer from data scarcity issues as there
exists little parallel S2ST data, compared to the amount of data available for conventional …

Salvați Citați Citat de 66 ori Articole cu conținut similar Toate cele 6 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Cascade versus direct speech translation: Do the differences still make a difference?

L Bentivogli, M Cettolo, M Gaido, A Karakanta… - arxiv preprint arxiv …, 2021 - arxiv.org

Five years after the first published proofs of concept, direct approaches to speech translation
(ST) are now competing with traditional cascade solutions. In light of this steady progress …

Salvați Citați Citat de 84 ori Articole cu conținut similar Toate cele 11 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Learning shared semantic space for speech-to-text translation

C Han, M Wang, H Ji, L Li - arxiv preprint arxiv:2105.03095, 2021 - arxiv.org

Having numerous potential applications and great impact, end-to-end speech translation
(ST) has long been treated as an independent task, failing to fully draw strength from the …

Salvați Citați Citat de 82 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

Creează alerta

Citați

Căutare avansată

Salvat în Bibliotecă

A comparative study on end-to-end speech to text translation

A review of deep learning techniques for speech processing

A comprehensive review of multimodal large language models: Performance and challenges across different tasks

SeamlessM4T: Massively Multilingual & Multimodal Machine Translation

Direct speech-to-speech translation with discrete units

STEMM: Self-learning with speech-text manifold mixup for speech translation

ESPnet-ST: All-in-one speech translation toolkit

Multilingual speech translation with efficient finetuning of pretrained models

Enhanced direct speech-to-speech translation using self-supervised pre-training and data augmentation

Cascade versus direct speech translation: Do the differences still make a difference?

Learning shared semantic space for speech-to-text translation