- Academic Search

P Peng, B Yan, S Watanabe, D Harwath - arxiv preprint arxiv:2305.11095, 2023 - arxiv.org

We investigate the emergent abilities of the recently proposed web-scale speech model
Whisper, by adapting it to unseen tasks with prompt engineering. We selected three tasks …

Enregistrer Citer Cité 44 fois Autres articles Les 7 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

T-modules: Translation modules for zero-shot cross-modal machine translation

PA Duquenne, H Gong, B Sagot, H Schwenk - arxiv preprint arxiv …, 2022 - arxiv.org

We present a new approach to perform zero-shot cross-modal transfer between speech and
text for translation tasks. Multilingual speech and text are encoded in a joint fixed-size …

Enregistrer Citer Cité 20 fois Autres articles Les 4 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Modular speech-to-text translation for zero-shot cross-modal transfer

PA Duquenne, H Schwenk, B Sagot - arxiv preprint arxiv:2310.03724, 2023 - arxiv.org

Recent research has shown that independently trained encoders and decoders, combined
through a shared fixed-size representation, can achieve competitive performance in speech …

Enregistrer Citer Cité 5 fois Autres articles Les 7 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Evaluating parameter-efficient transfer learning approaches on sure benchmark for speech understanding

Y Li, A Mehrish, R Bhardwaj… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Fine-tuning is widely used as the default algorithm for transfer learning from pre-trained
models. Parameter inefficiency can however arise when, during transfer learning, all the …

Enregistrer Citer Cité 20 fois Autres articles Les 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

End-to-end speech translation with pre-trained models and adapters: Upc at iwslt 2021

GI Gállego, I Tsiamas, C Escolano… - arxiv preprint arxiv …, 2021 - arxiv.org

This paper describes the submission to the IWSLT 2021 offline speech translation task by
the UPC Machine Translation group. The task consists of building a system capable of …

Enregistrer Citer Cité 20 fois Autres articles Les 6 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Multimodal robustness for neural machine translation

Y Zhao, I Calapodescu - Proceedings of the 2022 conference on …, 2022 - aclanthology.org

In this paper, we look at the case of a Generic text-to-text NMT model that has to deal with
data coming from various modalities, like speech, images, or noisy text extracted from the …

Enregistrer Citer Cité 5 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Discrete cross-modal alignment enables zero-shot speech translation

C Wang, Y Liu, B Chen, J Zhang, W Luo… - arxiv preprint arxiv …, 2022 - arxiv.org

End-to-end Speech Translation (ST) aims at translating the source language speech into
target language text without generating the intermediate transcriptions. However, the …

Enregistrer Citer Cité 5 fois Autres articles Les 4 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation

P Gao, R Zhang, Z He, H Wu, H Wang - arxiv preprint arxiv:2308.14482, 2023 - arxiv.org

Consistency regularization methods, such as R-Drop (Liang et al., 2021) and CrossConST
(Gao et al., 2023), have achieved impressive supervised and zero-shot performance in the …

Enregistrer Citer Cité 1 fois Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Towards Zero-shot Learning for End-to-end Cross-modal Translation Models

J Yang, K Fan, M Liao, B Chen… - Findings of the …, 2023 - aclanthology.org

One of the main problems in speech translation is the mismatches between different
modalities. The second problem, scarcity of parallel data covering multiple modalities …

Enregistrer Citer Cité 1 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] upc.edu

Learning multilingual and multimodal representations with language-specific encoders and decoders for machine translation

C Escolano Peinado - 2022 - upcommons.upc.edu

This thesis aims to study different language-specific approaches for Multilingual Machine
Translation without parameter sharing and their properties compared to the current state-of …

Enregistrer Citer Cité 4 fois Autres articles Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Enabling zero-shot multilingual spoken language translation with language-specific encoders...

Prompting the hidden talent of web-scale speech models for zero-shot task generalization

T-modules: Translation modules for zero-shot cross-modal machine translation

Modular speech-to-text translation for zero-shot cross-modal transfer

Evaluating parameter-efficient transfer learning approaches on sure benchmark for speech understanding

End-to-end speech translation with pre-trained models and adapters: Upc at iwslt 2021

Multimodal robustness for neural machine translation

Discrete cross-modal alignment enables zero-shot speech translation

An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation

Towards Zero-shot Learning for End-to-end Cross-modal Translation Models

Learning multilingual and multimodal representations with language-specific encoders and decoders for machine translation