- Academic Search

Y Zhao, J Zhang, C Zong - Machine Intelligence Research, 2023 - Springer

Abstract Machine translation is an important and challenging task that aims at automatically
translating natural language sentences from one language into another. Recently …

Gem Citer Citeret af 39 Relaterede artikler Alle 4 versioner

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

End-to-end speech recognition: A survey

R Prabhavalkar, T Hori, TN Sainath… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …

Gem Citer Citeret af 180 Relaterede artikler Alle 6 versioner Bibliotekssøgning

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

One-peace: Exploring one general representation model toward unlimited modalities

P Wang, S Wang, J Lin, S Bai, X Zhou, J Zhou… - arxiv preprint arxiv …, 2023 - arxiv.org

In this work, we explore a scalable way for building a general representation model toward
unlimited modalities. We release ONE-PEACE, a highly extensible model with 4B …

Gem Citer Citeret af 124 Relaterede artikler Alle 3 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Recent advances in direct speech-to-text translation

C Xu, R Ye, Q Dong, C Zhao, T Ko, M Wang… - arxiv preprint arxiv …, 2023 - arxiv.org

Recently, speech-to-text translation has attracted more and more attention and many studies
have emerged rapidly. In this paper, we present a comprehensive survey on direct speech …

Gem Citer Citeret af 21 Relaterede artikler Alle 4 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Cross-modal contrastive learning for speech translation

R Ye, M Wang, L Li - arxiv preprint arxiv:2205.02444, 2022 - arxiv.org

How can we learn unified representations for spoken utterances and their written text?
Learning similar representations for semantically similar speech and text is important for …

Gem Citer Citeret af 88 Relaterede artikler Alle 9 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

SLTUNET: A simple unified model for sign language translation

B Zhang, M Müller, R Sennrich - arxiv preprint arxiv:2305.01778, 2023 - arxiv.org

Despite recent successes with neural models for sign language translation (SLT), translation
quality still lags behind spoken languages because of the data scarcity and modality gap …

Gem Citer Citeret af 52 Relaterede artikler Alle 6 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Speechut: Bridging speech and text with hidden-unit for encoder-decoder based speech-text pre-training

Z Zhang, L Zhou, J Ao, S Liu, L Dai, J Li… - arxiv preprint arxiv …, 2022 - arxiv.org

The rapid development of single-modal pre-training has prompted researchers to pay more
attention to cross-modal pre-training methods. In this paper, we propose a unified-modal …

Gem Citer Citeret af 54 Relaterede artikler Alle 3 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Unity: Two-pass direct speech-to-speech translation with discrete units

H Inaguma, S Popuri, I Kulikov, PJ Chen… - arxiv preprint arxiv …, 2022 - arxiv.org

Direct speech-to-speech translation (S2ST), in which all components can be optimized
jointly, is advantageous over cascaded approaches to achieve fast inference with a …

Gem Citer Citeret af 46 Relaterede artikler Alle 6 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Speechlm: Enhanced speech pre-training with unpaired textual data

Z Zhang, S Chen, L Zhou, Y Wu, S Ren… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org

How to boost speech pre-training with textual data is an unsolved problem due to the fact
that speech and text are very different modalities with distinct characteristics. In this paper …

Gem Citer Citeret af 52 Relaterede artikler Alle 6 versioner

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Pre-training for speech translation: CTC meets optimal transport

PH Le, H Gong, C Wang, J Pino… - International …, 2023 - proceedings.mlr.press

The gap between speech and text modalities is a major challenge in speech-to-text
translation (ST). Different methods have been proposed to reduce this gap, but most of them …

Gem Citer Citeret af 25 Relaterede artikler Alle 11 versioner Vis som HTML

Opret underretning

Citer

Avanceret søgning

Gemt i Min samling

Unified speech-text pre-training for speech translation and recognition

Transformer: A general framework from machine translation to others

End-to-end speech recognition: A survey

One-peace: Exploring one general representation model toward unlimited modalities

Recent advances in direct speech-to-text translation

Cross-modal contrastive learning for speech translation

SLTUNET: A simple unified model for sign language translation

Speechut: Bridging speech and text with hidden-unit for encoder-decoder based speech-text pre-training

Unity: Two-pass direct speech-to-speech translation with discrete units

Speechlm: Enhanced speech pre-training with unpaired textual data

Pre-training for speech translation: CTC meets optimal transport