- Academic Search

Over-generation cannot be rewarded: Length-adaptive average lagging for simultaneous speech translation

S Papi, M Gaido, M Negri, M Turchi - arxiv preprint arxiv:2206.05807, 2022 - arxiv.org

Simultaneous speech translation (SimulST) systems aim at generating their output with the
lowest possible latency, which is normally computed in terms of Average Lagging (AL). In …

Salva Cita Citato da 33 Articoli correlati Tutte e 11 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Attention as a guide for simultaneous speech translation

S Papi, M Negri, M Turchi - arxiv preprint arxiv:2212.07850, 2022 - arxiv.org

The study of the attention mechanism has sparked interest in many fields, such as language
modeling and machine translation. Although its patterns have been exploited to perform …

Salva Cita Citato da 21 Articoli correlati Tutte e 6 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Recent advances in end-to-end simultaneous speech translation

X Liu, G Hu, Y Du, E He, YF Luo, C Xu, T **ao… - arxiv preprint arxiv …, 2024 - arxiv.org

Simultaneous speech translation (SimulST) is a demanding task that involves generating
translations in real-time while continuously processing speech input. This paper offers a …

Salva Cita Citato da 1 Articoli correlati Tutte e 5 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Alignatt: Using attention-based audio-translation alignments as a guide for simultaneous speech translation

S Papi, M Turchi, M Negri - arxiv preprint arxiv:2305.11408, 2023 - arxiv.org

Attention is the core mechanism of today's most used architectures for natural language
processing and has been analyzed from many perspectives, including its effectiveness for …

Salva Cita Citato da 9 Articoli correlati Tutte e 8 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient yet competitive speech translation: FBK@ IWSLT2022

M Gaido, S Papi, D Fucci, G Fiameni, M Negri… - arxiv preprint arxiv …, 2022 - arxiv.org

The primary goal of this FBK's systems submission to the IWSLT 2022 offline and
simultaneous speech translation tasks is to reduce model training costs without sacrificing …

Salva Cita Citato da 10 Articoli correlati Tutte e 11 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

How" Real" is Your Real-Time Simultaneous Speech-to-Text Translation System?

S Papi, P Polak, O Bojar, D Macháček - arxiv preprint arxiv:2412.18495, 2024 - arxiv.org

Simultaneous speech-to-text translation (SimulST) translates source-language speech into
target-language text concurrently with the speaker's speech, ensuring low latency for better …

Salva Cita Citato da 1 Articoli correlati Tutte e 2 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Adapting offline speech translation models for streaming with future-aware distillation and inference

B Fu, M Liao, K Fan, Z Huang, B Chen, Y Chen… - arxiv preprint arxiv …, 2023 - arxiv.org

A popular approach to streaming speech translation is to employ a single offline model with
a wait-k policy to support different latency requirements, which is simpler than training …

Salva Cita Citato da 4 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection

H Wang, G Hu, G Lin, WQ Zhang, J Li - arxiv preprint arxiv:2406.10052, 2024 - arxiv.org

As a robust and large-scale multilingual speech recognition model, Whisper has
demonstrated impressive results in many low-resource and out-of-distribution scenarios …

Salva Cita Citato da 1 Articoli correlati Tutte e 4 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

wav2vec-S: Adapting Pre-trained Speech Models for Streaming

B Fu, K Fan, M Liao, Y Chen, X Shi… - Findings of the …, 2024 - aclanthology.org

Pre-trained speech models, such as wav2vec 2.0, have significantly advanced speech-
related tasks, including speech recognition and translation. However, their applicability in …

Salva Cita Articoli correlati Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Learning when to speak: Latency and quality trade-offs for simultaneous speech-to-speech translation with offline models

L Dugan, A Wadhawan, K Spence… - arxiv preprint arxiv …, 2023 - arxiv.org

Recent work in speech-to-speech translation (S2ST) has focused primarily on offline
settings, where the full input utterance is available before any output is given. This, however …

Salva Cita Citato da 2 Articoli correlati Tutte e 4 le versioni Versione HTML

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Does simultaneous speech translation need simultaneous models?

Over-generation cannot be rewarded: Length-adaptive average lagging for simultaneous speech translation

Attention as a guide for simultaneous speech translation

Recent advances in end-to-end simultaneous speech translation

Alignatt: Using attention-based audio-translation alignments as a guide for simultaneous speech translation

Efficient yet competitive speech translation: FBK@ IWSLT2022

How" Real" is Your Real-Time Simultaneous Speech-to-Text Translation System?

Adapting offline speech translation models for streaming with future-aware distillation and inference

Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection

wav2vec-S: Adapting Pre-trained Speech Models for Streaming

Learning when to speak: Latency and quality trade-offs for simultaneous speech-to-speech translation with offline models