- Academic Search

A Mohamed, H Lee, L Borgholt… - IEEE Journal of …, 2022 - ieeexplore.ieee.org

Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …

Enregistrer Citer Cité 403 fois Autres articles Les 10 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

L Barrault, YA Chung, MC Meglioli, D Dale… - arxiv preprint arxiv …, 2023 - arxiv.org

What does it take to create the Babel Fish, a tool that can help individuals translate speech
between any two languages? While recent breakthroughs in text-based models have …

Enregistrer Citer Cité 107 fois Autres articles Version HTML

[Free GPT-4]

[PDF] arxiv.org

Whisper-at: Noise-robust automatic speech recognizers are also strong general audio event taggers

Y Gong, S Khurana, L Karlinsky, J Glass - arxiv preprint arxiv:2307.03183, 2023 - arxiv.org

In this paper, we focus on Whisper, a recent automatic speech recognition model trained
with a massive 680k hour labeled speech corpus recorded in diverse conditions. We first …

Enregistrer Citer Cité 76 fois Autres articles Les 8 versions Free GPT-4 Version HTML

A joint speech enhancement and self-supervised representation learning framework for noise-robust speech recognition

QS Zhu, J Zhang, ZQ Zhang… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org

Though speech enhancement (SE) can be used to improve speech quality in noisy
environments, it may also cause distortions that degrade the performance of automatic …

Enregistrer Citer Cité 30 fois Autres articles Les 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

How does pre-trained wav2vec 2.0 perform on domain-shifted asr? an extensive benchmark on air traffic control communications

J Zuluaga-Gomez, A Prasad… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org

Recent work on self-supervised pre-training focus on leveraging large-scale unlabeled
speech data to build robust end-to-end (E2E) acoustic models (AM) that can be later fine …

Enregistrer Citer Cité 51 fois Autres articles Les 11 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Robust data2vec: Noise-robust speech representation learning for asr by combining regression and improved contrastive learning

QS Zhu, L Zhou, J Zhang, SJ Liu… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Self-supervised pre-training methods based on contrastive learning or regression tasks can
utilize more unlabeled data to improve the performance of automatic speech recognition …

Enregistrer Citer Cité 33 fois Autres articles Les 3 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Improving distortion robustness of self-supervised speech processing tasks with domain adaptation

KP Huang, YK Fu, Y Zhang, H Lee - arxiv preprint arxiv:2203.16104, 2022 - arxiv.org

Speech distortions are a long-standing problem that degrades the performance of
supervisely trained speech processing models. It is high time that we enhance the …

Enregistrer Citer Cité 29 fois Autres articles Les 7 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Gradient remedy for multi-task learning in end-to-end noise-robust speech recognition

Y Hu, C Chen, R Li, Q Zhu… - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org

Speech enhancement (SE) is proved effective in reducing noise from noisy speech signals
for downstream automatic speech recognition (ASR), where multi-task learning strategy is …

Enregistrer Citer Cité 25 fois Autres articles Les 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Wav2code: Restore clean speech representations via codebook lookup for noise-robust asr

Y Hu, C Chen, Q Zhu, ES Chng - IEEE/ACM Transactions on …, 2023 - ieeexplore.ieee.org

Automatic speech recognition (ASR) has gained remarkable successes thanks to recent
advances of deep learning, but it usually degrades significantly under real-world noisy …

Enregistrer Citer Cité 11 fois Autres articles Les 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

De'hubert: Disentangling noise in a self-supervised model for robust speech recognition

D Ng, R Zhang, JQ Yip, Z Yang, J Ni… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Existing self-supervised pre-trained speech models have offered an effective way to
leverage massive unannotated corpora to build good automatic speech recognition (ASR) …

Enregistrer Citer Cité 23 fois Autres articles Les 3 versions Free GPT-4

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

A noise-robust self-supervised pre-training model based speech representation learning for...

Self-supervised speech representation learning: A review

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

Whisper-at: Noise-robust automatic speech recognizers are also strong general audio event taggers

A joint speech enhancement and self-supervised representation learning framework for noise-robust speech recognition

How does pre-trained wav2vec 2.0 perform on domain-shifted asr? an extensive benchmark on air traffic control communications

Robust data2vec: Noise-robust speech representation learning for asr by combining regression and improved contrastive learning

Improving distortion robustness of self-supervised speech processing tasks with domain adaptation

Gradient remedy for multi-task learning in end-to-end noise-robust speech recognition

Wav2code: Restore clean speech representations via codebook lookup for noise-robust asr

De'hubert: Disentangling noise in a self-supervised model for robust speech recognition