Hyporadise: An open baseline for generative speech recognition with large language models

C Chen, Y Hu, CHH Yang… - Advances in …, 2023 - proceedings.neurips.cc
Advancements in deep neural networks have allowed automatic speech recognition (ASR)
systems to attain human parity on several publicly available clean speech datasets …

[KSIĄŻKA][B] Deep learning for NLP and speech recognition

U Kamath, J Liu, J Whitaker - 2019 - Springer
With the widespread adoption of deep learning, natural language processing (NLP), and
speech applications in various domains such as finance, healthcare, and government and …

The chime-7 dasr challenge: Distant meeting transcription with multiple devices in diverse scenarios

S Cornell, M Wiesner, S Watanabe, D Raj… - arxiv preprint arxiv …, 2023 - arxiv.org
The CHiME challenges have played a significant role in the development and evaluation of
robust automatic speech recognition (ASR) systems. We introduce the CHiME-7 distant ASR …

Self-taught recognizer: Toward unsupervised adaptation for speech foundation models

Y Hu, C Chen, CH Yang, C Qin… - Advances in …, 2025 - proceedings.neurips.cc
We propose an unsupervised adaptation framework, Self-TAught Recognizer (STAR), which
leverages unlabeled data to enhance the robustness of automatic speech recognition (ASR) …

Exploring speech recognition, translation, and understanding with discrete speech units: A comparative study

X Chang, B Yan, K Choi, JW Jung, Y Lu… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Speech signals, typically sampled at rates in the tens of thousands per second, contain
redundancies, evoking inefficiencies in sequence modeling. High-dimensional speech …

Gradient remedy for multi-task learning in end-to-end noise-robust speech recognition

Y Hu, C Chen, R Li, Q Zhu… - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Speech enhancement (SE) is proved effective in reducing noise from noisy speech signals
for downstream automatic speech recognition (ASR), where multi-task learning strategy is …

Wav2code: Restore clean speech representations via codebook lookup for noise-robust asr

Y Hu, C Chen, Q Zhu, ES Chng - IEEE/ACM Transactions on …, 2023 - ieeexplore.ieee.org
Automatic speech recognition (ASR) has gained remarkable successes thanks to recent
advances of deep learning, but it usually degrades significantly under real-world noisy …

Improving noise robustness of contrastive speech representation learning with speech reconstruction

H Wang, Y Qian, X Wang, Y Wang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Noise robustness is essential for deploying automatic speech recognition (ASR) systems in
real-world environments. One way to reduce the effect of noise interference is to employ a …

Deep neural network-based generalized sidelobe canceller for dual-channel far-field speech recognition

G Li, S Liang, S Nie, W Liu, Z Yang - Neural Networks, 2021 - Elsevier
The traditional generalized sidelobe canceller (GSC) is a common speech enhancement
front end to improve the noise robustness of automatic speech recognition (ASR) systems in …

Dual-path style learning for end-to-end noise-robust speech recognition

Y Hu, N Hou, C Chen, ES Chng - arxiv preprint arxiv:2203.14838, 2022 - arxiv.org
Automatic speech recognition (ASR) systems degrade significantly under noisy conditions.
Recently, speech enhancement (SE) is introduced as front-end to reduce noise for ASR, but …