Large language models are efficient learners of noise-robust speech recognition

Y Hu, C Chen, CHH Yang, R Li, C Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advances in large language models (LLMs) have promoted generative error
correction (GER) for automatic speech recognition (ASR), which leverages the rich linguistic …

Interactive feature fusion for end-to-end noise-robust speech recognition

Y Hu, N Hou, C Chen, ES Chng - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Speech enhancement (SE) aims to suppress the additive noise from noisy speech signals to
improve the speech's perceptual quality and intelligibility. However, the over-suppression …

Gradient remedy for multi-task learning in end-to-end noise-robust speech recognition

Y Hu, C Chen, R Li, Q Zhu… - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Speech enhancement (SE) is proved effective in reducing noise from noisy speech signals
for downstream automatic speech recognition (ASR), where multi-task learning strategy is …

Improving RNN transducer based ASR with auxiliary tasks

C Liu, F Zhang, D Le, S Kim, Y Saraf… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org
End-to-end automatic speech recognition (ASR) models with a single neural network have
recently demonstrated state-of-the-art results compared to conventional hybrid speech …

Knowledge distillation-based training of speech enhancement for noise-robust automatic speech recognition

GW Lee, HK Kim, DJ Kong - IEEE Access, 2024 - ieeexplore.ieee.org
This paper addresses the training issues associated with neural network-based automatic
speech recognition (ASR) under noise conditions. In particular, conventional joint training …

Unifying speech enhancement and separation with gradient modulation for end-to-end noise-robust speech separation

Y Hu, C Chen, H Zou, X Zhong… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Recent studies in neural network-based monaural speech separation (SS) have achieved a
remarkable success thanks to increasing ability of long sequence modeling. However, they …

A novel cross-attention fusion-based joint training framework for robust underwater acoustic signal recognition

A Zhou, X Li, W Zhang, D Li, K Deng… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Underwater acoustic signal recognition (UASR) systems face challenges in achieving high
accuracy when processing complex data with low signal-to-noise ratio (SNR) in underwater …

Two-stage deep spectrum fusion for noise-robust end-to-end speech recognition

C Fan, M Ding, J Yi, J Li, Z Lv - Applied Acoustics, 2023 - Elsevier
Recently, speech enhancement (SE) methods have achieved quite good performances.
However, because of the speech distortion problem, the enhanced speech may lose …

Dual-path style learning for end-to-end noise-robust speech recognition

Y Hu, N Hou, C Chen, ES Chng - arxiv preprint arxiv:2203.14838, 2022 - arxiv.org
Automatic speech recognition (ASR) systems degrade significantly under noisy conditions.
Recently, speech enhancement (SE) is introduced as front-end to reduce noise for ASR, but …

Multitask-based joint learning approach to robust ASR for radio communication speech

D Ma, N Hou, H Xu, ES Chng - 2021 Asia-Pacific Signal and …, 2021 - ieeexplore.ieee.org
To realize robust End-to-end Automatic Speech Recognition (E2E ASR) under radio
communication condition, we propose a multitask-based method to jointly train a Speech …