Hyporadise: An open baseline for generative speech recognition with large language models
Advancements in deep neural networks have allowed automatic speech recognition (ASR)
systems to attain human parity on several publicly available clean speech datasets …
systems to attain human parity on several publicly available clean speech datasets …
[KSIĄŻKA][B] Deep learning for NLP and speech recognition
With the widespread adoption of deep learning, natural language processing (NLP), and
speech applications in various domains such as finance, healthcare, and government and …
speech applications in various domains such as finance, healthcare, and government and …
The chime-7 dasr challenge: Distant meeting transcription with multiple devices in diverse scenarios
The CHiME challenges have played a significant role in the development and evaluation of
robust automatic speech recognition (ASR) systems. We introduce the CHiME-7 distant ASR …
robust automatic speech recognition (ASR) systems. We introduce the CHiME-7 distant ASR …
Self-taught recognizer: Toward unsupervised adaptation for speech foundation models
We propose an unsupervised adaptation framework, Self-TAught Recognizer (STAR), which
leverages unlabeled data to enhance the robustness of automatic speech recognition (ASR) …
leverages unlabeled data to enhance the robustness of automatic speech recognition (ASR) …
Exploring speech recognition, translation, and understanding with discrete speech units: A comparative study
Speech signals, typically sampled at rates in the tens of thousands per second, contain
redundancies, evoking inefficiencies in sequence modeling. High-dimensional speech …
redundancies, evoking inefficiencies in sequence modeling. High-dimensional speech …
Gradient remedy for multi-task learning in end-to-end noise-robust speech recognition
Speech enhancement (SE) is proved effective in reducing noise from noisy speech signals
for downstream automatic speech recognition (ASR), where multi-task learning strategy is …
for downstream automatic speech recognition (ASR), where multi-task learning strategy is …
Wav2code: Restore clean speech representations via codebook lookup for noise-robust asr
Automatic speech recognition (ASR) has gained remarkable successes thanks to recent
advances of deep learning, but it usually degrades significantly under real-world noisy …
advances of deep learning, but it usually degrades significantly under real-world noisy …
Improving noise robustness of contrastive speech representation learning with speech reconstruction
Noise robustness is essential for deploying automatic speech recognition (ASR) systems in
real-world environments. One way to reduce the effect of noise interference is to employ a …
real-world environments. One way to reduce the effect of noise interference is to employ a …
Deep neural network-based generalized sidelobe canceller for dual-channel far-field speech recognition
The traditional generalized sidelobe canceller (GSC) is a common speech enhancement
front end to improve the noise robustness of automatic speech recognition (ASR) systems in …
front end to improve the noise robustness of automatic speech recognition (ASR) systems in …
Dual-path style learning for end-to-end noise-robust speech recognition
Automatic speech recognition (ASR) systems degrade significantly under noisy conditions.
Recently, speech enhancement (SE) is introduced as front-end to reduce noise for ASR, but …
Recently, speech enhancement (SE) is introduced as front-end to reduce noise for ASR, but …