Automatic speech recognition: Systematic literature review

S Alharbi, M Alrazgan, A Alrashed, T Alnomasi… - Ieee …, 2021 - ieeexplore.ieee.org
A huge amount of research has been done in the field of speech signal processing in recent
years. In particular, there has been increasing interest in the automatic speech recognition …

Adaptation algorithms for neural network-based speech recognition: An overview

P Bell, J Fainberg, O Klejch, J Li… - IEEE Open Journal …, 2020 - ieeexplore.ieee.org
We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …

Personalizing ASR for dysarthric and accented speech with limited data

J Shor, D Emanuel, O Lang, O Tuval, M Brenner… - arxiv preprint arxiv …, 2019 - arxiv.org
Automatic speech recognition (ASR) systems have dramatically improved over the last few
years. ASR systems are most often trained from'typical'speech, which means that …

Toward domain-invariant speech recognition via large scale training

A Narayanan, A Misra, KC Sim… - 2018 IEEE Spoken …, 2018 - ieeexplore.ieee.org
Current state-of-the-art automatic speech recognition systems are trained to work in
specificdomains', defined based on factors like application, sampling rate and codec. When …

Personalization of end-to-end speech recognition on mobile devices for named entities

KC Sim, F Beaufays, A Benard… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org
We study the effectiveness of several techniques to personalize end-to-end speech models
and improve the recognition of proper names relevant to the user. These techniques differ in …

An investigation into on-device personalization of end-to-end automatic speech recognition models

KC Sim, P Zadrazil, F Beaufays - arxiv preprint arxiv:1909.06678, 2019 - arxiv.org
Speaker-independent speech recognition systems trained with data from many users are
generally robust against speaker variability and work well for a large population of speakers …

Modular domain adaptation for conformer-based streaming asr

Q Li, B Li, D Hwang, TN Sainath… - arxiv preprint arxiv …, 2023 - arxiv.org
Speech data from different domains has distinct acoustic and linguistic characteristics. It is
common to train a single multidomain model such as a Conformer transducer for speech …

[PDF][PDF] A Comparison of Supervised and Unsupervised Pre-Training of End-to-End Models.

A Misra, D Hwang, Z Huo, S Garg, N Siddhartha… - Interspeech, 2021 - isca-archive.org
In the absence of large-scale in-domain supervised training data, ASR models can achieve
reasonable performance through pre-training on additional data that is unlabeled …

A comparison of parameter-efficient asr domain adaptation methods for universal speech and language models

KC Sim, Z Huo, T Munkhdalai… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
A recent paradigm shift in artificial intelligence has seen the rise of foundation models, such
as the large language models and the universal speech models. With billions of model …

Boosting cross-domain speech recognition with self-supervision

H Zhu, G Cheng, J Wang, W Hou… - … /ACM Transactions on …, 2023 - ieeexplore.ieee.org
The cross-domain performance of automatic speech recognition (ASR) could be severely
hampered due to the mismatch between training and testing distributions. Since the target …