[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
Self-supervised speech representation learning: A review
Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …
necessitated the building of specialist models for individual tasks and application scenarios …
SLURP: A spoken language understanding resource package
Spoken Language Understanding infers semantic meaning directly from audio data, and
thus promises to reduce error propagation and misunderstandings in end-user applications …
thus promises to reduce error propagation and misunderstandings in end-user applications …
Deep learning in diverse intelligent sensor based systems
Deep learning has become a predominant method for solving data analysis problems in
virtually all fields of science and engineering. The increasing complexity and the large …
virtually all fields of science and engineering. The increasing complexity and the large …
Large-scale asr domain adaptation using self-and semi-supervised learning
Self-and semi-supervised learning methods have been actively investigated to reduce
labeled training data or enhance model performance. However, these approaches mostly …
labeled training data or enhance model performance. However, these approaches mostly …
Unsupervised domain adaptation for speech recognition via uncertainty driven self-training
The performance of automatic speech recognition (ASR) systems typically degrades
significantly when the training and test data domains are mismatched. In this paper, we …
significantly when the training and test data domains are mismatched. In this paper, we …
Modular domain adaptation for conformer-based streaming asr
Speech data from different domains has distinct acoustic and linguistic characteristics. It is
common to train a single multidomain model such as a Conformer transducer for speech …
common to train a single multidomain model such as a Conformer transducer for speech …
Confidence score based speaker adaptation of conformer speech recognition systems
Speaker adaptation techniques provide a powerful solution to customise automatic speech
recognition (ASR) systems for individual users. Practical application of unsupervised model …
recognition (ASR) systems for individual users. Practical application of unsupervised model …
On addressing practical challenges for rnn-transducer
In this paper, several works are proposed to address practi-cal challenges for deploying
RNN Transducer (RNN-T) based speech recognition systems. These challenges are …
RNN Transducer (RNN-T) based speech recognition systems. These challenges are …
Generalizing speaker verification for spoof awareness in the embedding space
It is now well-known that automatic speaker verification (ASV) systems can be spoofed using
various types of adversaries. The usual approach to counteract ASV systems against such …
various types of adversaries. The usual approach to counteract ASV systems against such …