An overview of end-to-end automatic speech recognition
D Wang, X Wang, S Lv - Symmetry, 2019 - mdpi.com
Automatic speech recognition, especially large vocabulary continuous speech recognition,
is an important issue in the field of machine learning. For a long time, the hidden Markov …
is an important issue in the field of machine learning. For a long time, the hidden Markov …
Adaptation algorithms for neural network-based speech recognition: An overview
We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …
recognition, considering both hybrid hidden Markov model/neural network systems and end …
End-to-end speech recognition: A survey
In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …
learning has brought considerable reductions in word error rate of more than 50% relative …
End-to-end attention-based large vocabulary speech recognition
Many state-of-the-art Large Vocabulary Continuous Speech Recognition (LVCSR) Systems
are hybrids of neural networks and Hidden Markov Models (HMMs). Recently, more direct …
are hybrids of neural networks and Hidden Markov Models (HMMs). Recently, more direct …
An actor-critic algorithm for sequence prediction
We present an approach to training neural networks to generate sequences using actor-
critic methods from reinforcement learning (RL). Current log-likelihood training methods are …
critic methods from reinforcement learning (RL). Current log-likelihood training methods are …
cudnn: Efficient primitives for deep learning
S Chetlur, C Woolley, P Vandermersch… - arxiv preprint arxiv …, 2014 - arxiv.org
We present a library of efficient implementations of deep learning primitives. Deep learning
workloads are computationally intensive, and optimizing their kernels is difficult and time …
workloads are computationally intensive, and optimizing their kernels is difficult and time …
EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding
The performance of automatic speech recognition (ASR) has improved tremendously due to
the application of deep neural networks (DNNs). Despite this progress, building a new ASR …
the application of deep neural networks (DNNs). Despite this progress, building a new ASR …
TED-LIUM 3: Twice as much data and corpus repartition for experiments on speaker adaptation
In this paper, we present TED-LIUM release 3 corpus (TED-LIUM 3 is available on
https://lium. univ-lemans. fr/ted-lium3/) dedicated to speech recognition in English, which …
https://lium. univ-lemans. fr/ted-lium3/) dedicated to speech recognition in English, which …
Streaming automatic speech recognition with the transformer model
Encoder-decoder based sequence-to-sequence models have demonstrated state-of-the-art
results in end-to-end automatic speech recognition (ASR). Recently, the transformer …
results in end-to-end automatic speech recognition (ASR). Recently, the transformer …
Stochastic fine-grained labeling of multi-state sign glosses for continuous sign language recognition
In this paper, we propose novel stochastic modeling of various components of a continuous
sign language recognition (CSLR) system that is based on the transformer encoder and …
sign language recognition (CSLR) system that is based on the transformer encoder and …