Accelerating rnn-t training and inference using ctc guidance
We propose a novel method to accelerate training and inference process of recurrent neural
network transducer (RNN-T) based on the guidance from a co-trained connectionist …
network transducer (RNN-T) based on the guidance from a co-trained connectionist …
Adapting large language model with speech for fully formatted end-to-end speech recognition
Most end-to-end (E2E) speech recognition models are composed of encoder and decoder
blocks that perform acoustic and language modeling functions. Pretrained large language …
blocks that perform acoustic and language modeling functions. Pretrained large language …
Decoder-only architecture for speech recognition with ctc prompts and text data augmentation
E Tsunoo, H Futami, Y Kashiwagi, S Arora… - ar** in neural transducer