- Academic Search

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

Save Cite Cited by 440 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Wenet: Production oriented streaming and non-streaming end-to-end speech recognition toolkit

Z Yao, D Wu, X Wang, B Zhang, F Yu, C Yang… - ar**
from source conditional data X to target data Y. The target Y (eg, text, speech, music, image …

Save Cite Cited by 13 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition

Q Shao, P Guo, J Yan, P Hu… - IEEE/ACM Transactions on …, 2023 - ieeexplore.ieee.org

Accents pose significant challenges for speech recognition systems. Although joint
automatic speech recognition (ASR) and accent recognition (AR) training has been proven …

Save Cite Cited by 6 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Non-autoregressive error correction for CTC-based ASR with phone-conditioned masked LM

H Futami, H Inaguma, S Ueno, M Mimura… - arxiv preprint arxiv …, 2022 - arxiv.org

Connectionist temporal classification (CTC)-based models are attractive in automatic
speech recognition (ASR) because of their non-autoregressive nature. To take advantage of …

Save Cite Cited by 13 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Improving deliberation by text-only and semi-supervised training

K Hu, TN Sainath, Y He, R Prabhavalkar… - arxiv preprint arxiv …, 2022 - arxiv.org

Text-only and semi-supervised training based on audio-only data has gained popularity
recently due to the wide availability of unlabeled text and speech data. In this work, we …

Save Cite Cited by 12 Related articles All 6 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Cascade rnn-transducer: Syllable based streaming on-device mandarin speech recognition with...

[PDF][PDF] Recent advances in end-to-end automatic speech recognition

Wenet: Production oriented streaming and non-streaming end-to-end speech recognition toolkit

Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition

Non-autoregressive error correction for CTC-based ASR with phone-conditioned masked LM

Improving deliberation by text-only and semi-supervised training