Self-supervised end-to-end ASR for low resource L2 Swedish
Abstract Unlike traditional (hybrid) Automatic Speech Recognition (ASR), end-to-end ASR
systems simplify the training procedure by directly map** acoustic features to sequences …
systems simplify the training procedure by directly map** acoustic features to sequences …
Improving cross-lingual transfer learning for end-to-end speech recognition with speech translation
Transfer learning from high-resource languages is known to be an efficient way to improve
end-to-end automatic speech recognition (ASR) for low-resource languages. Pre-trained or …
end-to-end automatic speech recognition (ASR) for low-resource languages. Pre-trained or …
Learning to Jointly Transcribe and Subtitle for End-To-End Spontaneous Speech Recognition
TV subtitles are a rich source of transcriptions of many types of speech, ranging from read
speech in news reports to conversational and spontaneous speech in talk shows and soaps …
speech in news reports to conversational and spontaneous speech in talk shows and soaps …
Large scale weakly and semi-supervised learning for low-resource video ASR
Many semi-and weakly-supervised approaches have been investigated for overcoming the
labeling cost of building high quality speech recognition systems. On the challenging task of …
labeling cost of building high quality speech recognition systems. On the challenging task of …
Multimodal N-best List Rescoring with Weakly Supervised Pre-training in Hybrid Speech Recognition
Y Song, X Huang, X Zhao, D Jiang… - … Conference on Data …, 2021 - ieeexplore.ieee.org
N-best list rescoring, an essential step in hybrid automatic speech recognition (ASR), aims to
re-evaluate the N-best hypothesis list decoded by the acoustic model (AM) and language …
re-evaluate the N-best hypothesis list decoded by the acoustic model (AM) and language …
A Language Model Based Pseudo-Sample Deliberation for Semi-supervised Speech Recognition
C Yi, J Wang, N Cheng, S Zhou… - 2021 International Joint …, 2021 - ieeexplore.ieee.org
End-to-end modeling requires tremendous amounts of transcribed speech to achieve an
automatic speech recognition (ASR) model with high performance. For low-resource ASR …
automatic speech recognition (ASR) model with high performance. For low-resource ASR …