Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
[HTML][HTML] Unsupervised automatic speech recognition: A review
Abstract Automatic Speech Recognition (ASR) systems can be trained to achieve
remarkable performance given large amounts of manually transcribed speech, but large …
remarkable performance given large amounts of manually transcribed speech, but large …
End-to-end speech recognition: A survey
In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …
learning has brought considerable reductions in word error rate of more than 50% relative …
VoxPopuli: A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation
We introduce VoxPopuli, a large-scale multilingual corpus providing 100K hours of
unlabelled speech data in 23 languages. It is the largest open data to date for unsupervised …
unlabelled speech data in 23 languages. It is the largest open data to date for unsupervised …
Mls: A large-scale multilingual dataset for speech research
This paper introduces Multilingual LibriSpeech (MLS) dataset, a large multilingual corpus
suitable for speech research. The dataset is derived from read audiobooks from LibriVox …
suitable for speech research. The dataset is derived from read audiobooks from LibriVox …
Contextnet: Improving convolutional neural networks for automatic speech recognition with global context
Convolutional neural networks (CNN) have shown promising results for end-to-end speech
recognition, albeit still behind other state-of-the-art methods in performance. In this paper …
recognition, albeit still behind other state-of-the-art methods in performance. In this paper …
Quartznet: Deep automatic speech recognition with 1d time-channel separable convolutions
We propose a new end-to-end neural acoustic model for automatic speech recognition. The
model is composed of multiple blocks with residual connections between them. Each block …
model is composed of multiple blocks with residual connections between them. Each block …
A comparison of transformer and lstm encoder decoder models for asr
We present competitive results using a Transformer encoder-decoder-attention model for
end-to-end speech recognition needing less training time compared to a similarly …
end-to-end speech recognition needing less training time compared to a similarly …
End-to-end asr: from supervised to semi-supervised learning with modern architectures
We study pseudo-labeling for the semi-supervised training of ResNet, Time-Depth
Separable ConvNets, and Transformers for speech recognition, with either CTC or Seq2Seq …
Separable ConvNets, and Transformers for speech recognition, with either CTC or Seq2Seq …
RWTH ASR Systems for LibriSpeech: Hybrid vs Attention--w/o Data Augmentation
We present state-of-the-art automatic speech recognition (ASR) systems employing a
standard hybrid DNN/HMM architecture compared to an attention-based encoder-decoder …
standard hybrid DNN/HMM architecture compared to an attention-based encoder-decoder …
Self-training for end-to-end speech recognition
We revisit self-training in the context of end-to-end speech recognition. We demonstrate that
training with pseudo-labels can substantially improve the accuracy of a baseline model. Key …
training with pseudo-labels can substantially improve the accuracy of a baseline model. Key …