Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A review of deep learning techniques for speech processing
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …
learning. The use of multiple processing layers has enabled the creation of models capable …
Robust speech recognition via large-scale weak supervision
We study the capabilities of speech processing systems trained simply to predict large
amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual …
amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual …
Speecht5: Unified-modal encoder-decoder pre-training for spoken language processing
Motivated by the success of T5 (Text-To-Text Transfer Transformer) in pre-trained natural
language processing models, we propose a unified-modal SpeechT5 framework that …
language processing models, we propose a unified-modal SpeechT5 framework that …
VoxPopuli: A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation
We introduce VoxPopuli, a large-scale multilingual corpus providing 100K hours of
unlabelled speech data in 23 languages. It is the largest open data to date for unsupervised …
unlabelled speech data in 23 languages. It is the largest open data to date for unsupervised …
Direct speech-to-speech translation with discrete units
We present a direct speech-to-speech translation (S2ST) model that translates speech from
one language to speech in another language without relying on intermediate text …
one language to speech in another language without relying on intermediate text …
Transformers in speech processing: A survey
The remarkable success of transformers in the field of natural language processing has
sparked the interest of the speech-processing community, leading to an exploration of their …
sparked the interest of the speech-processing community, leading to an exploration of their …
[PDF][PDF] CoVoST 2 and massively multilingual speech translation.
Speech translation (ST) is an increasingly popular topic of research, partly due to the
development of benchmark datasets. Nevertheless, current datasets cover a limited number …
development of benchmark datasets. Nevertheless, current datasets cover a limited number …
STEMM: Self-learning with speech-text manifold mixup for speech translation
How to learn a better speech representation for end-to-end speech-to-text translation (ST)
with limited labeled data? Existing techniques often attempt to transfer powerful machine …
with limited labeled data? Existing techniques often attempt to transfer powerful machine …
[PDF][PDF] Speech emotion recognition with multi-task learning.
Speech emotion recognition (SER) classifies speech into emotion categories such as:
Happy, Angry, Sad and Neutral. Recently, deep learning has been applied to the SER task …
Happy, Angry, Sad and Neutral. Recently, deep learning has been applied to the SER task …
Cross-modal contrastive learning for speech translation
How can we learn unified representations for spoken utterances and their written text?
Learning similar representations for semantically similar speech and text is important for …
Learning similar representations for semantically similar speech and text is important for …