Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
Data augmentation techniques in time series domain: a survey and taxonomy
With the latest advances in deep learning-based generative models, it has not taken long to
take advantage of their remarkable performance in the area of time series. Deep neural …
take advantage of their remarkable performance in the area of time series. Deep neural …
Voicebox: Text-guided multilingual universal speech generation at scale
Large-scale generative models such as GPT and DALL-E have revolutionized the research
community. These models not only generate high fidelity outputs, but are also generalists …
community. These models not only generate high fidelity outputs, but are also generalists …
A high-performance neuroprosthesis for speech decoding and avatar control
Speech neuroprostheses have the potential to restore communication to people living with
paralysis, but naturalistic speed and expressivity are elusive. Here we use high-density …
paralysis, but naturalistic speed and expressivity are elusive. Here we use high-density …
Qwen-audio: Advancing universal audio understanding via unified large-scale audio-language models
Recently, instruction-following audio-language models have received broad attention for
audio interaction with humans. However, the absence of pre-trained audio models capable …
audio interaction with humans. However, the absence of pre-trained audio models capable …
Robust speech recognition via large-scale weak supervision
We study the capabilities of speech processing systems trained simply to predict large
amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual …
amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual …
Beats: Audio pre-training with acoustic tokenizers
The massive growth of self-supervised learning (SSL) has been witnessed in language,
vision, speech, and audio domains over the past few years. While discrete label prediction is …
vision, speech, and audio domains over the past few years. While discrete label prediction is …
End-to-end speech recognition: A survey
In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …
learning has brought considerable reductions in word error rate of more than 50% relative …
Masked autoencoders that listen
This paper studies a simple extension of image-based Masked Autoencoders (MAE) to self-
supervised representation learning from audio spectrograms. Following the Transformer …
supervised representation learning from audio spectrograms. Following the Transformer …
Wavlm: Large-scale self-supervised pre-training for full stack speech processing
Self-supervised learning (SSL) achieves great success in speech recognition, while limited
exploration has been attempted for other speech processing tasks. As speech signal …
exploration has been attempted for other speech processing tasks. As speech signal …