Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Towards universal speech discrete tokens: A case study for asr and tts
Self-supervised learning (SSL) proficiency in speech-related tasks has driven research into
utilizing discrete tokens for speech tasks like recognition and translation, which offer lower …
utilizing discrete tokens for speech tasks like recognition and translation, which offer lower …
CTC variations through new WFST topologies
A Laptev, S Majumdar, B Ginsburg - ar** in neural transducer
Neural Transducer and connectionist temporal classification (CTC) are popular end-to-end
automatic speech recognition systems. Due to their frame-synchronous design, blank …
automatic speech recognition systems. Due to their frame-synchronous design, blank …
Unsupervised Domain Adaptation on End-to-End Multi-talker Overlapped Speech Recognition
Serialized Output Training (SOT) has emerged as the mainstream approach for addressing
the multi-talker overlapped speech recognition challenge due to its simplicity. However, SOT …
the multi-talker overlapped speech recognition challenge due to its simplicity. However, SOT …
Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Self-supervised learning (SSL) based discrete speech representations are highly compact
and domain adaptable. In this paper, SSL discrete speech features extracted from WavLM …
and domain adaptable. In this paper, SSL discrete speech features extracted from WavLM …
Efficient Cascaded Streaming ASR System via Frame Rate Reduction
In this paper, we explore various frame rate reduction schemes on the two-pass cascaded
encoder model to improve its efficiency without scarifying the transcription quality. We …
encoder model to improve its efficiency without scarifying the transcription quality. We …
The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge
Discrete speech tokens have been more and more popular in multiple speech processing
fields, including automatic speech recognition (ASR), text-to-speech (TTS) and singing voice …
fields, including automatic speech recognition (ASR), text-to-speech (TTS) and singing voice …
TTS-Transducer: End-to-End Speech Synthesis with Neural Transducer
This work introduces TTS-Transducer-a novel architecture for text-to-speech, leveraging the
strengths of audio codec models and neural transducers. Transducers, renowned for their …
strengths of audio codec models and neural transducers. Transducers, renowned for their …
[PDF][PDF] The Vicomtech Speech Transcription Systems for the Albayzin 2024 Bilingual Basque-Spanish Speech to Text (BBS-S2T) Challenge
This paper presents the Vicomtech's submission to the Albayzın 2024 Bilingual Basque-
Spanish Speech-to-Text Challenge, which involves evaluating automatic speech …
Spanish Speech-to-Text Challenge, which involves evaluating automatic speech …
Powerful and Extensible WFST Framework for Rnn-Transducer Losses
This paper presents a framework based on Weighted Finite-State Transducers (WFST) to
simplify the development of modifications for RNN-Transducer (RNN-T) loss. Existing …
simplify the development of modifications for RNN-Transducer (RNN-T) loss. Existing …