Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
emotion2vec: Self-supervised pre-training for speech emotion representation
We propose emotion2vec, a universal speech emotion representation model. emotion2vec
is pre-trained on open-source unlabeled emotion data through self-supervised online …
is pre-trained on open-source unlabeled emotion data through self-supervised online …
Foundation models for music: A survey
In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
Speechut: Bridging speech and text with hidden-unit for encoder-decoder based speech-text pre-training
The rapid development of single-modal pre-training has prompted researchers to pay more
attention to cross-modal pre-training methods. In this paper, we propose a unified-modal …
attention to cross-modal pre-training methods. In this paper, we propose a unified-modal …
Speechlm: Enhanced speech pre-training with unpaired textual data
How to boost speech pre-training with textual data is an unsolved problem due to the fact
that speech and text are very different modalities with distinct characteristics. In this paper …
that speech and text are very different modalities with distinct characteristics. In this paper …
Reducing barriers to self-supervised learning: Hubert pre-training with academic compute
Self-supervised learning (SSL) has led to great strides in speech processing. However, the
resources needed to train these models has become prohibitively large as they continue to …
resources needed to train these models has become prohibitively large as they continue to …
MT4SSL: Boosting self-supervised speech representation learning by integrating multiple targets
In this paper, we provide a new perspective on self-supervised speech models from how the
self-training targets are obtained. We generalize the targets extractor into Offline Targets …
self-training targets are obtained. We generalize the targets extractor into Offline Targets …
Pushing the limits of unsupervised unit discovery for SSL speech representation
The excellent generalization ability of self-supervised learning (SSL) for speech foundation
models has garnered significant attention. HuBERT is a successful example that utilizes …
models has garnered significant attention. HuBERT is a successful example that utilizes …
Fast-Hubert: An efficient training framework for self-supervised speech representation learning
Recent years have witnessed significant advancements in self-supervised learning (SSL)
methods for speech-processing tasks. Various speech-based SSL models have been …
methods for speech-processing tasks. Various speech-based SSL models have been …
CTCBERT: Advancing hidden-unit bert with CTC objectives
In this work, we present a simple but effective method, CTCBERT, for advancing hidden-unit
BERT (HuBERT). HuBERT applies a frame-level cross-entropy (CE) loss, which is similar to …
BERT (HuBERT). HuBERT applies a frame-level cross-entropy (CE) loss, which is similar to …
Token2vec: A joint self-supervised pre-training framework using unpaired speech and text
Self-supervised pre-training has been successful in both text and speech processing.
Speech and text offer different but complementary information. The question is whether we …
Speech and text offer different but complementary information. The question is whether we …