Audio self-supervised learning: A survey
Similar to humans' cognitive ability to generalize knowledge and skills, self-supervised
learning (SSL) targets discovering general representations from large-scale data. This …
learning (SSL) targets discovering general representations from large-scale data. This …
Wavlm: Large-scale self-supervised pre-training for full stack speech processing
Self-supervised learning (SSL) achieves great success in speech recognition, while limited
exploration has been attempted for other speech processing tasks. As speech signal …
exploration has been attempted for other speech processing tasks. As speech signal …
Comparative layer-wise analysis of self-supervised speech models
Many self-supervised speech models, varying in their pre-training objective, input modality,
and pre-training data, have been proposed in the last few years. Despite impressive …
and pre-training data, have been proposed in the last few years. Despite impressive …
Ml-superb: Multilingual speech universal performance benchmark
Speech processing Universal PERformance Benchmark (SUPERB) is a leaderboard to
benchmark the performance of Self-Supervised Learning (SSL) models on various speech …
benchmark the performance of Self-Supervised Learning (SSL) models on various speech …
Generative pre-training for speech with flow matching
Generative models have gained more and more attention in recent years for their
remarkable success in tasks that required estimating and sampling data distribution to …
remarkable success in tasks that required estimating and sampling data distribution to …
A survey of reasoning with foundation models
Reasoning, a crucial ability for complex problem-solving, plays a pivotal role in various real-
world settings such as negotiation, medical diagnosis, and criminal investigation. It serves …
world settings such as negotiation, medical diagnosis, and criminal investigation. It serves …
Speechprompt: An exploration of prompt tuning on generative spoken language model for speech processing tasks
Speech representations learned from Self-supervised learning (SSL) models can benefit
various speech processing tasks. However, utilizing SSL representations usually requires …
various speech processing tasks. However, utilizing SSL representations usually requires …
Superb@ slt 2022: Challenge on generalization and efficiency of self-supervised speech representation learning
We present the SUPERB challenge at SLT 2022, which aims at learning self-supervised
speech representation for better performance, generalization, and efficiency. The challenge …
speech representation for better performance, generalization, and efficiency. The challenge …
Speech self-supervised representation benchmarking: Are we doing it right?
Self-supervised learning (SSL) has recently allowed leveraging large datasets of unlabeled
speech signals to reach impressive performance on speech tasks using only small amounts …
speech signals to reach impressive performance on speech tasks using only small amounts …
On the utility of self-supervised models for prosody-related tasks
Self-Supervised Learning (SSL) from speech data has produced models that have achieved
remarkable performance in many tasks, and that are known to implicitly represent many …
remarkable performance in many tasks, and that are known to implicitly represent many …