Full-band general audio synthesis with score-based diffusion
Recent works have shown the capability of deep generative models to tackle general audio
synthesis from a single label, producing a variety of impulsive, tonal, and environmental …
synthesis from a single label, producing a variety of impulsive, tonal, and environmental …
The ICML 2022 expressive vocalizations workshop and competition: Recognizing, generating, and personalizing vocal bursts
The ICML Expressive Vocalization (ExVo) Competition is focused on understanding and
generating vocal bursts: laughs, gasps, cries, and other non-verbal vocalizations that are …
generating vocal bursts: laughs, gasps, cries, and other non-verbal vocalizations that are …
Laughter synthesis using pseudo phonetic tokens with a large-scale in-the-wild laughter corpus
We present a large-scale in-the-wild Japanese laughter corpus and a laughter synthesis
method. Previous work on laughter synthesis lacks not only data but also proper ways to …
method. Previous work on laughter synthesis lacks not only data but also proper ways to …
T-Foley: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis
Foley sound, audio content inserted synchronously with videos, plays a critical role in the
user experience of multimedia content. Recently, there has been active research in Foley …
user experience of multimedia content. Recently, there has been active research in Foley …
Optimization Techniques for a Physical Model of Human Vocalisation
We present a non-supervised approach to optimize and evaluate the synthesis of non-
speech audio effects from a speech production model. We use the Pink Trombone …
speech audio effects from a speech production model. We use the Pink Trombone …
Generating diverse vocal bursts with StyleGAN2 and mel-spectrograms
We describe our approach for the generative emotional vocal burst task (ExVo Generate) of
the ICML Expressive Vocalizations Competition. We train a conditional StyleGAN2 …
the ICML Expressive Vocalizations Competition. We train a conditional StyleGAN2 …
[PDF][PDF] Foley sound synthesis in waveform domain with diffusion model
Foley sound synthesis becomes an important task due to the growing popularity of multi-
media content, which is an industrial usecase of general audio synthesis. We propose a …
media content, which is an industrial usecase of general audio synthesis. We propose a …