Full-band general audio synthesis with score-based diffusion

S Pascual, G Bhattacharya, C Yeh… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Recent works have shown the capability of deep generative models to tackle general audio
synthesis from a single label, producing a variety of impulsive, tonal, and environmental …

The ICML 2022 expressive vocalizations workshop and competition: Recognizing, generating, and personalizing vocal bursts

A Baird, P Tzirakis, G Gidel, M Jiralerspong… - arxiv preprint arxiv …, 2022 - arxiv.org
The ICML Expressive Vocalization (ExVo) Competition is focused on understanding and
generating vocal bursts: laughs, gasps, cries, and other non-verbal vocalizations that are …

Laughter synthesis using pseudo phonetic tokens with a large-scale in-the-wild laughter corpus

D **n, S Takamichi, A Morimatsu… - arxiv preprint arxiv …, 2023 - arxiv.org
We present a large-scale in-the-wild Japanese laughter corpus and a laughter synthesis
method. Previous work on laughter synthesis lacks not only data but also proper ways to …

T-Foley: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis

Y Chung, J Lee, J Nam - ICASSP 2024-2024 IEEE International …, 2024 - ieeexplore.ieee.org
Foley sound, audio content inserted synchronously with videos, plays a critical role in the
user experience of multimedia content. Recently, there has been active research in Foley …

Optimization Techniques for a Physical Model of Human Vocalisation

M Cámara, Z Xu, Y Zong, JL Blanco… - arxiv preprint arxiv …, 2023 - arxiv.org
We present a non-supervised approach to optimize and evaluate the synthesis of non-
speech audio effects from a speech production model. We use the Pink Trombone …

Generating diverse vocal bursts with StyleGAN2 and mel-spectrograms

M Jiralerspong, G Gidel - arxiv preprint arxiv:2206.12563, 2022 - arxiv.org
We describe our approach for the generative emotional vocal burst task (ExVo Generate) of
the ICML Expressive Vocalizations Competition. We train a conditional StyleGAN2 …

[PDF][PDF] Foley sound synthesis in waveform domain with diffusion model

Y Chung, J Lee, J Nam - 2023 - dcase.community
Foley sound synthesis becomes an important task due to the growing popularity of multi-
media content, which is an industrial usecase of general audio synthesis. We propose a …