- Academic Search

S Pascual, G Bhattacharya, C Yeh… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Recent works have shown the capability of deep generative models to tackle general audio
synthesis from a single label, producing a variety of impulsive, tonal, and environmental …

Save Cite Cited by 41 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

The ICML 2022 expressive vocalizations workshop and competition: Recognizing, generating, and personalizing vocal bursts

A Baird, P Tzirakis, G Gidel, M Jiralerspong… - arxiv preprint arxiv …, 2022 - arxiv.org

The ICML Expressive Vocalization (ExVo) Competition is focused on understanding and
generating vocal bursts: laughs, gasps, cries, and other non-verbal vocalizations that are …

Save Cite Cited by 22 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Laughter synthesis using pseudo phonetic tokens with a large-scale in-the-wild laughter corpus

D **n, S Takamichi, A Morimatsu… - arxiv preprint arxiv …, 2023 - arxiv.org

We present a large-scale in-the-wild Japanese laughter corpus and a laughter synthesis
method. Previous work on laughter synthesis lacks not only data but also proper ways to …

Save Cite Cited by 9 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

T-Foley: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis

Y Chung, J Lee, J Nam - ICASSP 2024-2024 IEEE International …, 2024 - ieeexplore.ieee.org

Foley sound, audio content inserted synchronously with videos, plays a critical role in the
user experience of multimedia content. Recently, there has been active research in Foley …

Save Cite Cited by 13 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Optimization Techniques for a Physical Model of Human Vocalisation

M Cámara, Z Xu, Y Zong, JL Blanco… - arxiv preprint arxiv …, 2023 - arxiv.org

We present a non-supervised approach to optimize and evaluate the synthesis of non-
speech audio effects from a speech production model. We use the Pink Trombone …

Save Cite Cited by 5 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Generating diverse vocal bursts with StyleGAN2 and mel-spectrograms

M Jiralerspong, G Gidel - arxiv preprint arxiv:2206.12563, 2022 - arxiv.org

We describe our approach for the generative emotional vocal burst task (ExVo Generate) of
the ICML Expressive Vocalizations Competition. We train a conditional StyleGAN2 …

Save Cite Cited by 5 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] dcase.community

[PDF][PDF] Foley sound synthesis in waveform domain with diffusion model

Y Chung, J Lee, J Nam - 2023 - dcase.community

Foley sound synthesis becomes an important task due to the growing popularity of multi-
media content, which is an industrial usecase of general audio synthesis. We propose a …

Save Cite Cited by 1 Related articles View as HTML

Create alert

Cite

Advanced search

Saved to My library

Generating diverse realistic laughter for interactive art

Full-band general audio synthesis with score-based diffusion

The ICML 2022 expressive vocalizations workshop and competition: Recognizing, generating, and personalizing vocal bursts

Laughter synthesis using pseudo phonetic tokens with a large-scale in-the-wild laughter corpus

T-Foley: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis

Optimization Techniques for a Physical Model of Human Vocalisation

Generating diverse vocal bursts with StyleGAN2 and mel-spectrograms

[PDF][PDF] Foley sound synthesis in waveform domain with diffusion model