- Academic Search

B Hayes, J Shier, G Fazekas, A McPherson… - Frontiers in Signal …, 2024 - frontiersin.org

The term “differentiable digital signal processing” describes a family of techniques in which
loss function gradients are backpropagated through digital signal processors, facilitating …

Zapisz Cytuj Cytowane przez 27 Powiązane artykuły Wszystkie wersje 6 Kopia

[Free GPT-4]
[DeepSeek]

[PDF] qmul.ac.uk

The state of the art in procedural audio

D Menexopoulos, P Pestana, J Reiss - Journal of the Audio Engineering …, 2023 - aes.org

Procedural audio may be defined as real-time sound generation according to programmatic
rules and live input. It is often considered a subset of sound synthesis and is especially …

Zapisz Cytuj Cytowane przez 9 Powiązane artykuły Wszystkie wersje 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Adapting frechet audio distance for generative music evaluation

A Gui, H Gamper, S Braun… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

The growing popularity of generative music models underlines the need for perceptually
relevant, objective music quality metrics. The Frechet Audio Distance (FAD) is commonly …

Zapisz Cytuj Cytowane przez 50 Powiązane artykuły Wszystkie wersje 3

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

Multi-modal latent diffusion

M Bounoua, G Franzese, P Michiardi - Entropy, 2024 - mdpi.com

Multimodal datasets are ubiquitous in modern applications, and multimodal Variational
Autoencoders are a popular family of models that aim to learn a joint representation of …

Zapisz Cytuj Cytowane przez 13 Powiązane artykuły Wszystkie wersje 9 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Configurable EBEN: Extreme bandwidth extension network to enhance body-conducted speech capture

J Hauret, T Joubaud, V Zimpfer… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org

This article presents a configurable version of Extreme Bandwidth Extension Network
(EBEN), a Generative Adversarial Network (GAN) designed to improve audio captured with …

Zapisz Cytuj Cytowane przez 8 Powiązane artykuły Wszystkie wersje 7

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Siamese siren: Audio compression with implicit neural representations

LA Lanzendörfer, R Wattenhofer - arxiv preprint arxiv:2306.12957, 2023 - arxiv.org

Implicit Neural Representations (INRs) have emerged as a promising method for
representing diverse data modalities, including 3D shapes, images, and audio. While recent …

Zapisz Cytuj Cytowane przez 9 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

PAGURI: a user experience study of creative interaction with text-to-music models

F Ronchini, L Comanducci, G Perego… - arxiv preprint arxiv …, 2024 - arxiv.org

In recent years, text-to-music models have been the biggest breakthrough in automatic
music generation. While they are unquestionably a showcase of technological progress, it is …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] umons.ac.be

Latent space interpolation of synthesizer parameters using timbre-regularized auto-encoders

G Le Vaillant, T Dutoit - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org

Sound synthesizers are ubiquitous in modern music production but manipulating their
presets, ie the sets of synthesis parameters, demands expert skills. This study presents a …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wszystkie wersje 3

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

What you hear is what you see: Audio quality metrics from image quality metrics

T Namgyal, A Hepburn, R Santos-Rodriguez… - arxiv preprint arxiv …, 2023 - arxiv.org

In this study, we investigate the feasibility of utilizing state-of-the-art image perceptual
metrics for evaluating audio signals by representing them as spectrograms. The …

Zapisz Cytuj Cytowane przez 3 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

[PDF][PDF] Conditional sound effects generation with regularized wgan

Y Liu, C ** - Proceedings of the Sound and Music Computing …, 2023 - researchgate.net

Over recent years generative models utilizing deep neural networks have demonstrated
outstanding capacity in synthesizing high-quality and plausible human speech and music …

Zapisz Cytuj Cytowane przez 4 Powiązane artykuły Wersja HTML

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

A review of differentiable digital signal processing for music and speech synthesis

The state of the art in procedural audio

Adapting frechet audio distance for generative music evaluation

Multi-modal latent diffusion

Configurable EBEN: Extreme bandwidth extension network to enhance body-conducted speech capture

Siamese siren: Audio compression with implicit neural representations

PAGURI: a user experience study of creative interaction with text-to-music models

Latent space interpolation of synthesizer parameters using timbre-regularized auto-encoders

What you hear is what you see: Audio quality metrics from image quality metrics

[PDF][PDF] Conditional sound effects generation with regularized wgan