Generative semantic communication: Diffusion models beyond bit recovery
Semantic communication is expected to be one of the cores of next-generation AI-based
communications. One of the possibilities offered by semantic communication is the capability …
communications. One of the possibilities offered by semantic communication is the capability …
Aero: Audio super resolution in the spectral domain
We present AERO, a audio super-resolution model that processes speech and music
signals in the spectral domain. AERO is based on an encoder-decoder architecture with …
signals in the spectral domain. AERO is based on an encoder-decoder architecture with …
Deep prior-based audio inpainting using multi-resolution harmonic convolutional neural networks
In this manuscript, we propose a novel method to perform audio inpainting, ie, the
restoration of audio signals presenting multiple missing parts. Audio inpainting can be …
restoration of audio signals presenting multiple missing parts. Audio inpainting can be …
Vrdmg: Vocal restoration via diffusion posterior sampling with multiple guidance
Restoring degraded music signals is essential to enhance audio quality for downstream
music manipulation. Recent diffusion-based music restoration methods have demonstrated …
music manipulation. Recent diffusion-based music restoration methods have demonstrated …
Rethinking Multi-User Semantic Communications with Deep Generative Models
In recent years, novel communication strategies have emerged to face the challenges that
the increased number of connected devices and the higher quality of transmitted information …
the increased number of connected devices and the higher quality of transmitted information …
[PDF][PDF] Blind Zero-Shot Audio Restoration: A Variational Autoencoder Approach for Denoising and Inpainting
V Boukun, J Drefs, J Lücke - Proc. Interspeech 2024, 2024 - isca-archive.org
We address the task of blind 'zero-shot'audio signal denoising and inpainting. In the blind
zero-shot setting, only the corrupted audio signal is used for signal restoration (no other …
zero-shot setting, only the corrupted audio signal is used for signal restoration (no other …
Improving Audio Recognition with Randomized Area Ratio Patch Masking: A Data Augmentation Perspective
W Wong, Y Li, S Li - IEEE Access, 2024 - ieeexplore.ieee.org
In audio recognition, improving the accuracy and generalizability of Pretrained Audio Neural
Networks (PANNs) remains challenging. This study introduces Randomized Area Ratio …
Networks (PANNs) remains challenging. This study introduces Randomized Area Ratio …
Handcrafted Feature From Classification Mood Music Indonesia With Machine Learning BERT and Transformer
Music is a combination of the human voice and instruments that bring beauty to the listener.
Many people like music to create an atmosphere or atmosphere. In the field of computer …
Many people like music to create an atmosphere or atmosphere. In the field of computer …
Unsupervised speech enhancement with spectral kurtosis and double deep priors
This paper proposes an unsupervised DNN-based speech enhancement approach founded
on deep priors (DPs). Here, DP signifies that DNNs are more inclined to produce clean …
on deep priors (DPs). Here, DP signifies that DNNs are more inclined to produce clean …
Model-based and data-driven approaches meet redundancy in signal processing
T Theoharis, V Kouni, I Panagakis, I Emiris… - 2023 - repository-empedu-rd.ekt.gr
The present dissertation is divided in two parts. In the first one, we address two inverse
problems, namely compressed sensing (CS) and speech denoising, through the lens of the …
problems, namely compressed sensing (CS) and speech denoising, through the lens of the …