Generative semantic communication: Diffusion models beyond bit recovery

E Grassucci, S Barbarossa, D Comminiello - arxiv preprint arxiv …, 2023 - arxiv.org
Semantic communication is expected to be one of the cores of next-generation AI-based
communications. One of the possibilities offered by semantic communication is the capability …

Aero: Audio super resolution in the spectral domain

M Mandel, O Tal, Y Adi - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
We present AERO, a audio super-resolution model that processes speech and music
signals in the spectral domain. AERO is based on an encoder-decoder architecture with …

Deep prior-based audio inpainting using multi-resolution harmonic convolutional neural networks

F Miotello, M Pezzoli, L Comanducci… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
In this manuscript, we propose a novel method to perform audio inpainting, ie, the
restoration of audio signals presenting multiple missing parts. Audio inpainting can be …

Vrdmg: Vocal restoration via diffusion posterior sampling with multiple guidance

C Hernandez-Olivan, K Saito, N Murata… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Restoring degraded music signals is essential to enhance audio quality for downstream
music manipulation. Recent diffusion-based music restoration methods have demonstrated …

Rethinking Multi-User Semantic Communications with Deep Generative Models

E Grassucci, J Choi, J Park, RF Gramaccioni… - arxiv preprint arxiv …, 2024 - arxiv.org
In recent years, novel communication strategies have emerged to face the challenges that
the increased number of connected devices and the higher quality of transmitted information …

[PDF][PDF] Blind Zero-Shot Audio Restoration: A Variational Autoencoder Approach for Denoising and Inpainting

V Boukun, J Drefs, J Lücke - Proc. Interspeech 2024, 2024 - isca-archive.org
We address the task of blind 'zero-shot'audio signal denoising and inpainting. In the blind
zero-shot setting, only the corrupted audio signal is used for signal restoration (no other …

Improving Audio Recognition with Randomized Area Ratio Patch Masking: A Data Augmentation Perspective

W Wong, Y Li, S Li - IEEE Access, 2024 - ieeexplore.ieee.org
In audio recognition, improving the accuracy and generalizability of Pretrained Audio Neural
Networks (PANNs) remains challenging. This study introduces Randomized Area Ratio …

Handcrafted Feature From Classification Mood Music Indonesia With Machine Learning BERT and Transformer

N Rosmawarni, I Ahmad, SS Hilabi… - … Multimedia, Cyber and …, 2023 - ieeexplore.ieee.org
Music is a combination of the human voice and instruments that bring beauty to the listener.
Many people like music to create an atmosphere or atmosphere. In the field of computer …

Unsupervised speech enhancement with spectral kurtosis and double deep priors

H Ohnaka, R Miyazaki - Acoustical Science and Technology, 2024 - jstage.jst.go.jp
This paper proposes an unsupervised DNN-based speech enhancement approach founded
on deep priors (DPs). Here, DP signifies that DNNs are more inclined to produce clean …

Model-based and data-driven approaches meet redundancy in signal processing

T Theoharis, V Kouni, I Panagakis, I Emiris… - 2023 - repository-empedu-rd.ekt.gr
The present dissertation is divided in two parts. In the first one, we address two inverse
problems, namely compressed sensing (CS) and speech denoising, through the lens of the …