Sparks of large audio models: A survey and outlook

S Latif, M Shoukat, F Shamshad, M Usama… - arxiv preprint arxiv …, 2023 - arxiv.org
This survey paper provides a comprehensive overview of the recent advancements and
challenges in applying large language models to the field of audio signal processing. Audio …

High fidelity neural audio compression

A Défossez, J Copet, G Synnaeve, Y Adi - arxiv preprint arxiv:2210.13438, 2022 - arxiv.org
We introduce a state-of-the-art real-time, high-fidelity, audio codec leveraging neural
networks. It consists in a streaming encoder-decoder architecture with quantized latent …

Soundstream: An end-to-end neural audio codec

N Zeghidour, A Luebs, A Omran… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org
We present SoundStream, a novel neural audio codec that can efficiently compress speech,
music and general audio at bitrates normally targeted by speech-tailored codecs …

Universal speech enhancement with score-based diffusion

J Serrà, S Pascual, J Pons, RO Araz… - arxiv preprint arxiv …, 2022 - arxiv.org
Removing background noise from speech audio has been the subject of considerable effort,
especially in recent years due to the rise of virtual communication and amateur recordings …

Funcodec: A fundamental, reproducible and integrable open-source toolkit for neural speech codec

Z Du, S Zhang, K Hu, S Zheng - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
This paper presents FunCodec, a fundamental neural speech codec toolkit, which is an
extension of the open-source speech processing toolkit FunASR. FunCodec provides …

Tramba: A hybrid transformer and mamba architecture for practical audio and bone conduction speech super resolution and enhancement on mobile and wearable …

Y Sui, M Zhao, J **a, X Jiang, S **a - … of the ACM on Interactive, Mobile …, 2024 - dl.acm.org
We propose TRAMBA, a hybrid transformer and Mamba architecture for acoustic and bone
conduction speech enhancement, suitable for mobile and wearable platforms. Bone …

HILCodec: High-Fidelity and Lightweight Neural Audio Codec

S Ahn, BJ Woo, MH Han, C Moon… - IEEE Journal of Selected …, 2024 - ieeexplore.ieee.org
The recent advancement of end-to-end neural audio codecs enables compressing audio at
very low bitrates while reconstructing the output audio with high fidelity. Nonetheless, such …

Aero: Audio super resolution in the spectral domain

M Mandel, O Tal, Y Adi - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
We present AERO, a audio super-resolution model that processes speech and music
signals in the spectral domain. AERO is based on an encoder-decoder architecture with …

Audio super-resolution with robust speech representation learning of masked autoencoder

SB Kim, SH Lee, HY Choi… - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org
This paper proposes Fre-Painter, a high-fidelity audio super-resolution system that utilizes
robust speech representation learning with various masking strategies. Recently, masked …

Hifi++: a unified framework for bandwidth extension and speech enhancement

P Andreev, A Alanov, O Ivanov… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Generative adversarial networks have recently demonstrated outstanding performance in
neural vocoding outperforming best autoregressive and flow-based models. In this paper …