High fidelity neural audio compression

A Défossez, J Copet, G Synnaeve, Y Adi - arxiv preprint arxiv:2210.13438, 2022 - arxiv.org
We introduce a state-of-the-art real-time, high-fidelity, audio codec leveraging neural
networks. It consists in a streaming encoder-decoder architecture with quantized latent …

Soundstream: An end-to-end neural audio codec

N Zeghidour, A Luebs, A Omran… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org
We present SoundStream, a novel neural audio codec that can efficiently compress speech,
music and general audio at bitrates normally targeted by speech-tailored codecs …

Quality of experience in telemeetings and videoconferencing: a comprehensive survey

J Skowronek, A Raake, GH Berndtsson… - IEEE …, 2022 - ieeexplore.ieee.org
Telemeetings such as audiovisual conferences or virtual meetings play an increasingly
important role in our professional and private lives. For that reason, system developers and …

Towards audio language modeling-an overview

H Wu, X Chen, YC Lin, K Chang, HL Chung… - arxiv preprint arxiv …, 2024 - arxiv.org
Neural audio codecs are initially introduced to compress audio data into compact codes to
reduce transmission latency. Researchers recently discovered the potential of codecs as …

Audiodec: An open-source streaming high-fidelity neural audio codec

YC Wu, ID Gebru, D Marković… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
A good audio codec for live applications such as telecommunication is characterized by
three key properties:(1) compression, ie the bitrate that is required to transmit the signal …

Language-codec: Reducing the gaps between discrete codec representation and speech language models

S Ji, M Fang, Z Jiang, S Zheng, Q Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
In recent years, large language models have achieved significant success in generative
tasks (eg, speech cloning and audio generation) related to speech, audio, music, and other …

Generative speech coding with predictive variance regularization

WB Kleijn, A Storus, M Chinen, T Denton… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
The recent emergence of machine-learning based generative models for speech suggests a
significant reduction in bit rate for speech codecs is possible. However, the performance of …

Funcodec: A fundamental, reproducible and integrable open-source toolkit for neural speech codec

Z Du, S Zhang, K Hu, S Zheng - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
This paper presents FunCodec, a fundamental neural speech codec toolkit, which is an
extension of the open-source speech processing toolkit FunASR. FunCodec provides …

Bigcodec: Pushing the limits of low-bitrate neural speech codec

D **n, X Tan, S Takamichi, H Saruwatari - arxiv preprint arxiv:2409.05377, 2024 - arxiv.org
We present BigCodec, a low-bitrate neural speech codec. While recent neural speech
codecs have shown impressive progress, their performance significantly deteriorates at low …

Lmcodec: A low bitrate speech codec with causal transformer models

T Jenrungrot, M Chinen, WB Kleijn… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
We introduce LMCodec, a causal neural speech codec that provides high quality audio at
very low bitrates. The backbone of the system is a causal convolutional codec that encodes …