Speech enhancement and dereverberation with diffusion-based generative models

J Richter, S Welker, JM Lemercier… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
In this work, we build upon our previous publication and use diffusion-based generative
models for speech enhancement. We present a detailed overview of the diffusion process …

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer
Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

Deep learning-based non-intrusive multi-objective speech assessment model with cross-domain features

RE Zezario, SW Fu, F Chen, CS Fuh… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
This study proposes a cross-domain multi-objective speech assessment model, called
MOSA-Net, which can simultaneously estimate the speech quality, intelligibility, and …

Unsupervised speech enhancement using dynamical variational autoencoders

X Bie, S Leglaive, X Alameda-Pineda… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org
Dynamical variational autoencoders (DVAEs) are a class of deep generative models with
latent variables, dedicated to model time series of high-dimensional data. DVAEs can be …

Self-supervised visual acoustic matching

A Somayazulu, C Chen… - Advances in Neural …, 2023 - proceedings.neurips.cc
Acoustic matching aims to re-synthesize an audio clip to sound as if it were recorded in a
target acoustic environment. Existing methods assume access to paired training data, where …

Self-supervised speech quality estimation and enhancement using only clean speech

SW Fu, KH Hung, Y Tsao, YCF Wang - arxiv preprint arxiv:2402.16321, 2024 - arxiv.org
Speech quality estimation has recently undergone a paradigm shift from human-hearing
expert designs to machine-learning models. However, current models rely mainly on …

USDnet: Unsupervised Speech Dereverberation via Neural Forward Filtering

ZQ Wang - IEEE/ACM Transactions on Audio, Speech, and …, 2024 - ieeexplore.ieee.org
In reverberant conditions with a single speaker, each far-field microphone records a
reverberant version of the same speaker signal at a different location. In over-determined …

HD-DEMUCS: General speech restoration with heterogeneous decoders

D Kim, SW Chung, H Han, Y Ji, HG Kang - arxiv preprint arxiv:2306.01411, 2023 - arxiv.org
This paper introduces an end-to-end neural speech restoration model, HD-DEMUCS,
demonstrating efficacy across multiple distortion environments. Unlike conventional …

Tea-pse 3.0: Tencent-ethereal-audio-lab personalized speech enhancement system for icassp 2023 dns-challenge

Y Ju, J Chen, S Zhang, S He, W Rao… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
This paper introduces the Unbeatable Team's submission to the ICASSP 2023 Deep Noise
Suppression (DNS) Challenge. We expand our previous work, TEA-PSE, to its upgraded …

Skill: Similarity-aware knowledge distillation for speech self-supervised learning

L Zampierin, GB Hacene, B Nguyen… - … on Acoustics, Speech …, 2024 - ieeexplore.ieee.org
Self-supervised learning (SSL) has achieved remarkable success across various speech-
processing tasks. To enhance its efficiency, previous works often leverage the use of …