Unsupervised speech enhancement using dynamical variational autoencoders

X Bie, S Leglaive, X Alameda-Pineda… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org
Dynamical variational autoencoders (DVAEs) are a class of deep generative models with
latent variables, dedicated to model time series of high-dimensional data. DVAEs can be …

A recurrent variational autoencoder for speech enhancement

S Leglaive, X Alameda-Pineda, L Girin… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
This paper presents a generative approach to speech enhancement based on a recurrent
variational autoencoder (RVAE). The deep generative speech model is trained using clean …

Supervised determined source separation with multichannel variational autoencoder

H Kameoka, L Li, S Inoue, S Makino - Neural computation, 2019 - direct.mit.edu
This letter proposes a multichannel source separation technique, the multichannel
variational autoencoder (MVAE) method, which uses a conditional VAE (CVAE) to model …

Fast multichannel nonnegative matrix factorization with directivity-aware jointly-diagonalizable spatial covariance matrices for blind source separation

K Sekiguchi, Y Bando, AA Nugraha… - … on Audio, Speech …, 2020 - ieeexplore.ieee.org
This article describes a computationally-efficient blind source separation (BSS) method
based on the independence, low-rankness, and directivity of the sources. A typical approach …

Audio-visual speech enhancement using conditional variational auto-encoders

M Sadeghi, S Leglaive… - … on Audio, Speech …, 2020 - ieeexplore.ieee.org
Variational auto-encoders (VAEs) are deep generative latent variable models that can be
used for learning the distribution of complex data. VAEs have been successfully used to …

Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization

S Leglaive, L Girin, R Horaud - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
In this paper we address speaker-independent multichannel speech enhancement in
unknown noisy environments. Our work is based on a well-established multichannel local …

Unsupervised speech enhancement based on multichannel NMF-informed beamforming for noise-robust automatic speech recognition

K Shimada, Y Bando, M Mimura… - … on Audio, Speech …, 2019 - ieeexplore.ieee.org
This paper describes multichannel speech enhancement for improving automatic speech
recognition (ASR) in noisy environments. Recently, the minimum variance distortionless …

Fast multichannel source separation based on jointly diagonalizable spatial covariance matrices

K Sekiguchi, AA Nugraha, Y Bando… - 2019 27th European …, 2019 - ieeexplore.ieee.org
This paper describes a versatile method that accelerates multichannel source separation
methods based on full-rank spatial modeling. A popular approach to multichannel source …

Semi-supervised multichannel speech enhancement with a deep speech prior

K Sekiguchi, Y Bando, AA Nugraha… - … on Audio, Speech …, 2019 - ieeexplore.ieee.org
This paper describes a semi-supervised multichannel speech enhancement method that
uses clean speech data for prior training. Although multichannel nonnegative matrix …

A flow-based deep latent variable model for speech spectrogram modeling and enhancement

AA Nugraha, K Sekiguchi… - IEEE/ACM Transactions on …, 2020 - ieeexplore.ieee.org
This article describes a deep latent variable model of speech power spectrograms and its
application to semi-supervised speech enhancement with a deep speech prior. By …