[HTML][HTML] An overview of variational autoencoders for source separation, finance, and bio-signal applications

A Singh, T Ogunfunmi - Entropy, 2021 - mdpi.com
Autoencoders are a self-supervised learning system where, during training, the output is an
approximation of the input. Typically, autoencoders have three parts: Encoder (which …

Audio-visual speech enhancement using conditional variational auto-encoders

M Sadeghi, S Leglaive… - … on Audio, Speech …, 2020 - ieeexplore.ieee.org
Variational auto-encoders (VAEs) are deep generative latent variable models that can be
used for learning the distribution of complex data. VAEs have been successfully used to …

Autoencoders for music sound modeling: a comparison of linear, shallow, deep, recurrent and variational models

F Roche, T Hueber, S Limier, L Girin - arxiv preprint arxiv:1806.04096, 2018 - arxiv.org
This study investigates the use of non-linear unsupervised dimensionality reduction
techniques to compress a music dataset into a low-dimensional representation which can be …

A flow-based deep latent variable model for speech spectrogram modeling and enhancement

AA Nugraha, K Sekiguchi… - IEEE/ACM Transactions on …, 2020 - ieeexplore.ieee.org
This article describes a deep latent variable model of speech power spectrograms and its
application to semi-supervised speech enhancement with a deep speech prior. By …

Minimum-volume multichannel nonnegative matrix factorization for blind audio source separation

J Wang, S Guan, S Liu, XL Zhang - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org
Multichannel blind audio source separation aims to recover the latent sources from their
multichannel mixtures without supervised information. One state-of-the-art blind audio …

Notes on the use of variational autoencoders for speech and audio spectrogram modeling

L Girin, F Roche, T Hueber, S Leglaive - DAFx 2019-22nd International …, 2019 - hal.science
Variational autoencoders (VAEs) are powerful (deep) generative artificial neural networks.
They have been recently used in several papers for speech and audio processing, in …

Deep generative variational autoencoding for replay spoof detection in automatic speaker verification

B Chettri, T Kinnunen, E Benetos - Computer Speech & Language, 2020 - Elsevier
Automatic speaker verification (ASV) systems are highly vulnerable to presentation attacks,
also called spoofing attacks. Replay is among the simplest attacks to mount—yet difficult to …

Joint separation and localization of moving sound sources based on neural full-rank spatial covariance analysis

H Munakata, Y Bando, R Takeda… - IEEE Signal …, 2023 - ieeexplore.ieee.org
This paper presents an unsupervised multichannel method that can separate moving sound
sources based on an amortized variational inference (AVI) of joint separation and …

Determined bss by combination of iva and dnn via proximal average

K Matsumoto, K Yatabe - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
This paper proposes a novel approach for determined blind source separation (BSS)
assisted by deep neural network (DNN). Determined BSS algorithms, including independent …

A metaheuristic autoencoder deep learning model for intrusion detector system

JK Pandey, S Kumar, M Lamin, S Gupta… - Mathematical …, 2022 - Wiley Online Library
A multichannel autoencoder deep learning approach is developed to address the present
intrusion detection systems' detection accuracy and false alarm rate. First, two separate …