Conv-tasnet: Surpassing ideal time–frequency magnitude masking for speech separation

Y Luo, N Mesgarani - IEEE/ACM transactions on audio, speech …, 2019 - ieeexplore.ieee.org
Single-channel, speaker-independent speech separation methods have recently seen great
progress. However, the accuracy, latency, and computational cost of such methods remain …

A survey on audio diffusion models: Text to speech synthesis and enhancement in generative ai

C Zhang, C Zhang, S Zheng, M Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Generative AI has demonstrated impressive performance in various fields, among which
speech synthesis is an interesting direction. With the diffusion model as the most popular …

Vocal activity informed singing voice separation with the iKala dataset

TS Chan, TC Yeh, ZC Fan, HW Chen… - … , Speech and Signal …, 2015 - ieeexplore.ieee.org
A new algorithm is proposed for robust principal component analysis with predefined
sparsity patterns. The algorithm is then applied to separate the singing voice from the …

Non-negative matrix factorization: a survey

J Gan, T Liu, L Li, J Zhang - The Computer Journal, 2021 - academic.oup.com
Non-negative matrix factorization (NMF) is a powerful tool for data science researchers, and
it has been successfully applied to data mining and machine learning community, due to its …

Minimum-volume rank-deficient nonnegative matrix factorizations

V Leplat, AMS Ang, N Gillis - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
In recent years, nonnegative matrix factorization (NMF) with volume regularization has been
shown to be a powerful identifiable model; for example for hyperspectral unmixing …

Student's t nonnegative matrix factorization and positive semidefinite tensor factorization for single-channel audio source separation

K Yoshii, K Itoyama, M Goto - 2016 IEEE International …, 2016 - ieeexplore.ieee.org
This paper presents a robust variant of nonnegative matrix factorization (NMF) based on
complex Student's t distributions (t-NMF) for source separation of single-channel audio …

Single channel blind source separation of complex signals based on spatial‐temporal fusion deep learning

W Luo, R Yang, H **, X Li, H Li… - IET Radar, Sonar & …, 2023 - Wiley Online Library
Abstract Blind Source Separation (BSS) of complex signals composed of radar,
communication and jamming signals is the first step in an integrated electronic system …

Efficient personalized speech enhancement through self-supervised learning

A Sivaraman, M Kim - IEEE Journal of Selected Topics in Signal …, 2022 - ieeexplore.ieee.org
This work presents self-supervised learning methods for monaural speaker-specific (ie,
personalized) speech enhancement models. While general-purpose models must broadly …

Independent low-rank tensor analysis for audio source separation

K Yoshii, K Kitamura, Y Bando… - 2018 26th European …, 2018 - ieeexplore.ieee.org
This paper describes a versatile tensor factorization technique called independent low-rank
tensor analysis (ILRTA) and its application to single-channel audio source separation. In …

Correlated tensor factorization for audio source separation

K Yoshii - 2018 IEEE International Conference on Acoustics …, 2018 - ieeexplore.ieee.org
This paper presents an ultimate extension of nonnegative matrix factorization (NMF) for
audio source separation based on full covariance modeling over all the time-frequency (TF) …