A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Universal speech enhancement with score-based diffusion

J Serrà, S Pascual, J Pons, RO Araz… - ar**
Y Koizumi, H Zen, K Yatabe, N Chen… - arxiv preprint arxiv …, 2022 - arxiv.org
Neural vocoder using denoising diffusion probabilistic model (DDPM) has been improved by
adaptation of the diffusion noise distribution to given acoustic features. In this study, we …

Self-supervised speech quality estimation and enhancement using only clean speech

SW Fu, KH Hung, Y Tsao, YCF Wang - arxiv preprint arxiv:2402.16321, 2024 - arxiv.org
Speech quality estimation has recently undergone a paradigm shift from human-hearing
expert designs to machine-learning models. However, current models rely mainly on …

Universal melgan: A robust neural vocoder for high-fidelity waveform generation in multiple domains

W Jang, D Lim, J Yoon - arxiv preprint arxiv:2011.09631, 2020 - arxiv.org
We propose Universal MelGAN, a vocoder that synthesizes high-fidelity speech in multiple
domains. To preserve sound quality when the MelGAN-based structure is trained with a …

Speech enhancement with integration of neural homomorphic synthesis and spectral masking

W Jiang, K Yu - IEEE/ACM Transactions on Audio, Speech, and …, 2023 - ieeexplore.ieee.org
Speech enhancement refers to suppressing the background noise to improve the perceptual
quality and intelligibility of the observed noisy speech. Recently, speech enhancement …