DCCRN: Deep complex convolution recurrent network for phase-aware speech enhancement

Y Hu, Y Liu, S Lv, M **ng, S Zhang, Y Fu, J Wu… - arxiv preprint arxiv …, 2020 - arxiv.org
Speech enhancement has benefited from the success of deep learning in terms of
intelligibility and perceptual quality. Conventional time-frequency (TF) domain methods …

Boosting self-supervised embeddings for speech enhancement

KH Hung, S Fu, HH Tseng, HT Chiang, Y Tsao… - arxiv preprint arxiv …, 2022 - arxiv.org
Self-supervised learning (SSL) representation for speech has achieved state-of-the-art
(SOTA) performance on several downstream tasks. However, there remains room for …

DRC-NET: Densely connected recurrent convolutional neural network for speech dereverberation

J Liu, X Zhang - … 2022-2022 IEEE International Conference on …, 2022 - ieeexplore.ieee.org
Under our previous work on frequency bin-wise independent processing, a dramatic
reduction of the computational complexity for recurrent neural networks (RNN) is achieved …

Content-based music-image retrieval using self-and cross-modal feature embedding memory

T Nakatsuka, M Hamasaki… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
This paper describes a method based on deep metric learning for content-based cross-
modal retrieval of a piece of music and its representative image (ie, a music audio signal …

Denoising-and-dereverberation hierarchical neural vocoder for statistical parametric speech synthesis

Y Ai, ZH Ling, WL Wu, A Li - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
This paper presents a denoising and dereverberation hierarchical neural vocoder (DNR-
HiNet) to convert noisy and reverberant acoustic features into clean speech waveforms. The …

Stacked multiscale densely connected temporal convolutional attention network for multi-objective speech enhancement in an airborne environment

P Huang, Y Wu - Aerospace, 2024 - mdpi.com
Airborne speech enhancement is always a major challenge for the security of airborne
systems. Recently, multi-objective learning technology has become one of the mainstream …

Perceptual loss with recognition model for single-channel enhancement and robust ASR

P Plantinga, D Bagchi, E Fosler-Lussier - arxiv preprint arxiv:2112.06068, 2021 - arxiv.org
Single-channel speech enhancement approaches do not always improve automatic
recognition rates in the presence of noise, because they can introduce distortions unhelpful …

[PDF][PDF] Monaural Speech Enhancement Based on Spectrogram Decomposition for Convolutional Neural Network-sensitive Feature Extraction.

H Shi, L Wang, S Li, J Dang, T Kawahara - Interspeech, 2022 - sap.ist.i.kyoto-u.ac.jp
Many state-of-the-art speech enhancement (SE) systems have recently used convolutional
neural networks (CNNs) to extract multi-scale feature maps. However, CNN relies more on …

Denoising-and-dereverberation hierarchical neural vocoder for robust waveform generation

Y Ai, H Li, X Wang, J Yamagishi… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org
This paper presents a denoising and dereverberation hierarchical neural vocoder (DNR-
HiNet) to convert noisy and reverberant acoustic features into a clean speech waveform. We …

Computational intelligence for speech enhancement using deep neural network

D Hepsiba, J Justin - Computer Assisted Methods in Engineering …, 2022 - cames.ippt.gov.pl
In real time, the speech signal received contains noise produced in the background and
reverberations. These disturbances reduce the quality of speech; therefore, it is important to …