TSTNN: Two-stage transformer based neural network for speech enhancement in the time domain

K Wang, B He, WP Zhu - ICASSP 2021-2021 IEEE international …, 2021 - ieeexplore.ieee.org
In this paper, we propose a transformer-based architecture, called two-stage transformer
neural network (TSTNN) for end-to-end speech denoising in the time domain. The proposed …

CMGAN: Conformer-based metric GAN for speech enhancement

R Cao, S Abdulatif, B Yang - arxiv preprint arxiv:2203.15149, 2022 - arxiv.org
Recently, convolution-augmented transformer (Conformer) has achieved promising
performance in automatic speech recognition (ASR) and time-domain speech enhancement …

MP-SENet: A speech enhancement model with parallel denoising of magnitude and phase spectra

YX Lu, Y Ai, ZH Ling - arxiv preprint arxiv:2305.13686, 2023 - arxiv.org
This paper proposes MP-SENet, a novel Speech Enhancement Network which directly
denoises Magnitude and Phase spectra in parallel. The proposed MP-SENet adopts a …

Cmgan: Conformer-based metric-gan for monaural speech enhancement

S Abdulatif, R Cao, B Yang - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org
In this work, we further develop the conformer-based metric generative adversarial network
(CMGAN) model 1 for speech enhancement (SE) in the time-frequency (TF) domain. This …

Dense CNN with self-attention for time-domain speech enhancement

A Pandey, DL Wang - IEEE/ACM transactions on audio, speech …, 2021 - ieeexplore.ieee.org
Speech enhancement in the time domain is becoming increasingly popular in recent years,
due to its capability to jointly enhance both the magnitude and the phase of speech. In this …

A nested u-net with self-attention and dense connectivity for monaural speech enhancement

X **ang, X Zhang, H Chen - IEEE Signal Processing Letters, 2021 - ieeexplore.ieee.org
With the development of deep neural networks, speech enhancement technology has been
vastly improved. However, commonly used speech enhancement approaches cannot fully …

Manner: Multi-view attention network for noise erasure

HJ Park, BH Kang, W Shin, JS Kim… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
In the field of speech enhancement, time domain methods have difficulties in achieving both
high performance and efficiency. Recently, dual-path models have been adopted to …

S-dccrn: Super wide band dccrn with learnable complex feature for speech enhancement

S Lv, Y Fu, M **ng, J Sun, L **e… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
In speech enhancement, complex neural network has shown promising performance due to
their effectiveness in processing complex-valued spectrum. Most of the recent speech …

Upsampling artifacts in neural audio synthesis

J Pons, S Pascual, G Cengarle… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
A number of recent advances in neural audio synthesis rely on up-sampling layers, which
can introduce undesired artifacts. In computer vision, upsampling artifacts have been …

An overview of speech enhancement based on deep learning techniques

C Jannu, SD Vanambathina - International Journal of Image and …, 2025 - World Scientific
Recent years have seen a significant amount of studies in the area of speech enhancement.
This review looks at several speech improvement methods as well as Deep Neural Network …