- Academic Search

Sixty years of frequency-domain monaural speech enhancement: From traditional to deep learning methods

C Zheng, H Zhang, W Liu, X Luo, A Li, X Li… - Trends in …, 2023 - journals.sagepub.com

Frequency-domain monaural speech enhancement has been extensively studied for over
60 years, and a great number of methods have been proposed and applied to many …

Salva Cita Citato da 43 Articoli correlati Tutte e 9 le versioni

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Real time speech enhancement in the waveform domain

A Defossez, G Synnaeve, Y Adi - arxiv preprint arxiv:2006.12847, 2020 - arxiv.org

We present a causal speech enhancement model working on the raw waveform that runs in
real-time on a laptop CPU. The proposed model is based on an encoder-decoder …

Salva Cita Citato da 567 Articoli correlati Tutte e 8 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Music source separation with band-split RNN

Y Luo, J Yu - IEEE/ACM Transactions on Audio, Speech, and …, 2023 - ieeexplore.ieee.org

The performance of music source separation (MSS) models has been greatly improved in
recent years thanks to the development of novel neural network architectures and training …

Salva Cita Citato da 115 Articoli correlati Tutte e 4 le versioni

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

TSTNN: Two-stage transformer based neural network for speech enhancement in the time domain

K Wang, B He, WP Zhu - ICASSP 2021-2021 IEEE International …, 2021 - ieeexplore.ieee.org

In this paper, we propose a transformer-based architecture, called two-stage transformer
neural network (TSTNN) for end-to-end speech denoising in the time domain. The proposed …

Salva Cita Citato da 204 Articoli correlati Tutte e 3 le versioni

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Separate anything you describe

X Liu, Q Kong, Y Zhao, H Liu, Y Yuan… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org

Language-queried audio source separation (LASS) is a new paradigm for computational
auditory scene analysis (CASA). LASS aims to separate a target sound from an audio …

Salva Cita Citato da 40 Articoli correlati Tutte e 3 le versioni

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

CMGAN: Conformer-based metric GAN for speech enhancement

R Cao, S Abdulatif, B Yang - arxiv preprint arxiv:2203.15149, 2022 - arxiv.org

Recently, convolution-augmented transformer (Conformer) has achieved promising
performance in automatic speech recognition (ASR) and time-domain speech enhancement …

Salva Cita Citato da 119 Articoli correlati Tutte e 7 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

HiFi-GAN: High-fidelity denoising and dereverberation based on speech deep features in adversarial networks

J Su, Z **, A Finkelstein - arxiv preprint arxiv:2006.05694, 2020 - arxiv.org

Real-world audio recordings are often degraded by factors such as noise, reverberation,
and equalization distortion. This paper introduces HiFi-GAN, a deep learning method to …

Salva Cita Citato da 178 Articoli correlati Tutte e 10 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] isca-archive.org

[PDF][PDF] SE-Conformer: Time-Domain Speech Enhancement Using Conformer.

E Kim, H Seo - Interspeech, 2021 - isca-archive.org

Convolution-augmented transformer (conformer) has recently shown competitive results in
speech-domain applications, such as automatic speech recognition, continuous speech …

Salva Cita Citato da 108 Articoli correlati Tutte e 4 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Poconet: Better speech enhancement with frequency-positional embeddings, semi-supervised conversational data, and biased loss

U Isik, R Giri, N Phansalkar, JM Valin… - arxiv preprint arxiv …, 2020 - arxiv.org

Neural network applications generally benefit from larger-sized models, but for current
speech enhancement models, larger scale networks often suffer from decreased robustness …

Salva Cita Citato da 115 Articoli correlati Tutte e 12 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] amazon.science

Attention wave-u-net for speech enhancement

R Giri, U Isik, A Krishnaswamy - 2019 IEEE Workshop on …, 2019 - ieeexplore.ieee.org

We propose a novel application of an attention mechanism in neural speech enhancement,
by presenting a U-Net architecture with attention mechanism, which processes the raw …

Salva Cita Citato da 150 Articoli correlati Tutte e 2 le versioni

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Improved speech enhancement with the wave-u-net

Sixty years of frequency-domain monaural speech enhancement: From traditional to deep learning methods

Real time speech enhancement in the waveform domain

Music source separation with band-split RNN

TSTNN: Two-stage transformer based neural network for speech enhancement in the time domain

Separate anything you describe

CMGAN: Conformer-based metric GAN for speech enhancement

HiFi-GAN: High-fidelity denoising and dereverberation based on speech deep features in adversarial networks

[PDF][PDF] SE-Conformer: Time-Domain Speech Enhancement Using Conformer.

Poconet: Better speech enhancement with frequency-positional embeddings, semi-supervised conversational data, and biased loss

Attention wave-u-net for speech enhancement