- Academic Search

D Michelsanti, ZH Tan, SX Zhang, Y Xu… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org

Speech enhancement and speech separation are two related tasks, whose purpose is to
extract either one or more target speech signals, respectively, from a mixture of sounds …

Tallenna Viittaa Viittausten määrä 304 Aiheeseen liittyviä artikkeleita Kaikki 6 versiota

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

ADL-MVDR: All deep learning MVDR beamformer for target speech separation

Z Zhang, Y Xu, M Yu, SX Zhang… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

Speech separation algorithms are often used to separate the target speech from other
interfering sources. However, purely neural network based speech separation systems often …

Tallenna Viittaa Viittausten määrä 150 Aiheeseen liittyviä artikkeleita Kaikki 7 versiota

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Voicefixer: A unified framework for high-fidelity speech restoration

H Liu, X Liu, Q Kong, Q Tian, Y Zhao, DL Wang… - ar** …

Tallenna Viittaa Viittausten määrä 49 Aiheeseen liittyviä artikkeleita Kaikki 4 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Audio-visual end-to-end multi-channel speech separation, dereverberation and recognition

G Li, J Deng, M Geng, Z **, T Wang… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

Accurate recognition of cocktail party speech containing overlap** speakers, noise and
reverberation remains a highly challenging task to date. Motivated by the invariance of …

Tallenna Viittaa Viittausten määrä 15 Aiheeseen liittyviä artikkeleita Kaikki 7 versiota

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Adverb: Visually guided audio dereverberation

S Chowdhury, S Ghosh, S Dasgupta… - Proceedings of the …, 2023 - openaccess.thecvf.com

We present AdVerb, a novel audio-visual dereverberation framework that uses visual cues
in addition to the reverberant sound to estimate clean audio. Although audio-only …

Tallenna Viittaa Viittausten määrä 9 Aiheeseen liittyviä artikkeleita Kaikki 7 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Generalized spatio-temporal RNN beamformer for target speech separation

Y Xu, Z Zhang, M Yu, SX Zhang, D Yu - arxiv preprint arxiv:2101.01280, 2021 - arxiv.org

Although the conventional mask-based minimum variance distortionless response (MVDR)
could reduce the non-linear distortion, the residual noise level of the MVDR separated …

Tallenna Viittaa Viittausten määrä 49 Aiheeseen liittyviä artikkeleita Kaikki 6 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Seeing through the conversation: Audio-visual speech separation based on diffusion model

S Lee, C Jung, Y Jang, J Kim… - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org

The objective of this work is to extract the target speaker's voice from a mixture of voices
using visual cues. Existing works on audio-visual speech separation have demonstrated …

Tallenna Viittaa Viittausten määrä 10 Aiheeseen liittyviä artikkeleita Kaikki 5 versiota

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Learning audio-visual dereverberation

C Chen, W Sun, D Harwath… - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org

Reverberation not only degrades the quality of speech for human perception, but also
severely impacts the accuracy of automatic speech recognition. Prior work attempts to …

Tallenna Viittaa Viittausten määrä 35 Aiheeseen liittyviä artikkeleita Kaikki 6 versiota

Luo ilmoitus

Viittaa

Tarkennettu haku

Tallennettu omaan kirjastoon

Audio-visual speech separation and dereverberation with a two-stage multimodal network

An overview of deep-learning-based audio-visual speech enhancement and separation

ADL-MVDR: All deep learning MVDR beamformer for target speech separation

Voicefixer: A unified framework for high-fidelity speech restoration

Audio-visual end-to-end multi-channel speech separation, dereverberation and recognition

Adverb: Visually guided audio dereverberation

Generalized spatio-temporal RNN beamformer for target speech separation

Seeing through the conversation: Audio-visual speech separation based on diffusion model

Learning audio-visual dereverberation