- Academic Search

Z Pan, M Borsdorf, S Cai, T Schultz… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org

Humans possess the remarkable ability to selectively attend to a single speaker amidst
competing voices and background noise, known as selective auditory attention. Recent …

Enregistrer Citer Cité 16 fois Autres articles Les 5 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] merl.com

[PDF][PDF] PARIS: Pseudo-AutoRegressIve siamese training for online speech separation

Z Pan, G Wichern, FG Germain, K Saijo, J Le Roux - Proc. Interspeech, 2024 - merl.com

While offline speech separation models have made significant advances, the streaming
regime remains less explored and is typically limited to causal modifications of existing …

Enregistrer Citer Cité 3 fois Autres articles Les 5 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

AV-CrossNet: an Audiovisual Complex Spectral Map** Network for Speech Separation By Leveraging Narrow-and Cross-Band Modeling

VA Kalkhorani, C Yu, A Kumar, K Tan, B Xu… - arxiv preprint arxiv …, 2024 - arxiv.org

Adding visual cues to audio-based speech separation can improve separation performance.
This paper introduces AV-CrossNet, an audiovisual (AV) system for speech enhancement …

Enregistrer Citer Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

Sir-progressive audio-visual tf-gridnet with ASR-aware selector for target speaker extraction in MISP 2023 challenge

Z Hou, T Sun, Y Hu, C Zhu, K Chen… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

TF-GridNet has demonstrated its effectiveness in speech separation and enhancement. In
this paper, we extend its capabilities for progressive audio-visual speech enhancement by …

Enregistrer Citer Cité 1 fois Autres articles

[Free GPT-4]
[DeepSeek]

[PDF] isca-archive.org

[PDF][PDF] A Target Speaker Extraction Method for the 3rd Audio-Visual Speech Enhancement Challenge

Z **, B Zeng, Z Li, X Liu, M Li - System, 2024 - isca-archive.org

This paper describes our audio-visual target speaker extraction method for the 3rd AVSE
Challenge. This method adopts early channel-wise concatenation to fuse audio-visual …

Enregistrer Citer Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Scenario-aware audio-visual TF-Gridnet for target speech extraction

NeuroHeed: Neuro-steered speaker extraction using EEG signals

[PDF][PDF] PARIS: Pseudo-AutoRegressIve siamese training for online speech separation

AV-CrossNet: an Audiovisual Complex Spectral Map** Network for Speech Separation By Leveraging Narrow-and Cross-Band Modeling

Sir-progressive audio-visual tf-gridnet with ASR-aware selector for target speaker extraction in MISP 2023 challenge

[PDF][PDF] A Target Speaker Extraction Method for the 3rd Audio-Visual Speech Enhancement Challenge