NeuroHeed: Neuro-steered speaker extraction using EEG signals

Z Pan, M Borsdorf, S Cai, T Schultz… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
Humans possess the remarkable ability to selectively attend to a single speaker amidst
competing voices and background noise, known as selective auditory attention. Recent …

[PDF][PDF] PARIS: Pseudo-AutoRegressIve siamese training for online speech separation

Z Pan, G Wichern, FG Germain, K Saijo, J Le Roux - Proc. Interspeech, 2024 - merl.com
While offline speech separation models have made significant advances, the streaming
regime remains less explored and is typically limited to causal modifications of existing …

AV-CrossNet: an Audiovisual Complex Spectral Map** Network for Speech Separation By Leveraging Narrow-and Cross-Band Modeling

VA Kalkhorani, C Yu, A Kumar, K Tan, B Xu… - arxiv preprint arxiv …, 2024 - arxiv.org
Adding visual cues to audio-based speech separation can improve separation performance.
This paper introduces AV-CrossNet, an audiovisual (AV) system for speech enhancement …

Sir-progressive audio-visual tf-gridnet with ASR-aware selector for target speaker extraction in MISP 2023 challenge

Z Hou, T Sun, Y Hu, C Zhu, K Chen… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
TF-GridNet has demonstrated its effectiveness in speech separation and enhancement. In
this paper, we extend its capabilities for progressive audio-visual speech enhancement by …

[PDF][PDF] A Target Speaker Extraction Method for the 3rd Audio-Visual Speech Enhancement Challenge

Z **, B Zeng, Z Li, X Liu, M Li - System, 2024 - isca-archive.org
This paper describes our audio-visual target speaker extraction method for the 3rd AVSE
Challenge. This method adopts early channel-wise concatenation to fuse audio-visual …