NeuroHeed: Neuro-steered speaker extraction using EEG signals
Humans possess the remarkable ability to selectively attend to a single speaker amidst
competing voices and background noise, known as selective auditory attention. Recent …
competing voices and background noise, known as selective auditory attention. Recent …
Prompt-driven target speech diarization
We introduce a novel task named 'target speech diarization', which seeks to determine
'when target event occurred'within an audio signal. We devise a neural architecture called …
'when target event occurred'within an audio signal. We devise a neural architecture called …
Multi-Level Speaker Representation for Target Speaker Extraction
Target speaker extraction (TSE) relies on a reference cue of the target to extract the target
speech from a speech mixture. While a speaker embedding is commonly used as the …
speech from a speech mixture. While a speaker embedding is commonly used as the …
Audio-visual target speaker extraction with reverse selective auditory attention
Audio-visual target speaker extraction (AV-TSE) aims to extract the specific person's speech
from the audio mixture given auxiliary visual cues. Previous methods usually search for the …
from the audio mixture given auxiliary visual cues. Previous methods usually search for the …
Audio-visual active speaker extraction for sparsely overlapped multi-talker speech
Target speaker extraction aims to extract the speech of a specific speaker from a multi-talker
mixture as specified by an auxiliary reference. Most studies focus on the scenario where the …
mixture as specified by an auxiliary reference. Most studies focus on the scenario where the …
Scenario-aware audio-visual TF-Gridnet for target speech extraction
Target speech extraction aims to extract, based on a given conditioning cue, a target speech
signal that is corrupted by interfering sources, such as noise or competing speakers …
signal that is corrupted by interfering sources, such as noise or competing speakers …
On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
Deep learning technologies have significantly advanced the performance of target speaker
extraction (TSE) tasks. To enhance the generalization and robustness of these algorithms …
extraction (TSE) tasks. To enhance the generalization and robustness of these algorithms …
Enhancing Speaker Extraction Through Rectifying Target Confusion
Target Speaker Extraction (TSE) aims to extract target speech from mixed audio using clues
that identify the target speaker. However, TSE often faces the Target Confusion (TC) …
that identify the target speaker. However, TSE often faces the Target Confusion (TC) …
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy
Audio-visual target speech extraction (AV-TSE) is one of the enabling technologies in
robotics and many audiovisual applications. One of the challenges of AV-TSE is how to …
robotics and many audiovisual applications. One of the challenges of AV-TSE is how to …
Audio-Visual Target Speaker Extraction with Selective Auditory Attention
Audio-visual target speaker extraction (AV-TSE) aims to extract the specific person's speech
from the audio mixture given auxiliary visual cues. Previous methods usually search for the …
from the audio mixture given auxiliary visual cues. Previous methods usually search for the …