Google Academic

Z Pan, M Borsdorf, S Cai, T Schultz… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org

Humans possess the remarkable ability to selectively attend to a single speaker amidst
competing voices and background noise, known as selective auditory attention. Recent …

Salvați Citați Citat de 18 ori Articole cu conținut similar Toate cele 5 versiuni

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Prompt-driven target speech diarization

Y Jiang, Z Chen, R Tao, L Deng… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

We introduce a novel task named 'target speech diarization', which seeks to determine
'when target event occurred'within an audio signal. We devise a neural architecture called …

Salvați Citați Citat de 11 ori Articole cu conținut similar Toate cele 5 versiuni

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Multi-Level Speaker Representation for Target Speaker Extraction

K Zhang, J Li, S Wang, Y Wei, Y Wang, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Target speaker extraction (TSE) relies on a reference cue of the target to extract the target
speech from a speech mixture. While a speaker embedding is commonly used as the …

Salvați Citați Citat de 2 ori Articole cu conținut similar Toate cele 2 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Audio-visual target speaker extraction with reverse selective auditory attention

R Tao, X Qian, Y Jiang, J Li, J Wang, H Li - arxiv preprint arxiv …, 2024 - arxiv.org

Audio-visual target speaker extraction (AV-TSE) aims to extract the specific person's speech
from the audio mixture given auxiliary visual cues. Previous methods usually search for the …

Salvați Citați Citat de 2 ori Articole cu conținut similar Toate cele 2 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Audio-visual active speaker extraction for sparsely overlapped multi-talker speech

J Li, R Tao, Z Pan, M Ge, S Wang… - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org

Target speaker extraction aims to extract the speech of a specific speaker from a multi-talker
mixture as specified by an auxiliary reference. Most studies focus on the scenario where the …

Salvați Citați Citat de 5 ori Articole cu conținut similar Toate cele 3 versiuni

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Scenario-aware audio-visual TF-Gridnet for target speech extraction

Z Pan, G Wichern, Y Masuyama… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

Target speech extraction aims to extract, based on a given conditioning cue, a target speech
signal that is corrupted by interfering sources, such as noise or competing speakers …

Salvați Citați Citat de 5 ori Articole cu conținut similar Toate cele 9 versiuni

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

On the effectiveness of enrollment speech augmentation for Target Speaker Extraction

J Li, K Zhang, S Wang, H Li, MW Mak… - 2024 IEEE Spoken …, 2024 - ieeexplore.ieee.org

Deep learning technologies have significantly advanced the performance of target speaker
extraction (TSE) tasks. To enhance the generalization and robustness of these algorithms …

Salvați Citați Articole cu conținut similar Toate cele 3 versiuni

Enhancing Speaker Extraction Through Rectifying Target Confusion

J Wang, S Wang, J Li, K Zhang… - 2024 IEEE Spoken …, 2024 - ieeexplore.ieee.org

Target Speaker Extraction (TSE) aims to extract target speech from mixed audio using clues
that identify the target speaker. However, TSE often faces the Target Confusion (TC) …

Salvați Citați Articole cu conținut similar

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy

W Wu, X Chen, X Wu, H Li… - 2024 International Joint …, 2024 - ieeexplore.ieee.org

Audio-visual target speech extraction (AV-TSE) is one of the enabling technologies in
robotics and many audiovisual applications. One of the challenges of AV-TSE is how to …

Salvați Citați Articole cu conținut similar Toate cele 3 versiuni

Audio-Visual Target Speaker Extraction with Selective Auditory Attention

R Tao, X Qian, Y Jiang, J Li, J Wang… - IEEE Transactions on …, 2025 - ieeexplore.ieee.org

Audio-visual target speaker extraction (AV-TSE) aims to extract the specific person's speech
from the audio mixture given auxiliary visual cues. Previous methods usually search for the …

Salvați Citați Articole cu conținut similar

Creează alerta

Citați

Căutare avansată

Salvat în Bibliotecă

Rethinking the visual cues in audio-visual speaker extraction

NeuroHeed: Neuro-steered speaker extraction using EEG signals

Prompt-driven target speech diarization

Multi-Level Speaker Representation for Target Speaker Extraction

Audio-visual target speaker extraction with reverse selective auditory attention

Audio-visual active speaker extraction for sparsely overlapped multi-talker speech

Scenario-aware audio-visual TF-Gridnet for target speech extraction

On the effectiveness of enrollment speech augmentation for Target Speaker Extraction

Enhancing Speaker Extraction Through Rectifying Target Confusion

Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy

Audio-Visual Target Speaker Extraction with Selective Auditory Attention