Google Acadèmic

J Wang, Z Pan, M Zhang, RT Tan, H Li - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Prior studies on audio-visual speech recognition typically assume the visibility of speaking
lips, ignoring the fact that visual occlusion occurs in real-world videos, thus adversely …

Desa Cita Citat per 9 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

NeuroHeed: Neuro-steered speaker extraction using EEG signals

Z Pan, M Borsdorf, S Cai, T Schultz… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org

Humans possess the remarkable ability to selectively attend to a single speaker amidst
competing voices and background noise, known as selective auditory attention. Recent …

Desa Cita Citat per 16 Articles relacionats Totes les 5 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Av-sepformer: Cross-attention sepformer for audio-visual target speaker extraction

J Lin, X Cai, H Dinkel, J Chen, Z Yan… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Visual information can serve as an effective cue for target speaker extraction (TSE) and is
vital to improving extraction performance. In this paper, we propose AV-SepFormer, a …

Desa Cita Citat per 21 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

MSFNet: Multi-scale fusion network for brain-controlled speaker extraction

C Fan, J Zhang, H Zhang, W **ang, J Tao, X Li… - Proceedings of the …, 2024 - dl.acm.org

Speaker extraction aims to selectively extract the target speaker from the multi-talker
environment under the guidance of auxiliary reference. Recent studies have shown that the …

Desa Cita Citat per 5 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Speaker extraction with co-speech gestures cue

Z Pan, X Qian, H Li - IEEE Signal Processing Letters, 2022 - ieeexplore.ieee.org

Speaker extraction seeks to extract the clean speech of a target speaker from a multi-talker
mixture speech. There have been studies to use a pre-recorded speech sample or face …

Desa Cita Citat per 28 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek

Time-domain speech separation networks with graph encoding auxiliary

T Wang, Z Pan, M Ge, Z Yang… - IEEE Signal Processing …, 2023 - ieeexplore.ieee.org

End-to-end time-domain speech separation with masking strategy has shown its
performance advantage, where a 1-D convolutional layer is used as the speech encoder to …

Desa Cita Citat per 15 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

NeuroHeed+: Improving neuro-steered speaker extraction with joint auditory attention detection

Z Pan, G Wichern, FG Germain… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

Neuro-steered speaker extraction aims to extract the listener's brainattended speech signal
from a multi-talker speech signal, in which the attention is derived from the cortical activity …

Desa Cita Citat per 8 Articles relacionats Totes les 8 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Rethinking the visual cues in audio-visual speaker extraction

J Li, M Ge, R Cao, L Wang, J Dang, S Zhang - arxiv preprint arxiv …, 2023 - arxiv.org

The Audio-Visual Speaker Extraction (AVSE) algorithm employs parallel video recording to
leverage two visual cues, namely speaker identity and synchronization, to enhance …

Desa Cita Citat per 12 Articles relacionats Totes les 4 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Used: Universal speaker extraction and diarization

J Ao, MS Yıldırım, R Tao, M Ge, S Wang… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org

Speaker extraction and diarization are two enabling techniques for real-world speech
applications. Speaker extraction aims to extract a target speaker's voice from a speech …

Desa Cita Citat per 5 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

New insights on target speaker extraction

M Elminshawi, W Mack, SR Chetupalli… - arxiv preprint arxiv …, 2022 - arxiv.org

Speaker extraction (SE) aims to segregate the speech of a target speaker from a mixture of
interfering speakers with the help of auxiliary information. Several forms of auxiliary …

Desa Cita Citat per 19 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

Crea una alerta

Cita

Cerca avançada

S'ha desat a La meva biblioteca

USEV: Universal speaker extraction with visual cue

Restoring speaking lips from occlusion for audio-visual speech recognition

NeuroHeed: Neuro-steered speaker extraction using EEG signals

Av-sepformer: Cross-attention sepformer for audio-visual target speaker extraction

MSFNet: Multi-scale fusion network for brain-controlled speaker extraction

Speaker extraction with co-speech gestures cue

Time-domain speech separation networks with graph encoding auxiliary

NeuroHeed+: Improving neuro-steered speaker extraction with joint auditory attention detection

Rethinking the visual cues in audio-visual speaker extraction

Used: Universal speaker extraction and diarization

New insights on target speaker extraction