- Academic Search

TJ Park, N Kanda, D Dimitriadis, KJ Han… - Computer Speech & …, 2022 - Elsevier

Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …

保存引用被引用数: 420 関連記事全 7 バージョン

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Neural target speech extraction: An overview

K Zmolikova, M Delcroix, T Ochiai… - IEEE Signal …, 2023 - ieeexplore.ieee.org

Humans can listen to a target speaker even in challenging acoustic conditions that have
noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail …

保存引用被引用数: 87 関連記事全 5 バージョン

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Voicefilter: Targeted voice separation by speaker-conditioned spectrogram masking

Q Wang, H Muckenhirn, K Wilson, P Sridhar… - ar** speakers is one of the
challenging problems with regards to today's automatic speech recognition systems …

保存引用被引用数: 247 関連記事全 5 バージョン

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Spex: Multi-scale time domain speaker extraction network

C Xu, W Rao, ES Chng, H Li - IEEE/ACM transactions on audio …, 2020 - ieeexplore.ieee.org

Speaker extraction aims to mimic humans' selective auditory attention by extracting a target
speaker's voice from a multi-talker environment. It is common to perform the extraction in …

保存引用被引用数: 197 関連記事全 6 バージョン

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Speech enhancement using self-adaptation and multi-head self-attention

Y Koizumi, K Yatabe, M Delcroix… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

This paper investigates a self-adaptation method for speech enhancement using auxiliary
speaker-aware features; we extract a speaker representation used for adaptation directly …

保存引用被引用数: 157 関連記事全 7 バージョン

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Spex+: A complete time domain speaker extraction network

M Ge, C Xu, L Wang, ES Chng, J Dang, H Li - arxiv preprint arxiv …, 2020 - arxiv.org

Speaker extraction aims to extract the target speech signal from a multi-talker environment
given a target speaker's reference speech. We recently proposed a time-domain solution …

保存引用被引用数: 170 関連記事全 9 バージョン HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Single channel target speaker extraction and recognition with speaker beam

A review of speaker diarization: Recent advances with deep learning

Neural target speech extraction: An overview

Voicefilter: Targeted voice separation by speaker-conditioned spectrogram masking

Spex: Multi-scale time domain speaker extraction network

Speech enhancement using self-adaptation and multi-head self-attention

Spex+: A complete time domain speaker extraction network