- Academic Search

Z Pan, R Tao, C Xu, H Li - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org

A speaker extraction algorithm seeks to extract the speech of a target speaker from a multi-
talker speech mixture when given a cue that represents the target speaker, such as a pre …

保存引用被引用数: 48 関連記事全 4 バージョン

[Free GPT-4]

[PDF] arxiv.org

LC-TTFS: Towards lossless network conversion for spiking neural networks with TTFS coding

Q Yang, M Zhang, J Wu, KC Tan… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

The biological neurons use precise spike times, in addition to the spike firing rate, to
communicate with each other. The time-to-first-spike (TTFS) coding is inspired by such …

保存引用被引用数: 8 関連記事全 3 バージョン

[Free GPT-4]

[PDF] ssrn.com

MSRANet: Learning discriminative embeddings for speaker verification via channel and spatial attention mechanism in alterable scenarios

Q Zheng, Z Chen, H Liu, Y Lu, J Li, T Liu - Expert Systems with Applications, 2023 - Elsevier

Speaker embeddings have become the most popular feature representation in speaker
verification. Improving the robustness of speaker embedding extraction systems is a crucial …

保存引用被引用数: 16 関連記事全 3 バージョン

[Free GPT-4]

[PDF] arxiv.org

L-spex: Localized target speaker extraction

M Ge, C Xu, L Wang, ES Chng… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Speaker extraction aims to extract the target speaker's voice from a multi-talker speech
mixture given an auxiliary reference utterance. Recent studies show that speaker extraction …

保存引用被引用数: 28 関連記事全 3 バージョン

[Free GPT-4]

[PDF] arxiv.org

Speech separation with pretrained frontend to minimize domain mismatch

W Wang, Z Pan, X Li, S Wang… - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org

Speech separation seeks to separate individual speech signals from a speech mixture.
Typically, most separation models are trained on synthetic data due to the unavailability of …

保存引用被引用数: 3 関連記事全 5 バージョン

[Free GPT-4]

[PDF] arxiv.org

Used: Universal speaker extraction and diarization

J Ao, MS Yıldırım, R Tao, M Ge, S Wang… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org

Speaker extraction and diarization are two enabling techniques for real-world speech
applications. Speaker extraction aims to extract a target speaker's voice from a speech …

保存引用被引用数: 5 関連記事全 2 バージョン

[Free GPT-4]

[PDF] arxiv.org

Speaker verification using attentive multi-scale convolutional recurrent network

Y Li, Z Jiang, W Cao, Q Huang - Applied Soft Computing, 2022 - Elsevier

In this paper, we propose a speaker verification method by an Attentive Multi-scale
Convolutional Recurrent Network (AMCRN). The proposed AMCRN can acquire both local …

保存引用被引用数: 10 関連記事全 4 バージョン

[Free GPT-4]

[PDF] arxiv.org

Few-shot speaker identification using lightweight prototypical network with feature grou** and interaction

Y Li, H Chen, W Cao, Q Huang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Existing methods for few-shot speaker identification (FSSI) obtain high accuracy, but their
computational complexities and model sizes need to be reduced for lightweight applications …

保存引用被引用数: 10 関連記事全 4 バージョン

[Free GPT-4]

[PDF] arxiv.org

Aca-net: Towards lightweight speaker verification using asymmetric cross attention

JQ Yip, T Truong, D Ng, C Zhang, Y Ma… - arxiv preprint arxiv …, 2023 - arxiv.org

In this paper, we propose ACA-Net, a lightweight, global context-aware speaker embedding
extractor for Speaker Verification (SV) that improves upon existing work by using Asymmetric …

保存引用被引用数: 5 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Improving curriculum learning for target speaker extraction with synthetic speakers

Y Liu, X Liu, J Yamagishi - 2024 IEEE Spoken Language …, 2024 - ieeexplore.ieee.org

Target speaker extraction (TSE) aims to isolate individual speaker voices from complex
speech environments. The effectiveness of TSE systems is often compromised when the …

保存引用被引用数: 1 関連記事全 4 バージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Target speaker verification with selective auditory attention for single and multi-talker speech

Selective listening by synchronizing speech with lips

LC-TTFS: Towards lossless network conversion for spiking neural networks with TTFS coding

MSRANet: Learning discriminative embeddings for speaker verification via channel and spatial attention mechanism in alterable scenarios

L-spex: Localized target speaker extraction

Speech separation with pretrained frontend to minimize domain mismatch

Used: Universal speaker extraction and diarization

Speaker verification using attentive multi-scale convolutional recurrent network

Few-shot speaker identification using lightweight prototypical network with feature grou** and interaction

Aca-net: Towards lightweight speaker verification using asymmetric cross attention

Improving curriculum learning for target speaker extraction with synthetic speakers