„Google“ mokslinčius

Overview of speaker modeling and its applications: From the lens of deep speaker representation learning

S Wang, Z Chen, KA Lee, Y Qian… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org

Speaker individuality information is among the most critical elements within speech signals.
By thoroughly and accurately modeling this information, it can be utilized in various …

Išsaugoti Cituoti Cituoja 5 Susiję straipsniai Visos 4 versijos

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

ASVspoof 5: Crowdsourced speech data, deepfakes, and adversarial attacks at scale

X Wang, H Delgado, H Tak, J Jung, H Shim… - arxiv preprint arxiv …, 2024 - arxiv.org

ASVspoof 5 is the fifth edition in a series of challenges that promote the study of speech
spoofing and deepfake attacks, and the design of detection solutions. Compared to previous …

Išsaugoti Cituoti Cituoja 24 Susiję straipsniai Visos 9 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

An enhanced res2net with local and global feature fusion for speaker verification

Y Chen, S Zheng, H Wang, L Cheng, Q Chen… - arxiv preprint arxiv …, 2023 - arxiv.org

Effective fusion of multi-scale features is crucial for improving speaker verification
performance. While most existing methods aggregate multi-scale features in a layer-wise …

Išsaugoti Cituoti Cituoja 38 Susiję straipsniai Visos 5 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

ESPnet-SPK: Full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

J Jung, W Zhang, J Shi, Z Aldeneh, T Higuchi… - arxiv preprint arxiv …, 2024 - arxiv.org

This paper introduces ESPnet-SPK, a toolkit designed with several objectives for training
speaker embedding extractors. First, we provide an open-source platform for researchers in …

Išsaugoti Cituoti Cituoja 21 Susiję straipsniai Visos 4 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Whisper-SV: Adapting Whisper for low-data-resource speaker verification

L Zhang, N Jiang, Q Wang, Y Li, Q Lu, L **e - Speech Communication, 2024 - Elsevier

Trained on 680,000 h of massive speech data, Whisper is a multitasking, multilingual
speech foundation model demonstrating superior performance in automatic speech …

Išsaugoti Cituoti Cituoja 7 Susiję straipsniai Visos 4 versijos

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification

T Liu, KA Lee, Q Wang, H Li - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org

The residual neural networks (ResNet) demonstrate the impressive performance in
automatic speaker verification (ASV). They treat the time and frequency dimensions equally …

Išsaugoti Cituoti Cituoja 14 Susiję straipsniai Visos 6 versijos

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Leveraging asr pretrained conformers for speaker verification through transfer learning and knowledge distillation

D Cai, M Li - IEEE/ACM Transactions on Audio, Speech, and …, 2024 - ieeexplore.ieee.org

This paper focuses on the application of Conformers in speaker verification. Conformers,
initially designed for Automatic Speech Recognition (ASR), excel at modeling both local and …

Išsaugoti Cituoti Cituoja 13 Susiję straipsniai Visos 4 versijos

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verification

H Shim, J Jung, T Kinnunen, N Evans… - arxiv preprint arxiv …, 2024 - arxiv.org

Spoofing detection is today a mainstream research topic. Standard metrics can be applied to
evaluate the performance of isolated spoofing detection solutions and others have been …

Išsaugoti Cituoti Cituoja 11 Susiję straipsniai Visos 7 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] xiaolei-zhang.net

[PDF][PDF] Branch-ECAPA-TDNN: A parallel branch architecture to capture local and global features for speaker verification

J Yao, C Liang, Z Peng, B Zhang, XL Zhang - Proc. Interspeech, 2023 - xiaolei-zhang.net

Currently, ECAPA-TDNN is one of the state-of-the-art deep models for automatic speaker
verification (ASV). However, it focuses too much on local feature extraction with fixed local …

Išsaugoti Cituoti Cituoja 15 Susiję straipsniai Visos 4 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Voxblink: A large scale speaker verification dataset on camera

Y Lin, X Qin, G Zhao, M Cheng, N Jiang… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

In this paper, we introduce a large-scale and high-quality audiovisual speaker verification
dataset, named VoxBlink. We propose an innovative and robust automatic audio-visual data …

Išsaugoti Cituoti Cituoja 15 Susiję straipsniai Visos 4 versijos

Kurti įspėjimą

Cituoti

Išplėstinė paieška

Išsaugota skiltyje „Mano biblioteka“

Mfa-conformer: Multi-scale feature aggregation conformer for automatic speaker verification

Overview of speaker modeling and its applications: From the lens of deep speaker representation learning

ASVspoof 5: Crowdsourced speech data, deepfakes, and adversarial attacks at scale

An enhanced res2net with local and global feature fusion for speaker verification

ESPnet-SPK: Full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

Whisper-SV: Adapting Whisper for low-data-resource speaker verification

Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification

Leveraging asr pretrained conformers for speaker verification through transfer learning and knowledge distillation

a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verification

[PDF][PDF] Branch-ECAPA-TDNN: A parallel branch architecture to capture local and global features for speaker verification

Voxblink: A large scale speaker verification dataset on camera