Pcf: Ecapa-tdnn with progressive channel fusion for speaker verification

Z Zhao, Z Li, W Wang, P Zhang - ICASSP 2023-2023 IEEE …, 2023‏ - ieeexplore.ieee.org
ECAPA-TDNN is currently the most popular TDNN-series model for speaker verification,
which refreshed the state-of-the-art (SOTA) performance of TDNN models. However, one …

Studying squeeze-and-excitation used in CNN for speaker verification

M Rouvier, PM Bousquet - 2021 IEEE Automatic Speech …, 2021‏ - ieeexplore.ieee.org
In speaker verification, the extraction of voice representations is mainly based on the
Residual Neural Network (ResNet) architecture. ResNet is built upon convolution layers …

[HTML][HTML] Explore long-range context features for speaker verification

Z Li, Z Zhao, W Wang, P Zhang, Q Zhao - Applied Sciences, 2023‏ - mdpi.com
Multi-scale context information, especially long-range dependency, has shown to be
beneficial for speaker verification (SV) tasks. In this paper, we propose three methods to …

EcoSpeak: Cost-Efficient Bias Mitigation for Partially Cross-Lingual Speaker Verification

D Sharma - Findings of the Association for Computational …, 2024‏ - aclanthology.org
Linguistic bias is a critical problem concerning the diversity, equity, and inclusiveness of
Natural Language Processing tools. The severity of this problem intensifies in security …

Progressive channel fusion for more efficient TDNN on speaker verification

Z Zhao, Z Li, W Wang, J Xu - Speech Communication, 2024‏ - Elsevier
ECAPA-TDNN is one of the most popular TDNNs for speaker verification. While most of the
updates pay attention to building precisely designed auxiliary modules, the depth-first …

Speaker Verification Uasing Spatial Attention Mechanism And Semantic Enhancement

P Li, X Liu, W Suqin, X **e - 2024 43rd Chinese Control …, 2024‏ - ieeexplore.ieee.org
This paper introduces a novel speaker verification model that enhances the performance of
convolution-driven speaker recognition models. The proposed model incorporates a new …

Back-ends Selection for Deep Speaker Embeddings

Z Li, R **ao, Z Zhang, Z Zhao, W Wang… - arxiv preprint arxiv …, 2022‏ - arxiv.org
Probabilistic Linear Discriminant Analysis (PLDA) was the dominant and necessary back-
end for early speaker recognition approaches, like i-vector and x-vector. However, with the …

I4U System Description for NIST SRE'20 CTS Challenge

KA Lee, T Kinnunen, D Colibro, C Vair… - arxiv preprint arxiv …, 2022‏ - arxiv.org
This manuscript describes the I4U submission to the 2020 NIST Speaker Recognition
Evaluation (SRE'20) Conversational Telephone Speech (CTS) Challenge. The I4U's …