MFA: TDNN with multi-scale frequency-channel attention for text-independent speaker verification with short utterances

T Liu, RK Das, KA Lee, H Li - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
The time delay neural network (TDNN) represents one of the state-of-the-art of neural
solutions to text-independent speaker verification. However, they require a large number of …

Masterkey: Practical backdoor attack against speaker verification systems

H Guo, X Chen, J Guo, L **ao, Q Yan - Proceedings of the 29th Annual …, 2023 - dl.acm.org
Speaker Verification (SV) is widely deployed in mobile systems to authenticate legitimate
users by using their voice traits. In this work, we propose a backdoor attack MasterKey, to …

Text-independent speaker verification employing CNN-LSTM-TDNN hybrid networks

J Alam, A Fathan, WH Kang - … 2021, St. Petersburg, Russia, September 27 …, 2021 - Springer
Abstract Time Delay Neural Network (TDNN)-based speaker embeddings extraction have
become the dominant approach for text-independent speaker verification. Several single …

Hybrid neural network with cross-and self-module attention pooling for text-independent speaker verification

J Alam, WH Kang, A Fathan - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Extraction of a speaker embedding vector plays an important role in deep learning-based
speaker verification. In this contribution, to extract speaker discriminant utterance level …

Self-supervised learning based domain regularization for mask-wearing speaker verification

R Zhang, J Wei, X Lu, W Lu, D **, L Zhang, Y Ji… - Speech …, 2023 - Elsevier
Automatic speaker verification (ASV) faces an unprecedented problem due to mask-wearing
speakers, a consequence of COVID-19. Masked speakers unconsciously alter their normal …

Unsupervised adaptive speaker recognition by coupling-regularized optimal transport

R Zhang, J Wei, X Lu, W Lu, D **… - … /ACM Transactions on …, 2024 - ieeexplore.ieee.org
Cross-domain speaker recognition (SR) can be improved by unsupervised domain
adaptation (UDA) algorithms. UDA algorithms often reduce domain mismatch at the cost of …

Optimal transport with a diversified memory bank for cross-domain speaker verification

R Zhang, J Wei, X Lu, W Lu, D **… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Optimal transport (OT) can be applied to cross-domain adaptation in speaker verification
(SV) by converting speakers' probability distributions from source to target domains …

EfficientTDNN: Efficient architecture search for speaker recognition

R Wang, Z Wei, H Duan, S Ji, Y Long… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org
Convolutional neural networks (CNNs), such as the time-delay neural network (TDNN), have
shown their remarkable capability in learning speaker embedding. However, they …

Hybrid network with multi-level global-local statistics pooling for robust text-independent speaker recognition

WH Kang, J Alam, A Fathan - 2021 IEEE Automatic Speech …, 2021 - ieeexplore.ieee.org
In this paper, we propose a new hybrid system for extracting a speaker embedding vector.
More specifically, the proposed system employs a multi-level global-local statistics pooling …

Dynamic convolution with global-local information for session-invariant speaker representation learning

B Gu, W Guo - IEEE Signal Processing Letters, 2021 - ieeexplore.ieee.org
Various mismatchedconditions result in performance degradation of the speaker verification
(SV) systems. To address this issue, we extract robust speaker representations by devising a …