Speaker recognition based on deep learning: An overview

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

A survey on text-dependent and text-independent speaker verification

Y Tu, W Lin, MW Mak - IEEE Access, 2022 - ieeexplore.ieee.org
Speaker verification (SV) aims to detect an individual's identity from his/her voice. SV has
been successfully applied in various areas such as access control, remote service …

Augmentation adversarial training for self-supervised speaker recognition

J Huh, HS Heo, J Kang, S Watanabe… - arxiv preprint arxiv …, 2020 - arxiv.org
The goal of this work is to train robust speaker recognition models without speaker labels.
Recent works on unsupervised speaker representations are based on contrastive learning …

A deep neural network for short-segment speaker recognition

A Hajavi, A Etemad - arxiv preprint arxiv:1907.10420, 2019 - arxiv.org
Todays interactive devices such as smart-phone assistants and smart speakers often deal
with short-duration speech segments. As a result, speaker recognition systems integrated …

Whisper-SV: Adapting Whisper for low-data-resource speaker verification

L Zhang, N Jiang, Q Wang, Y Li, Q Lu, L **e - Speech Communication, 2024 - Elsevier
Trained on 680,000 h of massive speech data, Whisper is a multitasking, multilingual
speech foundation model demonstrating superior performance in automatic speech …

Self-supervised learning based domain adaptation for robust speaker verification

Z Chen, S Wang, Y Qian - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Large performance degradation is often observed for speaker verification systems when
applied to a new domain dataset. Given an unlabeled target-domain dataset, unsupervised …

Revisiting the statistics pooling layer in deep speaker embedding learning

S Wang, Y Yang, Y Qian, K Yu - 2021 12th International …, 2021 - ieeexplore.ieee.org
The pooling function plays a vital role in the segment-level deep speaker embedding
learning framework. One common method is to calculate the statistics of the temporal …

Augmentation adversarial training for self-supervised speaker representation learning

J Kang, J Huh, HS Heo… - IEEE Journal of Selected …, 2022 - ieeexplore.ieee.org
The goal of this work is to train robust speaker recognition models using self-supervised
representation learning. Recent works on self-supervised speaker representations are …

Overview of speaker modeling and its applications: From the lens of deep speaker representation learning

S Wang, Z Chen, KA Lee, Y Qian… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
Speaker individuality information is among the most critical elements within speech signals.
By thoroughly and accurately modeling this information, it can be utilized in various …

Delving into voxceleb: environment invariant speaker recognition

JS Chung, J Huh, S Mun - arxiv preprint arxiv:1910.11238, 2019 - arxiv.org
Research in speaker recognition has recently seen significant progress due to the
application of neural network models and the availability of new large-scale datasets. There …