Speaker recognition based on deep learning: An overview
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …
learning has dramatically revolutionized speaker recognition. However, there is lack of …
A survey on text-dependent and text-independent speaker verification
Speaker verification (SV) aims to detect an individual's identity from his/her voice. SV has
been successfully applied in various areas such as access control, remote service …
been successfully applied in various areas such as access control, remote service …
Augmentation adversarial training for self-supervised speaker recognition
The goal of this work is to train robust speaker recognition models without speaker labels.
Recent works on unsupervised speaker representations are based on contrastive learning …
Recent works on unsupervised speaker representations are based on contrastive learning …
A deep neural network for short-segment speaker recognition
Todays interactive devices such as smart-phone assistants and smart speakers often deal
with short-duration speech segments. As a result, speaker recognition systems integrated …
with short-duration speech segments. As a result, speaker recognition systems integrated …
Whisper-SV: Adapting Whisper for low-data-resource speaker verification
Trained on 680,000 h of massive speech data, Whisper is a multitasking, multilingual
speech foundation model demonstrating superior performance in automatic speech …
speech foundation model demonstrating superior performance in automatic speech …
Self-supervised learning based domain adaptation for robust speaker verification
Large performance degradation is often observed for speaker verification systems when
applied to a new domain dataset. Given an unlabeled target-domain dataset, unsupervised …
applied to a new domain dataset. Given an unlabeled target-domain dataset, unsupervised …
Revisiting the statistics pooling layer in deep speaker embedding learning
The pooling function plays a vital role in the segment-level deep speaker embedding
learning framework. One common method is to calculate the statistics of the temporal …
learning framework. One common method is to calculate the statistics of the temporal …
Augmentation adversarial training for self-supervised speaker representation learning
The goal of this work is to train robust speaker recognition models using self-supervised
representation learning. Recent works on self-supervised speaker representations are …
representation learning. Recent works on self-supervised speaker representations are …
Overview of speaker modeling and its applications: From the lens of deep speaker representation learning
Speaker individuality information is among the most critical elements within speech signals.
By thoroughly and accurately modeling this information, it can be utilized in various …
By thoroughly and accurately modeling this information, it can be utilized in various …
Delving into voxceleb: environment invariant speaker recognition
Research in speaker recognition has recently seen significant progress due to the
application of neural network models and the availability of new large-scale datasets. There …
application of neural network models and the availability of new large-scale datasets. There …