Speaker recognition based on deep learning: An overview

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

Deep representation learning in speech processing: Challenges, recent advances, and future trends

S Latif, R Rana, S Khalifa, R Jurdak, J Qadir… - arxiv preprint arxiv …, 2020 - arxiv.org
Research on speech processing has traditionally considered the task of designing hand-
engineered acoustic features (feature engineering) as a separate distinct problem from the …

ECAPA-TDNN embeddings for speaker diarization

N Dawalatabad, M Ravanelli, F Grondin… - arxiv preprint arxiv …, 2021 - arxiv.org
Learning robust speaker embeddings is a crucial step in speaker diarization. Deep neural
networks can accurately capture speaker discriminative characteristics and popular deep …

Discrete and Parameter-Free Multiple Kernel k-Means

R Wang, J Lu, Y Lu, F Nie, X Li - IEEE Transactions on Image …, 2022 - ieeexplore.ieee.org
The multiple kernel-means (MKKM) and its variants utilize complementary information from
different sources, achieving better performance than kernel-means (KKM). However, the …

Self-supervised representation learning with path integral clustering for speaker diarization

P Singh, S Ganapathy - IEEE/ACM Transactions on Audio …, 2021 - ieeexplore.ieee.org
Automatic speaker diarization techniques typically involve a two-stage processing approach
where audio segments of fixed duration are converted to vector representations in the first …

Graph attention-based deep embedded clustering for speaker diarization

Y Wei, H Guo, Z Ge, Z Yang - Speech Communication, 2023 - Elsevier
Deep speaker embedding extraction models have recently served as the cornerstone for
modular speaker diarization systems. However, in current modular systems, the extracted …

Meta-learning with latent space clustering in generative adversarial network for speaker diarization

M Pal, M Kumar, R Peri, TJ Park, SH Kim… - … ACM transactions on …, 2021 - ieeexplore.ieee.org
The performance of most speaker diarization systems with x-vector embeddings is both
vulnerable to noisy environments and lacks domain robustness. Earlier work on speaker …

Linguistically aided speaker diarization using speaker role information

N Flemotomos, P Georgiou, S Narayanan - arxiv preprint arxiv …, 2019 - arxiv.org
Speaker diarization relies on the assumption that speech segments corresponding to a
particular speaker are concentrated in a specific region of the speaker space; a region which …

Combination of deep speaker embeddings for diarisation

G Sun, C Zhang, PC Woodland - Neural Networks, 2021 - Elsevier
Significant progress has recently been made in speaker diarisation after the introduction of d-
vectors as speaker embeddings extracted from neural network (NN) speaker classifiers for …

Multi-scale speaker diarization with neural affinity score fusion

TJ Park, M Kumar, S Narayanan - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Predicting the speaker's identity of short speech segments in human dialogue has been
considered one of the most challenging problems in speech signal processing. Speaker …