A review of speaker diarization: Recent advances with deep learning

TJ Park, N Kanda, D Dimitriadis, KJ Han… - Computer Speech & …, 2022 - Elsevier
Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …

Speaker diarization with LSTM

Q Wang, C Downey, L Wan… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org
For many years, i-vector based audio embedding techniques were the dominant approach
for speaker verification and speaker diarization applications. However, mirroring the rise of …

Target-speaker voice activity detection: a novel approach for multi-speaker diarization in a dinner party scenario

I Medennikov, M Korenevsky, T Prisyach… - arxiv preprint arxiv …, 2020 - arxiv.org
Speaker diarization for real-life scenarios is an extremely challenging problem. Widely used
clustering-based diarization approaches perform rather poorly in such conditions, mainly …

Fully supervised speaker diarization

A Zhang, Q Wang, Z Zhu, J Paisley… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
In this paper, we propose a fully supervised speaker diarization approach, named
unbounded interleaved-state recurrent neural networks (UIS-RNN). Given extracted speaker …

Auto-tuning spectral clustering for speaker diarization using normalized maximum eigengap

TJ Park, KJ Han, M Kumar… - IEEE Signal Processing …, 2019 - ieeexplore.ieee.org
In this study, we propose a new spectral clustering framework that can auto-tune the
parameters of the clustering algorithm in the context of speaker diarization. The proposed …

[HTML][HTML] An experimental review of speaker diarization methods with application to two-speaker conversational telephone speech recordings

L Serafini, S Cornell, G Morrone, E Zovato… - Computer Speech & …, 2023 - Elsevier
We performed an experimental review of current diarization systems for the conversational
telephone speech (CTS) domain. In detail, we considered a total of eight different algorithms …

Unsupervised methods for speaker diarization: An integrated and iterative approach

SH Shum, N Dehak, R Dehak… - IEEE Transactions on …, 2013 - ieeexplore.ieee.org
In speaker diarization, standard approaches typically perform speaker clustering on some
initial segmentation before refining the segment boundaries in a re-segmentation step to …

Incremental spectral clustering by efficiently updating the eigen-system

H Ning, W Xu, Y Chi, Y Gong, TS Huang - Pattern Recognition, 2010 - Elsevier
In recent years, the spectral clustering method has gained attentions because of its superior
performance. To the best of our knowledge, the existing spectral clustering algorithms …

Discriminative neural clustering for speaker diarisation

Q Li, FL Kreyssig, C Zhang… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org
In this paper, we propose Discriminative Neural Clustering (DNC) that formulates data
clustering with a maximum number of clusters as a supervised sequence-to-sequence …

Target-speaker voice activity detection with improved i-vector estimation for unknown number of speaker

M He, D Raj, Z Huang, J Du, Z Chen… - arxiv preprint arxiv …, 2021 - arxiv.org
Target-speaker voice activity detection (TS-VAD) has recently shown promising results for
speaker diarization on highly overlapped speech. However, the original model requires a …