A review of speaker diarization: Recent advances with deep learning

TJ Park, N Kanda, D Dimitriadis, KJ Han… - Computer Speech & …, 2022 - Elsevier
Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …

Speaker recognition based on deep learning: An overview

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

Integrating end-to-end neural and clustering-based diarization: Getting the best of both worlds

K Kinoshita, M Delcroix… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Recent diarization technologies can be categorized into two approaches, ie, clustering and
end-to-end neural approaches, which have different pros and cons. The clustering-based …

Attention-based encoder-decoder end-to-end neural diarization with embedding enhancer

Z Chen, B Han, S Wang, Y Qian - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org
Deep neural network-based systems have significantly improved the performance of
speaker diarization tasks. However, end-to-end neural diarization (EEND) systems often …