A review of speaker diarization: Recent advances with deep learning

TJ Park, N Kanda, D Dimitriadis, KJ Han… - Computer Speech & …, 2022 - Elsevier
Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …

Bayesian hmm clustering of x-vector sequences (vbx) in speaker diarization: theory, implementation and analysis on standard tasks

F Landini, J Profant, M Diez, L Burget - Computer Speech & Language, 2022 - Elsevier
The recently proposed VBx diarization method uses a Bayesian hidden Markov model to
find speaker clusters in a sequence of x-vectors. In this work we perform an extensive …

The third DIHARD diarization challenge

N Ryant, P Singh, V Krishnamohan, R Varma… - arxiv preprint arxiv …, 2020 - arxiv.org
DIHARD III was the third in a series of speaker diarization challenges intended to improve
the robustness of diarization systems to variability in recording equipment, noise conditions …

The chime-7 dasr challenge: Distant meeting transcription with multiple devices in diverse scenarios

S Cornell, M Wiesner, S Watanabe, D Raj… - arxiv preprint arxiv …, 2023 - arxiv.org
The CHiME challenges have played a significant role in the development and evaluation of
robust automatic speech recognition (ASR) systems. We introduce the CHiME-7 distant ASR …

The speakin system for voxceleb speaker recognition challange 2021

M Zhao, Y Ma, M Liu, M Xu - arxiv preprint arxiv:2109.01989, 2021 - arxiv.org
This report describes our submission to the track 1 and track 2 of the VoxCeleb Speaker
Recognition Challenge 2021 (VoxSRC 2021). Both track 1 and track 2 share the same …

[HTML][HTML] An experimental review of speaker diarization methods with application to two-speaker conversational telephone speech recordings

L Serafini, S Cornell, G Morrone, E Zovato… - Computer Speech & …, 2023 - Elsevier
We performed an experimental review of current diarization systems for the conversational
telephone speech (CTS) domain. In detail, we considered a total of eight different algorithms …

Voxsrc 2021: The third voxceleb speaker recognition challenge

A Brown, J Huh, JS Chung, A Nagrani… - arxiv preprint arxiv …, 2022 - arxiv.org
The third instalment of the VoxCeleb Speaker Recognition Challenge was held in
conjunction with Interspeech 2021. The aim of this challenge was to assess how well current …

Voxsrc 2022: The fourth voxceleb speaker recognition challenge

J Huh, A Brown, J Jung, JS Chung, A Nagrani… - arxiv preprint arxiv …, 2023 - arxiv.org
This paper summarises the findings from the VoxCeleb Speaker Recognition Challenge
2022 (VoxSRC-22), which was held in conjunction with INTERSPEECH 2022. The goal of …

Encoder-decoder based attractors for end-to-end neural diarization

S Horiguchi, Y Fujita, S Watanabe… - … /ACM Transactions on …, 2022 - ieeexplore.ieee.org
This paper investigates an end-to-end neural diarization (EEND) method for an unknown
number of speakers. In contrast to the conventional cascaded approach to speaker …

The Vox Celeb Speaker Recognition Challenge: A Retrospective

J Huh, JS Chung, A Nagrani, A Brown… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org
The VoxCeleb Speaker Recognition Challenges (VoxSRC) were a series of challenges and
workshops that ran annually from 2019 to 2023. The challenges primarily evaluated the …