A review of speaker diarization: Recent advances with deep learning

TJ Park, N Kanda, D Dimitriadis, KJ Han… - Computer Speech & …, 2022 - Elsevier
Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …

Speaker diarization: A review of recent research

X Anguera, S Bozonnet, N Evans… - … on audio, speech …, 2012 - ieeexplore.ieee.org
Speaker diarization is the task of determining “who spoke when?” in an audio or video
recording that contains an unknown amount of speech and also an unknown number of …

[BOOK][B] Speaker recognition

H Beigi, H Beigi - 2011 - Springer
The objective of the enrollment process is to modify (adapt) a speaker-independent model
into one that best characterizes the target speaker's vocal tract characteristics. Depending …

The ICSI RT07s speaker diarization system

C Wooters, M Huijbregts - International Evaluation Workshop on Rich …, 2007 - Springer
In this paper, we present the ICSI speaker diarization system. This system was used in the
2007 National Institute of Standards and Technology (NIST) Rich Transcription evaluation …

Speaker segmentation and clustering

M Kotti, V Moschou, C Kotropoulos - Signal processing, 2008 - Elsevier
This survey focuses on two challenging speech processing topics, namely: speaker
segmentation and speaker clustering. Speaker segmentation aims at finding speaker …

A review on speaker diarization systems and approaches

MH Moattar, MM Homayounpour - Speech Communication, 2012 - Elsevier
Speaker indexing or diarization is an important task in audio processing and retrieval.
Speaker diarization is the process of labeling a speech signal with labels corresponding to …

[BOOK][B] Robust speaker diarization for meetings

X Anguera Miró - 2006 - upcommons.upc.edu
This thesis shows research performed into the topic of speaker diarization for meeting
rooms. It looks into the algorithms and the implementation of an offline speaker …

Augmenting transformer-transducer based speaker change detection with token-level training loss

G Zhao, Q Wang, H Lu, Y Huang… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
In this work we propose a novel token-based training strategy that improves Transformer-
Transducer (TT) based speaker change detection (SCD) performance. The conventional TT …