Speaker recognition based on deep learning: An overview
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …
learning has dramatically revolutionized speaker recognition. However, there is lack of …
A review of speaker diarization: Recent advances with deep learning
Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …
CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings
Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the
6th CHiME Speech Separation and Recognition Challenge (CHiME-6). The new challenge …
6th CHiME Speech Separation and Recognition Challenge (CHiME-6). The new challenge …
Bayesian hmm clustering of x-vector sequences (vbx) in speaker diarization: theory, implementation and analysis on standard tasks
The recently proposed VBx diarization method uses a Bayesian hidden Markov model to
find speaker clusters in a sequence of x-vectors. In this work we perform an extensive …
find speaker clusters in a sequence of x-vectors. In this work we perform an extensive …
Speaker recognition for multi-speaker conversations using x-vectors
Recently, deep neural networks that map utterances to fixed-dimensional embeddings have
emerged as the state-of-the-art in speaker recognition. Our prior work introduced x-vectors …
emerged as the state-of-the-art in speaker recognition. Our prior work introduced x-vectors …
The third DIHARD diarization challenge
DIHARD III was the third in a series of speaker diarization challenges intended to improve
the robustness of diarization systems to variability in recording equipment, noise conditions …
the robustness of diarization systems to variability in recording equipment, noise conditions …
End-to-end neural speaker diarization with self-attention
Speaker diarization has been mainly developed based on the clustering of speaker
embeddings. However, the clustering-based approach has two major problems; ie,(i) it is not …
embeddings. However, the clustering-based approach has two major problems; ie,(i) it is not …
End-to-end neural speaker diarization with permutation-free objectives
In this paper, we propose a novel end-to-end neural-network-based speaker diarization
method. Unlike most existing methods, our proposed method does not have separate …
method. Unlike most existing methods, our proposed method does not have separate …
Target-speaker voice activity detection: a novel approach for multi-speaker diarization in a dinner party scenario
I Medennikov, M Korenevsky, T Prisyach… - arxiv preprint arxiv …, 2020 - arxiv.org
Speaker diarization for real-life scenarios is an extremely challenging problem. Widely used
clustering-based diarization approaches perform rather poorly in such conditions, mainly …
clustering-based diarization approaches perform rather poorly in such conditions, mainly …
Spot the conversation: speaker diarisation in the wild
The goal of this paper is speaker diarisation of videos collected'in the wild'. We make three
key contributions. First, we propose an automatic audio-visual diarisation method for …
key contributions. First, we propose an automatic audio-visual diarisation method for …