A review of deep learning techniques for speech processing
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …
learning. The use of multiple processing layers has enabled the creation of models capable …
Speaker recognition based on deep learning: An overview
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …
learning has dramatically revolutionized speaker recognition. However, there is lack of …
A review of speaker diarization: Recent advances with deep learning
Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …
Voicefilter: Targeted voice separation by speaker-conditioned spectrogram masking
In this paper, we present a novel system that separates the voice of a target speaker from
multi-speaker signals, by making use of a reference signal from the target speaker. We …
multi-speaker signals, by making use of a reference signal from the target speaker. We …
Bayesian hmm clustering of x-vector sequences (vbx) in speaker diarization: theory, implementation and analysis on standard tasks
The recently proposed VBx diarization method uses a Bayesian hidden Markov model to
find speaker clusters in a sequence of x-vectors. In this work we perform an extensive …
find speaker clusters in a sequence of x-vectors. In this work we perform an extensive …
End-to-end neural speaker diarization with self-attention
Speaker diarization has been mainly developed based on the clustering of speaker
embeddings. However, the clustering-based approach has two major problems; ie,(i) it is not …
embeddings. However, the clustering-based approach has two major problems; ie,(i) it is not …
End-to-end neural speaker diarization with permutation-free objectives
In this paper, we propose a novel end-to-end neural-network-based speaker diarization
method. Unlike most existing methods, our proposed method does not have separate …
method. Unlike most existing methods, our proposed method does not have separate …
The third DIHARD diarization challenge
DIHARD III was the third in a series of speaker diarization challenges intended to improve
the robustness of diarization systems to variability in recording equipment, noise conditions …
the robustness of diarization systems to variability in recording equipment, noise conditions …
End-to-end speaker diarization for an unknown number of speakers with encoder-decoder based attractors
End-to-end speaker diarization for an unknown number of speakers is addressed in this
paper. Recently proposed end-to-end speaker diarization outperformed conventional …
paper. Recently proposed end-to-end speaker diarization outperformed conventional …
Spot the conversation: speaker diarisation in the wild
The goal of this paper is speaker diarisation of videos collected'in the wild'. We make three
key contributions. First, we propose an automatic audio-visual diarisation method for …
key contributions. First, we propose an automatic audio-visual diarisation method for …