- Academic Search

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Save Cite Cited by 224 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Speaker recognition based on deep learning: An overview

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier

Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

Save Cite Cited by 439 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A review of speaker diarization: Recent advances with deep learning

TJ Park, N Kanda, D Dimitriadis, KJ Han… - Computer Speech & …, 2022 - Elsevier

Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …

Save Cite Cited by 417 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Voicefilter: Targeted voice separation by speaker-conditioned spectrogram masking

Q Wang, H Muckenhirn, K Wilson, P Sridhar… - arxiv preprint arxiv …, 2018 - arxiv.org

In this paper, we present a novel system that separates the voice of a target speaker from
multi-speaker signals, by making use of a reference signal from the target speaker. We …

Save Cite Cited by 452 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] vut.cz

Bayesian hmm clustering of x-vector sequences (vbx) in speaker diarization: theory, implementation and analysis on standard tasks

F Landini, J Profant, M Diez, L Burget - Computer Speech & Language, 2022 - Elsevier

The recently proposed VBx diarization method uses a Bayesian hidden Markov model to
find speaker clusters in a sequence of x-vectors. In this work we perform an extensive …

Save Cite Cited by 231 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

End-to-end neural speaker diarization with self-attention

Y Fujita, N Kanda, S Horiguchi, Y Xue… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org

Speaker diarization has been mainly developed based on the clustering of speaker
embeddings. However, the clustering-based approach has two major problems; ie,(i) it is not …

Save Cite Cited by 291 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

End-to-end neural speaker diarization with permutation-free objectives

Y Fujita, N Kanda, S Horiguchi, K Nagamatsu… - arxiv preprint arxiv …, 2019 - arxiv.org

In this paper, we propose a novel end-to-end neural-network-based speaker diarization
method. Unlike most existing methods, our proposed method does not have separate …

Save Cite Cited by 288 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

The third DIHARD diarization challenge

N Ryant, P Singh, V Krishnamohan, R Varma… - arxiv preprint arxiv …, 2020 - arxiv.org

DIHARD III was the third in a series of speaker diarization challenges intended to improve
the robustness of diarization systems to variability in recording equipment, noise conditions …

Save Cite Cited by 172 Related articles All 11 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

End-to-end speaker diarization for an unknown number of speakers with encoder-decoder based attractors

S Horiguchi, Y Fujita, S Watanabe, Y Xue… - arxiv preprint arxiv …, 2020 - arxiv.org

End-to-end speaker diarization for an unknown number of speakers is addressed in this
paper. Recently proposed end-to-end speaker diarization outperformed conventional …

Save Cite Cited by 210 Related articles All 11 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Spot the conversation: speaker diarisation in the wild

JS Chung, J Huh, A Nagrani, T Afouras… - arxiv preprint arxiv …, 2020 - arxiv.org

The goal of this paper is speaker diarisation of videos collected'in the wild'. We make three
key contributions. First, we propose an automatic audio-visual diarisation method for …

Save Cite Cited by 182 Related articles All 12 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Fully supervised speaker diarization

A review of deep learning techniques for speech processing

Speaker recognition based on deep learning: An overview

A review of speaker diarization: Recent advances with deep learning

Voicefilter: Targeted voice separation by speaker-conditioned spectrogram masking

Bayesian hmm clustering of x-vector sequences (vbx) in speaker diarization: theory, implementation and analysis on standard tasks

End-to-end neural speaker diarization with self-attention

End-to-end neural speaker diarization with permutation-free objectives

The third DIHARD diarization challenge

End-to-end speaker diarization for an unknown number of speakers with encoder-decoder based attractors

Spot the conversation: speaker diarisation in the wild