Speaker recognition for multi-speaker conversations using x-vectors

D Snyder, D Garcia-Romero, G Sell… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
Recently, deep neural networks that map utterances to fixed-dimensional embeddings have
emerged as the state-of-the-art in speaker recognition. Our prior work introduced x-vectors …

State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and speakers in the wild evaluations

J Villalba, N Chen, D Snyder, D Garcia-Romero… - Computer Speech & …, 2020 - Elsevier
We present a thorough analysis of the systems developed by the JHU-MIT consortium in the
context of NIST speaker recognition evaluation 2018. In the previous NIST evaluation, in …

[PDF][PDF] State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18.

J Villalba, N Chen, D Snyder, D Garcia-Romero… - Interspeech, 2019 - danielpovey.com
We present a condensed description of the joint effort of JHUCLSP, JHU-HLTCOE, MIT-LL.,
MIT CSAIL and LSE-EPITA for NIST SRE18. All the developed systems consisted of xvector/i …

[PDF][PDF] MagNetO: X-vector Magnitude Estimation Network plus Offset for Improved Speaker Recognition.

D Garcia-Romero, G Sell, A Mccree - Odyssey, 2020 - isca-archive.org
We present a magnitude estimation network that is combined with a modified ResNet x-
vector system to generate embeddings whose inner product is able to produce calibrated …

Jhu-hltcoe system for the voxsrc speaker recognition challenge

D Garcia-Romero, A McCree… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
The VoxSRC speaker recognition challenge comprises data obtained from YouTube videos
of celebrity interviews in a wide range of recording environments. The challenge provides …

[PDF][PDF] x-vector DNN refinement with full-length recordings for speaker recognition.

D Garcia-Romero, D Snyder, G Sell, A McCree… - Interspeech, 2019 - danielpovey.com
State-of-the-art text-independent speaker recognition systems for long recordings (a few
minutes) are based on deep neural network (DNN) speaker embeddings. Current …

Self-supervised speaker embeddings

T Stafylakis, J Rohdin, O Plchot, P Mizera… - arxiv preprint arxiv …, 2019 - arxiv.org
Contrary to i-vectors, speaker embeddings such as x-vectors are incapable of leveraging
unlabelled utterances, due to the classification loss over training speakers. In this paper, we …

Variational domain adversarial learning for speaker verification

Y Tu, MW Mak, JT Chien - 2019 - ira.lib.polyu.edu.hk
Domain mismatch refers to the problem in which the distribution of training data differs from
that of the test data. This paper proposes a variational domain adversarial neural network …

[KIRJA][B] Machine learning for speaker recognition

MW Mak, JT Chien - 2020 - books.google.com
This book will help readers understand fundamental and advanced statistical models and
deep learning models for robust speaker recognition and domain adaptation. This useful …

A speaker verification backend with robust performance across conditions

L Ferrer, M McLaren, N Brümmer - Computer Speech & Language, 2022 - Elsevier
In this paper, we address the problem of speaker verification in conditions unseen or
unknown during development. A standard method for speaker verification consists of …