[HTML][HTML] Voxceleb: Large-scale speaker verification in the wild

A Nagrani, JS Chung, W **e, A Zisserman - Computer Speech & Language, 2020 - Elsevier
The objective of this work is speaker recognition under noisy and unconstrained conditions.
We make two key contributions. First, we introduce a very large-scale audio-visual dataset …

Optimization of data-driven filterbank for automatic speaker verification

S Sarangi, M Sahidullah, G Saha - Digital Signal Processing, 2020 - Elsevier
Most of the speech processing applications use triangular filters spaced in mel-scale for
feature extraction. In this paper, we propose a new data-driven filter design method which …

**-vector embedding for speaker recognition

KA Lee, Q Wang, T Koshinaka - IEEE Signal Processing Letters, 2021 - ieeexplore.ieee.org
We present a Bayesian formulation for deep speaker embedding, wherein the xi-vector is
the Bayesian counterpart of the x-vector, taking into account the uncertainty estimate. On the …

Audio-visual speaker recognition with a cross-modal discriminative network

R Tao, RK Das, H Li - arxiv preprint arxiv:2008.03894, 2020 - arxiv.org
Audio-visual speaker recognition is one of the tasks in the recent 2019 NIST speaker
recognition evaluation (SRE). Studies in neuroscience and computer science all point to the …

Voxceleb enrichment for age and gender recognition

K Hechmi, TN Trong, V Hautamäki… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org
VoxCeleb datasets are widely used in speaker recognition studies. Our work serves two
purposes. First, we provide speaker age labels and (an alternative) annotation of speaker …

A study of bias mitigation strategies for speaker recognition

R Peri, K Somandepalli, S Narayanan - Computer Speech & Language, 2023 - Elsevier
Speaker recognition is increasingly used in several everyday applications including smart
speakers, customer care centers and other speech-driven analytics. It is crucial to accurately …

Towards robust speaker verification with target speaker enhancement

C Zhang, M Yu, C Weng, D Yu - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
This paper proposes the target speaker enhancement based speaker verification network
(TASE-SVNet), an all neural model that couples target speaker enhancement and speaker …

An investigation of domain adaptation in speaker embedding space for speaker recognition

F Bahmaninezhad, C Zhang, JHL Hansen - Speech Communication, 2021 - Elsevier
Speaker recognition continues to grow as a research challenge in the field with expanded
application in commercial, forensic, educational and general speech technology interfaces …

Incorporating uncertainty from speaker embedding estimation to speaker verification

Q Wang, KA Lee, T Liu - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
Speech utterances recorded under differing conditions exhibit varying degrees of
confidence in their embedding estimates, ie, uncertainty, even if they are extracted using the …

NEC-TT system for mixed-bandwidth and multi-domain speaker recognition

KA Lee, H Yamamoto, K Okabe, Q Wang, L Guo… - Computer Speech & …, 2020 - Elsevier
This paper describes the NEC-TT speaker recognition system designed for the 2018
Speaker Recognition Evaluation (SRE'18) benchmarking. The NEC-TT submission was …