[HTML][HTML] Video and audio deepfake datasets and open issues in deepfake technology: being ahead of the curve

Z Akhtar, TL Pendyala, VS Athmakuri - Forensic Sciences, 2024 - mdpi.com
The revolutionary breakthroughs in Machine Learning (ML) and Artificial Intelligence (AI) are
extensively being harnessed across a diverse range of domains, eg, forensic science …

[HTML][HTML] An experimental review of speaker diarization methods with application to two-speaker conversational telephone speech recordings

L Serafini, S Cornell, G Morrone, E Zovato… - Computer Speech & …, 2023 - Elsevier
We performed an experimental review of current diarization systems for the conversational
telephone speech (CTS) domain. In detail, we considered a total of eight different algorithms …

Whisper-SV: Adapting Whisper for low-data-resource speaker verification

L Zhang, N Jiang, Q Wang, Y Li, Q Lu, L **e - Speech Communication, 2024 - Elsevier
Trained on 680,000 h of massive speech data, Whisper is a multitasking, multilingual
speech foundation model demonstrating superior performance in automatic speech …

Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification

T Liu, KA Lee, Q Wang, H Li - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org
The residual neural networks (ResNet) demonstrate the impressive performance in
automatic speaker verification (ASV). They treat the time and frequency dimensions equally …

The Vox Celeb Speaker Recognition Challenge: A Retrospective

J Huh, JS Chung, A Nagrani, A Brown… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org
The VoxCeleb Speaker Recognition Challenges (VoxSRC) were a series of challenges and
workshops that ran annually from 2019 to 2023. The challenges primarily evaluated the …

[PDF][PDF] VoxTube: a multilingual speaker recognition dataset

I Yakovlev, A Okhotnikov, N Torgashov… - Proc …, 2023 - isca-archive.org
The objective of this paper is to advance the development of technologies in the fields of
speaker recognition and speaker identification by introducing a large labeled audio …

VoxWatch: an open-set speaker recognition benchmark on VoxCeleb

R Peri, SO Sadjadi, D Garcia-Romero - arxiv preprint arxiv:2307.00169, 2023 - arxiv.org
Despite its broad practical applications such as in fraud prevention, open-set speaker
identification (OSI) has received less attention in the speaker recognition community …

Comparative Analysis of Modality Fusion Approaches for Audio-visual Person Identification and Verification

A Farhadipour, M Chapariniya, T Vukovic… - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal learning involves integrating information from various modalities to enhance
learning and comprehension. We compare three modality fusion strategies in person …

Individual identification in acoustic recordings

E Knight, T Rhinehart, DR de Zwaan, MJ Weldy… - Trends in Ecology & …, 2024 - cell.com
Recent advances in bioacoustics combined with acoustic individual identification (AIID)
could open frontiers for ecological and evolutionary research because traditional methods of …

Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification

HS Heo, KH Nam, BJ Lee, Y Kwon… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
In the field of speaker verification, session or channel variability poses a significant
challenge. While many contemporary methods aim to disentangle session information from …