[HTML][HTML] Video and audio deepfake datasets and open issues in deepfake technology: being ahead of the curve
Z Akhtar, TL Pendyala, VS Athmakuri - Forensic Sciences, 2024 - mdpi.com
The revolutionary breakthroughs in Machine Learning (ML) and Artificial Intelligence (AI) are
extensively being harnessed across a diverse range of domains, eg, forensic science …
extensively being harnessed across a diverse range of domains, eg, forensic science …
[HTML][HTML] An experimental review of speaker diarization methods with application to two-speaker conversational telephone speech recordings
We performed an experimental review of current diarization systems for the conversational
telephone speech (CTS) domain. In detail, we considered a total of eight different algorithms …
telephone speech (CTS) domain. In detail, we considered a total of eight different algorithms …
Whisper-SV: Adapting Whisper for low-data-resource speaker verification
Trained on 680,000 h of massive speech data, Whisper is a multitasking, multilingual
speech foundation model demonstrating superior performance in automatic speech …
speech foundation model demonstrating superior performance in automatic speech …
Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification
The residual neural networks (ResNet) demonstrate the impressive performance in
automatic speaker verification (ASV). They treat the time and frequency dimensions equally …
automatic speaker verification (ASV). They treat the time and frequency dimensions equally …
The Vox Celeb Speaker Recognition Challenge: A Retrospective
The VoxCeleb Speaker Recognition Challenges (VoxSRC) were a series of challenges and
workshops that ran annually from 2019 to 2023. The challenges primarily evaluated the …
workshops that ran annually from 2019 to 2023. The challenges primarily evaluated the …
[PDF][PDF] VoxTube: a multilingual speaker recognition dataset
The objective of this paper is to advance the development of technologies in the fields of
speaker recognition and speaker identification by introducing a large labeled audio …
speaker recognition and speaker identification by introducing a large labeled audio …
VoxWatch: an open-set speaker recognition benchmark on VoxCeleb
Despite its broad practical applications such as in fraud prevention, open-set speaker
identification (OSI) has received less attention in the speaker recognition community …
identification (OSI) has received less attention in the speaker recognition community …
Comparative Analysis of Modality Fusion Approaches for Audio-visual Person Identification and Verification
Multimodal learning involves integrating information from various modalities to enhance
learning and comprehension. We compare three modality fusion strategies in person …
learning and comprehension. We compare three modality fusion strategies in person …
Individual identification in acoustic recordings
Recent advances in bioacoustics combined with acoustic individual identification (AIID)
could open frontiers for ecological and evolutionary research because traditional methods of …
could open frontiers for ecological and evolutionary research because traditional methods of …
Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification
In the field of speaker verification, session or channel variability poses a significant
challenge. While many contemporary methods aim to disentangle session information from …
challenge. While many contemporary methods aim to disentangle session information from …