Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges

V Mingote, A Ortega, A Miguel, E Lleida - arxiv preprint arxiv:2409.05659, 2024 - arxiv.org
Nowadays, the large amount of audio-visual content available has fostered the need to
develop new robust automatic speaker diarization systems to analyse and characterise it …

Multimodal diarization systems by training enrollment models as identity representations

V Mingote, I Viñals, P Gimeno, A Miguel, A Ortega… - Applied Sciences, 2022 - mdpi.com
This paper describes a post-evaluation analysis of the system developed by ViVoLAB
research group for the IberSPEECH-RTVE 2020 Multimodal Diarization (MD) Challenge …

Design of Intelligent models for Multimodal Socio-Affective Computing

C Luna Jiménez - 2023 - oa.upm.es
Dialog and human-machine communication systems have represented a revolution in
recent years. Nonetheless, users increasingly require more personalized and human-like …

[PDF][PDF] Representation and metric learning advances for deep neural network face and speaker biometric systems

VM Bueno - 2022 - researchgate.net
The increasing use of technological devices and biometric recognition systems in people
daily lives has motivated a great deal of research interest in the development of effective and …