The third DIHARD diarization challenge

N Ryant, P Singh, V Krishnamohan, R Varma… - arxiv preprint arxiv …, 2020 - arxiv.org
DIHARD III was the third in a series of speaker diarization challenges intended to improve
the robustness of diarization systems to variability in recording equipment, noise conditions …

Novel speech recognition systems applied to forensics within child exploitation: Wav2vec2. 0 vs. whisper

JC Vásquez-Correa, A Álvarez Muniain - Sensors, 2023 - mdpi.com
The growth in online child exploitation material is a significant challenge for European Law
Enforcement Agencies (LEAs). One of the most important sources of such online information …

Multiclass audio segmentation based on recurrent neural networks for broadcast domain data

P Gimeno, I Viñals, A Ortega, A Miguel… - EURASIP Journal on …, 2020 - Springer
This paper presents a new approach based on recurrent neural networks (RNN) to the
multiclass audio segmentation task whose goal is to classify an audio signal as speech …

Analysis of the but diarization system for voxconverse challenge

F Landini, O Glembek, P Matějka… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
This paper describes the system developed by the BUT team for the fourth track of the
VoxCeleb Speaker Recognition Challenge, focusing on diarization on the VoxConverse …

Summary of the DISPLACE challenge 2023-DIarization of SPeaker and LAnguage in Conversational Environments

S Baghel, S Ramoji, S Jain, PR Chowdhuri… - Speech …, 2024 - Elsevier
In multi-lingual societies, where multiple languages are spoken in a small geographic
vicinity, informal conversations often involve mix of languages. Existing speech technologies …

A comparison of hybrid and end-to-end ASR systems for the IberSpeech-RTVE 2020 speech-to-text transcription challenge

JM Perero-Codosero, FM Espinoza-Cuadros… - Applied Sciences, 2022 - mdpi.com
This paper describes a comparison between hybrid and end-to-end Automatic Speech
Recognition (ASR) systems, which were evaluated on the IberSpeech-RTVE 2020 Speech …

An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies

E Lleida, LJ Rodriguez-Fuentes, J Tejedor, A Ortega… - Applied Sciences, 2023 - mdpi.com
Evaluation campaigns provide a common framework with which the progress of speech
technologies can be effectively measured. The aim of this paper is to present a detailed …

Tase: Task-aware speech enhancement for wake-up word detection in voice assistants

G Cámbara, F López, D Bonet, P Gómez, C Segura… - Applied Sciences, 2022 - mdpi.com
Wake-up word spotting in noisy environments is a critical task for an excellent user
experience with voice assistants. Unwanted activation of the device is often due to the …

Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges

V Mingote, A Ortega, A Miguel, E Lleida - arxiv preprint arxiv:2409.05659, 2024 - arxiv.org
Nowadays, the large amount of audio-visual content available has fostered the need to
develop new robust automatic speaker diarization systems to analyse and characterise it …

LIP-RTVE: An audiovisual database for continuous Spanish in the wild

D Gimeno-Gómez, CD Martínez-Hinarejos - arxiv preprint arxiv …, 2023 - arxiv.org
Speech is considered as a multi-modal process where hearing and vision are two
fundamentals pillars. In fact, several studies have demonstrated that the robustness of …