The third DIHARD diarization challenge
DIHARD III was the third in a series of speaker diarization challenges intended to improve
the robustness of diarization systems to variability in recording equipment, noise conditions …
the robustness of diarization systems to variability in recording equipment, noise conditions …
Novel speech recognition systems applied to forensics within child exploitation: Wav2vec2. 0 vs. whisper
The growth in online child exploitation material is a significant challenge for European Law
Enforcement Agencies (LEAs). One of the most important sources of such online information …
Enforcement Agencies (LEAs). One of the most important sources of such online information …
Multiclass audio segmentation based on recurrent neural networks for broadcast domain data
This paper presents a new approach based on recurrent neural networks (RNN) to the
multiclass audio segmentation task whose goal is to classify an audio signal as speech …
multiclass audio segmentation task whose goal is to classify an audio signal as speech …
Analysis of the but diarization system for voxconverse challenge
This paper describes the system developed by the BUT team for the fourth track of the
VoxCeleb Speaker Recognition Challenge, focusing on diarization on the VoxConverse …
VoxCeleb Speaker Recognition Challenge, focusing on diarization on the VoxConverse …
Summary of the DISPLACE challenge 2023-DIarization of SPeaker and LAnguage in Conversational Environments
In multi-lingual societies, where multiple languages are spoken in a small geographic
vicinity, informal conversations often involve mix of languages. Existing speech technologies …
vicinity, informal conversations often involve mix of languages. Existing speech technologies …
A comparison of hybrid and end-to-end ASR systems for the IberSpeech-RTVE 2020 speech-to-text transcription challenge
This paper describes a comparison between hybrid and end-to-end Automatic Speech
Recognition (ASR) systems, which were evaluated on the IberSpeech-RTVE 2020 Speech …
Recognition (ASR) systems, which were evaluated on the IberSpeech-RTVE 2020 Speech …
An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies
Evaluation campaigns provide a common framework with which the progress of speech
technologies can be effectively measured. The aim of this paper is to present a detailed …
technologies can be effectively measured. The aim of this paper is to present a detailed …
Tase: Task-aware speech enhancement for wake-up word detection in voice assistants
Wake-up word spotting in noisy environments is a critical task for an excellent user
experience with voice assistants. Unwanted activation of the device is often due to the …
experience with voice assistants. Unwanted activation of the device is often due to the …
Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges
Nowadays, the large amount of audio-visual content available has fostered the need to
develop new robust automatic speaker diarization systems to analyse and characterise it …
develop new robust automatic speaker diarization systems to analyse and characterise it …
LIP-RTVE: An audiovisual database for continuous Spanish in the wild
Speech is considered as a multi-modal process where hearing and vision are two
fundamentals pillars. In fact, several studies have demonstrated that the robustness of …
fundamentals pillars. In fact, several studies have demonstrated that the robustness of …