Unified multisensory perception: Weakly-supervised audio-visual video parsing
In this paper, we introduce a new problem, named audio-visual video parsing, which aims to
parse a video into temporal event segments and label them as either audible, visible, or …
parse a video into temporal event segments and label them as either audible, visible, or …
Sound event detection in domestic environments with weakly labeled data and soundscape synthesis
This paper presents Task 4 of the Detection and Classification of Acoustic Scenes and
Events (DCASE) 2019 challenge and provides a first analysis of the challenge results. The …
Events (DCASE) 2019 challenge and provides a first analysis of the challenge results. The …
A framework for the robust evaluation of sound event detection
This work defines a new framework for performance evaluation of polyphonic sound event
detection (SED) systems, which overcomes the limitations of the conventional collar-based …
detection (SED) systems, which overcomes the limitations of the conventional collar-based …
General-purpose tagging of freesound audio with audioset labels: Task description, dataset, and baseline
This paper describes Task 2 of the DCASE 2018 Challenge, titled" General-purpose audio
tagging of Freesound content with AudioSet labels". This task was hosted on the Kaggle …
tagging of Freesound content with AudioSet labels". This task was hosted on the Kaggle …
Sound event detection of weakly labelled data with cnn-transformer and automatic threshold optimization
Sound event detection (SED) is a task to detect sound events in an audio recording. One
challenge of the SED task is that many datasets such as the Detection and Classification of …
challenge of the SED task is that many datasets such as the Detection and Classification of …
Sound event detection in synthetic domestic environments
We present a comparative analysis of the performance of state-of-the-art sound event
detection systems. In particular, we study the robustness of the systems to noise and signal …
detection systems. In particular, we study the robustness of the systems to noise and signal …
Training sound event detection on a heterogeneous dataset
Training a sound event detection algorithm on a heterogeneous dataset including both
recorded and synthetic soundscapes that can have various labeling granularity is a non …
recorded and synthetic soundscapes that can have various labeling granularity is a non …
A transformer-based audio captioning model with keyword estimation
One of the problems with automated audio captioning (AAC) is the indeterminacy in word
selection corresponding to the audio event/scene. Since one acoustic event/scene can be …
selection corresponding to the audio event/scene. Since one acoustic event/scene can be …
Sound event detection in the DCASE 2017 challenge
Each edition of the challenge on Detection and Classification of Acoustic Scenes and Events
(DCASE) contained several tasks involving sound event detection in different setups …
(DCASE) contained several tasks involving sound event detection in different setups …
Cross-task learning for audio tagging, sound event detection and spatial localization: DCASE 2019 baseline systems
The Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 challenge
focuses on audio tagging, sound event detection and spatial localisation. DCASE 2019 …
focuses on audio tagging, sound event detection and spatial localisation. DCASE 2019 …