Fsd50k: an open dataset of human-labeled sound events

E Fonseca, X Favory, J Pons, F Font… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
Most existing datasets for sound event recognition (SER) are relatively small and/or domain-
specific, with the exception of AudioSet, based on over 2 M tracks from YouTube videos and …

Unsupervised contrastive learning of sound event representations

E Fonseca, D Ortego, K McGuinness… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Self-supervised representation learning can mitigate the limitations in recognition tasks with
few manually labeled data but abundant unlabeled data—a common scenario in sound …

Audio tagging with noisy labels and minimal supervision

E Fonseca, M Plakal, F Font, DPW Ellis… - arxiv preprint arxiv …, 2019 - arxiv.org
This paper introduces Task 2 of the DCASE2019 Challenge, titled" Audio tagging with noisy
labels and minimal supervision". This task was hosted on the Kaggle platform as" Freesound …

Addressing missing labels in large-scale sound event recognition using a teacher-student framework with loss masking

E Fonseca, S Hershey, M Plakal… - IEEE Signal …, 2020 - ieeexplore.ieee.org
The study of label noise in sound event recognition has recently gained attention with the
advent of larger and noisier datasets. This work addresses the problem of missing labels …

Self-supervised learning from automatically separated sound scenes

E Fonseca, A Jansen, DPW Ellis… - … IEEE Workshop on …, 2021 - ieeexplore.ieee.org
Real-world sound scenes consist of time-varying collections of sound sources, each
generating characteristic sound events that are mixed together in audio recordings. The …

Improving sound event classification by increasing shift invariance in convolutional neural networks

E Fonseca, A Ferraro, X Serra - arxiv preprint arxiv:2107.00623, 2021 - arxiv.org
Recent studies have put into question the commonly assumed shift invariance property of
convolutional networks, showing that small shifts in the input can affect the output …

Lightweight convolutional-iconformer for sound event detection

TK Chan, CS Chin - IEEE Transactions on Artificial Intelligence, 2022 - ieeexplore.ieee.org
The development of a sound event detection (SED) system is no trivial task where one has
to consider both audio tagging and temporal localization concurrently. Often model …

A hybrid parametric-deep learning approach for sound event localization and detection

A Pérez-López, E Fonseca, X Serra - arxiv preprint arxiv:1908.10133, 2019 - arxiv.org
This work describes and discusses an algorithm submitted to the Sound Event Localization
and Detection Task of DCASE2019 Challenge. The proposed methodology relies on …

[PDF][PDF] Noisy Web Supervision for Audio Classification

T Iqbal - 2022 - openresearch.surrey.ac.uk
Audio classification and other fields of pattern recognition have developed at an astounding
pace due to advances in machine learning. The availability of training data, especially …

[PDF][PDF] Improving Generalization of Deep Learning Music Classifiers

M Buisson - 2021 - academia.edu
Deep learning models have recently led to significant improvements in a wide variety of
tasks. Known as being a very powerful tool capable of generalizing better than traditional …