Sound event detection: A tutorial

A Mesaros, T Heittola, T Virtanen… - IEEE Signal …, 2021 - ieeexplore.ieee.org
Imagine standing on a street corner in the city. With your eyes closed you can hear and
recognize a succession of sounds: cars passing by, people speaking, their footsteps when …

Deep learning on multi sensor data for counter UAV applications—A systematic review

S Samaras, E Diamantidou, D Ataloglou, N Sakellariou… - Sensors, 2019 - mdpi.com
Usage of Unmanned Aerial Vehicles (UAVs) is growing rapidly in a wide range of consumer
applications, as they prove to be both autonomous and flexible in a variety of environments …

Wavcaps: A chatgpt-assisted weakly-labelled audio captioning dataset for audio-language multimodal research

X Mei, C Meng, H Liu, Q Kong, T Ko… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org
The advancement of audio-language (AL) multimodal learning tasks has been significant in
recent years, yet the limited size of existing audio-language datasets poses challenges for …

[HTML][HTML] Machine learning in acoustics: Theory and applications

MJ Bianco, P Gerstoft, J Traer, E Ozanich… - The Journal of the …, 2019 - pubs.aip.org
Acoustic data provide scientific and engineering insights in fields ranging from biology and
communications to ocean and Earth science. We survey the recent advances and …

A multi-device dataset for urban acoustic scene classification

A Mesaros, T Heittola, T Virtanen - arxiv preprint arxiv:1807.09840, 2018 - arxiv.org
This paper introduces the acoustic scene classification task of DCASE 2018 Challenge and
the TUT Urban Acoustic Scenes 2018 dataset provided for the task, and evaluates the …

Soundnet: Learning sound representations from unlabeled video

Y Aytar, C Vondrick, A Torralba - Advances in neural …, 2016 - proceedings.neurips.cc
We learn rich natural sound representations by capitalizing on large amounts of unlabeled
sound data collected in the wild. We leverage the natural synchronization between vision …

DCASE 2017 challenge setup: Tasks, datasets and baseline system

A Mesaros, T Heittola, A Diment, B Elizalde… - … 2017-workshop on …, 2017 - inria.hal.science
DCASE 2017 Challenge consists of four tasks: acoustic scene classification, detection of
rare sound events, sound event detection in real-life audio, and large-scale weakly …

ESC: Dataset for environmental sound classification

KJ Piczak - Proceedings of the 23rd ACM international conference …, 2015 - dl.acm.org
One of the obstacles in research activities concentrating on environmental sound
classification is the scarcity of suitable and publicly available datasets. This paper tries to …

Environmental sound classification with convolutional neural networks

KJ Piczak - 2015 IEEE 25th international workshop on machine …, 2015 - ieeexplore.ieee.org
This paper evaluates the potential of convolutional neural networks in classifying short audio
clips of environmental sounds. A deep model consisting of 2 convolutional layers with max …

Detection and classification of acoustic scenes and events: Outcome of the DCASE 2016 challenge

A Mesaros, T Heittola, E Benetos… - … on Audio, Speech …, 2017 - ieeexplore.ieee.org
Public evaluation campaigns and datasets promote active development in target research
areas, allowing direct comparison of algorithms. The second edition of the challenge on …