A survey on deep learning based forest environment sound classification at the edge
Forest ecosystems are of paramount importance to the sustainable existence of life on earth.
Unique natural and artificial phenomena pose severe threats to the perseverance of such …
Unique natural and artificial phenomena pose severe threats to the perseverance of such …
Fsd50k: an open dataset of human-labeled sound events
Most existing datasets for sound event recognition (SER) are relatively small and/or domain-
specific, with the exception of AudioSet, based on over 2 M tracks from YouTube videos and …
specific, with the exception of AudioSet, based on over 2 M tracks from YouTube videos and …
Hts-at: A hierarchical token-semantic audio transformer for sound classification and detection
Audio classification is an important task of map** audio samples into their corresponding
labels. Recently, the transformer model with self-attention mechanisms has been adopted in …
labels. Recently, the transformer model with self-attention mechanisms has been adopted in …
A comprehensive review of polyphonic sound event detection
One of the most amazing functions of the human auditory system is the ability to detect all
kinds of sound events in the environment. With the technologies and hardware advances …
kinds of sound events in the environment. With the technologies and hardware advances …
What's all the fuss about free universal sound separation data?
We introduce the Free Universal Sound Separation (FUSS) dataset, a new corpus for
experiments in separating mixtures of an unknown number of sounds from an open domain …
experiments in separating mixtures of an unknown number of sounds from an open domain …
Training sound event detection on a heterogeneous dataset
N Turpault, R Serizel - ar** target sound events and non-target sounds, also referred to as interference or …
Zero-shot audio source separation through query-based learning from weakly-labeled data
Deep learning techniques for separating audio into different sound sources face several
challenges. Standard architectures require training separate models for different types of …
challenges. Standard architectures require training separate models for different types of …
Sound event detection by consistency training and pseudo-labeling with feature-pyramid convolutional recurrent neural networks
Due to the high cost of large-scale strong labeling, sound event detection (SED) using only
weakly-labeled and unlabeled data has drawn increasing attention in recent years. To …
weakly-labeled and unlabeled data has drawn increasing attention in recent years. To …
[PDF][PDF] Audio lottery: Speech recognition made ultra-lightweight, noise-robust, and transferable
Lightweight speech recognition models have seen explosive demands owing to a growing
amount of speech-interactive features on mobile devices. Since designing such systems …
amount of speech-interactive features on mobile devices. Since designing such systems …