STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events
This report presents the Sony-TAu Realistic Spatial Soundscapes 2022 (STARS22) dataset
for sound event localization and detection, comprised of spatial recordings of real scenes …
for sound event localization and detection, comprised of spatial recordings of real scenes …
STARSS23: An audio-visual dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events
While direction of arrival (DOA) of sound events is generally estimated from multichannel
audio data recorded in a microphone array, sound events usually derive from visually …
audio data recorded in a microphone array, sound events usually derive from visually …
SELD U-Net: Joint Optimization of Sound Event Localization and Detection with Noise Reduction
Y Shin, YG Kim, CH Choi, DJ Kim, C Chun - IEEE Access, 2023 - ieeexplore.ieee.org
Sound event localization and detection (SELD) is a combined task that classifies acoustic
events from audio signals, estimates temporal boundaries, and identifies event locations …
events from audio signals, estimates temporal boundaries, and identifies event locations …
Tf-mamba: A time-frequency network for sound source localization
Sound source localization (SSL) determines the position of sound sources using multi-
channel audio data. It is commonly used to improve speech enhancement and separation …
channel audio data. It is commonly used to improve speech enhancement and separation …
FN-SSL: Full-band and narrow-band fusion for sound source localization
Extracting direct-path spatial features is critical for sound source localization in adverse
acoustic environments. This paper proposes a full-band and narrow-band fusion network for …
acoustic environments. This paper proposes a full-band and narrow-band fusion network for …
Audio inputs for active speaker detection and localization via microphone array
This study considers the problem of detecting and locating an active talker's horizontal
position from multichannel audio captured by a microphone array. We refer to this as active …
position from multichannel audio captured by a microphone array. We refer to this as active …
Leveraging Visual Supervision for Array-Based Active Speaker Detection and Localization
Conventional audio-visual approaches for active speaker detection (ASD) typically rely on
visually pre-extracted face tracks and the corresponding single-channel audio to find the …
visually pre-extracted face tracks and the corresponding single-channel audio to find the …
[PDF][PDF] Sound event localization and detection with pre-trained audio spectrogram transformer and multichannel separation network
We propose a sound event localization and detection system based on a CNN-Conformer
base network. Our main contribution is to evaluate the use of pre-trained elements in this …
base network. Our main contribution is to evaluate the use of pre-trained elements in this …
Text-Queried Target Sound Event Localization
Sound event localization and detection (SELD) aims to determine the appearance of sound
classes, together with their Direction of Arrival (DOA). However, current SELD systems can …
classes, together with their Direction of Arrival (DOA). However, current SELD systems can …
Learning multi-target TDOA features for sound event localization and detection
Sound event localization and detection (SELD) systems using audio recordings from a
microphone array rely on spatial cues for determining the location of sound events. As a …
microphone array rely on spatial cues for determining the location of sound events. As a …