STARSS23: An audio-visual dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events

K Shimada, A Politis, P Sudarsanam… - Advances in neural …, 2023 - proceedings.neurips.cc
While direction of arrival (DOA) of sound events is generally estimated from multichannel
audio data recorded in a microphone array, sound events usually derive from visually …

A four-stage data augmentation approach to resnet-conformer based acoustic modeling for sound event localization and detection

Q Wang, J Du, HX Wu, J Pan, F Ma… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org
In this paper, we propose a novel four-stage data augmentation approach to ResNet-
Conformer based acoustic modeling for sound event localization and detection (SELD) …

An experimental study on sound event localization and detection under realistic testing conditions

S Niu, J Du, Q Wang, L Chai, H Wu… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
We study four data augmentation (DA) techniques and two model architectures on realistic
data for sound event localization and detection (SELD). First, based on ResNet-Conformer …

Bat: Learning to reason about spatial sounds with large language models

Z Zheng, P Peng, Z Ma, X Chen, E Choi… - ar** for Multisource Sound Localization
C He, S Cheng, R Zheng, J Liu - IEEE Internet of Things …, 2024 - ieeexplore.ieee.org
Multisource sound localization can find applications in many domains, including auditory
scene analysis, fault detection, and diagnosis in manufacturing, augmented reality, etc. In far …

6DoF SELD: Sound event localization and detection using microphones and motion tracking sensors on self-motioning human

M Yasuda, S Saito, A Nakayama… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
We aim to perform sound event localization and detection (SELD) using wearable
equipment for a moving human, such as a pedestrian. Conventional SELD tasks have dealt …

Fusion of audio and visual embeddings for sound event localization and detection

D Berghi, P Wu, J Zhao, W Wang… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Sound event localization and detection (SELD) combines two subtasks: sound event
detection (SED) and direction of arrival (DOA) estimation. SELD is usually tackled as an …

Exploring audio-visual information fusion for sound event localization and detection in low-resource realistic scenarios

Y Jiang, Q Wang, J Du, M Hu, P Hu… - … on Multimedia and …, 2024 - ieeexplore.ieee.org
This study presents an audio-visual information fusion approach to sound event localization
and detection (SELD) in low-resource scenarios. We aim at utilizing audio and video …

[HTML][HTML] Transformers and Audio Detection Tasks: An Overview

K Zaman, K Li, M Sah, C Direkoglu, S Okada… - Digital Signal …, 2024 - Elsevier
Audio detection refers to the process of identifying and analyzing audio signals to extract
useful information or detect specific events or patterns within the audio data while utilizing …