[HTML][HTML] Localization of sound sources in robotics: A review

C Rascon, I Meza - Robotics and Autonomous Systems, 2017 - Elsevier
Sound source localization (SSL) in a robotic platform has been essential in the overall
scheme of robot audition. It allows a robot to locate a sound source by sound alone. It has an …

[HTML][HTML] A survey of sound source localization with deep learning methods

PA Grumiaux, S Kitić, L Girin, A Guérin - The Journal of the Acoustical …, 2022 - pubs.aip.org
This article is a survey of deep learning methods for single and multiple sound source
localization, with a focus on sound source localization in indoor environments, where …

A consolidated perspective on multimicrophone speech enhancement and source separation

S Gannot, E Vincent… - … /ACM Transactions on …, 2017 - ieeexplore.ieee.org
Speech enhancement and separation are core problems in audio signal processing, with
commercial applications in devices as diverse as mobile phones, conference call systems …

EM algorithms for weighted-data clustering with application to audio-visual scene analysis

ID Gebru, X Alameda-Pineda, F Forbes… - IEEE transactions on …, 2016 - ieeexplore.ieee.org
Data clustering has received a lot of attention and numerous methods, algorithms and
software packages are available. Among these techniques, parametric finite-mixture models …

Audio-visual speaker diarization based on spatiotemporal bayesian fusion

ID Gebru, S Ba, X Li, R Horaud - IEEE transactions on pattern …, 2017 - ieeexplore.ieee.org
Speaker diarization consists of assigning speech signals to people engaged in a dialogue.
An audio-visual spatiotemporal diarization model is proposed. The model is well suited for …

Neural network adaptation and data augmentation for multi-speaker direction-of-arrival estimation

W He, P Motlicek, JM Odobez - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org
Deep neural networks have been successfully applied to sound direction-of-arrival
estimation under challenging conditions. However, such a learning-based approach …

Multi-target DoA estimation with an audio-visual fusion mechanism

X Qian, M Madhavi, Z Pan, J Wang… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Most of the prior studies in the spatial Direction of Arrival (DoA) domain focus on a single
modality. However, humans use auditory and visual senses to detect the presence of sound …

Audio–visual particle flow smc-phd filtering for multi-speaker tracking

Y Liu, V Kılıç, J Guan, W Wang - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Sequential Monte Carlo probability hypothesis density (SMC-PHD) filtering is a popular
method used recently for audio-visual (AV) multi-speaker tracking. However, due to the …

Localize to binauralize: Audio spatialization from visual sound source localization

KK Rachavarapu, V Sundaresha… - Proceedings of the …, 2021 - openaccess.thecvf.com
Videos with binaural audios provide an immersive viewing experience by enabling 3D
sound sensation. Recent works attempt to generate binaural audio in a multimodal learning …

Multi-speaker tracking from an audio–visual sensing device

X Qian, A Brutti, O Lanz, M Omologo… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Compact multi-sensor platforms are portable and thus desirable for robotics and personal-
assistance tasks. However, compared to physically distributed sensors, the size of these …