[HTML][HTML] Localization of sound sources in robotics: A review

C Rascon, I Meza - Robotics and Autonomous Systems, 2017 - Elsevier
Sound source localization (SSL) in a robotic platform has been essential in the overall
scheme of robot audition. It allows a robot to locate a sound source by sound alone. It has an …

A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweight: application to robust clustering

F Forbes, D Wraith - Statistics and computing, 2014 - Springer
We propose a family of multivariate heavy-tailed distributions that allow variable marginal
amounts of tailweight. The originality comes from introducing multidimensional instead of …

Multi-speaker tracking from an audio–visual sensing device

X Qian, A Brutti, O Lanz, M Omologo… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Compact multi-sensor platforms are portable and thus desirable for robotics and personal-
assistance tasks. However, compared to physically distributed sensors, the size of these …

The vernissage corpus: A conversational human-robot-interaction dataset

DB Jayagopi, S Sheiki, D Klotz… - 2013 8th ACM/IEEE …, 2013 - ieeexplore.ieee.org
We introduce a new conversational Human-Robot-Interaction (HRI) dataset with a real-
behaving robot inducing interactive behavior with and between humans. Our scenario …

RAVEL: An annotated corpus for training robots with audiovisual abilities

X Alameda-Pineda, J Sanchez-Riera, J Wienke… - Journal on Multimodal …, 2013 - Springer
Abstract We introduce Ravel (Robots with Audiovisual Abilities), a publicly available data set
which covers examples of Human Robot Interaction (HRI) scenarios. These scenarios are …

Audio-visual speaker tracking: Progress, challenges, and future directions

J Zhao, Y Xu, X Qian, D Berghi, P Wu, M Cui… - arxiv preprint arxiv …, 2023 - arxiv.org
Audio-visual speaker tracking has drawn increasing attention over the past few years due to
its academic values and wide application. Audio and visual modalities can provide …

Vision-guided robot hearing

X Alameda-Pineda, R Horaud - The International Journal of …, 2015 - journals.sagepub.com
Natural human–robot interaction (HRI) in complex and unpredictable environments is
important with many potential applications. While vision-based HRI has been thoroughly …

Tragic Talkers: A Shakespearean sound-and light-field dataset for audio-visual machine learning research

D Berghi, M Volino, PJB Jackson - Proceedings of the 19th ACM …, 2022 - dl.acm.org
3D audio-visual production aims to deliver immersive and interactive experiences to the
consumer. Yet, faithfully reproducing real-world 3D scenes remains a challenging task. This …

Conjugate mixture models for clustering multimodal data

V Khalidov, F Forbes, R Horaud - Neural Computation, 2011 - ieeexplore.ieee.org
The problem of multimodal clustering arises whenever the data are gathered with several
physically different sensors. Observations from different modalities are not necessarily …

[PDF][PDF] The Sheffield wargames corpus

CW Fox, Y Liu, E Zwyssig… - … of Interspeech 2013, 2013 - eprints.whiterose.ac.uk
Recognition of speech in natural environments is a challenging task, even more so if this
involves conversations between several speakers. Work on meeting recognition has …