Audio-visual speaker tracking: Progress, challenges, and future directions

J Zhao, Y Xu, X Qian, D Berghi, P Wu, M Cui… - arxiv preprint arxiv …, 2023 - arxiv.org
Audio-visual speaker tracking has drawn increasing attention over the past few years due to
its academic values and wide application. Audio and visual modalities can provide …

AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness

Y Yang, S Yuan, M Cao, J Yang… - 2023 IEEE/RSJ …, 2023 - ieeexplore.ieee.org
In this study, we introduce AV-PedAware, a self-supervised audio-visual fusion system
designed to improve dynamic pedestrian awareness for robotics applications. Pedestrian …

Audio-Visual Talker Localization in Video for Spatial Sound Reproduction

D Berghi, PJB Jackson - arxiv preprint arxiv:2406.00495, 2024 - arxiv.org
Object-based audio production requires the positional metadata to be defined for each point-
source object, including the key elements in the foreground of the sound scene. In many …