Epic-sounds: A large-scale dataset of actions that sound

J Huh, J Chalk, E Kazakos, D Damen… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
We introduce EPIC-SOUNDS, a large-scale dataset of audio annotations capturing temporal
extents and class labels within the audio stream of the egocentric videos from EPIC …

Egochoir: Capturing 3d human-object interaction regions from egocentric views

Y Yang, W Zhai, C Wang, C Yu… - Advances in Neural …, 2025 - proceedings.neurips.cc
Understanding egocentric human-object interaction (HOI) is a fundamental aspect of human-
centric perception, facilitating applications like AR/VR and embodied AI. For the egocentric …

Action2sound: Ambient-aware generation of action sounds from egocentric videos

C Chen, P Peng, A Baid, Z Xue, WN Hsu… - … on Computer Vision, 2024 - Springer
Generating realistic audio for human actions is important for many applications, such as
creating sound effects for films or virtual reality games. Existing approaches implicitly …

ANAVI: Audio Noise Awareness using Visuals of Indoor environments for NAVIgation

V Jain, R Veerapaneni, Y Bisk - arxiv preprint arxiv:2410.18932, 2024 - arxiv.org
We propose Audio Noise Awareness using Visuals of Indoors for NAVIgation for quieter
robot path planning. While humans are naturally aware of the noise they make and its …

About Time: Advances, Challenges, and Outlooks of Action Understanding

A Stergiou, R Poppe - arxiv preprint arxiv:2411.15106, 2024 - arxiv.org
We have witnessed impressive advances in video action understanding. Increased dataset
sizes, variability, and computation availability have enabled leaps in performance and task …