Actionvos: Actions as prompts for video object segmentation

L Ouyang, R Liu, Y Huang, R Furuta, Y Sato - European Conference on …, 2024 - Springer
Delving into the realm of egocentric vision, the advancement of referring video object
segmentation (RVOS) stands as pivotal in understanding human activities. However …

MADiff: Motion-aware mamba diffusion models for hand trajectory prediction on egocentric videos

J Ma, X Chen, W Bao, J Xu, H Wang - arxiv preprint arxiv:2409.02638, 2024 - arxiv.org
Understanding human intentions and actions through egocentric videos is important on the
path to embodied artificial intelligence. As a branch of egocentric vision techniques, hand …

Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos

J Ma, J Xu, X Chen, H Wang - arxiv preprint arxiv:2405.04370, 2024 - arxiv.org
Understanding how humans would behave during hand-object interaction is vital for
applications in service robot manipulation and extended reality. To achieve this, some …

PooDLe: Pooled and dense self-supervised learning from naturalistic videos

AN Wang, C Hoang, Y **ong, Y LeCun… - arxiv preprint arxiv …, 2024 - arxiv.org
Self-supervised learning has driven significant progress in learning from single-subject,
iconic images. However, there are still unanswered questions about the use of minimally …