Pdpp: Projected diffusion for procedure planning in instructional videos

H Wang, Y Wu, S Guo, L Wang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
In this paper, we study the problem of procedure planning in instructional videos, which aims
to make goal-directed plans given the current visual observations in unstructured real-life …

Uncertainty-aware action decoupling transformer for action anticipation

H Guo, N Agarwal, SY Lo, K Lee… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Human action anticipation aims at predicting what people will do in the future based on past
observations. In this paper we introduce Uncertainty-aware Action Decoupling Transformer …

Intention-conditioned long-term human egocentric action anticipation

EV Mascaró, H Ahn, D Lee - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
To anticipate how a person would act in the future, it is essential to understand the human
intention since it guides the subject towards a certain action. In this paper, we propose a …

Interaction region visual transformer for egocentric action anticipation

D Roy, R Rajendiran… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Human-object interaction (HOI) and temporal dynamics along the motion paths are the most
important visual cues for egocentric action anticipation. Especially, interaction regions …

A stroke of genius: Predicting the next move in badminton

M Ibh, S Graßhof, DW Hansen - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
This paper presents a transformer encoder-decoder model for predicting future badminton
strokes based on previous rally actions. The model uses court position skeleton poses and …

Predicting the next action by modeling the abstract goal

D Roy, B Fernando - International Conference on Pattern Recognition, 2025 - Springer
The problem of predicting human actions from observed videos is an inherently uncertain
one. We present an action anticipation model that leverages latent goal information to …

Pretrained language models as visual planners for human assistance

D Patel, H Eghbalzadeh, N Kamra… - Proceedings of the …, 2023 - openaccess.thecvf.com
In our pursuit of advancing multi-modal AI assistants capable of guiding users to achieve
complex multi-step goals, we propose the task of'Visual Planning for Assistance (VPA)' …

Pear: Phrase-based hand-object interaction anticipation

Z Zhang, H Luo, W Zhai, Y Cao, Y Kang - ar** action forecasting
I González-Díaz, M Molina-Moreno… - IEEE Journal of …, 2024 - hal.science
This work tackles the problem of automatically predicting the gras** intention of humans
observing their environment, with eye-tracker glasses and video cameras recording the …