Oops! predicting unintentional action in video

D Epstein, B Chen, C Vondrick - Proceedings of the IEEE …, 2020‏ - openaccess.thecvf.com
From just a short glance at a video, we can often tell whether a person's action is intentional
or not. Can we train a model to recognize this? We introduce a dataset of in-the-wild videos …

Deep bingham networks: Dealing with uncertainty and ambiguity in pose estimation

H Deng, M Bui, N Navab, L Guibas, S Ilic… - International Journal of …, 2022‏ - Springer
In this work, we introduce Deep Bingham Networks (DBN), a generic framework that can
naturally handle pose-related uncertainties and ambiguities arising in almost all real life …

6d camera relocalization in ambiguous scenes via continuous multimodal inference

M Bui, T Birdal, H Deng, S Albarqouni, L Guibas… - Computer Vision–ECCV …, 2020‏ - Springer
We present a multimodal camera relocalization framework that captures ambiguities and
uncertainties with continuous mixture models defined on the manifold of camera poses. In …

Transferring knowledge from text to video: Zero-shot anticipation for procedural actions

F Sener, R Saraf, A Yao - IEEE transactions on pattern analysis …, 2022‏ - ieeexplore.ieee.org
Can we teach a robot to recognize and make predictions for activities that it has never seen
before? We tackle this problem by learning models for video from text. This paper presents a …

What and how? jointly forecasting human action and pose

Y Zhu, D Doermann, Y Zhang, Q Liu… - 2020 25th …, 2021‏ - ieeexplore.ieee.org
Forecasting human actions and motion trajectories address the problem of predicting what a
person is going to do next and how they will perform it. This is crucial in a wide range of …

Multimodal Human Action and Motion Prediction

Y Zhu - 2023‏ - search.proquest.com
Human action and motion trajectory prediction addresses the problem of determining what
people will do next and how they will perform it. It is critical in various applications such as …

Modelling Complex Activities from Visual and Textual Data

F Sener Merzbach - 2021‏ - bonndoc.ulb.uni-bonn.de
Complex activity videos are long-range videos composed of multiple sub-activities following
some temporal structuring and connected purpose. Recognizing human activities in such …