Egotv: Egocentric task verification from natural language task descriptions

R Hazra, B Chen, A Rai, N Kamra… - Proceedings of the …, 2023 - openaccess.thecvf.com
To enable progress towards egocentric agents capable of understanding everyday tasks
specified in natural language, we propose a benchmark and a synthetic dataset called …

Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos

KRY Nagasinghe, H Zhou… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this paper we explore the capability of an agent to construct a logical sequence of action
steps thereby assembling a strategic procedural plan. This plan is crucial for navigating from …

Pretrained language models as visual planners for human assistance

D Patel, H Eghbalzadeh, N Kamra… - Proceedings of the …, 2023 - openaccess.thecvf.com
In our pursuit of advancing multi-modal AI assistants capable of guiding users to achieve
complex multi-step goals, we propose the task of'Visual Planning for Assistance (VPA)' …

Box2Flow: Instance-Based Action Flow Graphs from Videos

J Li, K Basioti, V Pavlovic - International Conference on Pattern …, 2024 - Springer
A large amount of procedural videos on the web show how to complete various tasks. These
tasks can often be accomplished in different ways and step orderings, with some steps able …

EgoTV: Egocentric Task Verificationfrom Natural Language Task Descriptions

R Hazra, B Chen, A Rai, N Kamra… - … Conference on Computer …, 2023 - diva-portal.org
To enable progress towards egocentric agents capable of understanding everyday tasks
specified in natural language, we propose a benchmark and a synthetic dataset called …

Learning human actions on-demand based on graph theory

J Zhang - 2024 - odr.chalmers.se
Collaborative robots (Cobots) are designed to work side-by-side with humans, sharing
space and skills to achieve common goals. However, as human tasks become increasingly …