Learning object state changes in videos: An open-world perspective

Z Xue, K Ashutosh, K Grauman - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Object State Changes (OSCs) are pivotal for video understanding. While humans
can effortlessly generalize OSC understanding from familiar to unknown objects current …

Genhowto: Learning to generate actions and state transformations from instructional videos

T Souček, D Damen, M Wray… - Proceedings of the …, 2024 - openaccess.thecvf.com
We address the task of generating temporally consistent and physically plausible images of
actions and object state transformations. Given an input image and a text prompt describing …

Genhowto: Learning to generate actions and state transformations from instructional videos

T Souček, D Damen, M Wray, I Laptev… - 2024 IEEE/CVF …, 2024 - ieeexplore.ieee.org
We address the task of generating temporally consistent and physically plausible images of
actions and object state transformations. Given an input image and a text prompt describing …

PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns

S Ning, D Wang, Y Qin, Z **… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this paper we propose a novel virtual try-on from unconstrained designs (ucVTON) task to
enable photorealistic synthesis of personalized composite clothing on input human images …

Active Object Detection with Knowledge Aggregation and Distillation from Large Models

D Yang, Y Liu - Proceedings of the IEEE/CVF Conference …, 2024 - openaccess.thecvf.com
Accurately detecting active objects undergoing state changes is essential for
comprehending human interactions and facilitating decision-making. The existing methods …

SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos

Y Niu, W Guo, L Chen, X Lin, SF Chang - arxiv preprint arxiv:2403.01599, 2024 - arxiv.org
We study the problem of procedure planning in instructional videos, which aims to make a
goal-oriented sequence of action steps given partial visual state observations. The …

[PDF][PDF] Exploring the impact of knowledge graphs on zero-shot visual object state classification

F Gouidis, K Papoutsakis, T Patkos, A Argyros… - Proceedings …, 2024 - scitepress.org
In this work, we explore the potential of Knowledge Graphs (KGs) towards an effective Zero-
Shot Learning (ZSL) approach for Object State Classification (OSC) in images. For this …

OSCaR: Object State Captioning and State Change Representation

N Nguyen, J Bi, A Vosoughi, Y Tian, P Fazli… - arxiv preprint arxiv …, 2024 - arxiv.org
The capability of intelligent models to extrapolate and comprehend changes in object states
is a crucial yet demanding aspect of AI research, particularly through the lens of human …

Learning Object States from Actions via Large Language Models

M Tateno, T Yagi, R Furuta, Y Sato - arxiv preprint arxiv:2405.01090, 2024 - arxiv.org
Temporally localizing the presence of object states in videos is crucial in understanding
human activities beyond actions and objects. This task has suffered from a lack of training …

Real-world cooking robot system from recipes based on food state recognition using foundation models and PDDL

N Kanazawa, K Kawaharazuka, Y Obinata… - Advanced …, 2024 - Taylor & Francis
Although there is a growing demand for cooking behaviors as one of the expected tasks for
robots, a series of cooking behaviors based on new recipe descriptions by robots in the real …