Chop & learn: Recognizing and generating object-state compositions

N Saini, H Wang, A Swaminathan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recognizing and generating object-state compositions has been a challenging task,
especially when generalizing to unseen compositions. In this paper, we study the task of …

Learning object state changes in videos: An open-world perspective

Z Xue, K Ashutosh, K Grauman - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Object State Changes (OSCs) are pivotal for video understanding. While humans
can effortlessly generalize OSC understanding from familiar to unknown objects current …

Towards scalable neural representation for diverse videos

B He, X Yang, H Wang, Z Wu, H Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Implicit neural representations (INR) have gained increasing attention in representing 3D
scenes and images, and have been recently applied to encode videos (eg, NeRV, E-NeRV) …

Multi-task learning of object states and state-modifying actions from web videos

T Soucek, JB Alayrac, A Miech, I Laptev… - IEEE Transactions on …, 2024 - computer.org
We aim to learn to temporally localize object state changes and the corresponding state-
modifying actions by observing people interacting with objects in long uncurated web …

Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes

G Shrivastava, A Shrivastava - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Diffusion models have made significant strides in image generation mastering tasks such as
unconditional image synthesis text-image translation and image-to-image conversions …

Simpson: Simplifying photo cleanup with single-click distracting object segmentation network

C Huynh, Y Zhou, Z Lin, C Barnes… - Proceedings of the …, 2023 - openaccess.thecvf.com
In photo editing, it is common practice to remove visual distractions to improve the overall
image quality and highlight the primary subject. However, manually selecting and removing …

Multi-task learning of object state changes from uncurated videos

T Souček, JB Alayrac, A Miech, I Laptev… - arxiv preprint arxiv …, 2022 - arxiv.org
We aim to learn to temporally localize object state changes and the corresponding state-
modifying actions by observing people interacting with objects in long uncurated web …

Multi-task learning of object states and state-modifying actions from web videos

T Souček, JB Alayrac, A Miech… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
We aim to learn to temporally localize object state changes and the corresponding state-
modifying actions by observing people interacting with objects in long uncurated web …

Beyond Seen Primitive Concepts and Attribute-Object Compositional Learning

N Saini, K Pham, A Shrivastava - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Learning from seen attribute-object pairs to generalize to unseen compositions has been
studied extensively in Compositional Zero-Shot Learning (CZSL). However CZSL setup is …

Video decomposition prior: Editing videos layer by layer

G Shrivastava, SN Lim, A Shrivastava - The Twelfth International …, 2024 - openreview.net
In the evolving landscape of video editing methodologies, a majority of deep learning
techniques are often reliant on extensive datasets of observed input and ground truth …