Instructpix2pix: Learning to follow image editing instructions

T Brooks, A Holynski, AA Efros - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We propose a method for editing images from human instructions: given an input image and
a written instruction that tells the model what to do, our model follows these instructions to …

Improving factuality and reasoning in language models through multiagent debate

Y Du, S Li, A Torralba, JB Tenenbaum… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable capabilities in language
generation, understanding, and few-shot learning in recent years. An extensive body of work …

Erasing concepts from diffusion models

R Gandikota, J Materzynska… - Proceedings of the …, 2023 - openaccess.thecvf.com
Motivated by concerns that large-scale diffusion models can produce undesirable output
such as sexually explicit content or copyrighted artistic styles, we study erasure of specific …

Compositional visual generation with composable diffusion models

N Liu, S Li, Y Du, A Torralba, JB Tenenbaum - European Conference on …, 2022 - Springer
Large text-guided diffusion models, such as DALLE-2, are able to generate stunning
photorealistic images given natural language descriptions. While such models are highly …

Planning with diffusion for flexible behavior synthesis

M Janner, Y Du, JB Tenenbaum, S Levine - arxiv preprint arxiv …, 2022 - arxiv.org
Model-based reinforcement learning methods often use learning only for the purpose of
estimating an approximate dynamics model, offloading the rest of the decision-making work …

Conceptgraphs: Open-vocabulary 3d scene graphs for perception and planning

Q Gu, A Kuwajerwala, S Morin… - … on Robotics and …, 2024 - ieeexplore.ieee.org
For robots to perform a wide variety of tasks, they require a 3D representation of the world
that is semantically rich, yet compact and efficient for task-driven perception and planning …

Diffusion models as plug-and-play priors

A Graikos, N Malkin, N Jojic… - Advances in Neural …, 2022 - proceedings.neurips.cc
We consider the problem of inferring high-dimensional data $ x $ in a model that consists of
a prior $ p (x) $ and an auxiliary differentiable constraint $ c (x, y) $ on $ x $ given some …

Foundation models for decision making: Problems, methods, and opportunities

S Yang, O Nachum, Y Du, J Wei, P Abbeel… - arxiv preprint arxiv …, 2023 - arxiv.org
Foundation models pretrained on diverse data at scale have demonstrated extraordinary
capabilities in a wide range of vision and language tasks. When such models are deployed …

Reduce, reuse, recycle: Compositional generation with energy-based diffusion models and mcmc

Y Du, C Durkan, R Strudel… - International …, 2023 - proceedings.mlr.press
Since their introduction, diffusion models have quickly become the prevailing approach to
generative modeling in many domains. They can be interpreted as learning the gradients of …

Teaching clip to count to ten

R Paiss, A Ephrat, O Tov, S Zada… - Proceedings of the …, 2023 - openaccess.thecvf.com
Large vision-language models, such as CLIP, learn robust representations of text and
images, facilitating advances in many downstream tasks, including zero-shot classification …