One-2-3-45: Any single image to 3d mesh in 45 seconds without per-shape optimization

M Liu, C Xu, H **, L Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Single image 3D reconstruction is an important but challenging task that requires extensive
knowledge of our natural world. Many existing methods solve this problem by optimizing a …

Motiondirector: Motion customization of text-to-video diffusion models

R Zhao, Y Gu, JZ Wu, DJ Zhang, JW Liu, W Wu… - … on Computer Vision, 2024 - Springer
Large-scale pre-trained diffusion models have exhibited remarkable capabilities in diverse
video generations. Given a set of video clips of the same motion concept, the task of Motion …

CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets

L Zhang, Z Wang, Q Zhang, Q Qiu, A Pang… - ACM Transactions on …, 2024 - dl.acm.org
In the realm of digital creativity, our potential to craft intricate 3D worlds from imagination is
often hampered by the limitations of existing digital tools, which demand extensive expertise …

Object 3dit: Language-guided 3d-aware image editing

O Michel, A Bhattad, E VanderBilt… - Advances in …, 2024 - proceedings.neurips.cc
Existing image editing tools, while powerful, typically disregard the underlying 3D geometry
from which the image is projected. As a result, edits made using these tools may become …

Michelangelo: Conditional 3d shape generation based on shape-image-text aligned latent representation

Z Zhao, W Liu, X Chen, X Zeng… - Advances in …, 2024 - proceedings.neurips.cc
We present a novel alignment-before-generation approach to tackle the challenging task of
generating general 3D shapes based on 2D images or texts. Directly learning a conditional …

Diversify your vision datasets with automatic diffusion-based augmentation

L Dunlap, A Umino, H Zhang, J Yang… - Advances in neural …, 2023 - proceedings.neurips.cc
Many fine-grained classification tasks, like rare animal identification, have limited training
data and consequently classifiers trained on these datasets often fail to generalize to …

Prompting AI art: An investigation into the creative skill of prompt engineering

J Oppenlaender, R Linder… - International Journal of …, 2024 - Taylor & Francis
We are witnessing a novel era of creativity where anyone can create digital content via
prompt-based learning (known as prompt engineering). This article investigates prompt …

Unsupervised semantic correspondence using stable diffusion

E Hedlin, G Sharma, S Mahajan… - Advances in …, 2024 - proceedings.neurips.cc
Text-to-image diffusion models are now capable of generating images that are often
indistinguishable from real images. To generate such images, these models must …

Linguistic binding in diffusion models: Enhancing attribute correspondence through attention map alignment

R Rassin, E Hirsch, D Glickman… - Advances in …, 2024 - proceedings.neurips.cc
Text-conditioned image generation models often generate incorrect associations between
entities and their visual attributes. This reflects an impaired map** between linguistic …

Matte anything: Interactive natural image matting with segment anything model

J Yao, X Wang, L Ye, W Liu - Image and Vision Computing, 2024 - Elsevier
Natural image matting algorithms aim to predict the transparency map (alpha-matte) with the
trimap guidance. However, the production of trimap often requires significant labor, which …