One-2-3-45: Any single image to 3d mesh in 45 seconds without per-shape optimization
Single image 3D reconstruction is an important but challenging task that requires extensive
knowledge of our natural world. Many existing methods solve this problem by optimizing a …
knowledge of our natural world. Many existing methods solve this problem by optimizing a …
Motiondirector: Motion customization of text-to-video diffusion models
Large-scale pre-trained diffusion models have exhibited remarkable capabilities in diverse
video generations. Given a set of video clips of the same motion concept, the task of Motion …
video generations. Given a set of video clips of the same motion concept, the task of Motion …
CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets
In the realm of digital creativity, our potential to craft intricate 3D worlds from imagination is
often hampered by the limitations of existing digital tools, which demand extensive expertise …
often hampered by the limitations of existing digital tools, which demand extensive expertise …
Object 3dit: Language-guided 3d-aware image editing
Existing image editing tools, while powerful, typically disregard the underlying 3D geometry
from which the image is projected. As a result, edits made using these tools may become …
from which the image is projected. As a result, edits made using these tools may become …
Michelangelo: Conditional 3d shape generation based on shape-image-text aligned latent representation
We present a novel alignment-before-generation approach to tackle the challenging task of
generating general 3D shapes based on 2D images or texts. Directly learning a conditional …
generating general 3D shapes based on 2D images or texts. Directly learning a conditional …
Diversify your vision datasets with automatic diffusion-based augmentation
Many fine-grained classification tasks, like rare animal identification, have limited training
data and consequently classifiers trained on these datasets often fail to generalize to …
data and consequently classifiers trained on these datasets often fail to generalize to …
Prompting AI art: An investigation into the creative skill of prompt engineering
We are witnessing a novel era of creativity where anyone can create digital content via
prompt-based learning (known as prompt engineering). This article investigates prompt …
prompt-based learning (known as prompt engineering). This article investigates prompt …
Unsupervised semantic correspondence using stable diffusion
Text-to-image diffusion models are now capable of generating images that are often
indistinguishable from real images. To generate such images, these models must …
indistinguishable from real images. To generate such images, these models must …
Linguistic binding in diffusion models: Enhancing attribute correspondence through attention map alignment
Text-conditioned image generation models often generate incorrect associations between
entities and their visual attributes. This reflects an impaired map** between linguistic …
entities and their visual attributes. This reflects an impaired map** between linguistic …
Matte anything: Interactive natural image matting with segment anything model
Natural image matting algorithms aim to predict the transparency map (alpha-matte) with the
trimap guidance. However, the production of trimap often requires significant labor, which …
trimap guidance. However, the production of trimap often requires significant labor, which …