Lumiere: A space-time diffusion model for video generation
We introduce Lumiere–a text-to-video diffusion model designed for synthesizing videos that
portray realistic, diverse and coherent motion–a pivotal challenge in video synthesis. To this …
portray realistic, diverse and coherent motion–a pivotal challenge in video synthesis. To this …
Grm: Large gaussian reconstruction model for efficient 3d reconstruction and generation
We introduce GRM, a large-scale reconstructor capable of recovering a 3D asset from
sparse-view images in around 0.1 s. GRM is a feed-forward transformer-based model that …
sparse-view images in around 0.1 s. GRM is a feed-forward transformer-based model that …
Opportunities and challenges of diffusion models for generative AI
Diffusion models, a powerful and universal generative artificial intelligence technology, have
achieved tremendous success and opened up new possibilities in diverse applications. In …
achieved tremendous success and opened up new possibilities in diverse applications. In …
Gpt-4v (ision) is a human-aligned evaluator for text-to-3d generation
Despite recent advances in text-to-3D generative methods there is a notable absence of
reliable evaluation metrics. Existing metrics usually focus on a single criterion each such as …
reliable evaluation metrics. Existing metrics usually focus on a single criterion each such as …
CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets
In the realm of digital creativity, our potential to craft intricate 3D worlds from imagination is
often hampered by the limitations of existing digital tools, which demand extensive expertise …
often hampered by the limitations of existing digital tools, which demand extensive expertise …
4d-fy: Text-to-4d generation using hybrid score distillation sampling
Recent breakthroughs in text-to-4D generation rely on pre-trained text-to-image and text-to-
video models to generate dynamic 3D scenes. However current text-to-4D methods face a …
video models to generate dynamic 3D scenes. However current text-to-4D methods face a …
Reconfusion: 3d reconstruction with diffusion priors
Abstract 3D reconstruction methods such as Neural Radiance Fields (NeRFs) excel at
rendering photorealistic novel views of complex scenes. However recovering a high-quality …
rendering photorealistic novel views of complex scenes. However recovering a high-quality …
Dmv3d: Denoising multi-view diffusion using 3d large reconstruction model
We propose\textbf {DMV3D}, a novel 3D generation approach that uses a transformer-based
3D large reconstruction model to denoise multi-view diffusion. Our reconstruction model …
3D large reconstruction model to denoise multi-view diffusion. Our reconstruction model …
Emdm: Efficient motion diffusion model for fast and high-quality motion generation
Abstract We introduce Efficient Motion Diffusion Model (EMDM) for fast and high-quality
human motion generation. Current state-of-the-art generative diffusion models have …
human motion generation. Current state-of-the-art generative diffusion models have …
Diffusion model-based video editing: A survey
The rapid development of diffusion models (DMs) has significantly advanced image and
video applications, making" what you want is what you see" a reality. Among these, video …
video applications, making" what you want is what you see" a reality. Among these, video …