State of the art on diffusion models for visual computing

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024 - Wiley Online Library
The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

Wonder3d: Single image to 3d using cross-domain diffusion

X Long, YC Guo, C Lin, Y Liu, Z Dou… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this work we introduce Wonder3D a novel method for generating high-fidelity textured
meshes from single-view images with remarkable efficiency. Recent methods based on the …

Make-it-3d: High-fidelity 3d creation from a single image with diffusion prior

J Tang, T Wang, B Zhang, T Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this work, we investigate the problem of creating high-fidelity 3D content from only a single
image. This is inherently challenging: it essentially involves estimating the underlying 3D …

Stable video diffusion: Scaling latent video diffusion models to large datasets

A Blattmann, T Dockhorn, S Kulal… - arxiv preprint arxiv …, 2023 - arxiv.org
We present Stable Video Diffusion-a latent video diffusion model for high-resolution, state-of-
the-art text-to-video and image-to-video generation. Recently, latent diffusion models trained …

Syncdreamer: Generating multiview-consistent images from a single-view image

Y Liu, C Lin, Z Zeng, X Long, L Liu, T Komura… - arxiv preprint arxiv …, 2023 - arxiv.org
In this paper, we present a novel diffusion model called that generates multiview-consistent
images from a single-view image. Using pretrained large-scale 2D diffusion models, recent …

Grm: Large gaussian reconstruction model for efficient 3d reconstruction and generation

Y Xu, Z Shi, W Yifan, H Chen, C Yang, S Peng… - … on Computer Vision, 2024 - Springer
We introduce GRM, a large-scale reconstructor capable of recovering a 3D asset from
sparse-view images in around 0.1 s. GRM is a feed-forward transformer-based model that …

Text2room: Extracting textured 3d meshes from 2d text-to-image models

L Höllein, A Cao, A Owens… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We present Text2Room, a method for generating room-scale textured 3D meshes
from a given text prompt as input. To this end, we leverage pre-trained 2D text-to-image …

Sparsefusion: Distilling view-conditioned diffusion for 3d reconstruction

Z Zhou, S Tulsiani - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
We propose SparseFusion, a sparse view 3D reconstruction approach that unifies recent
advances in neural rendering and probabilistic image generation. Existing approaches …

Diffrf: Rendering-guided 3d radiance field diffusion

N Müller, Y Siddiqui, L Porzi, SR Bulo… - Proceedings of the …, 2023 - openaccess.thecvf.com
We introduce DiffRF, a novel approach for 3D radiance field synthesis based on denoising
diffusion probabilistic models. While existing diffusion-based methods operate on images …

Single-stage diffusion nerf: A unified approach to 3d generation and reconstruction

H Chen, J Gu, A Chen, W Tian, Z Tu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract 3D-aware image synthesis encompasses a variety of tasks, such as scene
generation and novel view synthesis from images. Despite numerous task-specific methods …