Multimodal image synthesis and editing: A survey and taxonomy

F Zhan, Y Yu, R Wu, J Zhang, S Lu, L Liu… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
As information exists in various modalities in real world, effective interaction and fusion
among multimodal information plays a key role for the creation and perception of multimodal …

Syncdreamer: Generating multiview-consistent images from a single-view image

Y Liu, C Lin, Z Zeng, X Long, L Liu, T Komura… - arxiv preprint arxiv …, 2023 - arxiv.org
In this paper, we present a novel diffusion model called that generates multiview-consistent
images from a single-view image. Using pretrained large-scale 2D diffusion models, recent …

Grm: Large gaussian reconstruction model for efficient 3d reconstruction and generation

Y Xu, Z Shi, W Yifan, H Chen, C Yang, S Peng… - … on Computer Vision, 2024 - Springer
We introduce GRM, a large-scale reconstructor capable of recovering a 3D asset from
sparse-view images in around 0.1 s. GRM is a feed-forward transformer-based model that …

Sdfusion: Multimodal 3d shape completion, reconstruction, and generation

YC Cheng, HY Lee, S Tulyakov… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this work, we present a novel framework built to simplify 3D asset generation for amateur
users. To enable interactive generation, our method supports a variety of input modalities …

A survey on deep generative 3d-aware image synthesis

W **a, JH Xue - ACM Computing Surveys, 2023 - dl.acm.org
Recent years have seen remarkable progress in deep learning powered visual content
creation. This includes deep generative 3D-aware image synthesis, which produces high …

Gaussian shell maps for efficient 3d human generation

R Abdal, W Yifan, Z Shi, Y Xu, R Po… - Proceedings of the …, 2024 - openaccess.thecvf.com
Efficient generation of 3D digital humans is important in several industries including virtual
reality social media and cinematic production. 3D generative adversarial networks (GANs) …

Autodecoding latent 3d diffusion models

E Ntavelis, A Siarohin, K Olszewski… - Advances in …, 2023 - proceedings.neurips.cc
Diffusion-based methods have shown impressive visual results in the text-to-image domain.
They first learn a latent space using an autoencoder, then run a denoising process on the …

3d-aware image generation using 2d diffusion models

J **ang, J Yang, B Huang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
In this paper, we introduce a novel 3D-aware image generation method that leverages 2D
diffusion models. We formulate the 3D-aware image generation task as multiview 2D image …

Dmv3d: Denoising multi-view diffusion using 3d large reconstruction model

Y Xu, H Tan, F Luan, S Bi, P Wang, J Li, Z Shi… - arxiv preprint arxiv …, 2023 - arxiv.org
We propose\textbf {DMV3D}, a novel 3D generation approach that uses a transformer-based
3D large reconstruction model to denoise multi-view diffusion. Our reconstruction model …

Text2tex: Text-driven texture synthesis via diffusion models

DZ Chen, Y Siddiqui, HY Lee… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We present Text2Tex, a novel method for generating high-quality textures for 3D
meshes from the given text prompts. Our method incorporates inpainting into a pre-trained …