Richdreamer: A generalizable normal-depth diffusion model for detail richness in text-to-3d

L Qiu, G Chen, X Gu, Q Zuo, M Xu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Lifting 2D diffusion for 3D generation is a challenging problem due to the lack of geometric
prior and the complex entanglement of materials and lighting in natural images. Existing …

Midas v3. 1--a model zoo for robust monocular relative depth estimation

R Birkl, D Wofk, M Müller - arxiv preprint arxiv:2307.14460, 2023 - arxiv.org
We release MiDaS v3. 1 for monocular depth estimation, offering a variety of new models
based on different encoder backbones. This release is motivated by the success of …

Controlroom3d: Room generation using semantic proxy rooms

J Schult, S Tsai, L Höllein, B Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Manually creating 3D environments for AR/VR applications is a complex process requiring
expert knowledge in 3D modeling software. Pioneering works facilitate this process by …

Towards text-guided 3d scene composition

Q Zhang, C Wang, A Siarohin… - Proceedings of the …, 2024 - openaccess.thecvf.com
We are witnessing significant breakthroughs in the technology for generating 3D objects
from text. Existing approaches either leverage large text-to-image models to optimize a 3D …

G3dr: Generative 3d reconstruction in imagenet

P Reddy, I Elezi, J Deng - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
We introduce a novel 3D generative method Generative 3D Reconstruction (G3DR) in
ImageNet capable of generating diverse and high-quality 3D objects from single images …

[หนังสือ][B] SparseGS: Real-time 360° sparse view synthesis using Gaussian splatting

H **ong - 2024 - search.proquest.com
The problem of novel view synthesis has grown significantly in popularity recently with the
introduction of Neural Radiance Fields (NeRFs) and other implicit scene representation …

Idol: Unified dual-modal latent diffusion for human-centric joint video-depth generation

Y Zhai, K Lin, L Li, CC Lin, J Wang, Z Yang… - … on Computer Vision, 2024 - Springer
Significant advances have been made in human-centric video generation, yet the joint video-
depth generation problem remains underexplored. Most existing monocular depth …

Exploiting the signal-leak bias in diffusion models

MN Everaert, A Fitsios, M Bocchio… - Proceedings of the …, 2024 - openaccess.thecvf.com
There is a bias in the inference pipeline of most diffusion models. This bias arises from a
signal leak whose distribution deviates from the noise distribution, creating a discrepancy …

Diffusion priors for dynamic view synthesis from monocular videos

C Wang, P Zhuang, A Siarohin, J Cao, G Qian… - arxiv preprint arxiv …, 2024 - arxiv.org
Dynamic novel view synthesis aims to capture the temporal evolution of visual content within
videos. Existing methods struggle to distinguishing between motion and structure …

invs: Repurposing diffusion inpainters for novel view synthesis

Y Kant, A Siarohin, M Vasilkovsky, RA Guler… - SIGGRAPH Asia 2023 …, 2023 - dl.acm.org
In this paper, we present a method for generating consistent novel views from a single
source image. Our approach focuses on maximizing the reuse of visible pixels from the …