Crm: Single image to 3d textured mesh with convolutional reconstruction model

Z Wang, Y Wang, Y Chen, C **ang, S Chen… - … on Computer Vision, 2024 - Springer
Feed-forward 3D generative models like the Large Reconstruction Model (LRM) have
demonstrated exceptional generation speed. However, the transformer-based methods do …

Tc4d: Trajectory-conditioned text-to-4d generation

S Bahmani, X Liu, W Yifan, I Skorokhodov… - … on Computer Vision, 2024 - Springer
Recent techniques for text-to-4D generation synthesize dynamic 3D scenes using
supervision from pre-trained text-to-video models. However, existing representations, such …

Unidream: Unifying diffusion priors for relightable text-to-3d generation

Z Liu, Y Li, Y Lin, X Yu, S Peng, YP Cao, X Qi… - … on Computer Vision, 2024 - Springer
Recent advancements in text-to-3D generation technology have significantly advanced the
conversion of textual descriptions into imaginative well-geometrical and finely textured 3D …

Vd3d: Taming large video diffusion transformers for 3d camera control

S Bahmani, I Skorokhodov, A Siarohin… - arxiv preprint arxiv …, 2024 - arxiv.org
Modern text-to-video synthesis models demonstrate coherent, photorealistic generation of
complex videos from a text description. However, most existing models lack fine-grained …

Instantmesh: Efficient 3d mesh generation from a single image with sparse-view large reconstruction models

J Xu, W Cheng, Y Gao, X Wang, S Gao… - arxiv preprint arxiv …, 2024 - arxiv.org
We present InstantMesh, a feed-forward framework for instant 3D mesh generation from a
single image, featuring state-of-the-art generation quality and significant training scalability …

Sv4d: Dynamic 3d content generation with multi-frame and multi-view consistency

Y **e, CH Yao, V Voleti, H Jiang, V Jampani - arxiv preprint arxiv …, 2024 - arxiv.org
We present Stable Video 4D (SV4D), a latent video diffusion model for multi-frame and multi-
view consistent dynamic 3D content generation. Unlike previous methods that rely on …

Learning-based multi-view stereo: a survey

F Wang, Q Zhu, D Chang, Q Gao, J Han… - arxiv preprint arxiv …, 2024 - arxiv.org
3D reconstruction aims to recover the dense 3D structure of a scene. It plays an essential
role in various applications such as Augmented/Virtual Reality (AR/VR), autonomous driving …

Scaledreamer: Scalable text-to-3d synthesis with asynchronous score distillation

Z Ma, Y Wei, Y Zhang, X Zhu, Z Lei, L Zhang - European Conference on …, 2024 - Springer
By leveraging the text-to-image diffusion prior, score distillation can synthesize 3D contents
without paired text-3D training data. Instead of spending hours of online optimization per text …

Im-3d: Iterative multiview diffusion and reconstruction for high-quality 3d generation

L Melas-Kyriazi, I Laina, C Rupprecht… - arxiv preprint arxiv …, 2024 - arxiv.org
Most text-to-3D generators build upon off-the-shelf text-to-image models trained on billions
of images. They use variants of Score Distillation Sampling (SDS), which is slow, somewhat …

Scube: Instant large-scale scene reconstruction using voxsplats

X Ren, Y Lu, H Liang, Z Wu, H Ling, M Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
We present SCube, a novel method for reconstructing large-scale 3D scenes (geometry,
appearance, and semantics) from a sparse set of posed images. Our method encodes …