Magic123: One image to high-quality 3d object generation using both 2d and 3d diffusion priors

G Qian, J Mai, A Hamdi, J Ren, A Siarohin, B Li… - arxiv preprint arxiv …, 2023 - arxiv.org
We present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D
meshes generation from a single unposed image in the wild using both2D and 3D priors. In …

Sit: Exploring flow and diffusion-based generative models with scalable interpolant transformers

N Ma, M Goldstein, MS Albergo, NM Boffi… - … on Computer Vision, 2024 - Springer
Abstract We present Scalable Interpolant Transformers (SiT), a family of generative models
built on the backbone of Diffusion Transformers (DiT). The interpolant framework, which …

4dgen: Grounded 4d content generation with spatial-temporal consistency

Y Yin, D Xu, Z Wang, Y Zhao, Y Wei - arxiv preprint arxiv:2312.17225, 2023 - arxiv.org
Aided by text-to-image and text-to-video diffusion models, existing 4D content creation
pipelines utilize score distillation sampling to optimize the entire dynamic 3D scene …

Learning the 3D Fauna of the Web

Z Li, D Litvak, R Li, Y Zhang, T Jakab… - Proceedings of the …, 2024 - openaccess.thecvf.com
Learning 3D models of all animals in nature requires massively scaling up existing
solutions. With this ultimate goal in mind we develop 3D-Fauna an approach that learns a …

Artic3d: Learning robust articulated 3d shapes from noisy web image collections

CH Yao, A Raj, WC Hung… - Advances in …, 2024 - proceedings.neurips.cc
Estimating 3D articulated shapes like animal bodies from monocular images is inherently
challenging due to the ambiguities of camera viewpoint, pose, texture, lighting, etc. We …

Dragapart: Learning a part-level motion prior for articulated objects

R Li, C Zheng, C Rupprecht, A Vedaldi - European Conference on …, 2024 - Springer
We introduce DragAPart, a method that, given an image and a set of drags as input,
generates a new image of the same object that responds to the action of the drags …

Animatabledreamer: Text-guided non-rigid 3d model generation and reconstruction with canonical score distillation

X Wang, Y Wang, J Ye, F Sun, Z Wang, L Wang… - … on Computer Vision, 2024 - Springer
Advances in 3D generation have facilitated sequential 3D model generation (aka 4D
generation), yet its application for animatable objects with large motion remains scarce. Our …

Animal avatars: Reconstructing animatable 3D animals from casual videos

R Sabathier, NJ Mitra, D Novotny - European Conference on Computer …, 2024 - Springer
We present a method to build animatable dog avatars from monocular videos. This is
challenging as animals display a range of (unpredictable) non-rigid movements and have a …

VAREN: Very Accurate and Realistic Equine Network

S Zuffi, Y Mellbin, C Li, M Hoeschle… - Proceedings of the …, 2024 - openaccess.thecvf.com
Data-driven three-dimensional parametric shape models of the human body have gained
enormous popularity both for the analysis of visual data and for the generation of synthetic …

Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos

K Sun, D Litvak, Y Zhang, H Li, J Wu, S Wu - European Conference on …, 2024 - Springer
We introduce a new method for learning a generative model of articulated 3D animal
motions from raw, unlabeled online videos. Unlike existing approaches for 3D motion …