Zoedepth: Zero-shot transfer by combining relative and metric depth

SF Bhat, R Birkl, D Wofk, P Wonka, M Müller - arxiv preprint arxiv …, 2023 - arxiv.org
This paper tackles the problem of depth estimation from a single image. Existing work either
focuses on generalization performance disregarding metric scale, ie relative depth …

Multimae: Multi-modal multi-task masked autoencoders

R Bachmann, D Mizrahi, A Atanov, A Zamir - European Conference on …, 2022 - Springer
We propose a pre-training strategy called Multi-modal Multi-task Masked Autoencoders
(MultiMAE). It differs from standard Masked Autoencoding in two key aspects: I) it can …

Neurallift-360: Lifting an in-the-wild 2d photo to a 3d object with 360deg views

D Xu, Y Jiang, P Wang, Z Fan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Virtual reality and augmented reality (XR) bring increasing demand for 3D content
generation. However, creating high-quality 3D content requires tedious work from a human …

Repurposing diffusion-based image generators for monocular depth estimation

B Ke, A Obukhov, S Huang, N Metzger… - Proceedings of the …, 2024 - openaccess.thecvf.com
Monocular depth estimation is a fundamental computer vision task. Recovering 3D depth
from a single image is geometrically ill-posed and requires scene understanding so it is not …

Unsupervised scale-consistent depth and ego-motion learning from monocular video

J Bian, Z Li, N Wang, H Zhan, C Shen… - Advances in neural …, 2019 - proceedings.neurips.cc
Recent work has shown that CNN-based depth and ego-motion estimators can be learned
using unlabelled monocular videos. However, the performance is limited by unidentified …

Geowizard: Unleashing the diffusion priors for 3d geometry estimation from a single image

X Fu, W Yin, M Hu, K Wang, Y Ma, P Tan… - … on Computer Vision, 2024 - Springer
We introduce GeoWizard, a new generative foundation model designed for estimating
geometric attributes, eg, depth and normals, from single images. While significant research …

Metric3d: Towards zero-shot metric 3d prediction from a single image

W Yin, C Zhang, H Chen, Z Cai, G Yu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Reconstructing accurate 3D scenes from images is a long-standing vision task. Due to the ill-
posedness of the single-image reconstruction problem, most well-established methods are …

Sinnerf: Training neural radiance fields on complex scenes from a single image

D Xu, Y Jiang, P Wang, Z Fan, H Shi… - European Conference on …, 2022 - Springer
Despite the rapid development of Neural Radiance Field (NeRF), the necessity of dense
covers largely prohibits its wider applications. While several recent works have attempted to …

P3depth: Monocular depth estimation with a piecewise planarity prior

V Patil, C Sakaridis, A Liniger… - Proceedings of the …, 2022 - openaccess.thecvf.com
Monocular depth estimation is vital for scene understanding and downstream tasks. We
focus on the supervised setup, in which ground-truth depth is available only at training time …

Text2nerf: Text-driven 3d scene generation with neural radiance fields

J Zhang, X Li, Z Wan, C Wang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Text-driven 3D scene generation is widely applicable to video gaming, film industry, and
metaverse applications that have a large demand for 3D scenes. However, existing text-to …