Depth anything: Unleashing the power of large-scale unlabeled data

L Yang, B Kang, Z Huang, X Xu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract This work presents Depth Anything a highly practical solution for robust monocular
depth estimation. Without pursuing novel technical modules we aim to build a simple yet …

Metric3d: Towards zero-shot metric 3d prediction from a single image

W Yin, C Zhang, H Chen, Z Cai, G Yu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Reconstructing accurate 3D scenes from images is a long-standing vision task. Due to the ill-
posedness of the single-image reconstruction problem, most well-established methods are …

Towards zero-shot scale-aware monocular depth estimation

V Guizilini, I Vasiljevic, D Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Monocular depth estimation is scale-ambiguous, and thus requires scale supervision to
produce metric predictions. Even so, the resulting models will be geometry-specific, with …

Orienternet: Visual localization in 2d public maps with neural matching

PE Sarlin, D DeTone, TY Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Humans can orient themselves in their 3D environments using simple 2D maps. Differently,
algorithms for visual localization mostly rely on complex 3D point clouds that are expensive …

Vip-deeplab: Learning visual perception with depth-aware video panoptic segmentation

S Qiao, Y Zhu, H Adam, A Yuille… - Proceedings of the …, 2021 - openaccess.thecvf.com
In this paper, we present ViP-DeepLab, a unified model attempting to tackle the long-
standing and challenging inverse projection problem in vision, which we model as restoring …

The second monocular depth estimation challenge

J Spencer, CS Qian, M Trescakova… - Proceedings of the …, 2023 - openaccess.thecvf.com
This paper discusses the results for the second edition of the Monocular Depth Estimation
Challenge (MDEC). This edition was open to methods using any form of supervision …

Snap: Self-supervised neural maps for visual positioning and semantic understanding

PE Sarlin, E Trulls, M Pollefeys… - Advances in Neural …, 2024 - proceedings.neurips.cc
Semantic 2D maps are commonly used by humans and machines for navigation purposes,
whether it's walking or driving. However, these maps have limitations: they lack detail, often …

[HTML][HTML] Accessing eye-level greenness visibility from open-source street view images: A methodological development and implementation in multi-city and multi …

IAV Sánchez, SM Labib - Sustainable Cities and Society, 2024 - Elsevier
The urban natural environment provides numerous benefits, including augmenting the
aesthetic appeal of urban landscapes and improving mental well-being. While diverse …

OpenStreetView-5M: The Many Roads to Global Visual Geolocation

G Astruc, N Dufour, I Siglidis… - Proceedings of the …, 2024 - openaccess.thecvf.com
Determining the location of an image anywhere on Earth is a complex visual task which
makes it particularly relevant for evaluating computer vision algorithms. Determining the …

[HTML][HTML] A survey on RGB-D datasets

A Lopes, R Souza, H Pedrini - Computer Vision and Image Understanding, 2022 - Elsevier
RGB-D data is essential for solving many problems in computer vision. Hundreds of public
RGB-D datasets containing various scenes, such as indoor, outdoor, aerial, driving, and …