Depth anything: Unleashing the power of large-scale unlabeled data
Abstract This work presents Depth Anything a highly practical solution for robust monocular
depth estimation. Without pursuing novel technical modules we aim to build a simple yet …
depth estimation. Without pursuing novel technical modules we aim to build a simple yet …
Metric3d: Towards zero-shot metric 3d prediction from a single image
Reconstructing accurate 3D scenes from images is a long-standing vision task. Due to the ill-
posedness of the single-image reconstruction problem, most well-established methods are …
posedness of the single-image reconstruction problem, most well-established methods are …
Towards zero-shot scale-aware monocular depth estimation
Monocular depth estimation is scale-ambiguous, and thus requires scale supervision to
produce metric predictions. Even so, the resulting models will be geometry-specific, with …
produce metric predictions. Even so, the resulting models will be geometry-specific, with …
Orienternet: Visual localization in 2d public maps with neural matching
Humans can orient themselves in their 3D environments using simple 2D maps. Differently,
algorithms for visual localization mostly rely on complex 3D point clouds that are expensive …
algorithms for visual localization mostly rely on complex 3D point clouds that are expensive …
Vip-deeplab: Learning visual perception with depth-aware video panoptic segmentation
In this paper, we present ViP-DeepLab, a unified model attempting to tackle the long-
standing and challenging inverse projection problem in vision, which we model as restoring …
standing and challenging inverse projection problem in vision, which we model as restoring …
The second monocular depth estimation challenge
This paper discusses the results for the second edition of the Monocular Depth Estimation
Challenge (MDEC). This edition was open to methods using any form of supervision …
Challenge (MDEC). This edition was open to methods using any form of supervision …
Snap: Self-supervised neural maps for visual positioning and semantic understanding
Semantic 2D maps are commonly used by humans and machines for navigation purposes,
whether it's walking or driving. However, these maps have limitations: they lack detail, often …
whether it's walking or driving. However, these maps have limitations: they lack detail, often …
[HTML][HTML] Accessing eye-level greenness visibility from open-source street view images: A methodological development and implementation in multi-city and multi …
IAV Sánchez, SM Labib - Sustainable Cities and Society, 2024 - Elsevier
The urban natural environment provides numerous benefits, including augmenting the
aesthetic appeal of urban landscapes and improving mental well-being. While diverse …
aesthetic appeal of urban landscapes and improving mental well-being. While diverse …
OpenStreetView-5M: The Many Roads to Global Visual Geolocation
Determining the location of an image anywhere on Earth is a complex visual task which
makes it particularly relevant for evaluating computer vision algorithms. Determining the …
makes it particularly relevant for evaluating computer vision algorithms. Determining the …
[HTML][HTML] A survey on RGB-D datasets
RGB-D data is essential for solving many problems in computer vision. Hundreds of public
RGB-D datasets containing various scenes, such as indoor, outdoor, aerial, driving, and …
RGB-D datasets containing various scenes, such as indoor, outdoor, aerial, driving, and …