Deep learning for monocular depth estimation: A review

Y Ming, X Meng, C Fan, H Yu - Neurocomputing, 2021 - Elsevier
Depth estimation is a classic task in computer vision, which is of great significance for many
applications such as augmented reality, target tracking and autonomous driving. Traditional …

Monocular depth estimation using deep learning: A review

A Masoumian, HA Rashwan, J Cristiano, MS Asif… - Sensors, 2022 - mdpi.com
In current decades, significant advancements in robotics engineering and autonomous
vehicles have improved the requirement for precise depth measurements. Depth estimation …

Depth anything: Unleashing the power of large-scale unlabeled data

L Yang, B Kang, Z Huang, X Xu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract This work presents Depth Anything a highly practical solution for robust monocular
depth estimation. Without pursuing novel technical modules we aim to build a simple yet …

Zoedepth: Zero-shot transfer by combining relative and metric depth

SF Bhat, R Birkl, D Wofk, P Wonka, M Müller - arxiv preprint arxiv …, 2023 - arxiv.org
This paper tackles the problem of depth estimation from a single image. Existing work either
focuses on generalization performance disregarding metric scale, ie relative depth …

Bevdet: High-performance multi-camera 3d object detection in bird-eye-view

J Huang, G Huang, Z Zhu, Y Ye, D Du - arxiv preprint arxiv:2112.11790, 2021 - arxiv.org
Autonomous driving perceives its surroundings for decision making, which is one of the most
complex scenarios in visual perception. The success of paradigm innovation in solving the …

idisc: Internal discretization for monocular depth estimation

L Piccinelli, C Sakaridis, F Yu - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Monocular depth estimation is fundamental for 3D scene understanding and downstream
applications. However, even under the supervised setup, it is still challenging and ill-posed …

Bevdet4d: Exploit temporal cues in multi-camera 3d object detection

J Huang, G Huang - arxiv preprint arxiv:2203.17054, 2022 - arxiv.org
Single frame data contains finite information which limits the performance of the existing
vision-based multi-camera 3D object detection paradigms. For fundamentally pushing the …

Detr3d: 3d object detection from multi-view images via 3d-to-2d queries

Y Wang, VC Guizilini, T Zhang… - … on Robot Learning, 2022 - proceedings.mlr.press
We introduce a framework for multi-camera 3D object detection. In contrast to existing works,
which estimate 3D bounding boxes directly from monocular images or use depth prediction …

Time will tell: New outlooks and a baseline for temporal multi-view 3d object detection

J Park, C Xu, S Yang, K Keutzer, K Kitani… - arxiv preprint arxiv …, 2022 - arxiv.org
While recent camera-only 3D detection methods leverage multiple timesteps, the limited
history they use significantly hampers the extent to which temporal fusion can improve object …

Metric3d: Towards zero-shot metric 3d prediction from a single image

W Yin, C Zhang, H Chen, Z Cai, G Yu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Reconstructing accurate 3D scenes from images is a long-standing vision task. Due to the ill-
posedness of the single-image reconstruction problem, most well-established methods are …