Repurposing diffusion-based image generators for monocular depth estimation

B Ke, A Obukhov, S Huang, N Metzger… - Proceedings of the …, 2024‏ - openaccess.thecvf.com
Monocular depth estimation is a fundamental computer vision task. Recovering 3D depth
from a single image is geometrically ill-posed and requires scene understanding so it is not …

Learning to upsample by learning to sample

W Liu, H Lu, H Fu, Z Cao - Proceedings of the IEEE/CVF …, 2023‏ - openaccess.thecvf.com
We present DySample, an ultra-lightweight and effective dynamic upsampler. While
impressive performance gains have been witnessed from recent kernel-based dynamic …

Bevdepth: Acquisition of reliable depth for multi-view 3d object detection

Y Li, Z Ge, G Yu, J Yang, Z Wang, Y Shi… - Proceedings of the AAAI …, 2023‏ - ojs.aaai.org
In this research, we propose a new 3D object detector with a trustworthy depth estimation,
dubbed BEVDepth, for camera-based Bird's-Eye-View~(BEV) 3D object detection. Our work …

Ddp: Diffusion model for dense visual prediction

Y Ji, Z Chen, E **e, L Hong, X Liu… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
We propose a simple, efficient, yet powerful framework for dense visual predictions based
on the conditional diffusion pipeline. Our approach follows a" noise-to-map" generative …

Datasetdm: Synthesizing data with perception annotations using diffusion models

W Wu, Y Zhao, H Chen, Y Gu, R Zhao… - Advances in …, 2023‏ - proceedings.neurips.cc
Current deep networks are very data-hungry and benefit from training on large-scale
datasets, which are often time-consuming to collect and annotate. By contrast, synthetic data …

Detrs with hybrid matching

D Jia, Y Yuan, H He, X Wu, H Yu… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
One-to-one set matching is a key design for DETR to establish its end-to-end capability, so
that object detection does not require a hand-crafted NMS (non-maximum suppression) to …

Monovit: Self-supervised monocular depth estimation with a vision transformer

C Zhao, Y Zhang, M Poggi, F Tosi… - … conference on 3D …, 2022‏ - ieeexplore.ieee.org
Self-supervised monocular depth estimation is an attractive solution that does not require
hard-to-source depth la-bels for training. Convolutional neural networks (CNNs) have …

Binsformer: Revisiting adaptive bins for monocular depth estimation

Z Li, X Wang, X Liu, J Jiang - IEEE Transactions on Image …, 2024‏ - ieeexplore.ieee.org
Monocular depth estimation (MDE) is a fundamental task in computer vision and has drawn
increasing attention. Recently, some methods reformulate it as a classification-regression …

Completionformer: Depth completion with convolutions and vision transformers

Y Zhang, X Guo, M Poggi, Z Zhu… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
Given sparse depths and the corresponding RGB images, depth completion aims at spatially
propagating the sparse measurements throughout the whole image to get a dense depth …

Robodepth: Robust out-of-distribution depth estimation under corruptions

L Kong, S **e, H Hu, LX Ng… - Advances in Neural …, 2023‏ - proceedings.neurips.cc
Depth estimation from monocular images is pivotal for real-world visual perception systems.
While current learning-based depth estimation models train and test on meticulously curated …