2dpass: 2d priors assisted semantic segmentation on lidar point clouds

X Yan, J Gao, C Zheng, C Zheng, R Zhang… - European conference on …, 2022 - Springer
As camera and LiDAR sensors capture complementary information in autonomous driving,
great efforts have been made to conduct semantic segmentation through multi-modality data …

Learning 3d representations from 2d pre-trained models via image-to-point masked autoencoders

R Zhang, L Wang, Y Qiao, P Gao… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Pre-training by numerous image data has become de-facto for robust 2D representations. In
contrast, due to the expensive data processing, a paucity of 3D datasets severely hinders …

Neural 3d scene reconstruction with the manhattan-world assumption

H Guo, S Peng, H Lin, Q Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view
images. Many previous works have shown impressive reconstruction results on textured …

PEAL: Prior-embedded explicit attention learning for low-overlap point cloud registration

J Yu, L Ren, Y Zhang, W Zhou… - Proceedings of the …, 2023 - openaccess.thecvf.com
Learning distinctive point-wise features is critical for low-overlap point cloud registration.
Recently, it has achieved huge success in incorporating Transformer into point cloud feature …

Unit3d: A unified transformer for 3d dense captioning and visual grounding

Z Chen, R Hu, X Chen, M Nießner… - Proceedings of the …, 2023 - openaccess.thecvf.com
Performing 3D dense captioning and visual grounding requires a common and shared
understanding of the underlying multimodal relationships. However, despite some previous …

Depthcrafter: Generating consistent long depth sequences for open-world videos

W Hu, X Gao, X Li, S Zhao, X Cun, Y Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Despite significant advancements in monocular depth estimation for static images,
estimating video depth in the open world remains challenging, since open-world videos are …

X3kd: Knowledge distillation across modalities, tasks and stages for multi-camera 3d object detection

M Klingner, S Borse, VR Kumar… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent advances in 3D object detection (3DOD) have obtained remarkably strong results for
LiDAR-based models. In contrast, surround-view 3DOD models based on multiple camera …

Image2point: 3d point-cloud understanding with 2d image pretrained models

C Xu, S Yang, T Galanti, B Wu, X Yue, B Zhai… - … on Computer Vision, 2022 - Springer
Abstract 3D point-clouds and 2D images are different visual representations of the physical
world. While human vision can understand both representations, computer vision models …

Pri3d: Can 3d priors help 2d representation learning?

J Hou, S **e, B Graham, A Dai… - Proceedings of the …, 2021 - openaccess.thecvf.com
Recent advances in 3D perception have shown impressive progress in understanding
geometric structures of 3D shapes and even scenes. Inspired by these advances in …

Structured knowledge distillation for accurate and efficient object detection

L Zhang, K Ma - IEEE Transactions on Pattern Analysis and …, 2023 - ieeexplore.ieee.org
Knowledge distillation, which aims to transfer the knowledge learned by a cumbersome
teacher model to a lightweight student model, has become one of the most popular and …