Robustness-aware 3d object detection in autonomous driving: A review and outlook

Z Song, L Liu, F Jia, Y Luo, C Jia… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
In the realm of modern autonomous driving, the perception system is indispensable for
accurately assessing the state of the surrounding environment, thereby enabling informed …

Unipad: A universal pre-training paradigm for autonomous driving

H Yang, S Zhang, D Huang, X Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com
In the context of autonomous driving the significance of effective feature learning is widely
acknowledged. While conventional 3D self-supervised pre-training methods have shown …

Semantically-aware neural radiance fields for visual scene understanding: A comprehensive review

TAQ Nguyen, A Bourki, M Macudzinski… - arxiv preprint arxiv …, 2024 - arxiv.org
This review thoroughly examines the role of semantically-aware Neural Radiance Fields
(NeRFs) in visual scene understanding, covering an analysis of over 250 scholarly papers. It …

OV-Uni3DETR: Towards unified open-vocabulary 3D object detection via cycle-modality propagation

Z Wang, Y Li, T Liu, H Zhao, S Wang - European Conference on Computer …, 2024 - Springer
In the current state of 3D object detection research, the severe scarcity of annotated 3D data,
substantial disparities across different data modalities, and the absence of a unified …

Nerf-mae: Masked autoencoders for self-supervised 3d representation learning for neural radiance fields

MZ Irshad, S Zakharov, V Guizilini, A Gaidon… - … on Computer Vision, 2024 - Springer
Neural fields excel in computer vision and robotics due to their ability to understand the 3D
visual world such as inferring semantics, geometry, and dynamics. Given the capabilities of …

Pixel-aligned recurrent queries for multi-view 3d object detection

Y **e, H Jiang, G Gkioxari… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We present PARQ-a multi-view 3D object detector with transformer and pixel-aligned
recurrent queries. Unlike previous works that use learnable features or only encode 3D point …

The nerfect match: Exploring nerf features for visual localization

Q Zhou, M Maximov, O Litany, L Leal-Taixé - European Conference on …, 2024 - Springer
In this work, we propose the use of Neural Radiance Fields (NeRF) as a scene
representation for visual localization. Recently, NeRF has been employed to enhance pose …

ConDense: Consistent 2D/3D Pre-training for Dense and Sparse Features from Multi-View Images

X Zhang, Z Wang, H Zhou, S Ghosh… - … on Computer Vision, 2024 - Springer
To advance the state of the art in the creation of 3D foundation models, this paper introduces
the ConDense framework for 3D pre-training utilizing existing pre-trained 2D networks and …

Cvt-occ: Cost volume temporal fusion for 3d occupancy prediction

Z Ye, T Jiang, C Xu, Y Li, H Zhao - European Conference on Computer …, 2024 - Springer
Vision-based 3D occupancy prediction is significantly challenged by the inherent limitations
of monocular vision in depth estimation. This paper introduces CVT-Occ, a novel approach …

PRED: pre-training via semantic rendering on LiDAR point clouds

H Yang, H Wang, D Dai… - Advances in Neural …, 2023 - proceedings.neurips.cc
Pre-training is crucial in 3D-related fields such as autonomous driving where point cloud
annotation is costly and challenging. Many recent studies on point cloud pre-training …