Robustness-aware 3d object detection in autonomous driving: A review and outlook

Z Song, L Liu, F Jia, Y Luo, C Jia… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
In the realm of modern autonomous driving, the perception system is indispensable for
accurately assessing the state of the surrounding environment, thereby enabling informed …

Spherical transformer for lidar-based 3d recognition

X Lai, Y Chen, F Lu, J Liu, J Jia - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
LiDAR-based 3D point cloud recognition has benefited various applications. Without
specially considering the LiDAR point distribution, most current methods suffer from …

Pointmamba: A simple state space model for point cloud analysis

D Liang, X Zhou, W Xu, X Zhu, Z Zou, X Ye… - arxiv preprint arxiv …, 2024 - arxiv.org
Transformers have become one of the foundational architectures in point cloud analysis
tasks due to their excellent global modeling ability. However, the attention mechanism has …

Focalformer3d: focusing on hard instance for 3d object detection

Y Chen, Z Yu, Y Chen, S Lan… - Proceedings of the …, 2023 - openaccess.thecvf.com
False negatives (FN) in 3D object detection, eg, missing predictions of pedestrians, vehicles,
or other obstacles, can lead to potentially dangerous situations in autonomous driving. While …

Genad: Generative end-to-end autonomous driving

W Zheng, R Song, X Guo, C Zhang, L Chen - European Conference on …, 2024 - Springer
Directly producing planning results from raw sensors has been a long-desired solution for
autonomous driving and has attracted increasing attention recently. Most existing end-to …

Unipad: A universal pre-training paradigm for autonomous driving

H Yang, S Zhang, D Huang, X Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com
In the context of autonomous driving the significance of effective feature learning is widely
acknowledged. While conventional 3D self-supervised pre-training methods have shown …

Cross modal transformer: Towards fast and robust 3d object detection

J Yan, Y Liu, J Sun, F Jia, S Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for
end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the …

Nuscenes-qa: A multi-modal visual question answering benchmark for autonomous driving scenario

T Qian, J Chen, L Zhuo, Y Jiao, YG Jiang - Proceedings of the AAAI …, 2024 - ojs.aaai.org
We introduce a novel visual question answering (VQA) task in the context of autonomous
driving, aiming to answer natural language questions based on street-view clues. Compared …

Uni3detr: Unified 3d detection transformer

Z Wang, YL Li, X Chen, H Zhao… - Advances in Neural …, 2024 - proceedings.neurips.cc
Existing point cloud based 3D detectors are designed for the particular scene, either indoor
or outdoor ones. Because of the substantial differences in object distribution and point …

A survey on segment anything model (sam): Vision foundation model meets prompt engineering

C Zhang, FD Puspitasari, S Zheng, C Li, Y Qiao… - arxiv preprint arxiv …, 2023 - arxiv.org
Segment anything model (SAM) developed by Meta AI Research has recently attracted
significant attention. Trained on a large segmentation dataset of over 1 billion masks, SAM is …