3D object detection for autonomous driving: A comprehensive survey

J Mao, S Shi, X Wang, H Li - International Journal of Computer Vision, 2023 - Springer
Autonomous driving, in recent years, has been receiving increasing attention for its potential
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving …

Multi-modal 3d object detection in autonomous driving: A survey and taxonomy

L Wang, X Zhang, Z Song, J Bi, G Zhang… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
Autonomous vehicles require constant environmental perception to obtain the distribution of
obstacles to achieve safe driving. Specifically, 3D object detection is a vital functional …

Voxelnext: Fully sparse voxelnet for 3d object detection and tracking

Y Chen, J Liu, X Zhang, X Qi… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Abstract 3D object detectors usually rely on hand-crafted proxies, eg, anchors or centers,
and translate well-studied 2D frameworks to 3D. Thus, sparse voxel features need to be …

Bevformer v2: Adapting modern image backbones to bird's-eye-view recognition via perspective supervision

C Yang, Y Chen, H Tian, C Tao, X Zhu… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present a novel bird's-eye-view (BEV) detector with perspective supervision, which
converges faster and better suits modern image backbones. Existing state-of-the-art BEV …

Spherical transformer for lidar-based 3d recognition

X Lai, Y Chen, F Lu, J Liu, J Jia - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
LiDAR-based 3D point cloud recognition has benefited various applications. Without
specially considering the LiDAR point distribution, most current methods suffer from …

Virtual sparse convolution for multimodal 3d object detection

H Wu, C Wen, S Shi, X Li… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Abstract Recently, virtual/pseudo-point-based 3D object detection that seamlessly fuses
RGB images and LiDAR data by depth completion has gained great attention. However …

Multimodal learning with transformers: A survey

P Xu, X Zhu, DA Clifton - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Transformer is a promising neural network learner, and has achieved great success in
various machine learning tasks. Thanks to the recent prevalence of multimodal applications …

Bevfusion: Multi-task multi-sensor fusion with unified bird's-eye view representation

Z Liu, H Tang, A Amini, X Yang, H Mao… - … on robotics and …, 2023 - ieeexplore.ieee.org
Multi-sensor fusion is essential for an accurate and reliable autonomous driving system.
Recent approaches are based on point-level fusion: augmenting the LiDAR point cloud with …

Bevfusion: A simple and robust lidar-camera fusion framework

T Liang, H **e, K Yu, Z **a, Z Lin… - Advances in …, 2022 - proceedings.neurips.cc
Fusing the camera and LiDAR information has become a de-facto standard for 3D object
detection tasks. Current methods rely on point clouds from the LiDAR sensor as queries to …

Bevformer: learning bird's-eye-view representation from lidar-camera via spatiotemporal transformers

Z Li, W Wang, H Li, E **e, C Sima, T Lu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Multi-modality fusion strategy is currently the de-facto most competitive solution for 3D
perception tasks. In this work, we present a new framework termed BEVFormer, which learns …