3D object detection for autonomous driving: A comprehensive survey

J Mao, S Shi, X Wang, H Li - International Journal of Computer Vision, 2023‏ - Springer
Autonomous driving, in recent years, has been receiving increasing attention for its potential
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving …

Delving into the devils of bird's-eye-view perception: A review, evaluation and recipe

H Li, C Sima, J Dai, W Wang, L Lu… - … on Pattern Analysis …, 2023‏ - ieeexplore.ieee.org
Learning powerful representations in bird's-eye-view (BEV) for perception tasks is trending
and drawing extensive attention both from industry and academia. Conventional …

Transfusion: Robust lidar-camera fusion for 3d object detection with transformers

X Bai, Z Hu, X Zhu, Q Huang, Y Chen… - Proceedings of the …, 2022‏ - openaccess.thecvf.com
LiDAR and camera are two important sensors for 3D object detection in autonomous driving.
Despite the increasing popularity of sensor fusion in this field, the robustness against inferior …

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - IEEE transactions on …, 2024‏ - ieeexplore.ieee.org
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

Mask3d: Mask transformer for 3d semantic instance segmentation

J Schult, F Engelmann, A Hermans… - … on Robotics and …, 2023‏ - ieeexplore.ieee.org
Modern 3D semantic instance segmentation approaches predominantly rely on specialized
voting mechanisms followed by carefully designed geometric clustering techniques. Building …

Vision transformer with deformable attention

Z **a, X Pan, S Song, LE Li… - Proceedings of the IEEE …, 2022‏ - openaccess.thecvf.com
Transformers have recently shown superior performances on various vision tasks. The large,
sometimes even global, receptive field endows Transformer models with higher …

Dsvt: Dynamic sparse voxel transformer with rotated sets

H Wang, C Shi, S Shi, M Lei, S Wang… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
Designing an efficient yet deployment-friendly 3D backbone to handle sparse point clouds is
a fundamental problem in 3D perception. Compared with the customized sparse …

On the integration of self-attention and convolution

X Pan, C Ge, R Lu, S Song, G Chen… - Proceedings of the …, 2022‏ - openaccess.thecvf.com
Convolution and self-attention are two powerful techniques for representation learning, and
they are usually considered as two peer approaches that are distinct from each other. In this …

An end-to-end transformer model for 3d object detection

I Misra, R Girdhar, A Joulin - Proceedings of the IEEE/CVF …, 2021‏ - openaccess.thecvf.com
We propose 3DETR, an end-to-end Transformer based object detection model for 3D point
clouds. Compared to existing detection methods that employ a number of 3D-specific …

Voxel transformer for 3d object detection

J Mao, Y Xue, M Niu, H Bai, J Feng… - Proceedings of the …, 2021‏ - openaccess.thecvf.com
Abstract We present Voxel Transformer (VoTr), a novel and effective voxel-based
Transformer backbone for 3D object detection from point clouds. Conventional 3D …