Delving into the devils of bird's-eye-view perception: A review, evaluation and recipe

H Li, C Sima, J Dai, W Wang, L Lu… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Learning powerful representations in bird's-eye-view (BEV) for perception tasks is trending
and drawing extensive attention both from industry and academia. Conventional …

3D object detection for autonomous driving: A comprehensive survey

J Mao, S Shi, X Wang, H Li - International Journal of Computer Vision, 2023 - Springer
Autonomous driving, in recent years, has been receiving increasing attention for its potential
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving …

Transfusion: Robust lidar-camera fusion for 3d object detection with transformers

X Bai, Z Hu, X Zhu, Q Huang, Y Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com
LiDAR and camera are two important sensors for 3D object detection in autonomous driving.
Despite the increasing popularity of sensor fusion in this field, the robustness against inferior …

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

Vision transformer with deformable attention

Z **a, X Pan, S Song, LE Li… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Transformers have recently shown superior performances on various vision tasks. The large,
sometimes even global, receptive field endows Transformer models with higher …

On the integration of self-attention and convolution

X Pan, C Ge, R Lu, S Song, G Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com
Convolution and self-attention are two powerful techniques for representation learning, and
they are usually considered as two peer approaches that are distinct from each other. In this …

An end-to-end transformer model for 3d object detection

I Misra, R Girdhar, A Joulin - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
We propose 3DETR, an end-to-end Transformer based object detection model for 3D point
clouds. Compared to existing detection methods that employ a number of 3D-specific …

Voxel transformer for 3d object detection

J Mao, Y Xue, M Niu, H Bai, J Feng… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract We present Voxel Transformer (VoTr), a novel and effective voxel-based
Transformer backbone for 3D object detection from point clouds. Conventional 3D …

A survey of visual transformers

Y Liu, Y Zhang, Y Wang, F Hou, J Yuan… - … on Neural Networks …, 2023 - ieeexplore.ieee.org
Transformer, an attention-based encoder–decoder model, has already revolutionized the
field of natural language processing (NLP). Inspired by such significant achievements, some …

Embracing single stride 3d object detector with sparse transformer

L Fan, Z Pang, T Zhang, YX Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com
In LiDAR-based 3D object detection for autonomous driving, the ratio of the object size to
input scene size is significantly smaller compared to 2D detection cases. Overlooking this …