Delving into the devils of bird's-eye-view perception: A review, evaluation and recipe
Learning powerful representations in bird's-eye-view (BEV) for perception tasks is trending
and drawing extensive attention both from industry and academia. Conventional …
and drawing extensive attention both from industry and academia. Conventional …
3D object detection for autonomous driving: A comprehensive survey
Autonomous driving, in recent years, has been receiving increasing attention for its potential
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving …
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving …
Transfusion: Robust lidar-camera fusion for 3d object detection with transformers
LiDAR and camera are two important sensors for 3D object detection in autonomous driving.
Despite the increasing popularity of sensor fusion in this field, the robustness against inferior …
Despite the increasing popularity of sensor fusion in this field, the robustness against inferior …
Transformer-based visual segmentation: A survey
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …
segments or groups. This technique has numerous real-world applications, such as …
Vision transformer with deformable attention
Transformers have recently shown superior performances on various vision tasks. The large,
sometimes even global, receptive field endows Transformer models with higher …
sometimes even global, receptive field endows Transformer models with higher …
On the integration of self-attention and convolution
Convolution and self-attention are two powerful techniques for representation learning, and
they are usually considered as two peer approaches that are distinct from each other. In this …
they are usually considered as two peer approaches that are distinct from each other. In this …
An end-to-end transformer model for 3d object detection
We propose 3DETR, an end-to-end Transformer based object detection model for 3D point
clouds. Compared to existing detection methods that employ a number of 3D-specific …
clouds. Compared to existing detection methods that employ a number of 3D-specific …
Voxel transformer for 3d object detection
Abstract We present Voxel Transformer (VoTr), a novel and effective voxel-based
Transformer backbone for 3D object detection from point clouds. Conventional 3D …
Transformer backbone for 3D object detection from point clouds. Conventional 3D …
A survey of visual transformers
Transformer, an attention-based encoder–decoder model, has already revolutionized the
field of natural language processing (NLP). Inspired by such significant achievements, some …
field of natural language processing (NLP). Inspired by such significant achievements, some …
Embracing single stride 3d object detector with sparse transformer
In LiDAR-based 3D object detection for autonomous driving, the ratio of the object size to
input scene size is significantly smaller compared to 2D detection cases. Overlooking this …
input scene size is significantly smaller compared to 2D detection cases. Overlooking this …