Multi-modal 3d object detection in autonomous driving: A survey and taxonomy
Autonomous vehicles require constant environmental perception to obtain the distribution of
obstacles to achieve safe driving. Specifically, 3D object detection is a vital functional …
obstacles to achieve safe driving. Specifically, 3D object detection is a vital functional …
Delving into the devils of bird's-eye-view perception: A review, evaluation and recipe
Learning powerful representations in bird's-eye-view (BEV) for perception tasks is trending
and drawing extensive attention both from industry and academia. Conventional …
and drawing extensive attention both from industry and academia. Conventional …
Bevfusion: Multi-task multi-sensor fusion with unified bird's-eye view representation
Multi-sensor fusion is essential for an accurate and reliable autonomous driving system.
Recent approaches are based on point-level fusion: augmenting the LiDAR point cloud with …
Recent approaches are based on point-level fusion: augmenting the LiDAR point cloud with …
Bevformer: learning bird's-eye-view representation from lidar-camera via spatiotemporal transformers
Multi-modality fusion strategy is currently the de-facto most competitive solution for 3D
perception tasks. In this work, we present a new framework termed BEVFormer, which learns …
perception tasks. In this work, we present a new framework termed BEVFormer, which learns …
Transfusion: Robust lidar-camera fusion for 3d object detection with transformers
LiDAR and camera are two important sensors for 3D object detection in autonomous driving.
Despite the increasing popularity of sensor fusion in this field, the robustness against inferior …
Despite the increasing popularity of sensor fusion in this field, the robustness against inferior …
Virtual sparse convolution for multimodal 3d object detection
Abstract Recently, virtual/pseudo-point-based 3D object detection that seamlessly fuses
RGB images and LiDAR data by depth completion has gained great attention. However …
RGB images and LiDAR data by depth completion has gained great attention. However …
Bevfusion: A simple and robust lidar-camera fusion framework
Fusing the camera and LiDAR information has become a de-facto standard for 3D object
detection tasks. Current methods rely on point clouds from the LiDAR sensor as queries to …
detection tasks. Current methods rely on point clouds from the LiDAR sensor as queries to …
Deepfusion: Lidar-camera deep fusion for multi-modal 3d object detection
Lidars and cameras are critical sensors that provide complementary information for 3D
detection in autonomous driving. While prevalent multi-modal methods simply decorate raw …
detection in autonomous driving. While prevalent multi-modal methods simply decorate raw …
Unifying voxel-based representation with transformer for 3d object detection
In this work, we present a unified framework for multi-modality 3D object detection, named
UVTR. The proposed method aims to unify multi-modality representations in the voxel space …
UVTR. The proposed method aims to unify multi-modality representations in the voxel space …
Focal sparse convolutional networks for 3d object detection
Non-uniformed 3D sparse data, eg, point clouds or voxels in different spatial positions, make
contribution to the task of 3D object detection in different ways. Existing basic components in …
contribution to the task of 3D object detection in different ways. Existing basic components in …