Delving into the devils of bird's-eye-view perception: A review, evaluation and recipe

H Li, C Sima, J Dai, W Wang, L Lu… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Learning powerful representations in bird's-eye-view (BEV) for perception tasks is trending
and drawing extensive attention both from industry and academia. Conventional …

3D object detection for autonomous driving: A comprehensive survey

J Mao, S Shi, X Wang, H Li - International Journal of Computer Vision, 2023 - Springer
Autonomous driving, in recent years, has been receiving increasing attention for its potential
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving …

Tri-perspective view for vision-based 3d semantic occupancy prediction

Y Huang, W Zheng, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Modern methods for vision-centric autonomous driving perception widely adopt the bird's-
eye-view (BEV) representation to describe a 3D scene. Despite its better efficiency than …

Occ3d: A large-scale 3d occupancy prediction benchmark for autonomous driving

X Tian, T Jiang, L Yun, Y Mao, H Yang… - Advances in …, 2024 - proceedings.neurips.cc
Robotic perception requires the modeling of both 3D geometry and semantics. Existing
methods typically focus on estimating 3D bounding boxes, neglecting finer geometric details …

Surroundocc: Multi-camera 3d occupancy prediction for autonomous driving

Y Wei, L Zhao, W Zheng, Z Zhu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract 3D scene understanding plays a vital role in vision-based autonomous driving.
While most existing methods focus on 3D object detection, they have difficulty describing …

Exploring object-centric temporal modeling for efficient multi-view 3d object detection

S Wang, Y Liu, T Wang, Y Li… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
In this paper, we propose a long-sequence modeling framework, named StreamPETR, for
multi-view 3D object detection. Built upon the sparse query design in the PETR series, we …

DriveDreamer: Towards Real-World-Drive World Models for Autonomous Driving

X Wang, Z Zhu, G Huang, X Chen, J Zhu… - European Conference on …, 2024 - Springer
World models, especially in autonomous driving, are trending and drawing extensive
attention due to their capacity for comprehending driving environments. The established …

Bevdepth: Acquisition of reliable depth for multi-view 3d object detection

Y Li, Z Ge, G Yu, J Yang, Z Wang, Y Shi… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
In this research, we propose a new 3D object detector with a trustworthy depth estimation,
dubbed BEVDepth, for camera-based Bird's-Eye-View~(BEV) 3D object detection. Our work …

Bevformer v2: Adapting modern image backbones to bird's-eye-view recognition via perspective supervision

C Yang, Y Chen, H Tian, C Tao, X Zhu… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present a novel bird's-eye-view (BEV) detector with perspective supervision, which
converges faster and better suits modern image backbones. Existing state-of-the-art BEV …

Bevfusion: Multi-task multi-sensor fusion with unified bird's-eye view representation

Z Liu, H Tang, A Amini, X Yang, H Mao… - … on robotics and …, 2023 - ieeexplore.ieee.org
Multi-sensor fusion is essential for an accurate and reliable autonomous driving system.
Recent approaches are based on point-level fusion: augmenting the LiDAR point cloud with …