[HTML][HTML] Deep learning for object detection and scene perception in self-driving cars: Survey, challenges, and open issues

A Gupta, A Anpalagan, L Guan, AS Khwaja - Array, 2021 - Elsevier
This article presents a comprehensive survey of deep learning applications for object
detection and scene perception in autonomous vehicles. Unlike existing review papers, we …

Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges

D Feng, C Haase-Schütz, L Rosenbaum… - IEEE Transactions …, 2020 - ieeexplore.ieee.org
Recent advancements in perception for autonomous driving are driven by deep learning. In
order to achieve robust and accurate scene understanding, autonomous vehicles are …

Bytetrack: Multi-object tracking by associating every detection box

Y Zhang, P Sun, Y Jiang, D Yu, F Weng, Z Yuan… - European conference on …, 2022 - Springer
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in
videos. Most methods obtain identities by associating detection boxes whose scores are …

Petr: Position embedding transformation for multi-view 3d object detection

Y Liu, T Wang, X Zhang, J Sun - European Conference on Computer …, 2022 - Springer
In this paper, we develop position embedding transformation (PETR) for multi-view 3D
object detection. PETR encodes the position information of 3D coordinates into image …

nuscenes: A multimodal dataset for autonomous driving

H Caesar, V Bankiti, AH Lang, S Vora… - Proceedings of the …, 2020 - openaccess.thecvf.com
Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle
technology. Image based benchmark datasets have driven development in computer vision …

Objects as points

X Zhou, D Wang, P Krähenbühl - arxiv preprint arxiv:1904.07850, 2019 - arxiv.org
Detection identifies objects as axis-aligned boxes in an image. Most successful object
detectors enumerate a nearly exhaustive list of potential object locations and classify each …

Detr3d: 3d object detection from multi-view images via 3d-to-2d queries

Y Wang, VC Guizilini, T Zhang… - … on Robot Learning, 2022 - proceedings.mlr.press
We introduce a framework for multi-camera 3D object detection. In contrast to existing works,
which estimate 3D bounding boxes directly from monocular images or use depth prediction …

Deepfusion: Lidar-camera deep fusion for multi-modal 3d object detection

Y Li, AW Yu, T Meng, B Caine… - Proceedings of the …, 2022 - openaccess.thecvf.com
Lidars and cameras are critical sensors that provide complementary information for 3D
detection in autonomous driving. While prevalent multi-modal methods simply decorate raw …

Pointrcnn: 3d object proposal generation and detection from point cloud

S Shi, X Wang, H Li - … of the IEEE/CVF conference on …, 2019 - openaccess.thecvf.com
In this paper, we propose PointRCNN for 3D object detection from raw point cloud. The
whole framework is composed of two stages: stage-1 for the bottom-up 3D proposal …

Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3d

J Philion, S Fidler - Computer Vision–ECCV 2020: 16th European …, 2020 - Springer
The goal of perception for autonomous vehicles is to extract semantic representations from
multiple sensors and fuse these representations into a single “bird's-eye-view” coordinate …