- Academic Search

BEVFormer: Learning Bird's-Eye-View Representation From LiDAR-Camera Via Spatiotemporal Transformers

Z Li, W Wang, H Li, E **e, C Sima, T Lu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Multi-modality fusion strategy is currently the de-facto most competitive solution for 3D
perception tasks. In this work, we present a new framework termed BEVFormer, which learns …

Save Cite Cited by 1286 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Planning-oriented autonomous driving

Y Hu, J Yang, L Chen, K Li, C Sima… - Proceedings of the …, 2023 - openaccess.thecvf.com

Modern autonomous driving system is characterized as modular tasks in sequential order,
ie, perception, prediction, and planning. In order to perform a wide diversity of tasks and …

Save Cite Cited by 585 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Bytetrack: Multi-object tracking by associating every detection box

Y Zhang, P Sun, Y Jiang, D Yu, F Weng, Z Yuan… - European conference on …, 2022 - Springer

Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in
videos. Most methods obtain identities by associating detection boxes whose scores are …

Save Cite Cited by 1767 Related articles All 12 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Exploring object-centric temporal modeling for efficient multi-view 3d object detection

S Wang, Y Liu, T Wang, Y Li… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

In this paper, we propose a long-sequence modeling framework, named StreamPETR, for
multi-view 3D object detection. Built upon the sparse query design in the PETR series, we …

Save Cite Cited by 188 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Vip3d: End-to-end visual trajectory prediction via 3d agent queries

J Gu, C Hu, T Zhang, X Chen, Y Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Perception and prediction are two separate modules in the existing autonomous driving
systems. They interact with each other via hand-picked features such as agent bounding …

Save Cite Cited by 98 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Standing between past and future: Spatio-temporal modeling for multi-camera 3d multi-object tracking

Z Pang, J Li, P Tokmakov, D Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

This work proposes an end-to-end multi-camera 3D multi-object tracking (MOT) framework. It
emphasizes spatio-temporal continuity and integrates both past and future reasoning for …

Save Cite Cited by 47 Related articles All 10 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Visual point cloud forecasting enables scalable autonomous driving

Z Yang, L Chen, Y Sun, H Li - Proceedings of the IEEE/CVF …, 2024 - openaccess.thecvf.com

In contrast to extensive studies on general vision pre-training for scalable visual
autonomous driving remains seldom explored. Visual autonomous driving applications …

Save Cite Cited by 40 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Sparse4d: Multi-view 3d object detection with sparse spatial-temporal fusion

X Lin, T Lin, Z Pei, L Huang, Z Su - arxiv preprint arxiv:2211.10581, 2022 - arxiv.org

Bird-eye-view (BEV) based methods have made great progress recently in multi-view 3D
detection task. Comparing with BEV based methods, sparse based methods lag behind in …

Save Cite Cited by 102 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Exploring recurrent long-term temporal fusion for multi-view 3d perception

C Han, J Yang, J Sun, Z Ge, R Dong… - IEEE Robotics and …, 2024 - ieeexplore.ieee.org

Long-term temporal fusion is a crucial but often overlooked technique in camera-based
Bird's-Eye-View (BEV) 3D perception. Existing methods are mostly in a parallel manner …

Save Cite Cited by 58 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Panacea: Panoramic and controllable video generation for autonomous driving

Y Wen, Y Zhao, Y Liu, F Jia, Y Wang… - Proceedings of the …, 2024 - openaccess.thecvf.com

The field of autonomous driving increasingly demands high-quality annotated training data.
In this paper we propose Panacea an innovative approach to generate panoramic and …

Save Cite Cited by 31 Related articles All 3 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Mutr3d: A multi-camera tracking framework via 3d-to-2d queries

BEVFormer: Learning Bird's-Eye-View Representation From LiDAR-Camera Via Spatiotemporal Transformers

Planning-oriented autonomous driving

Bytetrack: Multi-object tracking by associating every detection box

Exploring object-centric temporal modeling for efficient multi-view 3d object detection

Vip3d: End-to-end visual trajectory prediction via 3d agent queries

Standing between past and future: Spatio-temporal modeling for multi-camera 3d multi-object tracking

Visual point cloud forecasting enables scalable autonomous driving

Sparse4d: Multi-view 3d object detection with sparse spatial-temporal fusion

Exploring recurrent long-term temporal fusion for multi-view 3d perception

Panacea: Panoramic and controllable video generation for autonomous driving