End-to-end autonomous driving: Challenges and frontiers

L Chen, P Wu, K Chitta, B Jaeger… - IEEE Transactions on …, 2024‏ - ieeexplore.ieee.org
The autonomous driving community has witnessed a rapid growth in approaches that
embrace an end-to-end algorithm framework, utilizing raw sensor input to generate vehicle …

Gpt-driver: Learning to drive with gpt

J Mao, Y Qian, J Ye, H Zhao, Y Wang - arxiv preprint arxiv:2310.01415, 2023‏ - arxiv.org
We present a simple yet effective approach that can transform the OpenAI GPT-3.5 model
into a reliable motion planner for autonomous vehicles. Motion planning is a core challenge …

DriveDreamer: Towards Real-World-Drive World Models for Autonomous Driving

X Wang, Z Zhu, G Huang, X Chen, J Zhu… - European Conference on …, 2024‏ - Springer
World models, especially in autonomous driving, are trending and drawing extensive
attention due to their capacity for comprehending driving environments. The established …

Driving into the future: Multiview visual forecasting and planning with world model for autonomous driving

Y Wang, J He, L Fan, H Li, Y Chen… - Proceedings of the …, 2024‏ - openaccess.thecvf.com
In autonomous driving predicting future events in advance and evaluating the foreseeable
risks empowers autonomous vehicles to plan their actions enhancing safety and efficiency …

Vista: A generalizable driving world model with high fidelity and versatile controllability

S Gao, J Yang, L Chen, K Chitta… - Advances in …, 2025‏ - proceedings.neurips.cc
World models can foresee the outcomes of different actions, which is of paramount
importance for autonomous driving. Nevertheless, existing driving world models still have …

Bevformer: learning bird's-eye-view representation from lidar-camera via spatiotemporal transformers

Z Li, W Wang, H Li, E **e, C Sima, T Lu… - IEEE Transactions on …, 2024‏ - ieeexplore.ieee.org
Multi-modality fusion strategy is currently the de-facto most competitive solution for 3D
perception tasks. In this work, we present a new framework termed BEVFormer, which learns …

Drivevlm: The convergence of autonomous driving and large vision-language models

X Tian, J Gu, B Li, Y Liu, Y Wang, Z Zhao… - arxiv preprint arxiv …, 2024‏ - arxiv.org
A primary hurdle of autonomous driving in urban environments is understanding complex
and long-tail scenarios, such as challenging road conditions and delicate human behaviors …

Selfocc: Self-supervised vision-based 3d occupancy prediction

Y Huang, W Zheng, B Zhang… - Proceedings of the …, 2024‏ - openaccess.thecvf.com
Abstract 3D occupancy prediction is an important task for the robustness of vision-centric
autonomous driving which aims to predict whether each point is occupied in the surrounding …

Maptrv2: An end-to-end framework for online vectorized hd map construction

B Liao, S Chen, Y Zhang, B Jiang, Q Zhang… - International Journal of …, 2024‏ - Springer
High-definition (HD) map provides abundant and precise static environmental information of
the driving scene, serving as a fundamental and indispensable component for planning in …

Occworld: Learning a 3d occupancy world model for autonomous driving

W Zheng, W Chen, Y Huang, B Zhang, Y Duan… - European conference on …, 2024‏ - Springer
Understanding how the 3D scene evolves is vital for making decisions in autonomous
driving. Most existing methods achieve this by predicting the movements of object boxes …