A Survey of World Models for Autonomous Driving
T Feng, W Wang, Y Yang - arxiv preprint arxiv:2501.11260, 2025 - arxiv.org
Recent breakthroughs in autonomous driving have revolutionized the way vehicles perceive
and interact with their surroundings. In particular, world models have emerged as a linchpin …
and interact with their surroundings. In particular, world models have emerged as a linchpin …
Maximizing Alignment with Minimal Feedback: Efficiently Learning Rewards for Visuomotor Robot Policy Alignment
Visuomotor robot policies, increasingly pre-trained on large-scale datasets, promise
significant advancements across robotics domains. However, aligning these policies with …
significant advancements across robotics domains. However, aligning these policies with …
Distilling Multi-modal Large Language Models for Autonomous Driving
Autonomous driving demands safe motion planning, especially in critical" long-tail"
scenarios. Recent end-to-end autonomous driving systems leverage large language models …
scenarios. Recent end-to-end autonomous driving systems leverage large language models …
V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models
Current autonomous driving vehicles rely mainly on their individual sensors to understand
surrounding scenes and plan for future trajectories, which can be unreliable when the …
surrounding scenes and plan for future trajectories, which can be unreliable when the …
LEO: Boosting Mixture of Vision Encoders for Multimodal Large Language Models
Enhanced visual understanding serves as a cornerstone for multimodal large language
models (MLLMs). Recent hybrid MLLMs incorporate a mixture of vision experts to address …
models (MLLMs). Recent hybrid MLLMs incorporate a mixture of vision experts to address …
MTA: Multimodal Task Alignment for BEV Perception and Captioning
Bird's eye view (BEV)-based 3D perception plays a crucial role in autonomous driving
applications. The rise of large language models has spurred interest in BEV-based …
applications. The rise of large language models has spurred interest in BEV-based …
World knowledge-enhanced Reasoning Using Instruction-guided Interactor in Autonomous Driving
The Multi-modal Large Language Models (MLLMs) with extensive world knowledge have
revitalized autonomous driving, particularly in reasoning tasks within perceivable regions …
revitalized autonomous driving, particularly in reasoning tasks within perceivable regions …
Scaling LLM Pre-training with Vocabulary Curriculum
F Yu - arxiv preprint arxiv:2502.17910, 2025 - arxiv.org
Modern language models rely on static vocabularies, fixed before pretraining, in contrast to
the adaptive vocabulary acquisition observed in human language learning. To bridge this …
the adaptive vocabulary acquisition observed in human language learning. To bridge this …