Google Učenjak

[HTML][HTML] Using multimodal large language models (MLLMs) for automated detection of traffic safety-critical events

M Abu Tami, HI Ashqar, M Elhenawy, S Glaser… - Vehicles, 2024 - mdpi.com

Traditional approaches to safety event analysis in autonomous systems have relied on
complex machine and deep learning models and extensive datasets for high accuracy and …

Shrani Navedi Navedeno v 13 virih Sorodni članki Vse različice: 5 Posnetek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

V2x-vlm: End-to-end v2x cooperative autonomous driving through large vision-language models

J You, H Shi, Z Jiang, Z Huang, R Gan, K Wu… - arxiv preprint arxiv …, 2024 - arxiv.org

Advancements in autonomous driving have increasingly focused on end-to-end (E2E)
systems that manage the full spectrum of driving tasks, from environmental perception to …

Shrani Navedi Navedeno v 10 virih Sorodni članki Vse različice: 3 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Grid: Visual layout generation

C Wan, X Luo, Z Cai, Y Song, Y Zhao, Y Bai… - arxiv preprint arxiv …, 2024 - arxiv.org

In this paper, we introduce GRID, a novel paradigm that reframes a broad range of visual
generation tasks as the problem of arranging grids, akin to film strips. At its core, GRID …

Shrani Navedi Navedeno v 2 virih Sorodni članki Vse različice: 2 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

VLM-MPC: Vision Language Foundation Model (VLM)-Guided Model Predictive Controller (MPC) for Autonomous Driving

K Long, H Shi, J Liu, X Li - arxiv preprint arxiv:2408.04821, 2024 - arxiv.org

Motivated by the emergent reasoning capabilities of Vision Language Models (VLMs) and
their potential to improve the comprehensibility of autonomous driving systems, this paper …

Shrani Navedi Navedeno v 2 virih Sorodni članki Vse različice: 3 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving

T Li, H Wang, X Li, W Liao, T He, P Peng - arxiv preprint arxiv:2501.08861, 2025 - arxiv.org

Autonomous driving is a challenging task that requires perceiving and understanding the
surrounding environment for safe trajectory planning. While existing vision-based end-to …

Shrani Navedi Sorodni članki Vse različice: 2 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training

A Cao, X Wei, Z Ma - arxiv preprint arxiv:2411.11927, 2024 - arxiv.org

Language-image pre-training faces significant challenges due to limited data in specific
formats and the constrained capacities of text encoders. While prevailing methods attempt to …

Shrani Navedi Sorodni članki Vse različice: 2 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

World knowledge-enhanced Reasoning Using Instruction-guided Interactor in Autonomous Driving

M Zhai, C Li, Z Guo, N Yang, X Qin, Y Wu… - arxiv preprint arxiv …, 2024 - arxiv.org

The Multi-modal Large Language Models (MLLMs) with extensive world knowledge have
revitalized autonomous driving, particularly in reasoning tasks within perceivable regions …

Shrani Navedi Navedeno v 2 virih Sorodni članki Vse različice: 2 V obliki HTML

Ustvari opozorilo

Navedi

Napredno iskanje

Shranjeno v Mojo knjižnico

Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving?

[HTML][HTML] Using multimodal large language models (MLLMs) for automated detection of traffic safety-critical events

V2x-vlm: End-to-end v2x cooperative autonomous driving through large vision-language models

Grid: Visual layout generation

VLM-MPC: Vision Language Foundation Model (VLM)-Guided Model Predictive Controller (MPC) for Autonomous Driving

Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving

FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training

World knowledge-enhanced Reasoning Using Instruction-guided Interactor in Autonomous Driving