- Academic Search

X Zhou, M Liu, E Yurtsever, BL Zagar… - IEEE Transactions …, 2024 - ieeexplore.ieee.org

The applications of Vision-Language Models (VLMs) in the field of Autonomous Driving (AD)
have attracted widespread attention due to their outstanding performance and the ability to …

Lưu Trích dẫn Trích dẫn 75 bài viết Bài viết có liên quan Tất cả 6 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] science.org

Prospective role of foundation models in advancing autonomous vehicles

J Wu, B Gao, J Gao, J Yu, H Chu, Q Yu, X Gong… - Research, 2024 - spj.science.org

With the development of artificial intelligence and breakthroughs in deep learning, large-
scale foundation models (FMs), such as generative pre-trained transformer (GPT), Sora, etc …

Lưu Trích dẫn Trích dẫn 3 bài viết Bài viết có liên quan Tất cả 9 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

SketchQL: Video Moment Querying with a Visual Query Interface

R Wu, P Chunduri, A Payani, X Chu, J Arulraj… - Proceedings of the …, 2024 - dl.acm.org

Localizing video moments based on the movement patterns of objects is an important task in
video analytics. Existing video analytics systems offer two types of querying interfaces based …

Lưu Trích dẫn Trích dẫn 1 bài viết Bài viết có liên quan Tất cả 4 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] github.io

V2V: Efficiently Synthesizing Video Results for Video Queries

D Winecki, A Nandi - 2024 IEEE 40th International Conference …, 2024 - ieeexplore.ieee.org

Querying video data has become increasingly popular and useful. Video queries can be
complex, ranging from retrieval tasks (“find me the top videos that have…”), to analytics …

Lưu Trích dẫn Trích dẫn 1 bài viết Bài viết có liên quan Tất cả 4 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Radar spectra-language model for automotive scene parsing

M Pushkareva, Y Feldman, C Domokos… - arxiv preprint arxiv …, 2024 - arxiv.org

Radar sensors are low cost, long-range, and weather-resilient. Therefore, they are widely
used for driver assistance functions, and are expected to be crucial for the success of …

Lưu Trích dẫn Trích dẫn 2 bài viết Bài viết có liên quan Tất cả 3 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Self-Enhancing Video Data Management System for Compositional Events with Large Language Models [Technical Report]

E Zhang, N Sullivan, B Haynes, R Krishna… - arxiv preprint arxiv …, 2024 - arxiv.org

Complex video queries can be answered by decomposing them into modular subtasks.
However, existing video data management systems assume the existence of predefined …

Lưu Trích dẫn Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning

A Ziai, A Vartakavi - arxiv preprint arxiv:2402.06560, 2024 - arxiv.org

High-quality and consistent annotations are fundamental to the successful development of
robust machine learning models. Traditional data annotation methods are resource …

Lưu Trích dẫn Bài viết có liên quan Tất cả 2 phiên bản Bản lưu

[Free GPT-4]
[DeepSeek]

[PDF] techrxiv.org

Large (Vision) Language Models for Autonomous Vehicles: Current Trends and Future Directions

H Tian, K Reddy, Y Feng, M Quddus, Y Demiris… - Authorea Preprints - techrxiv.org

As autonomous vehicles (AVs) advance, the integration of Large (Vision) Language Models
(L (V) LMs) has emerged as a promising approach to enhance AV capabilities in perception …

Lưu Trích dẫn Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

Tạo thông báo

Trích dẫn

Tìm kiếm nâng cao

Đã lưu vào Thư viện của tôi

Zelda: Video analytics using vision-language models

Vision language models in autonomous driving: A survey and outlook

Prospective role of foundation models in advancing autonomous vehicles

SketchQL: Video Moment Querying with a Visual Query Interface

V2V: Efficiently Synthesizing Video Results for Video Queries

Radar spectra-language model for automotive scene parsing

Self-Enhancing Video Data Management System for Compositional Events with Large Language Models [Technical Report]

Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning

Large (Vision) Language Models for Autonomous Vehicles: Current Trends and Future Directions