- Academic Search

S Chen, X Chen, C Zhang, M Li, G Yu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Recent progress in Large Multimodal Models (LMM) has opened up great
possibilities for various applications in the field of human-machine interactions. However …

Enregistrer Citer Cité 61 fois Autres articles Les 3 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Tod3cap: Towards 3d dense captioning in outdoor scenes

B **, Y Zheng, P Li, W Li, Y Zheng, S Hu, X Liu… - … on Computer Vision, 2024 - Springer

Abstract 3D dense captioning stands as a cornerstone in achieving a comprehensive
understanding of 3D scenes through natural language. It has recently witnessed remarkable …

Enregistrer Citer Cité 8 fois Autres articles Les 2 versions Free GPT-4

[Free GPT-4]

[PDF] openreview.net

Chat-scene: Bridging 3d scene and large language models with object identifiers

H Huang, Y Chen, Z Wang, R Huang, R Xu… - The Thirty-eighth …, 2024 - openreview.net

Recent advancements in 3D Large Language Models (LLMs) have demonstrated promising
capabilities for 3D scene understanding. However, previous methods exhibit deficiencies in …

Enregistrer Citer Cité 12 fois Autres articles Version HTML

[Free GPT-4]

[PDF] arxiv.org

Bi-directional contextual attention for 3d dense captioning

M Kim, HS Lim, S Lee, B Kim, G Kim - European Conference on Computer …, 2024 - Springer

Abstract 3D dense captioning is a task involving the localization of objects and the
generation of descriptions for each object in a 3D scene. Recent approaches have …

Enregistrer Citer Cité 2 fois Autres articles Les 7 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Hint-ad: Holistically aligned interpretability in end-to-end autonomous driving

K Ding, B Chen, Y Su, H Gao, B **, C Sima… - ar** robust autonomous …

Enregistrer Citer Cité 1 fois Autres articles Version HTML

[Free GPT-4]

[PDF] arxiv.org

Lightweight Model Pre-Training Via Language Guided Knowledge Distillation

M Li, L Zhang, M Zhu, Z Huang, G Yu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

This paper studies the problem of pre-training for small models, which is essential for many
mobile devices. Current state-of-the-art methods on this problem transfer the …

Enregistrer Citer Cité 2 fois Autres articles Les 2 versions Free GPT-4

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Vote2cap-detr++: Decoupling localization and describing for end-to-end 3d dense captioning

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and Planning

Tod3cap: Towards 3d dense captioning in outdoor scenes

Chat-scene: Bridging 3d scene and large language models with object identifiers

Bi-directional contextual attention for 3d dense captioning

Hint-ad: Holistically aligned interpretability in end-to-end autonomous driving

Lightweight Model Pre-Training Via Language Guided Knowledge Distillation