- Academic Search

K Kawaharazuka, T Matsushima… - Advanced …, 2024 - Taylor & Francis

Recent developments in foundation models, like Large Language Models (LLMs) and Vision-
Language Models (VLMs), trained on extensive data, facilitate flexible application across …

Salva Cita Citato da 39 Articoli correlati Tutte e 2 le versioni

[Free GPT-4]

[PDF] springer.com

A survey on integration of large language models with intelligent robots

Y Kim, D Kim, J Choi, J Park, N Oh, D Park - Intelligent Service Robotics, 2024 - Springer

In recent years, the integration of large language models (LLMs) has revolutionized the field
of robotics, enabling robots to communicate, understand, and reason with human-like …

Salva Cita Citato da 18 Articoli correlati Tutte e 2 le versioni

[Free GPT-4]

[PDF] thecvf.com

Lerf: Language embedded radiance fields

J Kerr, CM Kim, K Goldberg… - Proceedings of the …, 2023 - openaccess.thecvf.com

Humans describe the physical world using natural language to refer to specific 3D locations
based on a vast range of properties: visual appearance, semantics, abstract associations, or …

Salva Cita Citato da 328 Articoli correlati Tutte e 6 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Gaussian grou**: Segment and edit anything in 3d scenes

M Ye, M Danelljan, F Yu, L Ke - European Conference on Computer …, 2024 - Springer

Abstract The recent Gaussian Splatting achieves high-quality and real-time novel-view
synthesis of the 3D scenes. However, it is solely concentrated on the appearance and …

Salva Cita Citato da 111 Articoli correlati Tutte e 2 le versioni

[Free GPT-4]

[PDF] arxiv.org

Conceptgraphs: Open-vocabulary 3d scene graphs for perception and planning

Q Gu, A Kuwajerwala, S Morin… - … on Robotics and …, 2024 - ieeexplore.ieee.org

For robots to perform a wide variety of tasks, they require a 3D representation of the world
that is semantically rich, yet compact and efficient for task-driven perception and planning …

Salva Cita Citato da 134 Articoli correlati Tutte e 6 le versioni

[Free GPT-4]

[PDF] thecvf.com

Gaussianavatar: Towards realistic human avatar modeling from a single video via animatable 3d gaussians

L Hu, H Zhang, Y Zhang, B Zhou… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present GaussianAvatar an efficient approach to creating realistic human avatars with
dynamic 3D appearances from a single video. We start by introducing animatable 3D …

Salva Cita Citato da 141 Articoli correlati Tutte e 5 le versioni Versione HTML

Segment anything in 3d with nerfs

J Cen, Z Zhou, J Fang, W Shen, L **e… - Advances in …, 2023 - proceedings.neurips.cc

Abstract Recently, the Segment Anything Model (SAM) emerged as a powerful vision
foundation model which is capable to segment anything in 2D images. This paper aims to …

Salva Cita Citato da 139 Articoli correlati Tutte e 4 le versioni Copia cache

[Free GPT-4]

[PDF] arxiv.org

Physically grounded vision-language models for robotic manipulation

J Gao, B Sarkar, F **a, T **ao, J Wu… - … on Robotics and …, 2024 - ieeexplore.ieee.org

Recent advances in vision-language models (VLMs) have led to improved performance on
tasks such as visual question answering and image captioning. Consequently, these models …

Salva Cita Citato da 90 Articoli correlati Tutte e 2 le versioni

[Free GPT-4]

[PDF] neurips.cc

Openshape: Scaling up 3d shape representation towards open-world understanding

M Liu, R Shi, K Kuang, Y Zhu, X Li… - Advances in neural …, 2024 - proceedings.neurips.cc

We introduce OpenShape, a method for learning multi-modal joint representations of text,
image, and point clouds. We adopt the commonly used multi-modal contrastive learning …

Salva Cita Citato da 96 Articoli correlati Tutte e 5 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Shapellm: Universal 3d object understanding for embodied interaction

Z Qi, R Dong, S Zhang, H Geng, C Han, Z Ge… - … on Computer Vision, 2024 - Springer

This paper presents ShapeLLM, the first 3D Multimodal Large Language Model (LLM)
designed for embodied interaction, exploring a universal 3D object understanding with 3D …

Salva Cita Citato da 40 Articoli correlati Tutte e 2 le versioni

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Openscene: 3d scene understanding with open vocabularies

Real-world robot applications of foundation models: A review

A survey on integration of large language models with intelligent robots

Lerf: Language embedded radiance fields

Gaussian grou**: Segment and edit anything in 3d scenes

Conceptgraphs: Open-vocabulary 3d scene graphs for perception and planning

Gaussianavatar: Towards realistic human avatar modeling from a single video via animatable 3d gaussians

Segment anything in 3d with nerfs

Physically grounded vision-language models for robotic manipulation

Openshape: Scaling up 3d shape representation towards open-world understanding

Shapellm: Universal 3d object understanding for embodied interaction