Google Наука

W Yuan, J Duan, V Blukis, W Pumacay… - ar**

T Ma, Z Wang, J Zhou, M Wang, J Liang - arxiv preprint arxiv:2411.12286, 2024 - arxiv.org

Inferring affordable (ie, graspable) parts of arbitrary objects based on human specifications
is essential for robots advancing toward open-vocabulary manipulation. Current grasp …

Запазване Позоваване С позовавания в 1 Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

UniAff: A unified representation of affordances for tool usage and articulation with vision-language models

Q Yu, S Huang, X Yuan, Z Jiang, C Hao, X Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Previous studies on robotic manipulation are based on a limited understanding of the
underlying 3D motion constraints and affordances. To address these challenges, we …

Запазване Позоваване С позовавания в 1 Сродни статии Всички 3 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Showui: One vision-language-action model for gui visual agent

KQ Lin, L Li, D Gao, Z Yang, S Wu, Z Bai, W Lei… - arxiv preprint arxiv …, 2024 - arxiv.org

Building Graphical User Interface (GUI) assistants holds significant promise for enhancing
human workflow productivity. While most agents are language-based, relying on closed …

Запазване Позоваване С позовавания в 3 Сродни статии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Improving Vision-Language-Action Models via Chain-of-Affordance

J Li, Y Zhu, Z Tang, J Wen, M Zhu, X Liu, C Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Robot foundation models, particularly Vision-Language-Action (VLA) models, have
garnered significant attention for their ability to enhance robot policy learning, greatly …

Запазване Позоваване С позовавания в 1 Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] washington.edu

Objects and Actions Learning Representations for Open-World Robotics

W Yuan - 2024 - search.proquest.com

Advancing robotics involves enabling systems to generalize across diverse and unseen
environments, known as" the open world." Traditional approaches rely on state estimators …

Запазване Позоваване Сродни статии Всички 3 версии

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Understanding Depth and Height Perception in Large Visual-Language Models

S Azad, Y Jain, R Garg, YS Rawat, V Vineet - openreview.net

Geometric understanding—including depth and height perception—is fundamental to
intelligence and crucial for navigating our environment. Despite the impressive capabilities …

Запазване Позоваване Сродни статии Във вид на HTML

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

A3vlm: Actionable articulation-aware vision language model

Robopoint: A vision-language model for spatial affordance prediction for robotics

UniAff: A unified representation of affordances for tool usage and articulation with vision-language models

Showui: One vision-language-action model for gui visual agent

Improving Vision-Language-Action Models via Chain-of-Affordance

Objects and Actions Learning Representations for Open-World Robotics

Understanding Depth and Height Perception in Large Visual-Language Models