Spatialbot: Precise spatial understanding with vision language models

W Cai, I Ponomarenko, J Yuan, X Li, W Yang… - ar**
Z Wei, Z Xu, J Guo, Y Hou, C Gao, Z Cai, J Luo… - ar** is a fundamental yet challenging skill in robotic manipulation, requiring
precise interaction between robotic hands and objects. In this paper, we present D (R, O) …

FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks

C Gao, H Zhang, Z Xu, Z Cai, L Shao - arxiv preprint arxiv:2412.08261, 2024 - arxiv.org
We aim to develop a model-based planning framework for world models that can be scaled
with increasing model and data budgets for general-purpose manipulation tasks with only …