Lift3d foundation policy: Lifting 2d large-scale pretrained models for robust 3d robotic manipulation

Y Jia, J Liu, S Chen, C Gu, Z Wang, L Luo… - arxiv preprint arxiv …, 2024 - arxiv.org
3D geometric information is essential for manipulation tasks, as robots need to perceive the
3D environment, reason about spatial relationships, and interact with intricate spatial …

GEXIA: Granularity Expansion and Iterative Approximation for Scalable Multi-grained Video-language Learning

Y Wang, Z Zhang, J Wang, D Fan, Z Xu, L Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
In various video-language learning tasks, the challenge of achieving cross-modality
alignment with multi-grained data persists. We propose a method to tackle this challenge …