Foundationpose: Unified 6d pose estimation and tracking of novel objects

B Wen, W Yang, J Kautz… - Proceedings of the IEEE …, 2024‏ - openaccess.thecvf.com
We present FoundationPose a unified foundation model for 6D object pose estimation and
tracking supporting both model-based and model-free setups. Our approach can be instantly …

Bundlesdf: Neural 6-dof tracking and 3d reconstruction of unknown objects

B Wen, J Tremblay, V Blukis, S Tyree… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
We present a near real-time (10Hz) method for 6-DoF tracking of an unknown object from a
monocular RGBD video sequence, while simultaneously performing neural 3D …

Mimicgen: A data generation system for scalable robot learning using human demonstrations

A Mandlekar, S Nasiriany, B Wen, I Akinola… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Imitation learning from a large set of human demonstrations has proved to be an effective
paradigm for building capable robot agents. However, the demonstrations can be extremely …

Instance-adaptive and geometric-aware keypoint learning for category-level 6d object pose estimation

X Lin, W Yang, Y Gao, T Zhang - Proceedings of the IEEE …, 2024‏ - openaccess.thecvf.com
Category-level 6D object pose estimation aims to estimate the rotation translation and size
of unseen instances within specific categories. In this area dense correspondence-based …

Industreal: Transferring contact-rich assembly tasks from simulation to reality

B Tang, MA Lin, I Akinola, A Handa… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Robotic assembly is a longstanding challenge, requiring contact-rich interaction and high
precision and accuracy. Many applications also require adaptivity to diverse parts, poses …

Tta-cope: Test-time adaptation for category-level object pose estimation

T Lee, J Tremblay, V Blukis, B Wen… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
Test-time adaptation methods have been gaining attention recently as a practical solution for
addressing source-to-target domain gaps by gradually updating the model without requiring …

Se (3)-equivariant relational rearrangement with neural descriptor fields

A Simeonov, Y Du, YC Lin, AR Garcia… - … on Robot Learning, 2023‏ - proceedings.mlr.press
We present a framework for specifying tasks involving spatial relations between objects
using only 5-10 demonstrations and then executing such tasks given point cloud …

Vision-based manipulation from single human video with open-world object graphs

Y Zhu, A Lim, P Stone, Y Zhu - arxiv preprint arxiv:2405.20321, 2024‏ - arxiv.org
We present an object-centric approach to empower robots to learn vision-based
manipulation skills from human videos. We investigate the problem of imitating robot …

Handal: A dataset of real-world manipulable object categories with pose annotations, affordances, and reconstructions

A Guo, B Wen, J Yuan, J Tremblay… - 2023 IEEE/RSJ …, 2023‏ - ieeexplore.ieee.org
We present the HANDAL dataset for category-level object pose estimation and affordance
prediction. Unlike previous datasets, ours is focused on robotics-ready manipulable objects …

Frame mining: a free lunch for learning robotic manipulation from 3d point clouds

M Liu, X Li, Z Ling, Y Li, H Su - arxiv preprint arxiv:2210.07442, 2022‏ - arxiv.org
We study how choices of input point cloud coordinate frames impact learning of
manipulation skills from 3D point clouds. There exist a variety of coordinate frame choices to …