Deep learning-based object pose estimation: A comprehensive survey

J Liu, W Sun, H Yang, Z Zeng, C Liu, J Zheng… - arxiv preprint arxiv …, 2024 - arxiv.org
Object pose estimation is a fundamental computer vision problem with broad applications in
augmented reality and robotics. Over the past decade, deep learning models, due to their …

What foundation models can bring for robot learning in manipulation: A survey

D Li, Y **, Y Sun, H Yu, J Shi, X Hao, P Hao… - arxiv preprint arxiv …, 2024 - arxiv.org
The realization of universal robots is an ultimate goal of researchers. However, a key hurdle
in achieving this goal lies in the robots' ability to manipulate objects in their unstructured …

Imagenet3d: Towards general-purpose object-level 3d understanding

W Ma, G Zeng, G Zhang, Q Liu, L Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
A vision model with general-purpose object-level 3D understanding should be capable of
inferring both 2D (eg, class name and bounding box) and 3D information (eg, 3D location …

Intertrack: Tracking human object interaction without object templates

X **e, JE Lenssen, G Pons-Moll - arxiv preprint arxiv:2408.13953, 2024 - arxiv.org
Tracking human object interaction from videos is important to understand human behavior
from the rapidly growing stream of video data. Previous video-based methods require …

High-resolution open-vocabulary object 6D pose estimation

J Corsetti, D Boscaini, F Giuliari, C Oh… - arxiv preprint arxiv …, 2024 - arxiv.org
The generalisation to unseen objects in the 6D pose estimation task is very challenging.
While Vision-Language Models (VLMs) enable using natural language descriptions to …