Omni6dpose: A benchmark and model for universal 6d object pose estimation and tracking

J Zhang, W Huang, B Peng, M Wu, F Hu… - … on Computer Vision, 2024 - Springer
Abstract 6D object pose estimation is crucial in the field of computer vision. However, it
suffers from a significant lack of large-scale and diverse datasets, impeding comprehensive …

Dream2Real: Zero-shot 3D object rearrangement with vision-language models

I Kapelyukh, Y Ren, I Alzugaray… - First Workshop on Vision …, 2024 - openreview.net
We introduce Dream2Real, a robotics framework which integrates vision-language models
(VLMs) trained on 2D data into a 3D object rearrangement pipeline. This is achieved by the …

Human-object interaction from human-level instructions

Z Wu, J Li, CK Liu - arxiv preprint arxiv:2406.17840, 2024 - arxiv.org
Intelligent agents need to autonomously navigate and interact within contextual
environments to perform a wide range of daily tasks based on human-level instructions …

Deep Networks and Sensor Fusion for Personal Care Robot Tasks-A Review

N Ramsai, K Sridharan - IEEE Sensors Journal, 2025 - ieeexplore.ieee.org
Welfare support systems involving robots have been actively researched during the last two
decades. Early attempts have been largely based on classical approaches to process …

Learning Instruction-Guided Manipulation Affordance via Large Models for Embodied Robotic Tasks

D Li, C Zhao, S Yang, L Ma, Y Li… - … on Advanced Robotics …, 2024 - ieeexplore.ieee.org
We study the task of language instruction-guided robotic manipulation, in which an
embodied robot is supposed to manipulate the target objects based on the language …

OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints

M Pan, J Zhang, T Wu, Y Zhao, W Gao… - arxiv preprint arxiv …, 2025 - arxiv.org
The development of general robotic systems capable of manipulating in unstructured
environments is a significant challenge. While Vision-Language Models (VLM) excel in high …