Omni6dpose: A benchmark and model for universal 6d object pose estimation and tracking
Abstract 6D object pose estimation is crucial in the field of computer vision. However, it
suffers from a significant lack of large-scale and diverse datasets, impeding comprehensive …
suffers from a significant lack of large-scale and diverse datasets, impeding comprehensive …
Dream2Real: Zero-shot 3D object rearrangement with vision-language models
We introduce Dream2Real, a robotics framework which integrates vision-language models
(VLMs) trained on 2D data into a 3D object rearrangement pipeline. This is achieved by the …
(VLMs) trained on 2D data into a 3D object rearrangement pipeline. This is achieved by the …
Human-object interaction from human-level instructions
Intelligent agents need to autonomously navigate and interact within contextual
environments to perform a wide range of daily tasks based on human-level instructions …
environments to perform a wide range of daily tasks based on human-level instructions …
Deep Networks and Sensor Fusion for Personal Care Robot Tasks-A Review
N Ramsai, K Sridharan - IEEE Sensors Journal, 2025 - ieeexplore.ieee.org
Welfare support systems involving robots have been actively researched during the last two
decades. Early attempts have been largely based on classical approaches to process …
decades. Early attempts have been largely based on classical approaches to process …
Learning Instruction-Guided Manipulation Affordance via Large Models for Embodied Robotic Tasks
We study the task of language instruction-guided robotic manipulation, in which an
embodied robot is supposed to manipulate the target objects based on the language …
embodied robot is supposed to manipulate the target objects based on the language …
OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints
The development of general robotic systems capable of manipulating in unstructured
environments is a significant challenge. While Vision-Language Models (VLM) excel in high …
environments is a significant challenge. While Vision-Language Models (VLM) excel in high …