Embodiedocc: Embodied 3d occupancy prediction for vision-based online scene understanding

Y Wu, W Zheng, S Zuo, Y Huang, J Zhou… - arxiv preprint arxiv …, 2024 - arxiv.org
3D occupancy prediction provides a comprehensive description of the surrounding scenes
and has become an essential task for 3D perception. Most existing methods focus on offline …

ViGiL3D: A Linguistically Diverse Dataset for 3D Visual Grounding

AT Wang, ZM Gong, AX Chang - arxiv preprint arxiv:2501.01366, 2025 - arxiv.org
3D visual grounding (3DVG) involves localizing entities in a 3D scene referred to by natural
language text. Such models are useful for embodied AI and scene retrieval applications …